David Shapton on all the processes that combine to create a picture in the digital age, and how the essence of the picture survives them all.
There’s a kind of duality in the air ever since digital images became a thing. On the one hand, they’re a kind of “painting by numbers” distortion of the real world, ruthlessly quantised and often compressed to within an inch of their life. On the other, there’s the current reality in digital cinema, which is that high-resolution cinema cameras can capture the subtlest, most nuanced images ever seen. What’s going on here? Several things, I think, and we’ll talk about those in detail in a subsequent article.
But none of that is the point I want to make here. The real point is that images are much more robust than their storage and reproduction mediums would suggest. They seem to be able to survive, in essence, to a degree that is much greater than the processes they’ve been through. The question is: what is that 'essence'
A tale of two pictures
Have a look at this picture:
It was a pretty spectacular sunrise last June at about 4:30 am, just after I’d taken my daughter to the airport at stupid o’clock. I love this shot, taken from the living room of where I was staying not far from Baildon Moor in Yorkshire, which is in the direction of the sunrise. At the time, it looked like the sky was on fire. It’s not technically great: it was taken on an iPhone 13 Pro Max through an old glass window. I boosted the shadows because the house next to the church was too dark in the dawn light. You can see it was pretty dark by the porch lights in the house and the street lights in the lower right side of the picture.
But there’s another reason it’s not technically great. In addition to the window glass, this had also been printed out as a birthday card and re-photographed, this time with an iPhone 15 Pro Max, from across a room, using the optical 5X zoom lens.
This is what it looked like:
So, that picture was shot on an out-of-date iPhone through a glass window, printed on a piece of cardboard, photographed from a distance of around 4.5m on another iPhone, and then reproduced here.
And yet, if you were to show the original (shown here below) to someone, asking them to describe what they see, they would say exactly the same as if you showed them the original.
What can we conclude from this?
Not very much, you might think. I mean, if it’s a picture of a sunrise, then it’s a picture of a sunrise. That could be true of billions of pictures, not even similar.
But the two photos aren’t just a bit similar; they’re almost identical. Obviously they’re taken from the same viewpoint, at the same time of day, and the only difference is one of strictly technical quality.
I suppose my point is that, given the tortuous and almost comically sub-optimal processing path between the second picture and the first (ie the original and the “birthday card telephoto” one), shouldn’t it look a lot worse than it does?
It absolutely should, especially when you realise that the lower-quality shot had passed through two smartphones, both of them probably among the most sophisticated image-processing devices ever made.
Image processing under the hood
So what’s going on here? I think a lot of image processing, some of which I suspect is AI.
My iPhone 15 Pro Max has persuaded me that I no longer need to take a dedicated camera with me when I go out. And if someone asks me to take a portrait shot, I sit them on my office chair with an illuminated background and the two panel lights that I use for everything from Microsoft Teams to video podcasts. Even though I have a very nice big-name mirrorless camera set up on a tripod facing the chair, I still take the portrait with the iPhone. It actually looks better - albeit I’m sure with careful grading and manipulation, the mirrorless camera would do an equal or better job.
But the point is, unless I’m going to spend an hour or two doing the best job possible, the iPhone gives me a result that looks good and makes everyone happy straight out of the device. (And, of course, I can email the photos directly from the phone).
Whichever way you look at it, the iPhone is doing miraculous stuff with its images. Granted, it’s an optically precise device, but the computation that goes into making any modern smartphone’s pictures is immense. And now that AI is in the frame, you have to wonder whether it is “seeing” images in a different way.
One of which might be that the AI, in a sense, “knows” what it is looking at. Maybe not in any way that you or I would understand, but - let’s put it this way - if you had to say what the “essence” of that picture was, you almost certainly wouldn’t resort to pixels. You’d use high-level terms like “sunrise”, “orange”, and “church” - and you might go on to give geometrical details like “we’re looking from a viewpoint that’s to the left of the church, which is across a quiet-looking residential road”. But you needn’t stop there. Zoom in to another level, and you might say that the church is made from smoke-darkened blocks, that it is surrounded by grass and that there is a slight air of otherworldliness about the image overall.
Zoom in still further and you’ll be getting into quite fine detail.
At some point, your AI will have enough information to reconstruct the image. It might have several attempts, comparing each with the original picture.
But what it won’t do - unless you specifically ask it to - is reproduce the artefacts of the processing journey that this image has been through.
This process should be capable of making a remarkably close approximation of the image and, I suspect, one that might actually look better. No, it won’t be more accurate, but sunrise pictures aren’t engineering diagrams or legal documents. Remember that we don’t need ray tracing or anything like that here - just inference from the context of the picture.
Much of this is only speculation on my part, fuelled by curiosity about how that photograph could have gone through such a drawn-out, potentially degrading process and still look okay, if not brilliant.
And I do honestly think that AI will be the still and video image codec of choice within a few years.
Tags: Production
Comments