RedShark News - Video technology news and analysis

What the smartphone says about the future direction of camera development

Written by Phil Rhodes | Aug 9, 2022 8:00:00 AM

Phil Rhodes on how the processing tweaks and tricks in your average smartphone signpost the future direction of camera development.

Billboards are often forty-eight feet wide. Putting a cellphone image on one, assuming the cellphone has something like a Sony IMX703 sensor which is about 7.5mm wide, represents an enlargement pushing 2000:1. That’s what we technically call a stunt, and it is stunts like this, perhaps more than detached consideration, which lead to speculation that cellphones will inevitably overtake more conventional cameras for both stills and video.

This involves the sort of clever processing that takes the output of a tiny sensor, which we might kindly describe as being of constrained excellence, and mallets it into shape like one of those YouTube videos where someone restores a comprehensively-mangled supercar. We might conclude that it doesn’t matter so long as you end up with a Ferrari Enzo one way or the other, and that’s what’s making people think about the future of cinema cameras, which, it’s often assumed, aren’t using clever post-processing in the same way as phones.

Computational photography

Only, they are. What we’re talking about here is computational photography, although for film and TV that computation is often not done in camera, so the comparison with cellphone tech isn’t always obvious. Either, though, represents the process of making finished images out of information we record as opposed to expecting the unaltered information to directly represent the finished image. The material we capture in camera isn’t the shot; it’s data that we’ll use to create the shot. 

The fifth public beta of Resolve 18, for instance, introduced the option to use motion data recorded by some of Blackmagic’s cinema cameras to perform better stabilisation (this is also part of the intent of the lens data protocols developed by Cooke and Zeiss). Resolve’s preexisting warp stabiliser is a perfectly competent solution to minor instability problems, although with almost any related technology the word “warp” is both the solution and the problem. Many of them can wind up accurately simulating the point of view of someone staggering home after a particularly lively evening at the bar, and the new gyro data makes for much better results - almost good enough to make people question the necessity of a gimbal.

And the plot thickens. As with any post effect involving moving the frame around, things work best when the original material was shot with a short exposure,  so as not to besmirch the result with motion blur that no longer matches the apparent motion of objects in the scene. Assuming we’re not mimicking the beach assault scene from Saving Private Ryan, Resolve will let us add some motion blur back in using either the motion effects part of the colour page, or in Fusion with the optical flow and vector motion blur nodes. Neither is perfect (the latter is better) but either does a reasonable job of smoothing out the motion. We might also choose a 6K camera, so we can shoot wide, stabilise and trim the framing in post while still ending up with a razor-sharp 4K result.

Moving into the lightfield

These are all perfectly serviceable examples of computational photography, but the big name in all of this has to be lightfield cameras, exhibited by the people at Fraunhofer with sixteen cameras in a four-by-four grid. That allows for much more accurate depth estimation than just a stereo pair – possibly even good enough for depth keying without greenscreen, or adding properly depth-cued fog. Photographically accurate depth of field simulation is possible, and the shot can be moved around or stabilised over a physical distance equal to the size of the array of cameras. It’s an alien idea, not least because it completely obviates the artisan lenses we’ve become used to adoring, but it’s perhaps the ultimate expression of the concept.

In the end, most of the tricks which are available for cellphones can be available for film and TV cameras if we want them to be – we’re all interested in better noise reduction algorithms, for instance. Sure, the computation done for film and TV work is likely to be a little heavier-duty than the sort of things phones try to do in realtime. Even a big Threadripper or Xeon workstation will probably struggle to run, at 24 frames per second, the Fusion vector motion blur node tree we discussed. There’s also the practical and organisational consideration that cinematographers are less likely to be comfortable allowing the electronics to mess so profoundly with their images in camera, as phones do, and are more likely to defer the decision making for post. The principle, though, is the same.

Post involvement

Either way, all this has some pretty fundamental implications for cinematographers, who are increasingly likely to want and need to be more heavily involved in post where the data captured on set is reconstructed into the final image. It’s not new: ever since the advent of electronic grading, camera specialists have complained about their beloved images being suffering the photographic equivalent of a radical facelift under the knife of a surgeon they’ve never met, and when more and more of the image is built as a post production effort then that’s only going to become more profound.

The likelihood, then, is not that cellphones will somehow outdo mainstream cinema cameras, but that cinema cameras might start doing similar things – if not technically, at least philosophically, and suggestions to fix-it-in-post might not carry the stigma they once did.