Replay: Phil Rhodes takes a look at the complexities involved in bringing such a unified raw codec to fruition.
Options to store raw video in a standardised format is something that's long overdue. Raw video, like any other sort of image, is a list of numbers, and it's clearly possible to put those numbers in a file, and even make the file smaller without affecting the picture too much. This is, then, a good example of a situation where the actual technical approach could be any of a few different ones, and the best choice is really a matter of priorities.
What's important is that camera manufacturers, software people, etc use the same technical approach. The reason it hasn't happened until now has been largely political, rather than technical, as I'll explain below. But the theory behind a design such as the new ProRes RAW is interesting to think about.
The task is reasonably straightforward, in principle. By “raw video” we generally mean minimally-processed images from a camera that doesn't have three separate sensors for the red, green and blue channels. Often this means a single sensor with red, green and blue filters according to Bryce Bayer's design where there are twice as many green pixels as red or blue. Though there are also other filter patterns, such as the vertical arrays used on Sony's largely-historical F35 and the Panavision Genesis. Even three-chip cameras which do have separate RGB sensors could reasonably be described as “recorded raw” in circumstances where minimal processing is done to the image data before it's recorded.
Any file format intended to store raw video needs the ability to store frames where the three channels of data are not all the same resolution. This is already common; component video formats (often called YUV, YCrCb, etc) do it. A raw format probably needs a way of describing what sort of sensor layout the information in the three channels represents. This might be ignored in favour of a Bayer-only format, since that's overwhelmingly the application for raw, although it wouldn't take much to include a way of indicating what sort of patterning the sensor data uses.
Likewise, the three channels are just image data and could be compressed using any of the many approaches currently used in existing codecs. The new format seems intended to provide multi-stream performance on modest hardware, so the underlying compression mathematics is unlikely to be an advanced wavelet-based codec like JPEG2000. The use of the ProRes name may not mean much technically, but ProRes itself uses a discrete cosine transform that's actually fairly basic and comparable to things like the old DV, DVCPRO, DVCPRO-HD and even JPEG compression techniques. The performance advantages of wavelet transforms are the subject of much debate but as usually applied they're far harder work for the computer. It's also possible that ProRes Raw could support different compression techniques.
Where this potentially gets complicated is once the material has been compressed, stored in a file, pulled back out of that file and decompressed on the workstation. At that point, the task of recovering complete colour information must take place, and it's at this point that the process gets both technically ambiguous and not a little political, because the process involves a lot of matters of opinion. It is intrinsically imperfect, involving compromises between sharpness and aliasing, colour precision and sensitivity, noise and saturation, and many other things too complex to go into here.
This is difficult, as part of a standards organisation, because some of the information involved in doing that – the settings, if you like, for the raw decoder – may be considered proprietary. This is, frankly, slightly questionable.
Among other things, a decoder needs to understand what colour the filters in the camera actually are, and what the desired output colourspace is, in order to produce accurate colorimetry. That's just a set of three two-dimensional colour coordinates, plus another for the white point; some ancillary data, such as an output matrix or even a full LUT, could be recorded too. Settings such as the white balance gains (based on the monitoring used when the scene was shot) are just as easily recorded, as they are now in most existing raw stills formats. Some sort of pre-compression luminance processing is likely to be required to achieve reasonable compression performance, but again, if variable, the luminance processing function can be recorded. These things are fairly straightforward because they are absolute.
Concerns over how the actual colour recovery is done, however, are much more a matter of opinion. Some manufacturers have used this to make claims about the cleverness of their Bayer reconstruction, and may want to tweak and adjust various parts of the decoding process with their own particular aims in mind. How much difference this really makes is unclear. Manufacturers who have routinely recorded raw data with the intention that it might meet any of several decoders include Arri and Blackmagic Design, so the precedent for doing that exists; with any luck, most manufacturers will see the value of a less balkanised ecosystem for raw. Indeed, Panasonic, Sony, and Canon have been working very closely with Atomos on the recording of the format to its recorders. It can also be assumed that a similar level of co-operation existed with Apple as well when it came to the decoding in the latest update to FCP X.
Unified raw is something that's long overdue and completely plausible from a technical perspective. As ever, though, unless they are universally, or at least widely, adopted, new standards don't help. All we can do is wait and see who's willing to get involved.