RedShark News - Video technology news and analysis

RedShark Archive Exposure: Why can't there just be one, really good, codec?

Written by Phil Rhodes | May 26, 2014 12:00:00 PM
Why can't there just be one, really good, codec?

There are so many great articles in RedShark's archive - we're publishing this again in case you missed it first time!

Phil Rhodes doesn't mince his words in wondering why there can't be just one, really good codec

For some reason(!) the world has failed to take any notice of the article I wrote a while ago on the subject of uncompressed images. To be completely fair, this might be that NAND flash chips have failed to increase in size and speed by the order of magnitude that would have actually made uncompressed workflows practical in the interim.

Codecs on the Brain

What brought the subject of codecs back to my forebrain, however, was a refreshingly honest and interesting discussion with Al Mooney, product manager at Adobe for Premiere Pro, at IBC. Apparently, they get quite a lot of requests from people to develop an intermediate or mezzanine codec in the vein of Avid's DNxHD and Apple's ProRes, presumably because those asking for it are under the impression that the ability of things like Media Composer and Final Cut to do certain desirable things are based on something that's special about DNxHD or ProRes.

Mezzanine Codecs

This is immediately reminiscent of some of the points I made in support of uncompressed workflows in the previous article, but it's probably worth a quick intermission here to think about what these mezzanine codecs actually are, technically. Neither Avid nor Apple's pet codec is anything very new or special, technologically. Both are based on the discrete cosine transform used in image compression since JPEG, and thus have a lot in common with MJPEG, DV, DVCPRO-HD, Theora, and others. The advantage of them is that they are implemented to handle currently-popular frame sizes, bit depths, and frame rates, and are constrained in such a way that nonlinear edit software can make assumptions that lead to technological efficiencies. If forced to choose, DNxHD is probably the more egalitarian since it's standardised as SMPTE VC-3, but given the huge market penetration of ProRes, the choice is made for us in many circumstances.


No Adobe Interest

Thus there is no particular technological genius involved, and Adobe could quite trivially come up with some combination of preexisting technologies and stamp it with their logo. This wouldn't help anyone, since it adds another layer of incompatibility to the situation and not necessarily affect the way in which Premiere can do certain things with certain types of media. This is capably handled by current versions of Premiere in any case, since they have a level media compatibility on a single timeline which can still raise eyebrows to those used to other systems. To make a proprietary codec look good they'd probably actually have to start disabling things if it wasn't in use, which is clearly ludicrous (although some people have probably done worse in the name of vendor lock-in in the past). But in general, for all these reasons, it's refreshing that Adobe report absolutely no interest in going down the route of developing a branded codec, having realised the evils that lie therein.

So, we can't have uncompressed and the current approach to proprietary codecs – being as DnxHD is still fundamentally an AVID thing, albeit obstensibly open – is less than ideal. And now I've written myself into the corner of having to suggest a reasonable alternative, but that's good, because there is actually an option out there which could hit both issues at once.

A Reasonable Alternative

The video codecs HuffYUV (or its more modern incarnation Lagarith) and Ut Video are both open source and could thus be used by anyone, and provide at least something of an answer to our desire for uncompressed images. Both use clever mathematical techniques to achieve compression without sacrificing image quality, and it's my observation that Ut Video, at least, can provide 4:4:4 8-bit RGB for only about 50% more bitrate than the highest quality ProRes. While this immediately sounds like witchcraft, it is entirely legitimate under formal information theory, and people interested in a formal mathematical description of this may wish to peruse the Wikipedia article on Huffman coding and other approaches to the minimum-redundancy storage of data.

Informally, the approach used is to store the most frequently-encountered values in a stream of data as short codes, while using longer codes to represent less frequently-encountered values. A table (ordered as a tree using David A. Huffman's algorithm) is generated for each frame which maps the codes back to the original values. Therefore, the procedure is lossless and in the sense of a video codec, the stored frame can be precisely recovered. Practical applications tend to use a more conventional image compression algorithm similar to JPEG, and then Huffman encode the error between that compressed image and the original; because the errors tend to be small, there are fewer likely values, and the encoding can work better.


Entropy

It is possible to present such algorithms with high entropy data they cannot compress, or cannot compress very well. Formally, entropy refers to complexity or disorder. It's possible to calculate a value representing the entropy of an image; a picture comprising one completely flat colour would have entropy zero. While it might seem that a real-world image, given noisy sensors, fuzzy lenses and all the other imperfections of photography, would contain a lot of randomness and therefore have quite high entropy, it's not as bad as it might seem. Practical images might contain a lot of sky, or a lot of someone's face, or a lot of grass, and within those areas the probability of high contrast and therefore big differences between adjacent pixels are quite low. It is often claimed that photographic images contain lots of redundant information, and this is, as we can see here, strictly true, although that isn't generally what someone discussing a lossy codec might mean when they use the phrase.

A high-entropy image, by comparison, might contain entirely random values, making it difficult to compress using entropy coding techniques because it contains no redundant information; no value is related to any other. An image generated by Photoshop's add-noise filter will have entropy tending toward 1. Images of this sort are likely to be inflated, rather than compressed, by entropy coding, as the number of different values in the input file approaches the number of actual values, with no one value much more common than any other. Without the ability to represent large numbers of values with short codes, the compression scheme ceases to work.

There are also algorithms such as arithmetic coding, which is optional in h.264 and mandatory in h.265/HEVC, which have better compression performance than Huffman's, although at the cost of being harder work for the computer.

Implementing Huffman Style Coding

Many implementations of Huffman style entropy coding achieve something like 2:1 compression, depending on the source. This would, for instance, be sufficient to record 1080p24 10-bit RGB data on an LTO5 tape in realtime, and allows Ut Video to achieve the enticing 50%-more-than-ProRes figure stated above, give or take 10-bit precision. To date, improving the performance of lossless video codecs on practical computer hardware has been a priority. On compression, calculation of a table or tree which allows the best possible encoding is not trivial, and on decompression, generally a JPEG-style image must be decoded, and then have the decoded results of the table applied to it to correct its errors. It is probably not currently reasonable to build a battery-powered field recorder using these techniques, at least not without creating custom silicon to do certain aspects of the job. On a workstation, however, things are different.


Ut Video Codec

Recently, I've been using the Ut Video codec, which is a much more recent and actively-developed project, as a lossless intermediate to solve some compatibility problems between older versions of Premiere CS and four-channel audio files produced by Atomos recorders (the bug, I should point out, is Adobe's). Developed by Takeshi Umezawa, Ut Video emerged in 2008 and probably isn't as widely known as Ben Roudiak-Gould's earlier HuffYUV project. However, Ut Video does support truly lossless encoding in both YUV and RGB formats with various colour subsampling.

It does, however, lack support for anything other than 8 bits per channel, and as such Ut Video it is not ready to be put into use as either an intermediate or acquisition codec for high end work. In this brave new world of 8-bit cameras it is still not without application, although 10-bit support, and more, would obviously be great. Windows users can install it and have all applications which are aware of the Video for Windows codec management system become Ut Video-aware (strictly speaking it is a VCM codec, or rather a collection of them targeting different pixel formats). It is already supported internally by the free encoding tool ffmpeg, although VLC Media Player has yet to implement Ut Video for playback. Mitigating this is the fact that even modern versions of Windows Media Player are perfectly happy with VCM codecs and will handle Ut Video AVIs without modification.

Compatibility and Image Quality

To return to our original concerns over both compatibility and image quality, it's not yet obvious that this sort of codec technology is the answer; in the first place, whether there will ever be a need for uncompressed codecs before solid state storage becomes big enough to accommodate uncompressed footage is uncertain. If there is, this solution is still difficult not because it's hard to generate a codec that would be suitable for use as a standard, but because it's difficult to get people to use standards. In the case of proprietary mezzanine codecs, there's a motivation to pursue vendor lock-in, and in general the issue of patent protection has been putting nontechnical limitations on new codecs for some time. The commercial imperative is an increasing problem in acquisition, where the explosive proliferation of flash card formats and file formats to go on them is something that will doubtless be looked back upon with horror by future generations. The increasing capability to quickly develop and deploy new technologies as largely software- or firmware-based entities since the early 90s has allowed the problem to grow enormously, and there's no sign of this easing in the near future.

So if there's a conclusion to draw here, it is that the problem with achieving good image quality on inexpensive equipment with easy workflows is not, and has not been for some time, much to do with technology.

More from RedShark on compression:

Why uncompressed is better