After HD and 4K comes HD Audio. Or does it?

Written by RedShark News Staff | Nov 1, 2013 2:00:00 PM

Hi Res Audio

It's only reasonable that Hi Resolution Audio should accompany high resolution video. But it's not new. We've had High Resolution Audio for around fifteen years. It's just that we choose not to use it

Sony has a very nice looking website extolling the undoubted virtues of High Resolution Audio. As an audio enthusiast (and, professionally, as a specialist) I love this idea. But I'm just frustrated that it's coming fifteen years late.

Here, in a couple of paragraphs, is a potted history.

The Progression of Audio

In the 80s, CDs arrived. They were good, but not that good. They still had limits - namely that they couldn't record anything above around 20KHz and had a limited dynamic range - although much higher in theory than vinyl. Then, throughout the 90s, most of the recording industry graduated to digital recording media. Most of it was identical in resolution to CDs - 16 bit, with 44.1KHz or 48Khz sampling. But towards the end of the decade, 24 bit recording and high sample rates arrived. It was High Resolution Audio.

And what happened? Pretty much nothing. Studio master tapes improved but instead of embracing the new optical disk formats that were designed to deliver High Resolution Audio to end users, we all became transfixed with iPods, Minidiscs and streamed audio (although Spotify, etc, didn't arrive for another 10 years or so). Absolutely none of these sounded anything like as good as even CDs. So instead of High Resolution, like an unruly mob, we adopted Low Resolution audio.

Distinguishing High Resolution Audio

For many people (you know the ones: one speaker on the bookshelf and the other behind the sofa) they wouldn't recognise High Resolution audio even if it performed Riverdance in front of them. Nor would they have the equipment or the room acoustics to distinguish it from lesser forms of experience. And - I include myself in this - you can't deny the convenience of file-based music players. Sometimes, especially when you're sitting in a car or on an aircraft, the question of perfection doesn't arise.

But for real audio enthusiasts, MP3, or any type of (lossy) audio compression, just isn't good enough - especially when they know that higher resolution audio than even CD quality is captured in the studio.

Market Opportunity

So, you can understand why Sony sees High Resolution Audio as a market opportunity, now that we're going through another step change in video - from HD to 4K. And with storage around 1000 times cheaper than it was when the first iPods came out, there's every reason to urge portable player manufacturers to enable High Resolution audio on their devices, even though the tracks will need four, eight or even sixteen times the storage space to accommodate them.

The original High Resolution audio formats - in their physical form - never caught on. From memory, there was DVD Audio - which used the extra capacity of the DVD format to store the larger audio files and usually a surround-sound version as well (I've still got the DVD Audio version of Fleetwood Mac's Rumours - and listening to it in surround was like being on-stage with the band. It was a startling if not particularly naturalistic way to listen!). And then there was Super Audio CD, which used a completely different sampling method, and, in my view, was the best-sounding of all the formats, but it was also the most problematic to produce, edit and author.

The Technical Stuff

If you're not interested in the technical stuff, you can skip this bit, but if you understand digital video, you won't have any problems with it.

You get the same issues in audio as you get in digital video. If you record with too few bits, you get a "steppy" signal. With video, these "quantisation errors" look like contour lines, most noticeable with very gradually changing blue sky gradients. With audio, the quantisation makes a noise (or introduces a noise into the original signal). It's most audible at the end of long fade-outs (try the something-like three minute fade-out at the end of Stevie Wonder's He's Mistra Know It All) and with long piano notes, as they decay to silence. Piano's a really good example. I remember trying this: play a piano chord and let it decay to nothing. With 16 bit recording, it becomes distorted after only about ten seconds. With 24 bit recording, you can still hear it clear and undistorted after 30 or more seconds. (Your milage may vary but the difference will be the same).

24 bit recording in a studio is now considered essential because it allows so much more headroom for loud noises. If you allow too much "buffer" for unexpected peaks, the rest of the recording will suffer, and with 16 bits, if you have to leave 4 bits for unexpected stuff, then you will have only a 12 bit recording.

So, if the number of bits is the dynamic range, the sample rate is the equivalent to video resolution. The more high frequencies you have in video, the sharper and more detailed your image. The more you have in audio, the more subtle and nuanced the high notes.

Nyquist Therom

There was a reason why the CD's sample rate was fixed at 44.1 KHz. It's because it's generally reckoned that humans can't hear above about 20KHz. When you sample something, you can only record frequencies at up to half the sample rate. It's called the Nyquist therom, in case you want to impress, or annoy, your friends.

Why Record Frequencies you Can't hear?

Why would you want to record frequencies you can't hear anyway? That's a very good question but many people argue that even though you can't hear them directly, the higher frequencies interact with each other to create lower frequencies that have a direct bearing on the sound.

Super Audio CDs (and the format that they store) use a different way to sample. What they do is use a very high sample rate, and only one bit. That, of course, sounds ridiculous, but there is some solid theory in this madness. By sampling at an extremely high rate, at one bit, you end up with a picture of a sound wave that is built up by averaging the Ups and the Downs. If there are more Up samples at a given time, it means that the graph drawing the waveform is higher. If there are more Down samples, then it's lower. There are so many samples that its possible to draw the waveform very accurately based on the density of Up samples versus Down samples. What this does as well is side-step the problem of what happens with very quiet sounds, when the voltages produced by decoding conventional sampling are very small, and can be affected by the inaccuracy and variation in analogue electronic components. Essentially, all you have to be able to do is tell the difference between a 1 and a 0, as opposed to any one of 65,000 or so values.

Whether or not this is academic doesn't really matter now, though, that storage and processing is so cheap. We are orders of magnitude more capable of playing High Resolution audio than we need to be - at least digitally. You still need precision analogue electronics to be able to hear it, and that's where, I think, Sony is pitching itself as it promotes better audio.

Music to our ears

I'm all in favour of this. It's not going to change the world, but it is going to make a lot of people like me happier. And what's even better is that High Resolution audio is no longer tied to a physical format.

I hope Sony, and the publishers of downloadable High Resolution audio files succeed. It will be music to my ears.

View full post