RedShark News - Video technology news and analysis

Latency is the enemy of the metaverse

Written by David Shapton | Feb 11, 2022 8:55:09 AM

I'm (only just) old enough to remember seeing the Apollo moon landings televised live. It seemed remarkable then, and even more so now, when you realise that the entire project relied on computers that your Apple watch would laugh at.

There were lots of eye-popping live video moments to choose from, but the one that I still remember vividly was when the Apollo 17 Lunar Module blasted off from the surface of the moon.

It was about as gripping as moon coverage could get because this was the first time we'd seen the ascent from the moon live. It would have required an independent video link (analogue, of course) from the camera - or a nearby transmitter - to Earth.

For the astronauts, it was tense. Their rocket motor was a single point of failure, and if it hadn't lit up, they would have died on the moon.

Back on Earth, someone had the job of operating that camera. I assume the controls would have been familiar to anyone that's driven a PTZ (Pan, Tilt, Zoom) camera. It's not exactly easy back here on Earth, but imagine the responsibility of the brave operator tasked with tracking that take-off.

I didn't know much about video at my tender age, but something fascinated me. The TV commentator, introducing the take-off, said that we were lucky to watch it live from the surface of the moon. He continued: "I don't envy the camera operator. He's got to remember that there's something like a three-second delay between his camera commands and seeing the results".

Radio waves travel at the speed of light. But that's not much help when you're talking to something as far away as the moon. Three seconds (90 frames in NTSC) is an eternity when you're filming a moving object.

But the nervous operator got it exactly right.

I can only imagine that they sent the command to start moving the camera up at a pre-planned point in the countdown, and whoever it was, kept the spacecraft in the frame for nearly 30 seconds after take-off, and even appeared to track the module when it started moving out of the picture.

That's latency for you: the delay between commanding something to happen and it actually taking place. It might sound like a mere technicality, but the reality is that it's one of the most critical factors in our analogue - and, especially - our digital lives.

Latency is a big problem

Just imagine driving a car when there's a 1000ms (that's a second) delay between the steering wheel and the front wheels. Or the brake pedal and the brakes. It just wouldn't work: it would be lethal.

Latency is a problem for musicians. Our ears and bodies are extremely sensitive to timing, especially when listening to or performing music. Virtual synthesisers are extremely popular because they sound nearly as good as the real thing but at a fraction of the cost. You need a controller keyboard connected to your computer to play a virtual synthesiser. That's my set-up at home, and I've noticed that it's not until you get the latency between key presses and hearing the sound down to below 20ms that it feels like you're "connected" to the instrument.

It's latency that prevents musicians from playing together remotely on Zoom. You don't notice the delays on Zoom during a conversation because people speak serially - i.e. one after another. The odd third (300ms) of a second here or there doesn't matter, but that delay is a show stopper for music. It would make it impossible to play on the beat.

Swedish low latency pioneers Elk has built a hardware/software ecosystem that brings remote latency for musicians right down, and the way they've done it tells us a lot about where latency comes from.

The raw, unencumbered latency of the internet is surprisingly low. It's much less than we usually experience in our homes because WiFi adds its own delays. Another culprit is the computer itself. Computers aren't optimised for real-time performance - but they can be with the right software.

Elk has written its own audio-optimised operating system, Elk OS. It reduces computer audio latency by a factor of twenty or more. With Elk OS - running on Elks own pro audio hardware and supported by their subscription service, latency is low enough for musicians hundreds of miles apart to play together with professional audio quality.


ELK Live, allowing musicians to play together perfectly over the internet. Image: ELK.

Even the Elk system has limits, and that limit is explicitly the speed of light. Places on opposite sides of the Earth have a hard limit on theoretically achievable latency. We have yet to see a patent to travel faster than light.

We see the results of excessive latency everywhere, from a "laggy" electronic viewfinder to the excruciating ten or more seconds of uncertainty at the beginning of live-streamed shows.

Latency and the metaverse

But, within the boundaries of physical possibility, we're starting to see some significant advances. First, remote live production at broadcast levels of quality is now entirely possible thanks to devices like Matrox's Monarch Edge encoder/decoders that not only send video that's good enough to broadcast over the public internet but do so with latency as low as 100 ms (1/10th of a second).

Meanwhile, the metaverse will both pose and eventually solve many issues around latency. It will have to. Let's have a closer look at this.

First, and perhaps most obviously when you think about it, is latency in your metaverse device. Think about a fully immersive metaverse set-up. It's vital to have a convincing 3d world to wander into - and that's what tends to grab the headlines. But far more fundamental to the experience is that there's a digital version of you is walking around in a virtual world. So forget for a moment about what you look like in your avatar incarnation. Instead, consider this: for the "illusion" of reality to work, your avatar has to behave as if it's a natural, living, human body. Not only that, but it has to respond in the same way, too.

And the biggest reason why it might not respond like a living human body is latency.

Let's go back to that moon video. It doesn't matter what latency exists in the system: it will be double that in a real scenario. Why? Because if you, via your avatar, are responding to stimuli fed to you by the metaverse, it's not just a question of your own reaction time: those reactions need to have a causal effect on your avatar. That, too, will incur latency. Exactly how much depends on the capabilities of your local part of the metaverse. But at the very least, any movement or action you take will set up a causal chain of events. So, for example, if you move your arm, the computer responsible for reacting to your "commands" will generate a 3D representation of that arm and the movement. It takes time. If it's ray-traced, it might take more time.

And where's the server responsible for rendering your actions as a 3D moving object in a virtual 3D world? In addition to the raw "compute" time, there might be latency between you and the server, which could be in a cloud-connected computer data centre on the other side of the world.

If you tried to even work with the sort of latency that's common in graphics rendering - even if it's "real-time" - you'd probably fall over, or at least you'd wonder why your feet had turned to jelly.

The metaverse will have to be cloud-based, and some cloud developments look like they can help reduce latency.

Amazon's "Wavelength" is a service that places its cloud computers at the "edge" of the network - not only local to the users but literally inside a 5G network. The result is a powerful computing capability with extremely low latency. Expect to see more of this topology as the metaverse rolls out.

It could go either way. If the metaverse is "thrown together" with little thought about latency, then it will probably be horrible. But with a lot of care and massive attention to every potential latency-inducing node in the metaverse topology, if it can get latency down to the levels our own, human perception can either accept or discount, then we're on the verge of an incredible virtual experience. And one that matters far more than mere pixels.