It’s beginning to look like the future of video is not going to be linked to rectangular screens. While the technology that puts moving images onto screens (from mobile to massive) will continue to evolve, other, more diverse ways of viewing video are starting to emerge.
Ultimately, we won't need screens because we won't use our eyes to watch video. Instead, somehow, we will merely become 'aware' of it. Which is to say that video will go straight from the source to our brains. I have very little idea about how this will happen, but I'm pretty sure it will. Quite when depends on a lot of research and some of it will be of the 'hard' category, including a phenomenon that is intrinsically tied to awareness: consciousness itself.
Before then, we can probably expect to see retinal projection becoming a commercial reality. Why project video onto a wall when you can project it directly onto your retina?
This will be important for virtual and augmented reality. Awkward Head Mounted Displays that put a screen in front of your eyes can only ever be a phase we're passing through, while other techniques that are much better catch up.
Another barrier to cross is the one between pixels and objects. We might see pixels, but we don't perceive them. Of course, both of those statements are false if we're sitting too close or an image's resolution is too low. But forget that for a minute, because, strictly, pixels are only an intermediate stage in our perception, or our awareness, of an image.
Object awareness
Even in the most old-fashioned way of watching a video (looking at a screen), we don't care about the pixels. What we see is an image. But apart from the image in itself (i.e. the totality of the image), we are not looking at video objects.
Nor is any of the digital system that brings us that image 'aware' of anything in it, apart from the most basic parameters that apply to parts of the clip or all of it.
This is an essential distinction, because if the system was aware of objects within the video, then there's no reason why it shouldn't treat those objects as distinct entities and process them accordingly.
We do see the most basic antecedents in object orientation in Long-GOP video codecs, where movement is identified between adjacent frames, although I would argue that this is not in the same category of object orientation that we are talking about here, because the system is not 'aware' of what it is that is moving.
With proper object-oriented video, the system would know that an image contains a face, a cactus and a teapot. Building on this, there would be thousands of times more information than that, going into more detail about the objects.
So-called 'machine vision' is improving at a breathtaking rate, thanks to several techniques that are only recently becoming practical. Object-oriented video is related to this.
Language of vision
What's the point, ultimately? The point is that this is how we see.
Think about driving at night. What you can see in the dark as you are driving is probably millions of times less than the information your eyes pick up in the daytime. The only reason that's OK is because much of the visual detail you see during daylight is completely unnecessary and might even be distracting. At night, you concentrate on the only information you can see. Ideally, the road has clear markings that will be picked up by the headlights. If it doesn't, you can still drive, but you rely on seeing the edge of the road. That isn't much to go on, but the scope of the headlights is normally enough, if you drive at a suitable speed, to pick up any hazards in time for you to respond to them.
But you quickly get used to driving at night. It soon feels comfortable and natural. I suspect that this because our brain makes a very solid assumption, increasingly so with experience, about the space we are driving through. With very few visual clues, we're able to construct a 3D moving 'scene' that we're confident enough to propel a fast moving vehicle through. This isn't a video game; it's reality!
It's precisely because we're able to build a 3D world with such certainty that object-oriented video will be so effective. It doesn't need masses of resolution (except on the acquisition and 'object recognition' side, hence the need for 8K and beyond) because the 'reality' is created by our own brains.
If we can find a way to encode object-oriented video into the same set of 'primitives' used by our own perception, if we can understand the 'language' of our mind's vision, then we will have the ultimately-efficient method to present the best video we have ever seen.
Graphic by Shutterstock
Tags: Technology
Comments