Replay: Often the most under considered aspect of production, refusing to create captioning or subtitles could be costing you millions of viewers. Isn't it about time we finally started taking accessibility in video much more seriously?
It is the elephant in the room when it comes to producing film and video. Accessibility. It is quite often relegated to the bottom of the list, or even worse not considered at all.
Whilst it is quite true that in broadcast television and studio films there are closed captions and subtitles for the hard of hearing, and audio descriptions for those who suffer from sight problems, smaller scale production on the other hand often falls flat on its face when it comes to such consideration.
A look through YouTube, Vimeo, and a vast number of company websites with self hosted promotional videos shows that the problem is endemic, despite the fact that companies could be on dubious legal grounds for not considering such factors. Okay, I'll put my hands up right now and admit that I, too, have been guilty of this.
The reasons for not catering fully to such disabilities runs narrow. It usually boils down to two things. Time and money. This is a particular issue if the video is on a tight deadline and/or the client is a cheapskate. But the creation of fully accessible considerations for people with disabilities is also confounded by the lack of standards and the ability to actually implement them.
There are two main components that we need for wide ranging accessibility. Closed captioning and an audio description track. Simple, isn't it? Or is it?
Closed captions are similar to subtitles. But whereas subtitles only refer to what somebody is saying, closed captions give a visual description of sound as well. For example if a character drops a glass on the floor off camera or reacts in a certain way the captions might say "[Sound of glass smashing off camera]" or "[Sarah lets out a loud shriek]". Such descriptions extend to many minor sounds too, for instance if a character exerts a heavy sigh, or if there is a subtle sound of a tap dripping. Closed captioning gives a much more vivid and superior account of what is happening compared to traditional subtitling. Subtitling is also still very valid, but it doesn't compared in scope to closed captions.
Likewise, audio descriptions describe what is happening on the screen. So for instance in addition to a films traditional soundtrack, a narrator will describe events. For example "John enters the smokey, darkened room angrily, almost slipping on a banana skin that was on the floor. He grimaces profusely as his feet slip and slide around on the squeaky floor..." Once again, this allows the audience to build a visual reference in their heads as to what is happening as opposed to the confusion of wondering who is where and is doing what.
Okay, so those are the main ways of additional communication within video, how do we go about implementing them, and why don't we produce them as a matter of course?
The first reason is time. If a video is long it takes quite a while to transcribe. As a result some companies fall back on YouTube's automatic subtitling system and end up annoying more people than they satisfy! The second reason is money. Isn't captioning and transcription expensive?
These days the creation of a caption track is not a huge problem. There are professional companies around that will transcribe or produce a subtitle file for around $1 a minute, such as Rev. These aren't automated services either, but ones that use real humans to do the transcribing, and output files are available in the format of your choice, for example WebVTT, the preferred format of Vimeo.
Speaking of which, it used to be until very recent years that both Vimeo and YouTube lacked the ability to add subtitles and captions. This has changed, and now it is possible to both upload subtitle and caption files, and in the case of YouTube you can even adjust the timings online to make sure everything appears when it should. Remember that poorly timed captions are worse than none at all!
Now that we know that producing a caption file can be a fairly painless and inexpensive thing to do, what about audio description tracks? Unfortunately this is where things fall apart, badly. None of the major video hosting sites cater for multiple audio tracks, a feature that is required for such a function. Additionally creating an audio description track is much more involved than closed captioning.
For a start you need a good script that will give a vivid, visual description of what is happening. Second, you need to record the audio description with a good quality voice-over. This will usually need to be professionally done if it is to be effective. Just as you wouldn't want an amateur voice-over for your normal video soundtrack, you wont want one for your audio description either. Once you have this it will need to be inserted onto your timeline, edited for timings, mixed, and then a separate soundtrack output from the NLE or mixing software.
The trouble is that once you have this you are all dressed up with nowhere to go! Unless you are distributing through professional means, ie broadcast, retail DVD, Bluray, or one of the major online services such as Netflix, you have no way to implement the additional track into your video. Companies such as Wistia do have audio description functionality, but such services don't suit all companies and individuals due to cost. The only current realistic solution is to create a separate alternative upload of your video to the hosting site using your audio description track as the main audio.
And therein lies another issue in all of this. Many of the popular videos that are online are often not created by professionals. For example while we know that captions are incredibly cheap as part of a commissioned video, would an enthusiastic video blogger be inclined to produce such things for each of their videos? They may not be making much, if any money unless they are in the top tier, and will be producing content for quite often very niche subjects in their spare time. The money to transcribe each blog video over the course of a year would soon add up. Producing audio descriptions would be very much prohibitive on time and price for such a person as well.
There is one constant in this. The problem must be tackled. Somehow. It is just as much in the interests of video producers as it is for those with the disabilities that it affects. Not providing for such people is potentially depriving your videos of not just thousands of potential 'viewers', but millions! Did you realise, for example, that both YouTube and Google index videos with closed captions? In other words the captions in your video contribute significantly to SEO and therefore the number of views you receive. There is also another quirk with regard to all of this. It isn't just those with disabilities who are using accessibility options.
The British regulatory body, Ofcom, found that of 7.5 million people used closed captioning for television, only 1.5 million actually had a disability! To put it into stark terms, 80% of those who used closed captions had no disability. Why is this?
There are many reasons which can vary from the need to watch video in a sensitive or noisy environment such as an office, or on a train or bus, through to simple clarification of what is happening. It would appear that people simply take in what is being said or what is happening when it is reinforced by captioning. Furthermore if your video is in the second language of your audience members, then having subtitling or captioning can make a big difference to how much they understand, even if the captions are in the same language as the main audio on the video.
For the near future though, the processes for creating accessibility in video need to be made much easier. Better AI and speech recognition will no doubt drastically improve automated subtitling in the coming years, although this won't necesserily help much with captioning, which requires more description. Certainly, while actually producing an audio description track can be much more involved and expensive than many would like right now, the very least that the major hosting sites could do is to provide an alternative audio track function to enable such a feature. For companies of the size of Google this really shouldn't be a big problem to solve.
With computer produced speech generation becoming much more human like in sound and pronunciation it would hopefully not be too much time before audio descriptions could be read by the system rather than requiring a professional human voice over. If the hosting companies could agree on a standardised input format that they would accept, this would mean that a script that can be read by machine could be produced alongside the closed caption files, thus making things quick, painless, and inexpensive for everyone.
One can only hope that such a process won't take too long to implement since those who are left out of the digital video revolution only have so much patience. For now, it is only right that no matter what your level of production, such matters should be given much more consideration than they are right now.