Creating subtitles (translated speech-to-text intended for overseas audiences without hearing impediments), closed-captions (accessible speech-to-text for people who are deaf or hard of hearing) or subtitles suitable for people who are deaf or hard of hearing, is getting easier for media owners thanks to AI. Interra Systems, one of the companies driving advances in captioning (the general term that spans accessibility text and subtitles) talks of dramatic cost savings when using AI with limited human oversight for captioning, rather than relying only on human input.
Focusing on localization, Ashish Basu – Executive VP, Worldwide Sales and Business Development at Interra Systems, says captioning can be a barrier to exploiting media rights in overseas markets, including for sport, and offers an example of the challenge and opportunity. “We have been talking to multiple companies in the U.S. who are thinking seriously about taking their content to Latin American markets like Colombia, Venezuela, Brazil and Argentina. One exec said he is sitting on a pile of content he would like to localize but the captioning requirements are hard to fulfil.
“Even if he employed 25 or 45 people there is no guarantee the captions would comply with local regulations at the end. He told me that if he employed specialist native speakers everything would have to be checked for local compliance afterwards for each country. His estimate was that he could localize the content in 45 days using our solution.”
The solution Basu is referring to is BATON Captions, designed for automated captioning workflows covering VOD and live, which is ready to create content in multiple languages and, if required, take care of quality control (QC). Machine Learning, Automatic Speech Recognition (ASR) and Natural Language Processing (NLP) are big parts of this solution, which can transcribe audio to text faster than a human, learn diction and cope with accents. Among many other things, it splits captions into human readable sentences so that meaningful units of text are kept together – like a first name and last name.
BATON detects text that is part of the programme (including, for example, the score in the top corner of a sports match) and ensures captions never hide such content. QC and verification are key parts of the solution. Inaccuracies in captions are reported, and so are compliance issues. Interra Systems points out that captions can be regenerated in multiple versions and closely checked against audio essence, and corrected and then exported to any industry-supported caption format.
One of the QC checks is for reading speed, row count and character count. As Basu points out, when dealing with accessibility there are many factors that go into compliance, with words per page and reading speed among the considerations to ensure everyone can benefit from captions. “Our software can reduce everything into the right number of words. It is tested in a large number of data sets, for different markets and languages and regulatory expectations,” he adds.
It is easy to see how multi-market localization, including with different compliance regimes, is a complex task. Basu says one media company that evaluated the Interra Systems captioning technology decided it could make 70%+ cost savings versus a standard manual workflow.