大象传媒

Speech-to-Text

Using our subtitle archive to create more accurate speech-to-text

Published: 1 January 2015

Improvements in machine learning have allowed us to train our own speech-to-text system. It鈥檚 found a myriad of uses from archive search to improving social media shareability.

Project from 2015 - present


What we're doing

Speech-to-text is a process for automatically converting spoken audio to text. It has recently moved from the lab  as a useful new tool for broadcasters and journalists. Breakthroughs in automatic analysis and improvements in affordability mean that running it at scale over hundreds of thousands of hours of content is now feasible. Increases in accuracy mean that users will have a realistic chance of finding what they want in minutes rather than hours, especially in genres such as news or factual content.


Why it matters

The 大象传媒 has one of the largest archives of broadcast material in the world, but only a fraction of it is truly searchable. We know there are hidden gems throughout the hundreds of thousands of hours of TV and Radio we鈥檝e digitised, but there鈥檚 currently no easy way to find them. Speech-to-text is the first step in this process, as it allows a semi-accurate transcript of what鈥檚 said to be made searchable.


How it works

Our recent work has focused on using  to build speech-to-text systems for both live and offline use. We used our large archive of subtitled programmes to train language and acoustic models specifically for broadcast output, which we鈥檝e found to be more accurate than generalised models. We鈥檙e also researching new types of recurrent neural net which offer the promise of better accuracy when very large datasets are deployed.


Outcomes

The engine we鈥檝e build has been used in half a dozen different tools across the 大象传媒. The biggest user is the  who鈥檝e run it across almost a million hours of historic content. One of the more unexpected use cases is  which uses speech-to-text to rapidly subtitle short-form video for social media platforms. We presented a technical overview of the system we built and the uses we put it to at .

Woman looking through shelves of film in an archive

  •  - 

Project Team

  • Matt Haynes

    Matt Haynes

    Principal Web Developer
  • Rob Cooper

    Rob Cooper

    Producer
  • Chrissy Pocock-Nugent

    Software Engineer
  • Andrew McParland

    Andrew McParland

    Principal Engineer
  • Alex Norton

    Alex Norton

    Software Engineer

Project updates

  • Internet Research and Future Services section

    The Internet Research and Future Services section is an interdisciplinary team of researchers, technologists, designers, and data scientists who carry out original research to solve problems for the 大象传媒. 大象传媒 focuses on the intersection of audience needs and public service values, with digital media and machine learning. We develop research insights, prototypes and systems using experimental approaches and emerging technologies.

Rebuild Page

The page will automatically reload. You may need to reload again if the build takes longer than expected.

Useful links

Theme toggler

Select a theme and theme mode and click "Load theme" to load in your theme combination.

Theme:
Theme Mode: