Pinocchio
This week has been quite a week for some of ´óÏó´«Ã½ R&D's current research projects getting air time. A team of us from the North Lab helped The One Show draw a giant Christmas Tree using light painting, and halfRF HD cameras were used for the live broadcast from television centre. This post is about a radio drama that was broadcast on Saturday on Radio 4. It is an adaptation of Carlo Collodi's Pinocchio. Linda, the script writer, has written a blog about adapting the story that you can read here.
Radio drama production workflows have developed over the last 60 years, but the productions are almost always a mixture of microphone recordings of live actors' performances, recorded , sound effects from libraries, and music either from a composer or a music library. These sources are then mixed together to create a stereo (two channel) file which is played out for the broadcast. Recently a few radio dramas have been produced in 5.1, for example earlier this year Private Peaceful was mixed to 5.1 and then rendered to binaural using virtual speakers. This involves the creation of two mixes, a stereo (two channel) file and a 5.1 surround (6 channel) file. The creative process of mixing to these formats had to be performed twice, once in stereo and once in surround.
For Pinocchio we did something a bit different. Steve, the sound designer, and I mixed the recordings, treating the sources as "audio objects". This means that rather than making a stereo mix by panning sounds somewhere between left and right speakers, we forgot about speaker locations and positioned audio objects at locations in space. This means the final mix, rather than being a stereo or surround sound file, is actually a set of audio objects, each with accompanying metadata to describe things like the source level, azimuth, elevation, distance, etc. This data is then rendered to speaker channels before broadcast (in the case of this experiment) or at the listener's home/device (potentially the case in the future).
There are a number of potential advantages to describing audio scenes in this way:
- Speakers/listening devices become independent of the mix. This means listeners can put as many or as few speakers as they want, wherever they want. Or they can listen using headphones or a mono tablet speaker. The client system knows the listening environment and can render the scene in the way that provides the highest quality of experience.
- An object based scene representation can be rendered differently for different people. For example, people with different hearing abilities may want a different balance between foreground and background sounds. There is also a lot of potential for applications like Perceptive Media when describing scenes using an object based audio approach.
- Interactivity can be enabled. When audio scenes are comprised of different object, those objects can be fully interacted with in order to provide computer game like experiences. This opens up a lot of user experience research questions.
Ìý
This approach has the advantage that we could mix Pinocchio in 3D, placing sounds wherever we wanted (above, below, in-front behind etc.). For exmaple, there is a scene where Pinocchio was swallowed by a shark. Steve and I were able to position underwater sound effects, as audio objects, all around the listener, creating a highly immersive experience.
We were able to monitor the mix in our listening room, which was equiped with 26 loudspeakers. This set up has more loudspeakers than are likely to be available to the average listener so we used a rendering system created byÌý to create the stereo mix, which was broadcast on Saturday, a 5.1 mix and the 24.2 mix that the production team used to monitor the mix during post-production. So although you can't yet hear a full 3D surround version of Pinocchio, you can download a ( or ) 5.1 version to hear it in surround sound. Instructions for how to do so are given here.
This is still very experimental and we'd love to see your feedback in the blog comments below.
Comment number 1.
At 17th Dec 2012, Trev wrote:This comment was removed because the moderators found it broke the house rules. Explain.
Complain about this comment (Comment number 1)
Comment number 2.
At 19th Dec 2012, Riz wrote:How different your technique is to the patented Dolby Atmos system?
Complain about this comment (Comment number 2)
Comment number 3.
At 22nd Feb 2013, tony churnside wrote:Hi Riz,
Sorry for the delay in responding. Good question. It is generally agreed there are three schools of thought when it comes to sound representation beyond 5.1 surround:
1) Channel based - this is just increasing the number of channels, an example of which is NHKs 22.2 work.
2) Scene based - this is representing the whole audio scene as a set of dependent signals which can be decoded to recreate the audio scene. Higher Order Ambisonics is an example of this.
3) Object based - this means treating the sound sources which occur in the scene as independent audio objects along with the metadata (parameters like position) needed to reproduce these objects on a playback system. Our Pinocchio experiment is an example of this approach.
Each of these approaches has its advantages and disadvantages, which I'll save for a future blog post.
As for the specific Dolby ATMOS implementation, I don't know the fine details (only Dolby will be aware of these) but from what I understand it takes a hybrid approach using a combination of channel and object based approaches.
I hope this helps explain the difference.
Tony Churnside
´óÏó´«Ã½ Research & Development
Complain about this comment (Comment number 3)