Audio on the Web - Knobs and Waves
This is part two of a series of blog posts on how a small ´óÏó´«Ã½ R&D team has been rediscovering the era of Radiophonics at the ´óÏó´«Ã½, its sounds and technology, through the contemporary filter of our engagement in the .
After a couple of weeks of exploring articles and books on the matter, we were feeling ready to marry the sounds of the 1960s with our 2012 technology. Alongside the gunshot effects generator, about which I'd written in the first post of the series, we decided we would recreate a few of the Radiophonic Workshop's equipment with the new web audio standards:
- A set of tape decks - the kind you could use to put together a tape loop from a set of stems;
- A "wobbulator" -- a frequency-modulated oscillator, which today would probably be called more prosaically a "sweep generator";
- ... and a ring modulator, the kind of equipment used to create the voice effects for the Dalek and Cybermen characters in the Dr Who series.
Pete had started thinking about the aesthetics of our demos, and within a few days he'd whipped together gorgeous interfaces for the 4 types of equipment, which Chris would later bring to life in HTML and Javascript with a mix of CSS transforms and a rather frustrating amount of scripting to adapt the interactions with knobs and button to the screen-and-mouse paradigm.
Annotated interface sketches.
And then there was a meaty challenge: recreating the sound of the radiophonic equipment using the emerging APIs being developed in the W3C Audio WG.
What could possibly go wrong? We were only trying to recreate sounds from scattered fragments of knowledge of the actual hardware, building our code on experimental implementations of specifications which we knew would change on an almost daily basis. And indeed, we were almost immediately successful -- if generating a convincingly random imitation of bodily noises can be qualified as success.
´óÏó´«Ã½ R&D Gunshot effects generator (detail)
To understand why our output did not quite sound as it should, we had to resort to a divide-and-conquer strategy. With so many potential points of failure, we would have struggled to figure out whether the problem was with our audio synthesis algorithm, or with our implementation of it, or with our understanding of the draft specification... not to mention that we could simply be hitting a bug in the browser implementation of the specification.
Fortunately for us, we had one well known solid foundation to build upon. When we started work on this project, there were two competing approaches for web audio processing: the and the . The latter was particularly interesting, not only because it was more mature and full-featured than the other one, but because it used the audio routing graph paradigm. This is a model where a number of blocks (nodes) are connected together to define the processing algorithm.
The audio graph was a natural fit for our work, not only because it mirrored the way most of the radiophonic-era hardware was built (blocks and wires), but also because a lot of the audio processing software since the 1980s had been working on such a model, too.
And so Matt decided to use , an open source graphical audio programming tool he knew well. His ability to use Pd to quickly prototype our audio synthesis graphs was a boon throughout the project, allowing us to refine and validate the audio processing graphs which we expected to produce the sounds of our instruments, and do so in an environment we could trust.
Making a bang with Pure data
Once Matt managed to make his sound processing algorithms in Pure data sound right, Chris would translate them in javascript and build the demos on top of the Chrome implementation of the Web audio API (the only implementation of the emerging specification at the time of this writing).
That was not always a walk in the park, either. We had wanted to find a project that would stretch the capabilities of the API, and stretch it did.
Matt, Chris and Pete working on the demo interfaces
´óÏó´«Ã½ was about synthesis of audio, as opposed to only audio processing, which most of the API demos so far were showcasing.
already included a number of native node and built-in methods for a lot of the typical processing one would want to implement: mixing, common filters, spatializing, and so on.
It was quickly apparent however that synthesis was not as well supported, and we would have to code a lot of the synthesis from scratch in the API's JavaScriptNode, the "blank slate" nodes which developers can use to build their own custom code.
In spite of (or maybe thanks to) all the frustration we had to endure, this meant that we were able to provide a lot of feedback to the audio working group throughout the project, a lot of which quickly made its way into the specification as bug fixes or new features.
Such a quick turnaround was sometimes making things even more difficult for us, as we were not only shooting at a moving target, but helping the target move faster too! Shortly after building our own oscillator code in a custom JavaScriptNode and sending feedback to the group about the potential need for a native oscillator node, the specification's editor drafted it and it quickly made its way to the "cutting-edge" release of the Chrome browser. Should we then rewrite our code to use the new native node, and thus "target" only the very latest release of a single browser engine, or keep our custom code and ensure that anyone with a reasonably recent installation of either Chrome or Safari would be able to use our demos? We chose the latter, but as time passes and we look at publicising our demos, I know the question will eventually re-surface.
In several other instances, our prototyping gave us feedback material for the working group. In addition to the oscillator interface added for audio synthesis, we uncovered some issues with handling of multi-channel audio and problems with processing delays in some specific cases... and as I write this, the question of whether we need to standardise basic operator nodes such as add, subtract, and multiply is approaching a resolution.
Next step: release the demos to the world. But we want to do that right, and give our work not only a "cool" factor, but also make it into proper learning material.
In the next instalment of this series, we will look at how the demos were put together: writing less code, more documentation, and build demos others can learn from - or just play with!
Comments Post your comment