Technical: Music and Sound in CrossCode news

Post news Report RSS Technical: Music and Sound in CrossCode

We got a special technical post today, in which we share how we did music and sound in CrossCode. This includes working with the WebAudio API and our special music interface.

Posted by Regiden on Aug 23rd, 2013

It's time for a new technical post!

As promised today, we want to share with you how we implemented music and sound in CrossCode. This includes how we used the WebAudio API to increase the sound quality and make perfect loops possible. Any code you will encounter is not directly from the game and might be pseudo-code to some extend (Oh, and any typo is intended!).

So... grab some fish'n'chips and tea, this is going to be a long post!

The WebAudio API

For starters, the WebAudio API is a more or less high-level API for processing sound via JavaScript. Now that means that we can not only play sounds via the API, but we can also manipulate the sound in any way we want. To make this magic happen, the API uses a routing graph. You have a master node (called destination in the API) and attach a number of different nodes to it. Look at this figure showing how this visually looks like:

AudioNode represents a single sound that will be played. As you can see there aren't only AudioNodesthough. The WebAudio API features lots of nodes that make it possible to manipulate the actual sound. In the graph you can see a GainNode. This node is used to change the volume of the sound. Like any graph structure it is also possible to attach multiple nodes to another node. This can be really useful if you want to pass each sound through a single GainNode to change the volume for all sounds.
Other nodes make it possible to apply filters on a sound. Or you can use multiple sounds to cross-fade them based on your position in the game world. We don't want to cover all the possibilities here since there already is a great article over at Html5Rocks. Check out this link:

Developing Game Audio with the Web Audio API

This article features some very nice effects you can use and is a great start for learning how to use the API.

The API's starting point is a single object that has all the functions to create different nodes:

javascript code:

window["AudioContext"] = window["AudioContext"] || window["webkitAudioContext"];
this.context = new window["AudioContext"]();

Goals for CrossCode

For CrossCode we made a small list of things we wanted to have for both music and sound.

loop music without stutters
play sounds in a 3D enviroment
loop without making the music perfectly loop-able
fade-in/out music
pausing/resuming music

Lots of these points are already covered by the audio element. But the first 2 are hard to achieve. Perfect timing for music isn't easily achieved and you might have noticed it too while playing CrossCode. The same goes for positioning sound in 3D. We might also come up with other ideas along the way that aren't possible to do with the audio element so we started working on integrating the WebAudio API.

So, lets dive into the implementations! We're going to begin with sound to explain some basic aspects of the API. We will also talk about the default implementation which is mostly covered by the music and sound classes impact.js provides.

Sound

As said, impact.js covers a lot of ground here already. But as usual we changed a lot to make it work for us. One of these changes is to separate the default implementation, which uses the audio element, from the WebAudio Implementation. Normally we load a sound roughly like this:

javascript code:

var sound = new Sound("path");

And we wanted to keep it that way. So we have two sound implementations sharing the same interface, SoundWebKit and SoundDefault:

javascript code:

if(window["webkitAudioContext"]) {
Sound = SoundWebAudio;
} else {
Sound = SoundDefault;
}

When the game loads, it will detect your browser and switch to the WebAudio API if it is supported. For the default implementation we simply load an audio element. Since each audio element can't play its sound multiple times in parallel, we need to load several audio elements of the same sound for parallel playback. This is all handled nicely by impact.js so we don't have to worry about it.

However for WebAudio we had to come up with something on our own. To load a sound via the API you mainly use a XMLHttpRequest. But simply loading isn't enough. We also need to decode the data we get from the request. This is done via the context mentioned in the WebAudio API earlier. Let's simply assume that we have a global context created and ready to use. The code to load a sound would look like this:

javascript code:

var request = new XMLHttpRequest();
request.open('GET', path, true);
request.responseType = 'arraybuffer';
request.onload = function () {
sound.context.decodeAudioData(request.response, function (buffer) {
sound.buffers[path] = buffer;
}, someFancyErrorHandler())
}
request.send();

As you can see there isn't much effort in loading the sound file and decoding it. After we decoded the sound we cache the buffer in some global state, so we don't need to load it again. This is also true for quickly repeating the same sound! The buffer will then be referenced in the WebAudio implementation of the sound object.

You'll need to create a node each time you play the sound. That's right, In contrast to the audio element we can't simply reset the playback position to zero and play again:

javascript code:

play: function() {
// creates a AudioNode called AudioBufferSourceNode
var node = sound.context.createBufferSource();
node.buffer = this.buffer;

node.connect(sound.context.destination);
node.start(0); // or node.noteOn(0);
}

As you can see we assign the buffer of the sound to the node. The AudioNode here is a so called AudioBufferSourceNode. As the name implies it is the base node for playing back any sound. We then need to connect the node to the destination node of the context. As explained at the beginning of this post, this way the sound gets routed through the audio graph. To play it we use the start/noteOn method. Don't get confused by the parameter. It is mandatory and is used as the offset time to start the playback.

But of course this is not all we do. We also want to position sound, right? For this we simply use another node that WebAudio already provides to set the position of a sound in 3D: the PannerNode. We now need to extend the method above to include a possible position:

javascript code:

play: function(pos) {
var node = sound.context.createBufferSource();
node.buffer = this.buffer;

var position = sound.context.createPanner();
position.setPosition(pos.x, pos.y);

node.connect(position);
position.connect(sound.context.destination);
node.start(0); // or node.noteOn(0);

return {pos: position, source: node};
}

For this method, we simply assume that every sound has a position. As you can see we create a new PannerNode here and connect it to the AudioBufferSourceNode. We then connect the position node to the destination and start playing the sound like before. In CrossCode we have a little bit extra code here which adjusts the position to be in world space. There is also a bit of code that makes sure that when you're in a certain radius of an object that sound plays normally to create a smoother experience. Otherwise it would be very distracting if every sound makes a sudden gap from the right to the left speaker.
Notice that we also return an object here? In this pseudo-code it returns the position node and the source node. We do this to be able to adjust the position in real-time and be able to stop playback manually.

This is basically what we do with WebAudio here. Of course not all browsers currently support WebAudio. So you might wonder what we do with positioned sound for those players. Of course we can't achieve the same effect with the audio element, so we simulate the same effect via the volume of the sound. The further you away from the sound the softer it is. We hope that eventually all browsers will support the API so everyone can have positioned sound in their favorite browser.

And this is it for sound! Let's summarize everything in a small list:

create different implementations for WebAudio and audio element
use impact.js sound loading for the default implementation
load, decode and cache loaded sounds
play a sound with a position via PannerNode
simulate positioned sound via volume in default implementation
return a handle to manipulate the position of a sound in real-time

Music

In Weekly Update #9 we already told you that we basically rewrote the whole music player impact.js provides. We created a new interface that fits more our use-cases. We also separated the music player from the actual track. Each track has a minimal interface:

javascript code:

Track = {

loopEnd: 0,

play: function() {},
pause: function() {},
stop: function() {},
setVolume: function() {},
getVolume: function() {}
}

Just like with the sound we now have two alternative track implementations. One for the audio element and one for WebAudio. Again, for default we use the basic loading that impact.js provides. When creating a track we make a reference to the audio element and play it when we need it. The advantage of audio elements here is that they support streaming. That means we don't have to wait until the whole track is loaded to start the game. For WebAudio... well, here is the bummer: we have the load the whole piece. The API supports the connection to audio elements (via the MediaElementAudioSourceNode) to support streaming. However, with the audio element as source we again lose all the timing advantages we get from WebAudio. Thus, we're stuck with an increased loading time when using WebAudio. Of course since a browser will cache files, restarting the game will make the game load faster again, but we still decided it's a better idea to provide music via Web Audio as an option the player can choose.
As you can see in the code we also have a property called loopEnd. It stands for the playback position when the track should loop. So our tracks don't exactly loop at the end of the file. Instead we created a system for each track implementation which will start playing a second audio element or AudioNode when a timer hits the loopEnd mark. This makes it possible to seamlessly loop tracks. For the audio element we use the timeupdate callback that is provided by the element:

javascript code:

this.track.addEventListener('timeupdate', function(
if(this.track.currentTime >= this.loopEnd) {
// loop track
}
));

For WebAudio we use the current time of the WebAudio context to precisely time the next loop. We also have a loop count, which increases every loop. With this we time each new loop from the first playback of the track. This makes sure that the next loop is always started before the previous loop ended. This can be done easily via WebAudio since every node has a start time parameter. We simply calculate the offset from the startTime multiplied with the loop count and get the same timing for every loop. Here is some code that explains roughly how we did this:

javascript code:

// called periodically
loopcheck: function() {
var currentTime = sound.context.currenTime;
var nextOffset = track.duration - loopEnd;
if (currentTime > (startTime + loopEnd * (loopCount + 1) + nextOffset)) {
var tempNode = sound.context.createBufferSource();
tempNode.buffer = soundBuffer;

currentNode.stop(0);

currentNode = this.nextNode;
nextNode = tempNode;

loopCount++;
var offset = startTime + (loopCount + 1) * loopEnd;

sound.context.connectMusic(this.nextNode);

nextNode.start(offset);
}

}

This might seem a little bit confusing. But it does exactly what we want. As you can see we do not exactly start the next node, but rather the node after the next node. This also means that when starting the playback we have to start two nodes. However thanks to WebAudio we can simply shift the start time of the playback and thus time it the way we want.

All of this means that we have some sort of meta-data for each track. Next to the loopEnd property we might also have an intro that is played before the first loop starts. The intro has an introEnd property which works just like loopEnd. We adjust the loop about the time of of introEnd and voila, we have a custom intro for our tracks!

Now back to the interface we use to play the tracks via our music player. There are several methods our player has:

play
push
pop
inbetween
pause / resume

Play simply starts a track. It also stops the currently playing track. A track will be played by name, which means each track will be preloaded along with it's meta-data and name.
Push allows us to start a new track while the old one will be paused.
Pop simply stops the track at the top of the stack and starts the paused track below the top stack element, continuing from where we left off.
With the push and pop commands we introduce a stack in our music player to control which music piece is currently played. This is done to make sure we can play tracks in between other tracks. Think of a regular background theme. Now a battle starts and the battle music begins to play. After the fight is over the battle music stops and the regular background music start playing at the position it stopped.
InBetween is a mix of push and pop. You could call it a push with an automated pop. This is even true for tracks that would normally loop. We use this for short tracks like an item get sound (shameless reference to the Zelda series here).

All of these four methods, also include parameters to fade-in and fade-out the tracks. When pushing a new track, we can fade-out the current one and fade-in the pushed one. This is especially nice when you end an battle and while the battle theme is fading out a new battle starts. We simply fade-in the battle theme without even restarting the track. It creates a much better feeling for the player too.
Pause and Resume simply pause/resume the currently playing track. We use this mainly when you switch tabs. The music stops until you re-enter the tab in which the game plays.

Okay, let's summarize again what we do for music in CrossCode:

created different track implementations for WebAudio and audio element
use custom loop mechanic to perfectly loop tracks
music player loads tracks with meta-data
can push/pop tracks at any time
fade tracks when changing tracks
play a single track in between another track

Conclusion

All in all creating such a dynamic system sure wasn't easy. But it was worth the effort, especially for WebAudio. We can not only position sound in 3D, we can also perfectly loop our tracks even when using our own looping mechanic. There is the small issue with completely pre-loading each track in WebAudio, but we hope that somewhere along the line of web technology a solution will emerge. And if not, we will make sure that the player gets some option since no one likes long loading times!

Pheeew~ This is all we have to say on how we do music and sound in CrossCode! If you have any questions leave a comment and we will make sure to answer it!
We hope you liked this technical rant. It was really a lot to cover and a lot to read (let's hope we did not forget something on the way).
If you wish to read a technical topic about another feature we use in CrossCode you can ask us too!

Until next time!

Technical: Music and Sound in CrossCode news

The WebAudio API

Goals for CrossCode

Sound

Music

Conclusion

Browse

New

Report

Views

Share