Web Audio API: Add Bandwidth-Friendly Sound to Your Web Page

November 7, 2016

This article was peer reviewed by Mark Brown and Josh Wedekind. Thanks to all of SitePoint’s peer reviewers for making SitePoint content the best it can be!

Man in front of fireplace listening to sounds from a gramophone

The Web Audio API allows developers to leverage powerful audio processing techniques in the browser using JavaScript, without the need for plugins. As well as defining and processing file-based audio sources in real-time, it can synthesize sounds based upon various waveforms; this can be useful for web apps that are often consumed over low-bandwidth networks.

In this tutorial, I’m going to introduce you to the Web Audio API by presenting some of its more useful methods. I’ll demonstrate how it can be used to load and play an mp3 file, as well as to add notification sounds to a user interface (demo).

If you like this article and want to go into this topic in more depth, I’m producing a 5-part screencast series for SitePoint Premium named You Ain’t Heard Nothing Yet!

What Can I Do With the Web Audio API?

The use cases for the API in production are diverse, but some of the most common include:

Real-time audio processing e.g. adding reverb to a user’s voice
Generating sound effects for games
Adding notification sounds to user interfaces

In this article, we’ll ultimately write some code to implement the third use case.

Is it Well Supported by Browsers?

Web Audio is supported by Chrome, Edge, Firefox, Opera, and Safari. That said, at the time of writing Safari considers this browser feature experimental and requires a webkit prefix.

Can I Use audio-api? Data on support for the audio-api feature across the major browsers from caniuse.com.

Using the API

The entry point of the Web Audio API is a global constructor called AudioContext. When instantiated, it provides methods for defining various nodes that conform to the AudioNode interface. These can be split into three groups:

Source nodes – e.g. MP3 source, synthesized source
Effect nodes – e.g. Panning
Destination nodes – exposed by an AudioContext instance as destination; this represents a user’s default output device, such as speakers or headphones

These nodes can be chained in a variety of combinations using the connect method. Here’s the general idea of an audio graph build with the Web Audio API.

Building an audio graph with AudioContext
Source: MDN

Here’s an example of converting an MP3 file to an AudioBufferSourceNode and playing it via the AudioContext instance’s destination node:

See the Pen Playing an MP3 file with the Web Audio API by SitePoint (@SitePoint) on CodePen.

Generating Audio

As well as supporting recorded audio via AudioBufferSourceNode, the Web Audio API provides another source node called OscillatorNode. It allows frequencies to be generated against a specified waveform. But what does that actually mean?

At a high level, frequency determines the pitch of the sound measured in Hz. The higher the frequency, the higher the pitch will be. As well as custom waves, OscillatorNode provides some predefined waveforms, which can be specified via an instance’s type property:

Built-in waveforms supported by OscillatorNode
Source: Omegatron/Wikipedia

'sine' – sounds similar to whistling
'square' – this was often used for synthesizing sounds with old video game consoles
'triangle' – almost a hybrid of a sine and square wave
'sawtooth' – generates a strong, buzzing sound

Here’s an example of how OscillatorNode can be used to synthesize sound in real-time:

See the Pen Generating sound with OscillatorNode by SitePoint (@SitePoint) on CodePen.

How Does OscillatorNode Benefit the Web?

The ability to synthesize sounds with code will result in a much smaller payload than using files. This is important for maintaining the parity of your application across all sorts of bandwidths, from 2G to 4G. It is impossible to guarantee the speed of a mobile data connection, especially in the emerging markets.

48 percent of those using mobile internet on 2G or 3G are unable to perceive any difference between 2G and 3G services.

Ericsson, The Changing Mobile Broadband Landscape

To demonstrate this, I recorded the above example of OscillatorNode and encoded it to an MP3 file, using a bit rate that permitted the same sound quality. The resulting file is 10 KB, and, according to Chrome Dev Tools’ network throttling feature, would take 2.15 seconds to load over a regular 2G connection. In this case, the programmatic approach is the clear winner.

Using OscillatorNode for Notification Sounds

Let’s use OscillatorNode within a real-world example. I mentioned at the beginning of the article that we would add notification sounds to a user interface. If we open this CodePen, we’ll see a messaging app UI. Upon clicking the Send button, a notification will appear to inform us that the message was sent. This boilerplate contains two parts that are of interest to us; an AudioContext instance called context, and a function named playSound.

Before starting, click the Fork button. This will create a copy of the boilerplate to which you can save your changes.

It’s worth mentioning that I have tested this in both Chrome and Firefox, so you should use one of these browsers.

In playSound, declare a variable named oscillatorNode and assign to it the return value of context.createOscillator():

const oscillatorNode = context.createOscillator();

Next, let’s configure our node. Set its type property to 'sine', and the frequency.value property to 150:

oscillatorNode.type = 'sine';
oscillatorNode.frequency.value = 150;

To play our sine wave through our speakers or headphones, call oscillatorNode.connect, passing it a reference to the context.destination node. Finally, let’s call oscillatorNode.start, followed by oscillatorNode.stop, passing into it a parameter of context.currentTime + 0.5; this will stop the sound after 500 milliseconds has passed according to our AudioContext's hardware scheduling timestamp. Our playSound method now looks like this:

function playSound() {
  const oscillatorNode = context.createOscillator();

  oscillatorNode.type = 'sine';
  oscillatorNode.frequency.value = 150;

  oscillatorNode.connect(context.destination);
  oscillatorNode.start();
  oscillatorNode.stop(context.currentTime + 0.5);
}

Upon saving our changes and hitting Send, we’ll hear our notification sound.

Introducing GainNode

Needless to say, this is pretty garish. Why not use an effect node to make this sound more pleasing? GainNode is one example of an effect node. Gain is a means of altering the amplitude of an input signal, and in our case, it enables us to control the volume of an audio source.

Below the declaration of oscillatorNode, declare another variable called gainNode, and assign to it the return value of context.createGain():

const gainNode = context.createGain();

Under the configuration of oscillatorNode, set gainNode‘s gain.value property to 0.3. This will play the sound at 30% of its original volume:

gainNode.gain.value = 0.3;

Finally, to add the GainNode to our audio graph, pass gainNode to oscillatorNode.connect, then call gainNode.connect, to which we’ll pass context.destination:

function playSound() {
  const oscillatorNode = context.createOscillator();
  const gainNode = context.createGain();

  oscillatorNode.type = 'sine';
  oscillatorNode.frequency.value = 150;

  gainNode.gain.value = 0.3;

  oscillatorNode.connect(gainNode);
  gainNode.connect(context.destination);

  oscillatorNode.start();
  oscillatorNode.stop(context.currentTime + 0.5);
}

Upon saving our changes and hitting Send, we’ll hear that our sound plays more quietly.

Mixing Things up with AudioParam

You may have observed that, in order to set the frequency of our OscillatorNode, and the gain of our GainNode, we had to set a property called value. The reason for this contract is that gain and frequency are both AudioParams. This is an interface that can be used to not only set specific values, but also scheduled, gradually-changing values. AudioParam exposes a number of methods and properties, but three important methods are:

setValueAtTime – immediately changes the value at the given time
linearRampToValueAtTime – schedules a gradual, linear, change of the value within a given end time
exponentialRampToValueAtTime – schedules a gradual, exponential change of a value. As opposed to a linear change, which is constant, an exponential change will increase or decrease by larger increments as the scheduler approaches the end time. This can be preferable as it sounds more natural to the human ear

We’re now going to exponentially ramp both the frequency and the gain. In order to use the exponentialRampToValueAtTime method, we need to schedule a prior event. Replace oscillatorNode.frequency.value with a call to oscillatorNode.frequency.setValueAtTime. Pass the same frequency of 150 Hz, and schedule it immediately by passing context.currentTime as a second parameter:

oscillatorNode.frequency.setValueAtTime(150, context.currentTime);

Below the invocation of setValueAtTime, call oscillatorNode.frequency.exponentialRampToValueAtTime, with a value of 500 Hz. Schedule this 0.5 seconds from the scheduled start time:

oscillatorNode.frequency.exponentialRampToValueAtTime(500, context.currentTime + 0.5);

Upon saving and clicking Send, you’ll hear that the frequency increases as playback progresses.

To wrap things up, replace the setting of gainNode.gain.value with an invocation of gainNode.gain.setValueAtTime in the same vein as our OscillatorNode‘s frequency:

gainNode.gain.setValueAtTime(0.3, context.currentTime);

To fade out the sound, exponentially ramp the gain to 0.01 over 0.5 seconds:

function playSound() {
  const oscillatorNode = context.createOscillator();
  const gainNode = context.createGain();

  oscillatorNode.type = 'sine';
  oscillatorNode.frequency.setValueAtTime(150, context.currentTime);
  oscillatorNode.frequency.exponentialRampToValueAtTime(500, context.currentTime + 0.5);

  gainNode.gain.setValueAtTime(0.3, context.currentTime);
  gainNode.gain.exponentialRampToValueAtTime(0.01, context.currentTime + 0.5);

  oscillatorNode.connect(gainNode);
  gainNode.connect(context.destination);

  oscillatorNode.start();
  oscillatorNode.stop(context.currentTime + 0.5);
}

Upon hitting Save and Send, you’ll hear that our notification sound gets quieter over time. Now we’re sounding more human.

Here’s the completed demo.

See the Pen Notification Sounds with OscillatorNode by SitePoint (@SitePoint) on CodePen.

Replaying Source Nodes

Before concluding this article, it’s important to note a point of confusion for those who are getting to grips with the API. To play our sound again after it has finished, it would seemingly make sense to write this:

oscillatorNode.start();
oscillatorNode.stop(context.currentTime + 0.5);
oscillatorNode.start(context.currentTime + 0.5);
oscillatorNode.stop(context.currentTime + 1);

Upon doing this, we’ll observe that an InvalidStateError is thrown, with the message cannot call start more than once.

AudioNodes are cheap to create, thus the design of the Web Audio API encourages developers to recreate audio nodes as and when they’re needed. In our case, we would have to call the playSound function again.

Conclusion

I hope you have enjoyed this introduction to sound synthesis with the Web Audio API. We’ve demonstrated one of its many use cases, although the rise of notification sounds on websites and web apps is an interesting UX question that will only be answered over time.

If you want to learn more about the Web Audio API, I’m producing a 5-part screencast series for SitePoint Premium named You Ain’t Heard Nothing Yet!. The first episode is available to watch now.

Are you using the Web Audio API in your web pages and apps? I’d love to hear about your experiences and use cases in the comments below.

JavaScript: Novice to Ninja, 2nd Edition