Web Audio API · Subtractive Synthesis · Piano Roll
Every synthesizer you've ever used follows the same signal path. Start with something harmonically rich. Filter it. Shape it in time. Schedule it. This is how all of it works — from a $5 VST to a Moog.
Subtractive synthesis is the oldest and most common form of synthesis in electronic music. Moog built it. Roland built it. Every analogue synth from the 1960s to today uses it. Most software synths use it. The name tells you the method: you start with a waveform that contains a lot of harmonic content — a sawtooth wave is ideal, containing every harmonic — and you remove frequencies until what remains is the sound you want. You're sculpting by subtraction.
The signal chain is always the same. One or more oscillators generate the raw waveform. A filter removes frequency content — typically a lowpass filter that cuts everything above a certain frequency, letting only the bass end through. An envelope controls how the amplitude of the filtered signal changes over time: how fast the note attacks, how it decays, what level it sustains at, how long it takes to release after the key is lifted. This four-stage shape — Attack, Decay, Sustain, Release — is the ADSR envelope, and it's what makes a pad sound like a pad and a pluck sound like a pluck. The same oscillator and filter settings with different ADSR values produce completely different instruments.
The two-oscillator design adds one more crucial tool: detuning. When you run two oscillators at nearly identical frequencies — say, one at 440Hz and one at 442Hz — their waveforms drift in and out of phase with each other, creating a slow beating or chorusing effect. This is the characteristic thickness of analogue synths. No single digital oscillator sounds like that naturally because a digital oscillator is always perfectly in tune with itself. The imperfection is the sound. By the end of this lesson you'll have a fully working two-oscillator subtractive synth and a piano roll to compose with it.
Every subtractive synthesizer routes audio through the same four stages. The oscillators provide raw material — waveforms rich in harmonics. The filter shapes the frequency content of that material. The amplitude envelope determines how the signal evolves in volume over time. The master gain sets the final output level. Each stage in the Web Audio API maps cleanly to a node: OscillatorNode, BiquadFilterNode, GainNode for the envelope, GainNode for the master.
The key architectural decision in a subtractive synth is that one voice = one complete instance of this chain. Every time you press a key, a new set of nodes is created, runs for the duration of the note, then is discarded. This is called polyphony through instantiation — no complex voice stealing needed at the scale we're working at. Each note is independent and self-contained.
function triggerNote(ctx, freq, time, duration, settings) { // ─ Oscillators const osc1 = ctx.createOscillator(); const osc2 = ctx.createOscillator(); const osc1Gain = ctx.createGain(); const osc2Gain = ctx.createGain(); osc1.type = settings.osc1Type; osc2.type = settings.osc2Type; osc1.frequency.value = freq * Math.pow(2, settings.osc1Oct); osc2.frequency.value = freq * Math.pow(2, settings.osc2Oct) + settings.detune; osc1Gain.gain.value = 1 - settings.osc2Mix; osc2Gain.gain.value = settings.osc2Mix; // ─ Filter const filter = ctx.createBiquadFilter(); filter.type = 'lowpass'; filter.frequency.value = settings.cutoff; filter.Q.value = settings.resonance; // ─ Amplitude envelope const env = ctx.createGain(); env.gain.setValueAtTime(0, time); env.gain.linearRampToValueAtTime(0.7, time + settings.attack); env.gain.linearRampToValueAtTime(settings.sustain * 0.7, time + settings.attack + settings.decay); env.gain.setValueAtTime(settings.sustain * 0.7, time + duration); env.gain.linearRampToValueAtTime(0, time + duration + settings.release); // ─ Wire up osc1.connect(osc1Gain); osc2.connect(osc2Gain); osc1Gain.connect(filter); osc2Gain.connect(filter); filter.connect(env); env.connect(ctx.destination); osc1.start(time); osc2.start(time); osc1.stop(time + duration + settings.release + 0.05); osc2.stop(time + duration + settings.release + 0.05); }
Running two oscillators is not just about being louder. The detuning relationship between them creates beating — a low-frequency amplitude variation caused by two slightly different frequencies interfering. At 1–3Hz of detuning you get slow, lush chorus. At 5–10Hz you get a faster, more agitated shimmer. At 20Hz+ it becomes a dissonant roughness. This is analogue character.
The oscillator mix slider blends between osc1 and osc2. At 0% you hear only osc1. At 100% only osc2. Anywhere in between combines both. Running osc2 an octave above osc1 and mixing at 30% adds high-end brightness without losing the fundamental weight. Running osc2 an octave below adds sub weight. The combination of wave type, octave and mix across two oscillators is where most of the tonal variation in subtractive synthesis comes from — before the filter has even been touched.
// Try both oscillators on sawtooth with detune at +7Hz. That slow beating is what people call "analogue warmth". Now set osc2 an octave down and reduce its mix to 20% — that's a classic bass-heavy patch. Square + triangle with zero detune = hollow, glassy tone used in a lot of ambient music.
The filter is where subtractive synthesis earns its name. A lowpass filter is the most common — it passes low frequencies and attenuates everything above the cutoff. Open the cutoff wide and the full bright sawtooth comes through. Close it down and only the fundamental and lowest harmonics remain: a dark, warm, muffled tone. Sweep it and you hear harmonics being progressively revealed or removed in real time. This is the classic analogue synth sweep.
Resonance (Q) boosts the frequencies right at the cutoff point. At low Q values it's a subtle peak. At high Q values the resonance becomes so pronounced it begins to self-oscillate — the filter itself starts generating a pitch at its cutoff frequency. This is the "filter resonance squeal" at extreme Q values. At moderate settings it adds presence and edge to the sweep. Envelope-modulated filter cutoff — where the ADSR drives the filter open on attack and closed on release — is covered in the next chapter.
The ADSR envelope is what gives a synthesized note its character over time. A fast attack and fast decay with low sustain sounds like a pluck or a stab. A slow attack with full sustain sounds like a pad. A medium attack, low sustain, and long release sounds like a string section. The same oscillator and filter settings produce completely different instruments purely through envelope shape.
The filter envelope is a second ADSR that drives the filter cutoff rather than the volume — it opens the filter on attack and closes it on release. This is the "wah" shape: notes start bright and get darker as they sustain, which mimics the behaviour of real acoustic instruments. The envelope amount controls how wide the filter sweeps. Below is the complete two-oscillator subtractive voice with everything wired up. This is the same synth engine that drives the piano roll in the next chapter.
A piano roll is a two-dimensional grid: time moves left to right, pitch moves bottom to top. Each note is a rectangle: its horizontal position is when it starts, its width is how long it lasts, its vertical position is its pitch. Every DAW uses this model — Logic, Ableton, FL Studio, they all descend from player piano roll paper, where holes punched in a roll of paper triggered mechanical hammers.
Building one in the browser is a canvas drawing and scheduling problem. The canvas handles interaction — click to place a note, drag to set its length, click an existing note to delete it. The scheduler uses the same lookahead technique as the drum machine: AudioContext.currentTime drives the clock, and notes are scheduled slightly ahead of when they need to sound. The synth engine from the previous chapters fires for each note using its pitch, start time and duration. Everything below is playable and composable in the browser.
// click empty cell → place note · drag right edge → resize · click note → delete
// The piano roll uses the same synth voice as all the previous chapters — with whatever settings you have dialled in above. Change the preset, then switch to a new melody. The scheduling precision means notes land exactly where you place them regardless of BPM.