AI-powered processing. These nodes use trained neural networks for tasks that traditional DSP can’t handle — like separating a mixed song into individual instruments.
Neural Stem Separator
What it does — Splits a full mix into four separate stems (Drums, Bass, Other, Vocals) using the Demucs v4 neural network.
When you’d reach for it — You have a finished mix or a bounced track and you need to isolate one element — pull out the vocal for a remix, grab just the drums for layering, or remove the bass to replace it with your own.
Quick example
- Feed your mixed audio into Neural Stem Separator.
- Wait for inference to complete (roughly 10-30 seconds depending on length).
- Choose which stem to preview in the viewer using the Preview selector.
- Route each of the four outputs (Drums, Bass, Other, Vocals) into separate processing chains.
- Blend the processed stems back together downstream.
Parameters
| Parameter | What it controls | Range | Sweet spot hint |
|---|---|---|---|
| Preview in Viewer | Which stem displays in the main viewer | Drums / Bass / Other / Vocals | Switch as needed to inspect each stem |
| Mix | Blend between silence and full separation | 0.00 - 1.00 | Keep at 1.00 for clean stems |
| Normalize Output | Prevents clipping by normalizing each stem | On / Off | Leave on unless you need raw levels |
| Use GPU (CUDA) | Runs inference on your GPU instead of CPU | On / Off | Turn on if you have an NVIDIA GPU — dramatically faster |
DDSP Resynth
What it does — Deconstructs a monophonic sound into pitch, loudness, and timbre, then rebuilds it from scratch using additive synthesis and shaped noise.
When you’d reach for it — You want to radically reshape the character of a solo instrument or voice — turn a flute into something synth-like, add breathiness to a clean vocal, or create evolving textures from a simple melodic line. Works best on single-note material like voice, violin, flute, or synth leads.
Quick example
- Feed a monophonic recording into DDSP Resynth.
- Set Quality to Standard for a good speed/accuracy balance.
- Open the Synthesis section and try the Breathy preset for an airy vocal texture.
- Adjust Harmonic Level and Noise Level to taste.
- Set Phase Mode to RTPGHI for cleaner output.
Parameters
Global
| Parameter | What it controls | Range | Sweet spot hint |
|---|---|---|---|
| Frame Rate | How many analysis snapshots per second | 50 - 250 Hz | 100 Hz is the standard; raise for fast passages |
| Mix | Blend between original and resynthesized audio | 0.00 - 1.00 | 1.00 for full resynth, dial back to layer with the original |
Pitch Extraction
| Parameter | What it controls | Range | Sweet spot hint |
|---|---|---|---|
| Quality | CREPE model size — bigger is more accurate but slower | Draft / Standard / Precise / Extreme | Standard covers most material well |
| Voicing Threshold | Confidence level below which a frame is treated as unvoiced | 0.00 - 1.00 | 0.50 default; lower if pitched segments are dropping out |
| Fine Tune (cents) | Pitch offset to compensate for detection bias | -100 - +100 | -54 is the calibrated default; adjust if pitch drifts |
Loudness Extraction
| Parameter | What it controls | Range | Sweet spot hint |
|---|---|---|---|
| Weighting | Loudness curve applied during analysis | A-weighted / C-weighted / Flat | A-weighted matches human perception best |
Harmonic Analysis
| Parameter | What it controls | Range | Sweet spot hint |
|---|---|---|---|
| Harmonics | Number of overtones extracted from the spectrum | 1 - 100 | 60 for rich timbres, lower for purer tones |
| Interpolation | How harmonic peaks are read from the spectrum | Nearest / Linear / Parabolic | Linear is the safe default; Parabolic for precision |
| Smoothing | Temporal smoothing across analysis frames | 0.00 - 0.90 | 0.10 keeps detail; raise to tame jitter |
| Power Normalize | Scales harmonic amplitudes so total energy stays consistent | On / Off | Leave on for predictable levels |
Noise Analysis
| Parameter | What it controls | Range | Sweet spot hint |
|---|---|---|---|
| Bands | Number of frequency bands in the noise filter | 16 - 128 | 65 gives good resolution without excess cost |
| Smoothing | Temporal smoothing on the noise envelope | 0.00 - 0.90 | 0.20 balances detail and stability |
| Floor (dB) | Quietest level the noise model will represent | -80 - -20 dB | -60 dB catches most detail without amplifying silence |
Synthesis
| Parameter | What it controls | Range | Sweet spot hint |
|---|---|---|---|
| Harmonic Level (dB) | Volume of the harmonic (tonal) component | -24 - +12 dB | 0 dB keeps the original balance |
| Noise Level (dB) | Volume of the noise (breath/texture) component | -24 - +12 dB | -12 dB for subtle texture; raise for breathier sounds |
| Harmonic Rolloff (dB/oct) | Spectral tilt of the harmonic series | -12 - +6 dB/oct | 0 preserves the analyzed timbre; negative darkens, positive brightens |
| Noise Color | Spectral shape of the noise component | -1.00 - +1.00 | -1 is warm/pink, 0 is white, +1 is bright/blue |
| Output Gain (dB) | Master level after synthesis | -24 - +12 dB | 0 dB; adjust to match surrounding levels |
| Phase Mode | How phase is reconstructed in the output | None / RTPGHI / Anchored | RTPGHI for clean results; Anchored to preserve original phase character |
Presets
| Preset | What it sets up |
|---|---|
| Sawtooth | Rich harmonic series, minimal noise — classic synth tone |
| Square | Odd-harmonic emphasis, minimal noise — hollow, reedy character |
| Breathy | Fewer harmonics, prominent noise — airy, whispered quality |
| Warm Pad | Gentle rolloff, pink-tinted noise — soft, enveloping texture |
| Bright Lead | Full harmonics boosted, blue-tinted noise — cutting, present tone |