Skip to main content
k2k audio logo k2k audio

Back to Extractors
Documentation tree

HPSS Extractor

HPSS (Harmonic-Percussive Source Separation) is a classic audio decomposition algorithm that splits a signal into three streams.

Parameters

ParameterRangeDefault
Mask Power1.0 – 4.02.0

Mask Power — Soft-mask exponent, 1–4. Controls the sharpness of the harmonic/percussive/residual separation. 1.0 = soft separation (gradual transitions, content can leak between streams — more natural-sounding). 4.0 = hard separation (decisive split, less leakage but more potential for artifacts at boundaries). 2.0 is the standard default that balances cleanliness and naturalness.

Additional controls

H Kernel — Harmonic median-filter kernel size in frames, 11–51. Defines how wide the time window is when looking for “harmonic-like” stable content. Smaller kernels (11–21) are faster but can miss longer-sustained harmonic content; larger kernels (31–51) catch more sustained material but require more CPU. 31 is a balanced default — sustained tones lasting around half a second of source material will be reliably classified as harmonic.

P Kernel — Percussive median-filter kernel size in bins, 11–51. Defines how wide the frequency window is when looking for “percussive-like” broadband transient content. Smaller kernels detect narrower spectral spreads as percussive; larger kernels require wider spectral spread before content is classified as percussive. 31 is balanced; lower for surgical detection, higher for catching only the sharpest transients.

Component — Which component to extract:

  • Harmonics — sustained, pitched content (vocal sustains, chord tones, sustained instrument lines, drone content).
  • Percussives — transient, noisy, broadband content (drum hits, plosive sounds, attack edges of any sound).
  • Residual — what’s left after extracting both H and P. Typically room tone, breath, ambient noise, and other content that’s neither clearly pitched nor clearly transient.

About HPSS Extractor

HPSS (Harmonic-Percussive Source Separation) is a classic audio decomposition algorithm that splits a signal into three streams. The math: harmonic content tends to be horizontal in the spectrogram (sustained pitches → straight horizontal lines), percussive content tends to be vertical (transient hits → vertical streaks across all frequencies). HPSS uses median filtering in both directions to identify and separate these. Use it for: drum vs. tonal content separation (great for remixing), independent processing of melodic vs. rhythmic elements, isolating residual room tone for noise-reduction, or generating clean training data for sample-design. Pair with the Dynamic Brightness node (Texture category) for HPSS-aware brightness shaping without doing the explicit separation yourself.


Generated 2026-05-05 from K2K_Dev@96730bdc by scripts/gen_lexique.py. Edit _intros/ or _overrides/, not this file.