Resonate: Low latency, high temporal and frequency resolution spectral analysis

arjf4 pts1 comments

Resonate | ARJF

Resonate is a low latency, low memory footprint, and low computational cost algorithm to evaluate perceptually relevant spectral information from audio (and other) signals.

Overview

The Resonate model, developed over a number of years, was first officially introduced in (François, 2025) and extended in (François, 2026).

Resonate builds on a resonator model that accumulates the signal contribution around its resonant frequency in the time domain using the Exponentially Weighted Moving Average (EWMA), also known as a low-pass filter in signal processing. Consistently with on-line perceptual signal analysis, the EWMA gives more weight to recent input values, whereas the contributions of older values decay exponentially.<br>A compact, iterative formulation of the model affords computing an update at each signal input sample, requiring no buffering and involving only a handful of arithmetic operations.

Each resonator (indexed \(k\)), characterized by its natural resonant frequency and its instantaneous resonant frequency \(f_k(t) = \frac{\omega_k(t)}{2\pi}\), is described by a complex number \(R_k(t)\) whose amplitude captures the contribution of the input signal component around frequency \(f_k(t)\). The formulas below capture the recursive update formula for \(R_k(t)\) by way of a phasor \(P_k(t)\), applied for each sample of a real-valued input signal \(x(t) \in [-1,1]\), regularly sampled at sampling rate \(sr\). \(\Delta t=1/sr\) is the sample duration.

\[P_k(t) = P_k(t-\Delta t) e^{-i \omega \Delta t}\]

\[R_k(t) = (1-\alpha_k) R_k(t-\Delta t) + \alpha_k x(t) P_k(t)\]

The parameter \(\alpha_k \in [0,1]\) dictates how much each new measurement affects the accumulated value; it can be expressed as a function of the system’s time constant, set heuristically as a function of the resonant frequency \(f\) (intuitively inversely proportional to the frequency).<br>For the frequency range of interest in audio applications (20-20000 Hz), \(\alpha_k = 1-e^{-\Delta t\frac{f_k}{log(1+f_k)} }\) is a reasonable heuristic.

The smoothed state \(\tilde{R}_k\) is produced by applying the EMWA to \(R_k\) with parameter \(\beta_k\) to dampen power and phase oscillations.

\[\tilde{R}_k(t) = (1-\beta_k) \tilde{R}_k(t-\Delta t) + \beta_k R_k(t)\]

This formulation is consistent with the first steps in the filterbank interpretation of the phase vocoder analysis as described by Dolson in his 1986 paper “The Phase Vocoder: A Tutorial”, namely heterodyning followed by lowpass filtering.

The complex numbers \(P_k(t)\), \(R_k(t)\) and \(\tilde{R}_k(t)\) capture the full state of resonator \(k\).<br>Updating the state at each input signal sample only requires a handful of arithmetic operations.<br>Calculating the power and/or magnitude is not necessary for the update, and can be carried out only when required by the application, relatively efficiently as well.

Banks of resonators, independently tuned to perceptually relevant frequency scales, compute an instantaneous, perceptually relevant estimate of the spectral content of an input signal in real-time.<br>Both memory and per-sample computational complexity of such a bank are linear in the number of resonators, and independent of the number of input samples processed, or duration of processed signal.<br>Furthermore, since the resonators are independent, there is no constraint on the tuning of their resonant frequencies or time constants, and all per sample computations can be parallelized across resonators.<br>In an offline processing context, the cumulative computational cost for a given duration increases linearly with the number of input samples processed.

The original model presented in (François, 2025) keeps the resonant frequency of the resonators fixed, as is the case with the Fast Fourier Transform (FFT).<br>However, there is no such restriction on Resonate resonators.

In the frequency tracking model presented in (François, 2026), the resonant frequency of a resonator is allowed to change over time:<br>in the absence of significant information (i.e. below a set magnitude threshold for \(R_k(t)\)), the resonant frequency remains constant, equal to the resonator’s natural resonant frequency.<br>In the presence of significant response, however, the resonant frequency tracks the estimated instantaneous frequency.

At each time step, the phase difference \(\Delta \phi_k(t)\) between the previous and current value of \(\tilde{R}_k(t)\) provides an estimate of the phase’s time derivative, to compute the corresponding instantaneous frequency.

\[f_k(t) = f_k(t-\Delta t) + \frac {\Delta \phi_k(t)}{2\pi \Delta T}\]

Using the property that the phase of a complex number multiplied by the conjugate of another complex number is the phase difference between the two numbers yields a formula for \(\Delta \phi_k(t)\) that requires only one principal value argument computation.

\[D_k(t) = \tilde{R}_k(t) \overline{\tilde{R}_k}(t-\Delta t)\]

\[\Delta \phi_k(t) =...

frequency delta resonant signal input number

Related Articles