Adaptive State-Space Multitaper Spectral Estimation
Abstract
Short-time Fourier transform (STFT) is the most common window-based approach for analyzing the spectrotemporal dynamics of time series. To mitigate the effects of high variance on the spectral estimates due to finite-length, independent STFT windows, state-space multitaper (SSMT) method used a state-space framework to introduce dependency among the spectral estimates. However, the assumed time-invariance of the state-space parameters makes the spectral dynamics difficult to capture when the time series is highly nonstationary. We propose an adaptive SSMT (ASSMT) method as a time-varying extension of SSMT. ASSMT tracks highly nonstationary dynamics by adaptively updating the state parameters and Kalman gains using a heuristic, computationally efficient exponential smoothing technique. In analyses of simulated data and real human electroencephalogram (EEG) recordings, ASSMT showed improved denoising and smoothing properties relative to standard multitaper and SSMT approaches.
Index Terms:
State-space model, spectral estimation, multitaper method, adaptive estimation, Kalman filterThis manuscript is an extended version of the IEEE Signal
Processing Letters paper (doi:10.1109/LSP.2022.3142670), with the supplementary material as the appendix.
I Introduction
Nonstationary time series with time-varying probability structures are ubiquitous. Some examples include speech and image recordings [1, 2], oceanographic and seismic signals [3], neural spike trains [4], and electroencephalogram (EEG) [5]. We are interested in analyzing the nonstationary data through the lens of the time-varying spectral dynamics, which yields valuable information on the underlying system. The traditional approach has been to segment the data into independent overlapping or non-overlapping windows, assuming local stationarity [6] within each window, and to apply Fourier or wavelet transform [7, 8].
Despite the popularity, the windowing approach suffers from spectral estimates of high variance within each window, due to finite window-length [9] and the restrictive independence assumption for different windows. The state-space multitaper (SSMT) framework [10] and its extensions [11, 12, 13, 14] have been proposed as the solutions, by positing a time-invariant latent state-space model in the time-frequency domain, with each state representing the Fourier coefficients in each window. Since the states are linked by stochastic continuity constraint across windows, the spectral estimates are not independent.
However, the use of time-invariant parameters limits the capacity of SSMT to track strong nonstationarity that commonly occurs in systems neuroscience and manifests as strong spectral fluctuations due to fluctuations in system properties. Although time-varying state-space model offers an obvious solution, traditional parameter estimation approaches based on expectation-maximization (EM) algorithm [15] makes it less practical for real-time applications - The parameters must be re-estimated with every incoming batch of data, and thus incurs high computational costs.
We propose a time-varying extension of SSMT with an efficient and adaptive parameter estimation scheme, termed an adaptive SSMT (ASSMT) method. Based on a novel nonstationarity metric derived from spectral fluctuation, ASSMT adaptively adjusts the state parameters and consequently Kalman gain. This allows ASSMT to switch between removing background noise and tracking spectral fluctuation in a data-driven manner. The estimation for time-varying parameters requires only single pass through the data, making it well suited for real-time applications. The paper is organized as follows. In Section II, we review the SSMT framework. In Section III, we formulate the ASSMT framework. In Section IV and V, the results and conclusion are presented.
II Review of State-Space Multitaper Method
We first review the SSMT algorithm in [10]. Consider a nonstationary time series sampled at frequency as
(1) |
where is a locally stationary latent Gaussian process [16] with the measurement noise . Leveraging the local stationary property, we divide these signals into nonoverlapping stationary intervals of samples, such that . The segmented vectors for interval are denoted as , with the th element as for and .
To perform time-frequency analysis, we introduce the latent , where is a complex Gaussian variable with the magnitude corresponding to the power at the normalized frequency and interval [17]. To model the evolution of spectra across the windows, we assume that follow a random walk prior
(2) |
where the state noise is a complex Gaussian process with a diagonal covariance , i.e., . The prior encodes two important properties of . First, the state variance controls the smoothness of the process, with a large value indicative of non-smooth or fluctuating process. Second, and for are independent a priori.
We can link the time-frequency process with time-domain observation , using the Fourier transform matrix with and the inverse Fourier matrix , such that ,
(3) | |||
where , since .
Along with the state-space model, we incorporate the data tapers to further reduce the variance of the spectral estimates. Specifically, we use Slepian tapers, leading to the multitaper (MT) method that optimally balances the bias-variance trade-off via bandwidth adjustment [18, 19]. This essentially produces independent set of state-space models,
(4) | ||||
(5) |
where , , denotes the Slepian taper applied to , denotes Fourier transform of , i.e., , and represents the th spectral eigen-coefficient of . The variants of SSMT modify Eqs. (4) or (5) [11, 12, 13, 14].
We now focus on for simplicity. Based on Eqs. (4) and (5), we derive a Kalman filter algorithm for the estimation of using Kalman gain
(6) | ||||
(7) |
where the notation denotes the estimate on interval given the data observed up to interval and is given as
(8) |
The spectrogram estimate at frequency on interval is
(9) |
The parameters and are estimated with EM algorithm [15]. SSMT tracks nonstationarity in the data with the time-invariant parameters and , since and thus .
III Adaptive SSMT
Although SSMT can model slowly time-varying spectral dynamics, it is restrictive for highly fluctuating spectral dynamics. Such strong nonstationarity (or fluctuation) is common in EEG with external intervention to the brain or with the change of the brain state during anesthesia/sleep [20]. Fig. 1 shows a snippet of human anesthesia EEG in which SSMT (Fig. 1(b)) cannot track the apparent dynamics shown by the MT approach (Fig. 1(a)). However, the proposed ASSMT (Fig. 1(c)) is able to capture the spectral dynamics and remove non-relevant background activity.

The failure of SSMT is due to the time-invariant parameters, as the Kalman gain quickly converges to a steady-state value [21]. Denoting the observation prediction error as and the latent estimate change as , we can express Eq. (6) as . If is low (SSMT for Fig. 1(b)), fails to reflect fluctuation in . Consequently, cannot reliably track the spectral dynamics.
To resolve this issue, adaptive SSMT (ASSMT) posits a time-varying state-space model, to allow the adaptive change of . Specifically, we replace in Eq. (5) with , to indicate the state variance’s dependence on time. This prevents Kalman gain from converging to a steady-state value, allowing the algorithm to produce more flexible spectrogram estimates. remains constant across windows because we assume stationary background noise.
With the modified generative model, we now address when and how ASSMT tracks varying degrees of nonstationarity. We first quantify the notion of nonstationarity, and then propose an adaptive parameter estimation approach.
III-A Measure of nonstationarity
We define as the measure of nonstationarity, which is the expected observation difference and a function of window, frequency, and taper. Intuitively, we use high spectral fluctuation as a proxy for high nonstationarity, that is, when exceeds a frequency-dependent threshold .
To estimate , we use exponential moving average (EMA), often used as a simple, yet effective approach to estimate the expectation in the filtering literature [22]
(10) |
where is a smoothing factor chosen a priori. This approach allows us to use as a heuristic indicator of strong nonstationarity, decoupled from the generative model and hence the estimation procedure of . The choice of reflects the belief on the impulsiveness of nonstationarity and the volatile nature of the state transitions. With large , is sensitive to instantaneous fluctuation.
III-B Estimation of parameters & adaptive thresholding
We now examine how to use to set and subsequently estimate the parameters. Using Eq. (4), we have
(11) | ||||
where we use the uncorrelatedness of the two differences. Next, we use the fact that 1) and 2) and are independent with variance , hence , which leads to
(12) |
This establishes the connection between the nonstationarity metric, , and the two sources of variance in our model.
With Eq. (10) and Eq. (12), we can estimate . For , we use estimated from SSMT. This leads to
(13) |
We further lower bound for two reasons. First, we impose that the time-varying SNR, , is greater than the baseline SNR of the system, i.e., . Second, we require nonnegative . Since SSMT estimates the baseline properties of the data, we set . This yields the estimation procedure
(14) |
This obviates the need for EM beyond the initial phase for estimating and . Although EM can estimate the time-varying parameters, it requires multiple forward/backward passes through the entire data, which is computationally expensive. In addition, with every new observation, in the earlier windows need to be re-estimated.
III-C Estimation of nonstationary spectra
We use Kalman filter with to estimate the spectrogram . ASSMT thus operates with two different modes, depending on . If , ASSMT uses larger state variance to track high nonstationarity. Given and in Eq. (8), we observe that the increase in leads to the increase in . This agrees with our intuition, since we want the Kalman gain to increase such that explains a greater portion of . For , ASSMT simply uses the baseline and thus operates with a fixed Kalman gain. This explains how ASSMT with adaptive state variance is able to track strong nonstationarity.
IV Results
We apply ASSMT to two datasets: 1) nonstationary simulated data and 2) EEG data from a patient under propofol anesthesia. We compare the spectrogram estimates between MT, SSMT, and ASSMT. All spectrograms are in the dB scale.
IV-A Application to Simulated Data
We simulate the data as a superposition of amplitude-modulated process and frequency-modulated process . The process is generated from an AR(6) process centered at 11 Hz. The process is generated from ARMA(6,4) with varying pole loci. The observations are given by , where Hz, and is chosen to achieve an SNR of 30 dB. More details can be found in [12]. For MT, we use 6-second windows with 50% overlap and Slepian tapers, and 6-second non-overlapping windows for both SSMT and ASSMT. For SSMT, we use the entire data to estimate the parameters. For ASSMT, we use the initial 300 seconds of the data to compute the baseline parameters and use .
Fig. 2 shows the ground truth and the estimated spectrograms. Although MT captures the spectral dynamics reasonably well, it picks up background noise and spectral artifacts (i.e., vertical lines), and induces mixing of adjacent frequency bands due to low resolution. SSMT (Fig. 2 (c)) resolves these issues with sharper spectral localization and removal of spectral artifacts, benefitting from the state-space prior.
ASSMT also shares the artifact rejection and noise reduction properties of SSMT. Moreover, ASSMT performs better denoising, as evidenced in the to Hz frequency band. The Itakura-Saito divergence (IS) [23] between the ground truth and the spectrogram estimate also confirms this observation, with ASSMT attaining the lowest value (, , ). We attribute this difference to SSMT’s state variance and Kalman gain fixed to high values, between to Hz. However, since ASSMT does not commit to a fixed value, it can adaptively change the Kalman gain at different regimes of the data.

IV-B Application to EEG recorded during anesthesia
The EEG was recorded ( Hz) from a volunteer receiving propofol administered with increasing rate, followed by the decreasing rate [20]. This setup induces altering states of unconsciousness (or brain states), resulting in varying levels of nonstationarity. We used Slepian tapers, samples. For SSMT, we estimate the spectrogram based on the parameters estimated from 1) the initial 4 minutes of data (Fig. 3 (b)) and 2) the entire data (Fig. 3 (c)). For ASSMT, we use the initial 4 minutes to compute the baseline parameters. Fig. 3 shows the estimated spectrograms.


SSMT vs. ASSMT: SSMT based on the initial 4 minutes of the data (Fig. 3 (b)) produces low estimates for due to absence of spectral dynamics for the baseline state. Although it removes background noise well, as a result, it misses most of the strong spectral fluctuation, as evidenced in extreme denoising of the spectra from 40 min to 120 min.
This can be mitigated by applying EM to a different section of the data, or the entire data. Due to high estimates for , SSMT better captures the strong nonstationarity (Fig. 3 (c)). However, it fails to denoise the baseline state ( to min) due to high Kalman gain, as similarly observed in the simulation. These results demonstrate a drawback of the time-invariant assumption, as estimated from different sections could yield significantly different spectrogram estimates.
In contrast, ASSMT adaptively denoises the spectrogram (Fig. 3 (d-e)) even with the initial baseline parameters, the same setting for which SSMT failed (Fig. 3 (b)). The time-varying nature of the model allows it to switch between low , appropriate for removing background noise, and high , appropriate for capturing the spectral fluctuation, without fully committing to either parameters.
Another difference is the computation time. The time for using EM on the entire data and Kalman filtering is seconds. For ASSMT, however, the procedure takes seconds, demonstrating an appreciable reduction in computational time and thus, making it more suitable for real-time application.
Effect of Kalman gain: To further understand ASSMT, we analyze the evolution of state variance and the Kalman gain across time at the representative frequency bands (10 and 15 Hz), shown in Fig. 4. For both frequency bands, we observe that the state variance and consequently the Kalman gain is increased above the threshold, in tandem with the large spectral fluctuation, as desired. On the other hand, The SSMT Kalman gain stays fixed at either 0.1 (for initial 4-min estimation, solid black in Fig. 4) or 0.9 (for entire data estimation, dotted black in Fig. 4).
Effect of : We observe that the denoising performance of ASSMT is robust towards the choice of . To further understand how affects the filtering/denoising operation, we analyze how and Kalman gain change over time. ASSMT with (red) is dominated by the heavy fluctuation, . In contrast, ASSMT with small (blue) shows smoother state variance and Kalman gain. In this case, ASSMT starts adapting when there is significant evidence for nonstationarity. This explains why performs more denoising than , as it is less susceptible to instantaneous fluctuations (40 min., 0 to 10 Hz) and background noise (75 to 95 min., 2 to 10 Hz).
V Conclusion
We introduced an adaptive state-space multitaper (ASSMT) framework for estimating spectral dynamics in the nonstationary time series. By relaxing the time-invariant parameters assumption and proposing an adaptive parameter estimation scheme, we demonstrated that ASSMT was able to capture strong power fluctuations more reliably compared to SSMT.
References
- [1] T. F. Quatieri, Discrete-Time Speech Signal Processing: Principles and Practice. Prentice-Hall, 2008.
- [2] J. S. Lim, Two-Dimensional Signal and Image Processing. Prentice-Hall, 1990.
- [3] W. J. Emery and R. E. Thomson, Data Analysis Methods in Physical Oceanography. Elsevier, 2001.
- [4] W. Truccolo, U. T. Eden, M. R. Fellows, J. P. Donoghue, and E. N. Brown, “A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects,” Journal of Neurophysiology, vol. 93, no. 2, pp. 1074–1089, 2005.
- [5] P. Mitra and H. Bokil, Observed Brain Dynamics. Oxford University Press, 2007.
- [6] R. Dahlhaus, “Fitting time series models to nonstationary processes,” The Annals of Statistics, vol. 25, no. 1, pp. 1 – 37, 1997.
- [7] T. H. Koornwinder, Wavelets : An Elementary Treatment of Theory and Applications. World Scientific, 1995.
- [8] A. V. Oppenheim, R. W. Schafer, and J. R. Buck, Discrete-time Signal Processing. Prentice-Hall, Inc., 2009.
- [9] D. B. Percival and A. T. Walden, Spectral Analysis for Physical Applications. Cambridge Univ Press, 1993.
- [10] S. Kim, M. K. Behr, D. Ba, and E. N. Brown, “State-space multitaper time-frequency analysis,” Proceedings of the National Academy of Sciences, vol. 115, no. 1, pp. E5–E14, 2018.
- [11] D. Ba, B. Babadi, P. L. Purdon, and E. N. Brown, “Robust spectrotemporal decomposition by iteratively reweighted least squares,” Proceedings of the National Academy of Sciences, vol. 111, no. 50, pp. E5336–E5345, 2014.
- [12] P. Das and B. Babadi, “Dynamic bayesian multitaper spectral analysis,” IEEE Transactions on Signal Processing, vol. 66, no. 6, pp. 1394–1409, 2018.
- [13] A. H. Song, S. Chakravarty, and E. N. Brown, “A smoother state space multitaper spectrogram,” in 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2018, pp. 33–36.
- [14] A. H. Song, D. Ba, and E. N. Brown, “Plso: A generative framework for decomposing nonstationary time-series into piecewise stationary oscillatory components,” in Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, vol. 161, 2021, pp. 1371–1381.
- [15] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the em algorithm,” Journal of the Royal Statistical Society: Series B (Methodological), vol. 39, no. 1, pp. 1–22, 1977.
- [16] S. Adak, “Time-dependent spectral analysis of nonstationary time series,” Journal of the American Statistical Association, vol. 93, no. 444, pp. 1488–1501, 1998.
- [17] D. R. Brillinger, Time Series: Data Analysis and Theory. SIAM, 1981.
- [18] D. Thomson, “Spectrum estimation and harmonic analysis,” Proceedings of the IEEE, vol. 70, no. 9, pp. 1055–1096, 1982.
- [19] B. Babadi and E. N. Brown, “A review of multitaper spectral analysis,” IEEE Transactions on Biomedical Engineering, vol. 61, no. 5, pp. 1555–1564, 2014.
- [20] P. L. Purdon, E. T. Pierce, E. A. Mukamel, M. J. Prerau, J. L. Walsh, K. F. K. Wong, A. F. Salazar-Gomez, P. G. Harrell, A. L. Sampson, A. Cimenser, S. Ching, N. J. Kopell, C. Tavares-Stoeckel, K. Habeeb, R. Merhar, and E. N. Brown, “Electroencephalogram signatures of loss and recovery of consciousness from propofol,” Proceedings of the National Academy of Sciences, vol. 110, no. 12, pp. E1142–E1151, 2013.
- [21] R. H. Shumway and D. S. Stoffer, Time Series Analysis and Its Applications. Springer, 2017.
- [22] J. G. Proakis and D. K. Manolakis, Digital Signal Processing. Prentice-Hall, Inc., 2006.
- [23] F. Itakura and S. Saito, “A statistical method for estimation of speech spectral density and formant frequencies,” 1970.
A. Application to EEG recorded during anesthesia
We apply ASSMT to EEG recording of another volunteer receiving propofol anesthesia. All parameter settings of SSMT and ASSMT are same as those used in the main paper.
