Spatially Penalised Registration of Multivariate Functional Data
Abstract
Registration of multivariate functional data involves handling of both cross-component and cross-observation phase variations. Allowing for the two phase variations to be modelled as general diffeomorphic time warpings, in this work we focus on the hitherto unconsidered setting where phase variation of the component functions are spatially correlated. We propose an algorithm to optimize a metric-based objective function for registration with a novel penalty term that incorporates the spatial correlation between the component phase variations through a kriging estimate of an appropriate phase random field. The penalty term encourages the overall phase at a particular location to be similar to the spatially weighted average phase in its neighbourhood, and thus engenders a regularization that prevents over-alignment. Utility of the registration method, and its superior performance compared to methods that fail to account for the spatial correlation, is demonstrated through performance on simulated examples and two multivariate functional datasets pertaining to EEG signals and ozone concentrations. The generality of the framework opens up the possibility for extension to settings involving different forms of correlation between the component functions and their phases.
Keywords:
Elastic metric; Functional random field; Phase trace-variogram; Warping invariance.
1 Introduction
Modern functional datasets, such as longitudinal records, medical imaging signals, geometric shapes, contain confounded amplitude and phase variations (Srivastava et al., 2011). The adverse effects of ignoring the phase variation during their analyses are well-understood (Marron et al., 2015). The process of registration, also commonly referred to as alignment or amplitude-phase separation, enables the practitioner to decouple the two sources of variation and account for them in further analyses. For multivariate functional data consisting of multiple correlated, univariate functional components , the notion of phase variation can be decomposed into two types: (i) cross-observation phase, which is common across all component univariate functions within an observation , and (ii) cross-component phase within each observed unit.
Registration procedures for multivariate functional data are driven by assumptions on the two types of temporal variation. At the two extremes are procedures which assume that cross-component variation within each are either uncorrelated or identical. The former case is quite commonly considered by researchers in neuroimaging (Makeig et al., 2007; Tsai et al., 2014; Zhao et al., 2020), motivated mainly by the simplicity of independent componentwise registration. The latter case, referred to as universal registration, is common in shape analysis of -dimensional curves (Kurtek et al., 2012), wherein one assumes that cross-component phase variation does not exist and a common warping function is estimated for all components within the same observation (Olsen et al., 2016). Thus, neither of the two extreme cases directly addresses the registration problem when cross-component variation is non-negligible.
Literature on registration methods for multivariate functional data that lie between the two extreme cases is sparse. Noteworthy exceptions within the statistics literature are recent work by Carroll et al. (2021), Carroll & Müller (2021) and Park & Ahn (2017), where specific forms of cross-component temporal variation were considered. Specifically, the first two works restrict attention to the situation where all component functions for are assumed to have the same shape: for each observation , Carroll et al. (2021) assume that time-warped components arise through time shifts of the same function, say , while Carroll & Müller (2021) extend this to the case where the time shift map is replaced by a general diffeomorphic warping of . Motivated mainly by clustering, Park & Ahn (2017) proposed a conditional observation-specific registration procedure to extract relevant features for clustering.
It is quite common to encounter multivariate functional datasets wherein the component functions do not possess the same shape and/or when correlations between component functions need to be explicitly incorporated into the registration procedure. As an example, consider data generated from electroencephalogram (EEG) signals, specifically pertaining to event-related potentials measured during EEG tests. Event-related potentials are small voltages that reflect the electrical activity on the scalp in response to specific stimuli received by study participants (Sur & Sinha, 2009). For the th observation, EEG signals are densely recorded on a certain time domain using a set of electrodes placed at different locations on the scalp, and together comprise a multivariate functional observation . In the left panel of Figure 1, we display the observed EEG signals at the (projected) two-dimensional electrode locations on a toy map of the scalp. The entire collection of signals recorded during a single EEG trial is shown as a single slide and corresponds to an observed multivariate functional data unit.

Evidently, the component functions and at electrode locations and during the th trial are spatially correlated due to functional connectivity between brain regions. Latency of the brain’s responses to stimuli varies across trials for a single subject (and among different subjects) resulting in cross-observation phase variation (Wang et al., 2001). At the same time, different response lags of different brain regions to the presented stimuli (Stam et al., 2007) result in cross-component phase variation. The right two panels of Figure 1 offer an illustration of cross-observation (bottom) and cross-component (top) phase variation among EEG signals. To account for the two different types of phase variation in multi-trial EEG, an appropriate registration procedure that accounts for spatially-correlated cross-component phase is required. Specifically, the procedure should enable simultaneous registration of all components , while taking into account the dependence structure between the cross-component phases. This cannot be achieved by implementing independent registration of for each component , since such a procedure results in additional cross-component phase variation in the estimated average components, since each component is treated independently. We note this phenomenon in the right panel of Figure 9 (and the associated discussion) in Section 6.1 during our detailed analysis of the EEG data.
Motivated by the type of registration task associated with the EEG data, we propose a registration procedure that exploits correlated phase variation between component functions of multivariate functional data and offers a compromise between the two extreme cases of independent componentwise and universal registration methods alluded to above. Our focus is on multivariate functional data with component functions whose overall phase variation, represented through a time-warping function , is a combination of cross-observation phase and spatially correlated cross-component phase ( is function composition). The flexibility afforded, and desired, by such a general phase specification while registering multivariate functional data with correlated components, such as the EEG data, comes at a cost: the individual phase components and cannot be decoupled and estimated individually, and incorporating spatial information present in the latent when estimating is hugely challenging. In view of this, our main contributions are as follows.
-
(i)
We propose a metric-based penalized registration method for multivariate functional data with a penalty term defined using the spatial correlations between the cross-component phases , using which the overall phases are estimated by eliminating cross-observation phase variation .
-
(ii)
The penalty term is defined using a spatially weighted combination of the (estimated) cross-component phases , and is, owing to the invariant property of the metric to (simultaneous) time warping, impervious to the cross-observation phases ; this allows us to entirely circumvent having to estimate the .
-
(iii)
With respect to cross-observation registration, we demonstrate clear superiority of our method over ones that fail to account for the spatial correlation under a variety of simulation settings and two real data examples.
The rest of this paper is organized as follows. Section 2 details the registration problem for multivariate functional data whose component functions have spatially correlated phase variation. Section 3 briefly reviews the geometric elastic functional data analysis framework, including univariate pairwise and multiple registration, and registration of multivariate functions with common cross-component phase. Section 4 details the proposed spatially penalised registration objective function that accounts for spatial correlation in cross-component phase, and provides an algorithm (Algorithm 1) for its optimization. Section 5 presents results of simulation studies; in Section 6, we apply the proposed registration approach in two real data scenarios.
2 The registration problem
For a positive integer , denote by the set . The notation will be used in place of . We can view multivariate functional data with with cross-observation and correlated cross-component temporal variation as being generated from the model
(1) |
where , with , is a deterministic mean/template multivariate function, are realisations of an error process with spatially correlated component functions , and is a vector of monotone increasing, end-points preserving, time warping functions assuming values in , where is the time derivative of ; here, represents componentwise warping, where denotes function composition. Note that is a group with operation , identity element , and the function inverse as the group inverse. We will interchangeably refer to as a warping function or phase depending on context.
The characterising feature of the problem is that, for , the warping functions , where are spatially correlated warping functions at spatial locations within a spatial domain , with for each . The phase of a component function is thus represented as a combination of warping functions corresponding to cross-observation and cross-component temporal variations. The index will be used to denote the spatial location , i.e., will be used in place of , and the two notations will be used interchangeably.
Under the above setup, we can interpret as representing cross-observation temporal variation within and as representing spatially correlated cross-component temporal variation within . The two-fold objective is to, jointly:
-
1.
register by estimating in a manner that ensures for every , the warping functions and in nearby spatial locations and in are ‘similar’, owing to ‘similar’ cross-component warpings and ;
-
2.
estimate components of the template .
In the context of the EEG application, e.g., in the th trial for a single subject, we desire signals and obtained from spatially proximate electrode locations and to have similar phase variation and , with respect to template components and . The requirement is compatible with findings by Stam et al. (2007) on different response lags of spatially disparate brain regions to presented stimuli.
When the template is unknown, as is typical in practice, for each pair , the warping functions and are individually not identifiable in a nonparametric (infinite-dimensional) specification of the class , and error without restrictions (e.g., rank-one error model). It is hence not possible to estimate both and since their respective variabilities are confounded with the error ; this implies that it is not possible to decompose the overall phase variation given by in into the cross-observation and cross-component phase variations.
In view of this, our registration procedure will first estimate the cross-component warping functions for each observation and then use them to estimate . Evidently then, the cross-component warping functions will depend on . The key challenge arises from the need to iterate between computing the template components and warping functions by explicitly incorporating spatial dependence between the, unobserved but observation dependent, cross-component warping functions . In Section 4.2, we will consider a simplified geometry of under a suitable transformation, that will further clarify the dependence of the cross-component phases on the observations.
3 Overview of elastic functional data registration
The proposed algorithm is based on a componentwise spatially-penalized registration of using the metric-based elastic functional data analysis framework (Srivastava et al., 2011; Srivastava & Klassen, 2016) of univariate functions. The metric will dually enable us to propose a novel penalty term that quantifies the spatial correlation between the unobserved cross-component phases , unaffected by the cross-observation phase . We begin with a brief review of the elastic framework, and for later comparison with the proposed algorithm, also review universal registration of curves under the elastic framework (Srivastava & Klassen, 2016) where all components are assumed to have identical temporal variation.
3.1 Univariate functions
We consider the representation space of univariate functional data objects to be . The group of warping functions representing phase is . For any , , the warping of by is given by the group action of composition, . The group-theoretic formulation of phase enables a definition of the amplitude of a function as the equivalence class , known as its orbit under the action of ; thus, has the same amplitude as for each . The amplitude space then is the quotient space .
Separating amplitude and phase requires a metric on the amplitude space . A convenient way to define one is through a metric on that is invariant to simultaneous warping: for every . The standard metric fails to be invariant and Srivastava et al. (2011) thus proposed to use the extended Fisher-Rao (eFR) metric. While direct use of this metric for amplitude-phase separation is difficult in practice, the square-root slope transform can be used to ‘flatten’ the complicated eFR metric on to the standard metric on the transformed space. The transform maps ( is the time derivative of ). Given , is bijective with inverse . Henceforth, for any , we will refer to as its square-root slope function (SRSF).
The transformed space is a subset of , and is denoted by . Under , the eFR metric on maps to the standard metric on , and thus analysis of SRSFs can be carried out using standard Hilbert space machinery. Warping of by induces the warping action on equipped with the metric, and the action is by isometries, i.e., for and , , since for every ; denotes the norm.
3.1.1 Pairwise registration
Given two functions , amplitude and phase separation through pairwise registration or alignment of to , or vice versa, is formulated as the determination of the relative phase of with respect to ():
(2) |
The minimization problem in (2) is typically solved using the dynamic programming algorithm. To further regularize pairwise registration, one can instead solve the penalized optimization problem given by
(3) |
where is the regularization parameter. The penalty is defined as the squared distance between the SRSFs of and the identity warping function , where . It is evident that, depending on the magnitude of , this penalty forces the estimated phase to be close to the identity element, i.e., no warping.
3.1.2 Multiple registration
Amplitude-phase separation for a sample of functions is carried out with respect to a common template function that must also be estimated. Let denote the corresponding SRSFs. A template for multiple registration is estimated via
(4) |
Then, multiple registration is carried out via pairwise registration of each of with respect to using (2), resulting in the relative phases . The minimizer of (4) is not unique, with any element of the orbit resulting in the same value of the cost function due to the isometric action of . Thus, for identifiability, we select such that the relative phases average to ; see Srivastava et al. (2011) for details on how to compute an average of a sample of warping functions.
3.2 Componentwise and universal registration
Let denote the the space of multivariate functional data. Each multivariate function contains univariate components . Independent, componentwise registration of multivariate functional data is to apply the multiple registration given by (4) to functions in each component independently. This fails to account for any correlation between and for every and .
At the other end of the spectrum is universal registration, which treats each as a parameterised curve in , and thus assumes the same relative phase for all components. Registration under such a setup is again available through the elastic framework under a suitable transformation. With a slight abuse in notation, let denote the SRSF of , where is the componentwise time derivative of and is the Euclidean norm in ; each SRSF in this case is a function and the space of such SRSFs is denoted by . Then, given and , the relative phase of with respect to is given by
(5) |
where again denotes the Euclidean norm in and is applied componentwise. A penalized version of registration can also be implemented in this case by appropriately adapting the optimization problem defined in (3). Further, multiple registration, via estimation of a template , and pairwise registration of each function to via (5), follows the approach defined for univariate functional data in Section 3.1.2.
4 Registration of multivariate functions with spatially dependent cross-component phase
We propose a penalized multiple registration method for multivariate functional data wherein cross-component phase in each observation is spatially correlated. Let denote a multivariate functional data sample; we assume that are independent. Each component function is assumed to be an element of , and the components , for a fixed , have spatially dependent phase. Further, let denote the SRSFs of the component functions in each observation. Registration of multivariate functional data requires estimation of the overall phase (composition of cross-observation and cross-component phase) for each component in each observation, . This facilitates simultaneous synchronization of the component functions across and .
As a compromise between the two extreme settings of independent componentwise and universal registration described in Section 3.2, we propose a spatially penalized registration approach that takes spatial cross-component phase correlation into account, but allows each component to have its own phase. As described in Section 2, within each the dependence between the component functions arises through spatial correlation in the cross-component phases .
4.1 Spatially penalised componentwise registration
Our approach is to carry out cross-observation registration through a modification of the multiple registration procedure in (4) using a spatially-informed penalty term. For a fixed function component , the registration of functions , with SRSFs , amounts to determination of . However, for each , we note that is spatially correlated with , through the spatial correlation between the latent and .
We thus consider a penalised modification of (4) wherein the penalty term for each depends on the phases . Our choice is to define the penalty term using (an estimate of) the conditional mean of given : we wish to discourage from assuming values that are far away from its spatially weighted conditional mean. We accordingly consider the following optimization problem for each component :
(6) |
where is the template for registration in component , is a penalty parameter, and is a distance on ; the warping function is an estimate of the conditional mean of given , which we will define next in Section 4.2.
The first term in the objective function in (12) provides a measure of synchronization for component , across observations , with respect to the template . The second term, a penalty on phase, measures the distance between the estimated phase and a target determined by the correlated phase in the other components. The regularization penalty attempts to preserve the phase-induced correlation structure in the aligned components. If the function components are assumed to be independent, and , the identity warping function, for all and , the proposed approach is equivalent to independent univariate penalized registration problems, as specified in (3), with acting as the template for each component.
4.2 Penalty using spatial model for cross-component phase
We first discuss the choice of distance on in the penalty term in (6). Elements of the function space can be viewed as differentiable, increasing distribution functions of random variables on , and thus constitute a nonlinear, convex set. A simplified geometric structure compatible with the norm preserving action under the operation is available under the SRSF transform, under which . The SRSF map is bijective, and each in its image corresponds to a square-root probability density since . As a consequence, the set can be identified with the positive orthant of the unit sphere in . While the natural candidate for distance on is the intrinsic arc-length distance on , we use the extrinsic distance
where is the standard norm. Our choice is linked to the choice of in the penalty term. Observe that with defined as above on , we have that, for every ,
(7) |
and acts on itself by isometries under the SRSF map. We thus combine the cross-observation phase and cross-component phase using their SRSFs to obtain the SRSF of as
(8) |
As described in Section 2, our registration procedure will estimate by first estimating the spatially correlated cross-component phases for each observation since and are individually not estimable; in other words, we wish to induce dependence in the generative model (1) between the confounded cross-component and cross-observation phases. We achieve this using the SRSF transform of in the manner defined in (8).
Under the SRSF transform of , the population conditional expectation is an ideal choice for . As with spatially correlated real- or vector-valued observations in traditional statistics modeled using random fields, a viable estimate of the conditional expectation in this context is the spatial interpolant or kriging estimate at location of a functional random field
assuming values in , conditioned on its values at locations . In other words, the random field is derived by time-warping the values assumed by another functional random field with a fixed .
Computing the kriging estimate of at requires estimation of . Suppose that the components are indexed by spatial locations . Denote by the SRSFs of the warping functions , which, for every , are viewed as values assumed by the random field defined above at spatial locations . Then, under the assumption that the functional random field is second-order stationary and isotropic, we can consider the phase trace-variogram, defined by Guo et al. (2022) as
using Fubini’s theorem, where and is the Euclidean norm on the spatial domain . For a fixed observation , the estimator of is given by
(9) |
where . For irregularly spaced data, can be modified to for a small .
Despite the fact that the spatially correlated cross-component phases are unobserved and confounded with the cross-observation phases , we can access their correlation structure by estimating using and . To see this, note that due to the isometry property in (7),
and thus equals the phase-trace variogram of the functional random field ; a similar argument applies to the estimate as well. We thus note that the phase-trace variogram is invariant to (simultaneous) warping (Guo et al., 2022, Lemma 1), and this crucial observation motivates our use of the extrinsic distance on and the functional random field with values in .
We now move on to computing the kriging estimate using the variogram estimate . For component in observation , given phase for the other components , the conditional mean phase is given by the weighted average
(10) |
The coefficient vector is implicitly defined as the minimizer of
(11) |
Guo et al. (2022) show that can be estimated using a quadratic optimization problem that depends on the trace-variogram and pairwise spatial distances , for . The estimator is subsequently normalized using to correspond to a valid warping function.
4.3 Registration algorithm and implementation details
The optimization problem for registration in (12) is solved in an iterative fashion in Algorithm 1 given below, where tools from Section 4.2 are used to model the spatially dependent cross-component phase within each observation. Figure 2 provides a high-level diagrammatic representation of Algorithm 1.

More specifically, given component functions of multivariate functional data , we first transform them to obtain their SRSFs . Then, for a fixed penalty parameter , Algorithm 1 can be decomposed into two blocks:
-
Initalization:
-
(i)
Obtain initial template component functions by performing independent componentwise registration of for each given by (4);
-
(ii)
initialize the phase functions to the identity warping by initializing their SRSFs to ;
-
(iii)
for each observation , implement multiple registration of using (4) to estimate cross-component phase ;
-
(iv)
for each observation , compute weights required for the kriging estimate using the estimated phase trace variogram .
-
(i)
-
Iteration: Starting with the initial values, obtain phases and template components by iterating and updating, until convergence.
-
(i)
Compute in penalty using kriging coefficients { and phases ;
-
(ii)
solve ;
-
(iii)
compute template components by averaging over .
-
(i)
Algorithm 1 contains nested while loops (lines 12-26). The inner loop indexed by (lines 15-21) updates the estimated warping functions given the templates for all components. The outer loop index by (lines 12-26) updates the template for each component given the estimated warping functions. The convergence of Algorithm 1 is assessed empirically using simulation studies in Section 5.
5 Simulation studies
We conduct simulation studies to assess the performance of the proposed multiple registration approach for multivariate functional data. Specifically, we estimate the componentwise template functions using three different approaches: (1) the proposed method, which utilizes spatial cross-component phase correlation in penalized registration (Section 4 and Algorithm 1), (2) independent componentwise registration (Section 3.1 and Section 3.4 in (Srivastava et al., 2011)), and (3) universal registration wherein each component is assumed to have the same phase (Section 3.2).
5.1 Data generating model
We consider multivariate functional data wherein dependence between function components is induced by spatial correlation. In this setting, each component of a simulated observation is indexed by a spatial coordinate ; component for observation is denoted by . Component functions of the data are generated using the model:
In the model, denotes the template for component indexed by spatial location , which is the object that we wish to estimate through a registration procedure. The random functional error and cross-component warping functions are also indexed by the spatial location. The warping function denotes the observation specific warping function (common across all components).
We consider two simulation settings characterised by choice of the component template functions (see below). For both settings, the random phase components are composed of two warping functions: , the phase that is common across components for observation , and , the cross-component phase within observation . Such a nested structure is similar to the structure in a mixed-effects model. The warping functions are taken to be cumulative distribution functions (CDFs) of a beta distribution, , with random parameter . The cross-component phase is also the CDF of a beta distribution, , where follows the correlated uniform distribution on , independently for every ; a sample from the correlated uniform distribution can be generated by transforming a correlated multivariate normal sample with mean and Matern covariance . The parameters and control magnitudes of the cross-observation and cross-component phase variations; the range parameter is set to , where is the maximum spatial distance between the simulated sites .
Simulation setting 1: In this setting the component template functions have a specific bimodal form. We consider components where the spatial coordinates are generated using a uniform distribution on . The componentwise template functions are generated as , where are independently sampled from a multivariate normal distribution with mean vector and Matern covariance ; here, is the scale parameter, is the range, and the smoothing parameter is fixed to . The random errors are generated independently for each observation in a pointwise manner: for each value of we generate a sample from the multivariate normal distribution with mean vector and covariance . The other covariance parameters are set to , and .
Simulation setting 2: In this setting, in order to replicate the structure of EEG data, the component template functions are chosen to possess the (typical) shape of EEG signals, i; thus each component is treated as a signal from an EEG electrode. We consider 16 electrode locations on the scalp with three-dimensional coordinates. To imitate the shape variation in EEG signals from different brain regions, we use a more flexible data generating model. The componentwise template functions are generated using 10 B-spline basis functions, , of order 4 on the interval : . The B-spline basis coefficients are generated independently from a multivariate normal distribution with mean vector and Matern covariance . Random functional errors, , are generated in the same way as in Simulation setting 1, but under a lower signal-to-noise ratio: the covariance parameters are set to , or and .
5.2 Assessing performance of template estimation
Denote by and the aligned multivariate functions and their SRSFs obtained after registration. We define two performance metrics that quantify the accuracy of template estimation based on the registered functions:
where is the true template for component . MSE, based on the norm on , is sensitive to phase errors, while QMSE, based on the eFR metric on , is sensitive to shape differences (Srivastava & Klassen, 2016).
We use and , respectively, for the first and second simulation settings described in Section 5.1; we replicate each simulation 50 times. When implementing the proposed method, we use 4-fold cross-validation to select a value for the regularization parameter . Denote by the estimated template function for component based on data in all folds except the th one; here, we use to denote the set of observation indices in fold . Then, the value for the regularization parameter is chosen by minimizing the criterion
where . That is, we align the functions in the validation set (fold ) to the template function estimated using the training set (all folds except ). The selected value for the regularization parameter is then used for registration of all observations.
Results for each simulation setting described in Section 5.1 are summarized in the boxplots shown in Figures 3 and 4, respectively. In all cases, we also report values of the error metrics when no registration is applied to the data; this serves as the baseline.


Simulation setting 1: We observe in Figure 3 that both componentwise registration and the proposed method clearly have smaller MSEs than universal registration; further, the mean MSE for the proposed method (0.083) is slightly lower than the mean MSE for the componentwise method (0.099). With respect to QMSE, the performance of componentwise registration is clearly worse than the proposed approach. Interestingly, universal registration outperforms componentwise registration in terms of median QMSE.
The improvement in registration quality with the proposed method over the componentwise approach is due to the addition of the regularization term. Componentwise registration tends to ‘overalign’ features of the component functions, e.g., peaks and valleys, that are due to the random error, i.e., the estimate of cross-component phase is too flexible. On the other hand, the universal approach performs poorly since all components within an observation are constrained to be identical, i.e., the estimate of cross-component phase is too restrictive. The compromise achieved by the proposed method between the componentwise and universal methods by introducing a regularization penalty that synthesizes spatial phase information from all components within an observation results in improved registration. Finally, we note that the performance of the proposed method is consistently better than the other two approaches, even if no cross-component phase variation exists in the data (the case when ).
Simulation setting 2: Recall that the setting here is designed to replicate the structure in EEG data. The low signal-to-noise ratio presents significant challenges for all of the registration procedures. Figure 4 shows that the componentwise approach, expectedly, is the most sensitive to the random function errors due to its lack of regularization. Despite the good performance of this method in the first simulation setting, the low signal-to-noise ratio when significantly deteriorates the template estimates, especially with respect to QMSE. In fact, when , the advantages of registration are not obvious since all of the procedures are essentially driven by random noise; regularization with the spatial penalty provides some help, but there is insufficient structure in to inform estimation of the conditional mean phase based on cross-component spatial correlation. In such situations, we recommend slight smoothing of the observed functions prior to registration using any of-the-shelf smoothing procedure (e.g., splines). To examine value in the pre-smoothing, we report additional results in Figure 4 after applying smoothing splines, with a small value for the smoothing parameter, to the simulated data when . The presented results show that the proposed registration method produces more accurate and stable template estimates than the other two registration approaches.
In Figure 5 we provide results of template estimation for a single component in one simulation run when . The simulated data for this component is shown in grey with the ground truth template in black. The estimated templates are shown in red. In Panel 1 of Figure 5, it is evident that when no registration is performed, the resulting template underestimates relevant shape features, e.g., the two peaks and one valley between and ; in Panel 2, componentwise registration results in a template that generates additional shape features due to ‘overalignment’ of noise, e.g., the two small peaks and three valleys between and ; in Panel 3, universal registration results in a template that has the ‘correct’ shape, but that is out of phase with respect to the ground truth template, e.g., there is significant lag between and ; Finally, in Panel 4, we note that the proposed approach results in a template that is very similar in shape to the ground truth and is largely in phase.

5.3 Assessing convergence of Algorithm 1
We empirically assess convergence properties of Algorithm 1 by monitoring the average cost function and the relative change in estimated phase components with respect to the iteration counter. We use a single simulation example generated under Simulation setting 1 with . To capture the complete trend of convergence, we remove the stopping rule in the algorithm and complete a fixed number of iterations. Recall that the algorithm has nested while loops. The inner loop updates the template for each component, while the outer loop updates the estimated warping functions given the templates. The number of iterations for the outer loop is set to 20; the number of iterations for the inner loop is set to 10. In our evaluation, the overall number of iterations is defined as the cumulative count of the iterations completed in the inner loop.
The first quantity of interest is the average of the cost function in (12) across all components . The second is the relative change in the estimated warping functions between iterations: . The left panel in Figure 6 is a plot of the average cost function after each inner (red) and outer (black) loop update with respect to the iteration counter. It appears that the algorithm has converged after approximately iterations. The right panel in Figure 6 plots the value of with respect to the iteration counter. Overall, the value of decreases with jumps occurring after an update of the component templates. The magnitudes of these jumps gradually converge to 0 as the component templates also converge.

5.4 Comparison with regularization towards identity warping
The main advantage of the proposed registration method is the regularization of estimated warping functions by use of the spatial penalty term. Regularization reduces the complexity and variance of the estimated phase component and guards against ‘overalignment’ of function features that may be due to noise. In existing functional data analysis literature, most regularized registration methods use a regularization penalty that forces estimated warping functions to be close to the identity element, with respect to some prespecified distance on (Srivastava & Klassen, 2016), e.g., the squared extrinsic Fisher-Rao distance given by , where and ; see Section 3.1.1. However, in the case of multivariate functional data wherein phase is spatially dependent, the correlation structure in cross-component phase enables us to regularize estimated warping functions toward data-driven targets rather than the fixed identity warping function. To assess this benefit, we compare the estimated warping functions generated using the proposed approach to a regularized version of componentwise registration as specified by (3).
(a) | (b) |
---|---|
![]() |
![]() |
Figure 7(a) presents estimated warping functions for all components of the same observation in a randomly selected simulation run under Simulation setting 1. We consider three different values for the regularization parameter: , , and . The results of regularized componentwise registration are given in the top row while the results generated by the proposed approach are shown in the bottom row.
The magnitude and roughness of the cross-component phase estimated using regularized componentwise registration are penalized toward the identity warping; in regularized componentwise registration, when , the estimated warping functions all converge to the same identity element. On the other hand, the proposed approach allows estimated warping functions to converge to a more flexible target (see bottom right panel of Figure 7(a)) when increases. This is because the kriging estimate in the penalty can vary from observation to observation. This phenomenon is also observed in Figure 7(b) where we fix a component and display the estimated cross-observation phases. The regularized componentwise registration forces all warping functions towards the identity when is greater than 1. These results imply very small cross-observation phase variation, which is not true for the simulated data. On the other hand, the proposed penalized registration captures the cross-observation phase variation even if is fairly large. Based on these findings, we see that the proposed approach reduces the discrepancy of estimated warping functions for components in each observation and preserves the magnitude of estimated cross-observation phases. The proposed approach using the spatial penalty results in warping functions with appropriate smoothness and magnitude, but does not force them toward identity regardless of cross-observation phase variation.
6 Real data analysis
We apply the proposed penalized registration method to two multivariate functional datasets. The first consists of multi-trial EEG signals for alcoholism patients discussed earlier, and the second considers annual ozone concentration functions observed at different locations in a small area in northern California. In both cases, we use 4-fold cross-validation to select a value for in the proposed registration approach.
6.1 Electroencephalogram data
We analyze an EEG dataset arising from a study that examined EEG correlates of genetic predisposition to alcoholism (Bache & Lichman, 2013). The study resulted in EEG measurements recorded at 64 electrodes placed on the subjects’ scalps, with a sampling rate of 256 Hz (3.9-msec epoch) for 1 second. Each subject was exposed to a stimulus in each trial and completed multiple trials where different stimuli were shown. The study used standard 64-channel electrode placements as defined by the American Electroencephalographic Association. In our analysis, we use EEG signals from 50 trials for one subject, and treat each trial as an independent multivariate functional observation. Prior to analysis, each EEG signal was smoothed using smoothing splines with a small parameter value of . To compare the performance of different methods, we register the 50 trials using the componentwise, universal and proposed registration approaches. After registration, we average the aligned signals recorded by each electrode across trials to estimate the mean event related potentials.
For easy comparison across the three methods, we display the registration results for electrode AFZ in Figure 8. As in the simulations, we also display the result of averaging across trials when no registration is applied to the data as a baseline (Panel 1). The electrode AFZ was chosen as an illustration since many of the EEG signals recorded there contain a pronounced activation peak between and . We observe similar results in this real EEG data example, based on the different registration methods, as in the simulation studies. In Panel 1, when no alignment is applied, the pronounced activation peak is almost completely nonexistent in the average signal shown in red; thus, registration of the data is necessary prior to averaging. In Panel 2, the componentwise registration method tends to ‘overalign’ many of the relatively small modes present in the signals that are likely due to noise. This results in an average that has many small fluctuations that are indistinguishable from the pronounced activation peak. In Panels 3 and 4, the universal and proposed methods are effective in capturing the large peak in the respective average signals and tend to produce fewer small fluctuations prior to . Essentially, these two methods mitigate the contribution of noise and result in average signals that reflect the most prominent feature in the given data. The magnitude of the pronounced activation peak is similar in the two averages, but it is stretched over a longer part of the domain in the average produced using the universal method.
In EEG data analysis, it is often of interest to study the relationship between average signals (across trials) at different electrode locations after registration. To show the benefits of the proposed approach in downstream EEG data analysis tasks, we explore the spatial correlation between estimated average EEG signals using the empirical trace-variogram (Giraldo et al., 2011), defined as
(13) |
where for a small and denotes the estimated average signal for electrode located at . The empirical trace-variograms, computed based on average signals generated by the three different registration methods, are shown in the left panel of Figure 9 (componentwise in red, universal in blue, proposed in green). The variograms suggest very similar spatial correlation patterns between the average signals, but the proposed approach produces averages with smallest spatial variation. The universal approach results in more spatial variation due to its very restrictive assumption of common phase across all channels. The magnitude of spatial variation in this case is similar to the proposed method at small spatial distances. This is intuitive since cross-component phase variation at nearby electrodes is very small and the common phase assumption is reasonable. However, as the spatial distance increases, this assumption becomes unrealistic.
The componentwise method conducts separate alignment of EEG signals at each electrode, resulting in large phase variation across electrode locations. On the other hand, the proposed method avoids this issue by accounting for the spatial phase correlation across electrodes through the regularization penalty. To demonstrate this, we consider the 14 electrodes in the area corresponding to the parietal lobe. Since all of the 14 electrodes are related to the same brain region, we should expect very little phase variation in the resulting average signals. The corresponding estimated averages are shown in the middle panel in Figure 9 for the proposed method and the right panel in the same figure for the componentwise method. In the middle panel, there is very little phase variation across the estimated averages, as expected. However, in the right panel, it is easy to see that considerable phase variation remains.


6.2 Ozone concentration data
Ground-level ozone is a harmful pollutant that is monitored closely by the Environmental Protection Agency. The temporal trend of daily ozone concentration varies from year to year due to variations in various environmental conditions, including the weather. To explore average ozone concentration patterns at a certain location, we desire to first eliminate phase variation that exists across different observation years using a registration procedure. At the same time, we must account for cross-component phase variation across different locations within the same year that is due to different temporal patterns of pollutant spread. Thus, we view multi-year ozone concentration functions, observed at different spatial locations, as multivariate functional data with two different sources of phase variation; here, univariate ozone concentration functions observed at different spatial locations are treated as components of a full observation corresponding to a single year. In this analysis, we focus on a small area in northern California ( N, W) with observation stations (components). Each station recorded daily average ozone concentration (in parts per million) from year 2000 to year 2019 (sample size ). The data is publicly available on the air data website111https://www.epa.gov/outdoor-air-quality-data of the United States Environmental Protection Agency. As in the previous real data analysis example that considered EEG data, we compare the performance of three registration procedures in this context: componentwise, universal and proposed.
We display the registration results for a single location, corresponding to Alameda County (37.8 N, 122.3 W), in Figure 10. We also display the result of averaging across trials when no registration is applied to the data as a baseline (Panel 1). We make several interesting observations based on these results. First, the proposed method yields an estimate of the average ozone concentration function (Panel 4) that contains clearer patterns than the average computed without alignment (Panel 1), e.g., the steep reduction in ozone concentration around day 200. The proposed approach produces an average that has very similar patterns to those computed using the componentwise and universal registration methods, but is smoother overall. The most noticeable difference occurs in early autumn where the proposed approach results in an average that has a single mode, while the other two methods result in averages with two smaller peaks. A possible reason for this phenomenon is the large regularization parameter value chosen via cross-validation; this restricts the complexity of estimated cross-component warping functions and forces the registration procedure to overlook small features that are potentially induced by noise.


As in the EEG data example, we further analyze the estimated average ozone concentration functions at the 16 spatial locations. Modeling of spatial correlation based on averages computed after registration is also of particular interest in this application. In the left panel of Figure 11, we display the empirical trace-variograms, computed using (13) after estimating the average ozone concentration functions using the componentwise (red), universal (blue) and proposed (green) approaches. The conclusions here are similar to those reached in the EEG data example. The proposed method yields componentwise averages that have smallest spatial variation. The variation gap between the universal and proposed methods appears bigger in this case, even for small spatial distances. We further display the componentwise estimated average ozone concentration functions, for each of the 16 locations, generated using the proposed and componentwise registration methods in the middle and right panels of Figure 11, respectively. It is obvious that the proposed approach (middle) results in estimated average functions that exhibit much less cross-component phase variation than the average functions generated using the componentwise registration method (right). Furthermore, most averages in the middle panel have two ozone concentration maxima around April/May and August, and a single ozone concentration minimum around June/July. In the right panel, there is much more variation in the number of extrema in the estimated averages as well as their timing.
7 Discussion
The proposed penalized registration procedure can in principle be used to align multivariate functional data with component functions that are correlated in other ways. For example, when component functions are temporally correlated, we can replace the kriging estimate in the penalty term of the objective function (6) with a temporal interpolant or temporally weighted estimate (e.g., moving time-window average). There is much to be done in this direction, and the proposed method represents a promising initial foray.
The focus in this paper was restricted to multivariate functional data wherein only the cross-component phases are spatially correlated. The objective function (6) does not quantify and incorporate any spatial correlation between the cross-component amplitudes, and this may well be of interest in certain applications. Extension of the algorithm to accommodate such a requirement is possible using the amplitude trace-variogram proposed by Guo et al. (2022), in addition to the phase trace-variogram used here. This presents a fruitful line for future work.
Importance of the elastic metric and its invariance to time warping, which is made practically useful through the square-root slope function (SRSF) representation, cannot be overstated. The invariance property is used to great benefit in both the objective function and the novel spatially-informed penalty term through the kriging estimate , and enables us to disregard estimating the cross-observation phases .
There is potential for improvement in reducing the computational burden when implementing the registration procedure when the number of functional observations with components are both large. Registration in the elastic framework uses the the dynamic programming algorithm since the class of warping functions is infinite-dimensional and unconstrained (Srivastava & Klassen, 2016). Restricting attention to a smaller parametric class (e.g., parameterised class of distribution or quantile functions on ) will reduce computing time considerably, but at the cost of flexibility in registration.
Acknowledgements: We acknowledge funding from the National Science Foundation (DMS-2015374 to KB; CCF-1740761, DMS-2015226, CCF-1839252 to SK), the National Institutes of Health (R37-CA214955 to SK and KB) and the Engineering and Physical Sciences Research Council (EP/V048104/1 to KB).
References
- Bache & Lichman (2013) Bache, K. & Lichman, M. (2013). UCI machine learning repository. https://archive.ics.uci.edu/ml/datasets/EEG+Database.
- Carroll & Müller (2021) Carroll, C. & Müller, H.-G. (2021). Latent transport models for multivariate functional data. arXiv:2107.05730 .
- Carroll et al. (2021) Carroll, C., Müller, H.-G. & Kneip, A. (2021). Cross-component registration for multivariate functional data, with application to growth curves. Biometrics 77, 839–851.
- Giraldo et al. (2011) Giraldo, R., Delicado, P. & Mateu, J. (2011). Ordinary kriging for function-valued spatial data. Environmental and Ecological Statistics 18, 411–426.
- Guo et al. (2022) Guo, X., Kurtek, S. & Bharath, K. (2022). Variograms for kriging and clustering of spatial functional data with phase variation. Spatial Statistics , 100687.
- Kurtek et al. (2012) Kurtek, S., Srivastava, A., Klassen, E. & Ding, Z. (2012). Statistical modeling of curves using shapes and related features. Journal of the American Statistical Association 107, 1152–1165.
- Makeig et al. (2007) Makeig, S., Onton, J., Sejnowski, T. & Poizner, H. (2007). Prospects for mobile, high-definition brain imaging: EEG spectral modulations during 3-D reaching. Human Brain Mapping .
- Marron et al. (2015) Marron, J. S., Ramsay, J. O., Sangalli, L. M. & Srivastava, A. (2015). Functional data analysis of amplitude and phase variation. Statistical Science 30, 468–484.
- Olsen et al. (2016) Olsen, N., Markussen, B. & Raket, L. L. (2016). Simultaneous inference for misaligned multivariate functional data. Journal of the Royal Statistical Society: Series C 67, 1147–1176.
- Park & Ahn (2017) Park, J. & Ahn, J. (2017). Clustering multivariate functional data with phase variation. Biometrics 73, 324–333.
- Srivastava & Klassen (2016) Srivastava, A. & Klassen, E. P. (2016). Functional and Shape Data Analysis. Springer.
- Srivastava et al. (2011) Srivastava, A., Wu, W., Kurtek, S., Klassen, E. & Marron, J. S. (2011). Registration of functional data using Fisher-Rao metric. arXiv:1103.3817 .
- Stam et al. (2007) Stam, C. J., Nolte, G. & Daffertshofer, A. (2007). Phase lag index: assessment of functional connectivity from multi channel EEG and MEG with diminished bias from common sources. Human Brain Mapping 28, 1178–1193.
- Sur & Sinha (2009) Sur, S. & Sinha, V. K. (2009). Event-related potential: An overview. Industrial Psychiatry Journal 18, 70–73.
- Tsai et al. (2014) Tsai, A. C., Jung, T.-P., Chien, V. S., Savostyanov, A. N. & Makeig, S. (2014). Cortical surface alignment in multi-subject spatiotemporal independent EEG source imaging. NeuroImage 87, 297–310.
- Wang et al. (2001) Wang, K., Begleiter, H. & Porjesz, B. (2001). Warp-averaging event-related potentials. Clinical Neurophysiology 112, 1917–1924.
- Zhao et al. (2020) Zhao, W., Xu, Z., Li, W. & Wu, W. (2020). Modeling and analyzing neural signals with phase variability using Fisher-Rao registration. Journal of Neuroscience Methods 346, 108954.