This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Spatially Penalised Registration of Multivariate Functional Data

Xiaohan Guo Department of Statistics, The Ohio State University, USA Sebastian Kurtek Department of Statistics, The Ohio State University, USA Karthik Bharath School of Mathematical Sciences, University of Nottingham, UK
Abstract

Registration of multivariate functional data involves handling of both cross-component and cross-observation phase variations. Allowing for the two phase variations to be modelled as general diffeomorphic time warpings, in this work we focus on the hitherto unconsidered setting where phase variation of the component functions are spatially correlated. We propose an algorithm to optimize a metric-based objective function for registration with a novel penalty term that incorporates the spatial correlation between the component phase variations through a kriging estimate of an appropriate phase random field. The penalty term encourages the overall phase at a particular location to be similar to the spatially weighted average phase in its neighbourhood, and thus engenders a regularization that prevents over-alignment. Utility of the registration method, and its superior performance compared to methods that fail to account for the spatial correlation, is demonstrated through performance on simulated examples and two multivariate functional datasets pertaining to EEG signals and ozone concentrations. The generality of the framework opens up the possibility for extension to settings involving different forms of correlation between the component functions and their phases.

Keywords:
Elastic metric; Functional random field; Phase trace-variogram; Warping invariance.

1 Introduction

Modern functional datasets, such as longitudinal records, medical imaging signals, geometric shapes, contain confounded amplitude and phase variations (Srivastava et al., 2011). The adverse effects of ignoring the phase variation during their analyses are well-understood (Marron et al., 2015). The process of registration, also commonly referred to as alignment or amplitude-phase separation, enables the practitioner to decouple the two sources of variation and account for them in further analyses. For multivariate functional data IFi=(fi1,,fiK)K\mathbb{R}\supset I\mapsto F_{i}=(f_{i1},\ldots,f_{iK})^{\top}\in\mathbb{R}^{K} consisting of multiple correlated, univariate functional components fij:I,j=1,,Kf_{ij}:I\to\mathbb{R},\ j=1,\ldots,K, the notion of phase variation can be decomposed into two types: (i) cross-observation phase, which is common across all KK component univariate functions fijf_{ij} within an observation FiF_{i}, and (ii) cross-component phase within each observed unit.

Registration procedures for multivariate functional data are driven by assumptions on the two types of temporal variation. At the two extremes are procedures which assume that cross-component variation within each FiF_{i} are either uncorrelated or identical. The former case is quite commonly considered by researchers in neuroimaging (Makeig et al., 2007; Tsai et al., 2014; Zhao et al., 2020), motivated mainly by the simplicity of independent componentwise registration. The latter case, referred to as universal registration, is common in shape analysis of KK-dimensional curves {Fi}\{F_{i}\} (Kurtek et al., 2012), wherein one assumes that cross-component phase variation does not exist and a common warping function is estimated for all components within the same observation (Olsen et al., 2016). Thus, neither of the two extreme cases directly addresses the registration problem when cross-component variation is non-negligible.

Literature on registration methods for multivariate functional data that lie between the two extreme cases is sparse. Noteworthy exceptions within the statistics literature are recent work by Carroll et al. (2021), Carroll & Müller (2021) and Park & Ahn (2017), where specific forms of cross-component temporal variation were considered. Specifically, the first two works restrict attention to the situation where all component functions fijf_{ij} for j=1,,Kj=1,\ldots,K are assumed to have the same shape: for each observation ii, Carroll et al. (2021) assume that time-warped components fij(t)f_{ij}(t) arise through time shifts t(tτ)t\mapsto(t-\tau) of the same function, say gg, while Carroll & Müller (2021) extend this to the case where the time shift map is replaced by a general diffeomorphic warping tγ(t)t\mapsto\gamma(t) of II. Motivated mainly by clustering, Park & Ahn (2017) proposed a conditional observation-specific registration procedure to extract relevant features for clustering.

It is quite common to encounter multivariate functional datasets wherein the component functions do not possess the same shape and/or when correlations between component functions need to be explicitly incorporated into the registration procedure. As an example, consider data generated from electroencephalogram (EEG) signals, specifically pertaining to event-related potentials measured during EEG tests. Event-related potentials are small voltages that reflect the electrical activity on the scalp in response to specific stimuli received by study participants (Sur & Sinha, 2009). For the iith observation, EEG signals fijf_{ij} are densely recorded on a certain time domain II using a set of electrodes placed at KK different locations on the scalp, and together comprise a multivariate functional observation FiF_{i}. In the left panel of Figure 1, we display the observed EEG signals at the (projected) two-dimensional electrode locations on a toy map of the scalp. The entire collection of signals recorded during a single EEG trial is shown as a single slide and corresponds to an observed multivariate functional data unit.

Refer to caption
Figure 1: Left: Example of a multi-trial EEG dataset with 61 electrodes from an alcoholism study. Each slide represents a single trial in the study, while the red functions plotted at the locations of electrodes (grey circles) constitute the components. Top Right: 61 signals from a single trial with considerable cross-component phase variation. Bottom Right: Signals collected at electrode AFZ across 50 trials with considerable cross-observation phase variation.

Evidently, the component functions fijf_{ij} and fikf_{ik} at electrode locations jj and kk during the iith trial are spatially correlated due to functional connectivity between brain regions. Latency of the brain’s responses to stimuli varies across trials for a single subject (and among different subjects) resulting in cross-observation phase variation (Wang et al., 2001). At the same time, different response lags of different brain regions to the presented stimuli (Stam et al., 2007) result in cross-component phase variation. The right two panels of Figure 1 offer an illustration of cross-observation (bottom) and cross-component (top) phase variation among EEG signals. To account for the two different types of phase variation in multi-trial EEG, an appropriate registration procedure that accounts for spatially-correlated cross-component phase is required. Specifically, the procedure should enable simultaneous registration of all components fijf_{ij}, while taking into account the dependence structure between the cross-component phases. This cannot be achieved by implementing independent registration of f1j,,fnjf_{1j},\ldots,f_{nj} for each component jj, since such a procedure results in additional cross-component phase variation in the estimated average components, since each component is treated independently. We note this phenomenon in the right panel of Figure 9 (and the associated discussion) in Section 6.1 during our detailed analysis of the EEG data.

Motivated by the type of registration task associated with the EEG data, we propose a registration procedure that exploits correlated phase variation between component functions {fij}\{f_{ij}\} of multivariate functional data {Fi}\{F_{i}\} and offers a compromise between the two extreme cases of independent componentwise and universal registration methods alluded to above. Our focus is on multivariate functional data with component functions fijf_{ij} whose overall phase variation, represented through a time-warping function γij:II\gamma_{ij}:I\to I, is a combination γij=αiξj\gamma_{ij}=\alpha_{i}\circ\xi_{j} of cross-observation phase αi:II\alpha_{i}:I\to I and spatially correlated cross-component phase ξj:II\xi_{j}:I\to I (\circ is function composition). The flexibility afforded, and desired, by such a general phase specification while registering multivariate functional data with correlated components, such as the EEG data, comes at a cost: the individual phase components αi\alpha_{i} and ξj\xi_{j} cannot be decoupled and estimated individually, and incorporating spatial information present in the latent {ξj}\{\xi_{j}\} when estimating γij\gamma_{ij} is hugely challenging. In view of this, our main contributions are as follows.

  1. (i)

    We propose a metric-based penalized registration method for multivariate functional data with a penalty term defined using the spatial correlations between the cross-component phases {ξj}\{\xi_{j}\}, using which the overall phases {γij}\{\gamma_{ij}\} are estimated by eliminating cross-observation phase variation {αi}\{\alpha_{i}\}.

  2. (ii)

    The penalty term is defined using a spatially weighted combination of the (estimated) cross-component phases {ξj}\{\xi_{j}\}, and is, owing to the invariant property of the metric to (simultaneous) time warping, impervious to the cross-observation phases {αi}\{\alpha_{i}\}; this allows us to entirely circumvent having to estimate the {αi}\{\alpha_{i}\}.

  3. (iii)

    With respect to cross-observation registration, we demonstrate clear superiority of our method over ones that fail to account for the spatial correlation under a variety of simulation settings and two real data examples.

The rest of this paper is organized as follows. Section 2 details the registration problem for multivariate functional data whose component functions have spatially correlated phase variation. Section 3 briefly reviews the geometric elastic functional data analysis framework, including univariate pairwise and multiple registration, and registration of multivariate functions with common cross-component phase. Section 4 details the proposed spatially penalised registration objective function that accounts for spatial correlation in cross-component phase, and provides an algorithm (Algorithm 1) for its optimization. Section 5 presents results of simulation studies; in Section 6, we apply the proposed registration approach in two real data scenarios.

2 The registration problem

For a positive integer rr, denote by [r][r] the set {1,,r}\{1,\ldots,r\}. The notation j[r]j\in[r] will be used in place of j=1,,rj=1,\ldots,r. We can view multivariate functional data {F1,,Fn}\{F_{1},\ldots,F_{n}\} with Fi=(fi1,,fiK),i[n]F_{i}=(f_{i1},\ldots,f_{iK})^{\top},\ i\in[n] with cross-observation and correlated cross-component temporal variation as being generated from the model

Fi𝒘i(t)=μ(t)+Ei(t),t[0,1],i[n],F_{i}\circ\bm{w}_{i}(t)=\mu(t)+E_{i}(t),\quad t\in[0,1],\ i\in[n], (1)

where μ=(μ1,,μK)\mu=(\mu_{1},\ldots,\mu_{K}), with μj:[0,1]\mu_{j}:[0,1]\to\mathbb{R} , is a deterministic mean/template multivariate function, Ei=(ei1,,eiK)E_{i}=(e_{i1},\ldots,e_{iK})^{\top} are realisations of an error process with spatially correlated component functions {eij:[0,1]}\{e_{ij}:[0,1]\to\mathbb{R}\}, and 𝒘i=(γi1,,γiK)\bm{w}_{i}=(\gamma_{i1},\ldots,\gamma_{iK}) is a vector of monotone increasing, end-points preserving, time warping functions assuming values in Γ:={γ:[0,1][0,1]γ(0)=0,γ(1)=1,γ˙>0}\Gamma:=\{\gamma:[0,1]\to[0,1]\mid\gamma(0)=0,\gamma(1)=1,\dot{\gamma}>0\}, where γ˙\dot{\gamma} is the time derivative of γ\gamma; here, Fi𝒘i(t)=(fi1γi1,,fiKγiK)F_{i}\circ\bm{w}_{i}(t)=(f_{i1}\circ\gamma_{i1},\ldots,f_{iK}\circ\gamma_{iK})^{\top} represents componentwise warping, where \circ denotes function composition. Note that Γ\Gamma is a group with operation \circ, identity element γid(t)=t\gamma_{id}(t)=t, and the function inverse as the group inverse. We will interchangeably refer to γ\gamma as a warping function or phase depending on context.

The characterising feature of the problem is that, for i[n],j[K]i\in[n],\ j\in[K], the warping functions γij=ξ𝐬jαi\gamma_{ij}=\xi_{\mathbf{s}_{j}}\circ\alpha_{i}, where ξ𝐬1,,ξ𝐬K\xi_{\mathbf{s}_{1}},\ldots,\xi_{\mathbf{s}_{K}} are spatially correlated warping functions at KK spatial locations 𝐬1,,𝐬K\mathbf{s}_{1},\ldots,\mathbf{s}_{K} within a spatial domain 𝒟p\mathcal{D}\subset\mathbb{R}^{p}, with ξ𝐬j,αiΓ\xi_{\mathbf{s}_{j}},\ \alpha_{i}\in\Gamma for each i[n],j[K]i\in[n],\ j\in[K]. The phase γij\gamma_{ij} of a component function fijf_{ij} is thus represented as a combination of warping functions corresponding to cross-observation and cross-component temporal variations. The index jj will be used to denote the spatial location 𝐬j\mathbf{s}_{j}, i.e., ξj\xi_{j} will be used in place of ξ𝐬j\xi_{\mathbf{s}_{j}}, and the two notations will be used interchangeably.

Under the above setup, we can interpret α1,,αn\alpha_{1},\ldots,\alpha_{n} as representing cross-observation temporal variation within {Fi}\{F_{i}\} and ξ1,,ξK\xi_{1},\ldots,\xi_{K} as representing spatially correlated cross-component temporal variation within {Fi}\{F_{i}\}. The two-fold objective is to, jointly:

  1. 1.

    register F1,,FnF_{1},\ldots,F_{n} by estimating {γij:i[n],j[K]}\{\gamma_{ij}:i\in[n],\ j\in[K]\} in a manner that ensures for every i[n]i\in[n], the warping functions γij\gamma_{ij} and γik\gamma_{ik} in nearby spatial locations 𝐬j\mathbf{s}_{j} and 𝐬k\mathbf{s}_{k} in 𝒟\mathcal{D} are ‘similar’, owing to ‘similar’ cross-component warpings ξj\xi_{j} and ξk\xi_{k};

  2. 2.

    estimate components μ1,,μK\mu_{1},\ldots,\mu_{K} of the template μ\mu.

In the context of the EEG application, e.g., in the iith trial for a single subject, we desire signals fijf_{ij} and fikf_{ik} obtained from spatially proximate electrode locations 𝐬j\mathbf{s}_{j} and 𝐬k\mathbf{s}_{k} to have similar phase variation γij\gamma_{ij} and γik\gamma_{ik}, with respect to template components μj\mu_{j} and μk\mu_{k}. The requirement is compatible with findings by Stam et al. (2007) on different response lags of spatially disparate brain regions to presented stimuli.

When the template μ\mu is unknown, as is typical in practice, for each pair (i,j)(i,j), the warping functions αi\alpha_{i} and ξj\xi_{j} are individually not identifiable in a nonparametric (infinite-dimensional) specification of the class Γ\Gamma, and error EiE_{i} without restrictions (e.g., rank-one error model). It is hence not possible to estimate both αi\alpha_{i} and ξj\xi_{j} since their respective variabilities are confounded with the error EiE_{i}; this implies that it is not possible to decompose the overall phase variation given by {γij}\{\gamma_{ij}\} in {Fi}\{F_{i}\} into the cross-observation and cross-component phase variations.

In view of this, our registration procedure will first estimate the cross-component warping functions for each observation i[n]i\in[n] and then use them to estimate γij\gamma_{ij}. Evidently then, the cross-component warping functions ξ1,,ξK\xi_{1},\ldots,\xi_{K} will depend on ii. The key challenge arises from the need to iterate between computing the template components μ1,,μK\mu_{1},\ldots,\mu_{K} and warping functions {γij}\{\gamma_{ij}\} by explicitly incorporating spatial dependence between the, unobserved but observation dependent, cross-component warping functions ξi1,,ξ1K\xi_{i1},\ldots,\xi_{1K}. In Section 4.2, we will consider a simplified geometry of Γ\Gamma under a suitable transformation, that will further clarify the dependence of the cross-component phases on the observations.

3 Overview of elastic functional data registration

The proposed algorithm is based on a componentwise spatially-penalized registration of f1j,,fnjf_{1j},\ldots,f_{nj} using the metric-based elastic functional data analysis framework (Srivastava et al., 2011; Srivastava & Klassen, 2016) of univariate functions. The metric will dually enable us to propose a novel penalty term that quantifies the spatial correlation between the unobserved cross-component phases {ξj}\{\xi_{j}\}, unaffected by the cross-observation phase {αi}\{\alpha_{i}\}. We begin with a brief review of the elastic framework, and for later comparison with the proposed algorithm, also review universal registration of curves under the elastic framework (Srivastava & Klassen, 2016) where all components are assumed to have identical temporal variation.

3.1 Univariate functions

We consider the representation space of univariate functional data objects to be :={f:[0,1]f is absolutely continuous}\mathcal{F}:=\{f:[0,1]\to\mathbb{R}\mid f\text{ is absolutely continuous}\}. The group of warping functions representing phase is Γ={γ:[0,1][0,1]γ(0)=0,γ(1)=1,γ˙>0}\Gamma=\{\gamma:[0,1]\to[0,1]\mid\gamma(0)=0,\gamma(1)=1,\dot{\gamma}>0\}. For any ff\in\mathcal{F}, γΓ\gamma\in\Gamma, the warping of ff by γ\gamma is given by the group action of composition, fγf\circ\gamma. The group-theoretic formulation of phase enables a definition of the amplitude of a function ff as the equivalence class [f]:={fγγΓ}[f]:=\{f\circ\gamma\mid\gamma\in\Gamma\}\subset\mathcal{F}, known as its orbit under the action of Γ\Gamma; thus, fγ[f]f\circ\gamma\in[f] has the same amplitude as ff for each γΓ\gamma\in\Gamma. The amplitude space then is the quotient space /Γ:={[f]f}\mathcal{F}/\Gamma:=\{[f]\mid f\in\mathcal{F}\}.

Separating amplitude and phase requires a metric on the amplitude space /Γ\mathcal{F}/\Gamma. A convenient way to define one is through a metric dd on \mathcal{F} that is invariant to simultaneous warping: for every γΓ,d(f1,f2)=d(f1γ,f2γ)\gamma\in\Gamma,\ d(f_{1},f_{2})=d(f_{1}\circ\gamma,f_{2}\circ\gamma). The standard 𝕃2\mathbb{L}^{2} metric fails to be invariant and Srivastava et al. (2011) thus proposed to use the extended Fisher-Rao (eFR) metric. While direct use of this metric for amplitude-phase separation is difficult in practice, the square-root slope transform can be used to ‘flatten’ the complicated eFR metric on \mathcal{F} to the standard 𝕃2\mathbb{L}^{2} metric on the transformed space. The transform maps fQ(f)=q:=f˙|f˙|1/2f\mapsto Q(f)=q:=\dot{f}|\dot{f}|^{-1/2} (f˙\dot{f} is the time derivative of ff). Given f(0)f(0), QQ is bijective with inverse Q1(q,f(0))(t)=f(t)=f(0)+0tq(u)|q(u)|𝑑uQ^{-1}(q,f(0))(t)=f(t)=f(0)+\int_{0}^{t}q(u)|q(u)|du. Henceforth, for any ff\in{\cal F}, we will refer to q=Q(f)q=Q(f) as its square-root slope function (SRSF).

The transformed space Q()Q(\mathcal{F}) is a subset of 𝕃2([0,1],)\mathbb{L}^{2}([0,1],\mathbb{R}), and is denoted by 𝒬\mathcal{Q}. Under QQ, the eFR metric on \mathcal{F} maps to the standard 𝕃2\mathbb{L}^{2} metric on 𝒬\mathcal{Q}, and thus analysis of SRSFs can be carried out using standard Hilbert space machinery. Warping of ff\in\mathcal{F} by γΓ\gamma\in\Gamma induces the warping action qγ:=(qγ)γ˙1/2q\odot\gamma:=(q\circ\gamma){\dot{\gamma}}^{1/2} on 𝒬\mathcal{Q} equipped with the 𝕃2\mathbb{L}^{2} metric, and the action is by isometries, i.e., for q1,q2𝒬q_{1},\ q_{2}\in\mathcal{Q} and γΓ\gamma\in\Gamma, q1q2=q1γq2γ\|q_{1}-q_{2}\|=\|q_{1}\odot\gamma-q_{2}\odot\gamma\|, since qγ=q\|q\odot\gamma\|=\|q\| for every γΓ,q𝒬\gamma\in\Gamma,\ q\in\mathcal{Q}; \|\cdot\| denotes the 𝕃2\mathbb{L}^{2} norm.

3.1.1 Pairwise registration

Given two functions f1,f2f_{1},\ f_{2}\in\mathcal{F}, amplitude and phase separation through pairwise registration or alignment of f2f_{2} to f1f_{1}, or vice versa, is formulated as the determination of the relative phase of f2f_{2} with respect to f1f_{1} (q1=Q(f1),q2=Q(f2)q_{1}=Q(f_{1}),\ q_{2}=Q(f_{2})):

γ=argminγΓq1q2γ2=argminγΓ01|q1(t)(q2γ)(t)|2𝑑t.\gamma^{*}=\underset{\gamma\in\Gamma}{\arg\min}\ \|q_{1}-q_{2}\odot\gamma\|^{2}=\underset{\gamma\in\Gamma}{\arg\min}\ \int_{0}^{1}|q_{1}(t)-(q_{2}\odot\gamma)(t)|^{2}dt. (2)

The minimization problem in (2) is typically solved using the dynamic programming algorithm. To further regularize pairwise registration, one can instead solve the penalized optimization problem given by

γ=argminγΓ{q1q2γ2+λγ˙12},\gamma^{*}=\underset{\gamma\in\Gamma}{\arg\min}\ \{\|q_{1}-q_{2}\odot\gamma\|^{2}+\lambda\|\sqrt{\dot{\gamma}}-1\|^{2}\}, (3)

where λ\lambda is the regularization parameter. The penalty is defined as the squared 𝕃2\mathbb{L}^{2} distance between the SRSFs of γ\gamma and the identity warping function γid\gamma_{id}, where γid(t)=tt\gamma_{id}(t)=t\ \forall\ t. It is evident that, depending on the magnitude of λ\lambda, this penalty forces the estimated phase to be close to the identity element, i.e., no warping.

3.1.2 Multiple registration

Amplitude-phase separation for a sample of functions fi,i[n],n>2f_{i},\ i\in[n],\ n>2 is carried out with respect to a common template function that must also be estimated. Let qi,i[n]q_{i},\ i\in[n] denote the corresponding SRSFs. A template for multiple registration is estimated via

μ^=argminμ𝒬i=1nminγiΓμqiγi2.\hat{\mu}=\underset{\mu\in\mathcal{Q}}{\arg\min}\sum_{i=1}^{n}\min_{\gamma_{i}\in\Gamma}\|\mu-q_{i}\odot\gamma_{i}\|^{2}. (4)

Then, multiple registration is carried out via pairwise registration of each of qi,i[n]q_{i},\ i\in[n] with respect to μ^\hat{\mu} using (2), resulting in the relative phases γi,i[n]\gamma_{i}^{*},\ i\in[n]. The minimizer μ^\hat{\mu} of (4) is not unique, with any element of the orbit [μ^][\hat{\mu}] resulting in the same value of the cost function due to the isometric action of Γ\Gamma. Thus, for identifiability, we select μ^[μ^]\hat{\mu}\in[\hat{\mu}] such that the relative phases γi,i[n]\gamma_{i}^{*},\ i\in[n] average to γid\gamma_{id}; see Srivastava et al. (2011) for details on how to compute an average of a sample of warping functions.

3.2 Componentwise and universal registration

Let ~={F:[0,1]RKF is absolutely continuous}\tilde{\mathcal{F}}=\{F:[0,1]\to R^{K}\mid F\text{ is absolutely continuous}\} denote the the space of multivariate functional data. Each multivariate function FF contains KK univariate components f1,,fKf_{1},\dots,f_{K}\in\mathcal{F}. Independent, componentwise registration of multivariate functional data is to apply the multiple registration given by (4) to functions in each component independently. This fails to account for any correlation between γij\gamma_{ij} and γik\gamma_{ik} for every i[n]i\in[n] and j,k[K]j,\ k\in[K].

At the other end of the spectrum is universal registration, which treats each FiF_{i} as a parameterised curve t(fi1,,fiK)t\mapsto(f_{i1},\ldots,f_{iK})^{\top} in K\mathbb{R}^{K}, and thus assumes the same relative phase for all components. Registration under such a setup is again available through the elastic framework under a suitable transformation. With a slight abuse in notation, let Q(F)=q=F˙|F˙|1/2Q(F)=q=\dot{F}|\dot{F}|^{-1/2} denote the SRSF of FF, where F˙\dot{F} is the componentwise time derivative of FF and |||\cdot| is the Euclidean norm in K\mathbb{R}^{K}; each SRSF in this case is a function q:[0,1]Kq:[0,1]\to\mathbb{R}^{K} and the space of such SRSFs is denoted by 𝒬~𝕃2([0,1],K)\tilde{\mathcal{Q}}\subset\mathbb{L}^{2}([0,1],\mathbb{R}^{K}). Then, given q1=Q(F1)q_{1}=Q(F_{1}) and q2=Q(F2)q_{2}=Q(F_{2}), the relative phase of F2F_{2} with respect to F1F_{1} is given by

γ=argminγΓq1q2γ2=argminγΓ01|q1(t)(q2γ)(t)|2𝑑t,\gamma^{*}=\underset{\gamma\in\Gamma}{\arg\min}\ \|q_{1}-q_{2}\odot\gamma\|^{2}=\underset{\gamma\in\Gamma}{\arg\min}\ \int_{0}^{1}|q_{1}(t)-(q_{2}\odot\gamma)(t)|^{2}dt, (5)

where again |||\cdot| denotes the Euclidean norm in K\mathbb{R}^{K} and qγq\odot\gamma is applied componentwise. A penalized version of registration can also be implemented in this case by appropriately adapting the optimization problem defined in (3). Further, multiple registration, via estimation of a template μ^𝒬~\hat{\mu}\in\tilde{\mathcal{Q}}, and pairwise registration of each function to μ^\hat{\mu} via (5), follows the approach defined for univariate functional data in Section 3.1.2.

4 Registration of multivariate functions with spatially dependent cross-component phase

We propose a penalized multiple registration method for multivariate functional data wherein cross-component phase in each observation is spatially correlated. Let Fi=(fi1,,fiK)~,i[n]F_{i}=(f_{i1},\dots,f_{iK})^{\top}\in\tilde{\mathcal{F}},\ i\in[n] denote a multivariate functional data sample; we assume that Fi,i[n]F_{i},\ i\in[n] are independent. Each component function fij,i[n],j[K]f_{ij},\ i\in[n],\ j\in[K] is assumed to be an element of \mathcal{F}, and the components fij,j[K]f_{ij},\ j\in[K], for a fixed ii, have spatially dependent phase. Further, let qij=Q(fij)𝒬,i[n],j[K]q_{ij}=Q(f_{ij})\in\mathcal{Q},\ i\in[n],\ j\in[K] denote the SRSFs of the component functions in each observation. Registration of multivariate functional data requires estimation of the overall phase (composition of cross-observation and cross-component phase) for each component in each observation, γijΓ,i[n],j[K]\gamma_{ij}\in\Gamma,\ i\in[n],\ j\in[K]. This facilitates simultaneous synchronization of the component functions across ii and jj.

As a compromise between the two extreme settings of independent componentwise and universal registration described in Section 3.2, we propose a spatially penalized registration approach that takes spatial cross-component phase correlation into account, but allows each component to have its own phase. As described in Section 2, within each FiF_{i} the dependence between the component functions fi1,,fiKf_{i1},\ldots,f_{iK} arises through spatial correlation in the cross-component phases ξ1,,ξK\xi_{1},\ldots,\xi_{K}.

4.1 Spatially penalised componentwise registration

Our approach is to carry out cross-observation registration through a modification of the multiple registration procedure in (4) using a spatially-informed penalty term. For a fixed function component jj, the registration of functions fij,i[n]f_{ij},\ i\in[n], with SRSFs qij,i[n]q_{ij},\ i\in[n], amounts to determination of γij,i[n]\gamma_{ij},\ i\in[n]. However, for each i[n]i\in[n], we note that γij\gamma_{ij} is spatially correlated with γil,lj,l[K]\gamma_{il},\ l\neq j,\ l\in[K], through the spatial correlation between the latent ξj\xi_{j} and ξl,lj\xi_{l},\ l\neq j.

We thus consider a penalised modification of (4) wherein the penalty term for each i[n]i\in[n] depends on the phases γi1,,γi(j1),γi(j+1),,γiK\gamma_{i1},\ldots,\gamma_{i(j-1)},\gamma_{i(j+1)},\ldots,\gamma_{iK}. Our choice is to define the penalty term using (an estimate of) the conditional mean of γij\gamma_{ij} given {γil}lj\{\gamma_{il}\}_{l\neq j}: we wish to discourage γij\gamma_{ij} from assuming values that are far away from its spatially weighted conditional mean. We accordingly consider the following optimization problem for each component j[K]j\in[K]:

(γ1j,,γnj,μ^j)=argminγ1j,,γnjΓ,μj𝒬i=1n{μjqijγij2+λd2(γij,γ~ij)},(\gamma_{1j}^{*},\dots,\gamma_{nj}^{*},\hat{\mu}_{j})=\underset{\gamma_{1j},\dots,\gamma_{nj}\in\Gamma,\ \mu_{j}\in\mathcal{Q}}{\arg\min}\ \sum\limits_{i=1}^{n}\left\{\|\mu_{j}-q_{ij}\odot\gamma_{ij}\|^{2}+\lambda\thinspace d^{2}(\gamma_{ij},\tilde{\gamma}_{ij})\right\}, (6)

where μj\mu_{j} is the template for registration in component jj, λ>0\lambda>0 is a penalty parameter, and dd is a distance on Γ\Gamma; the warping function γ~ij\tilde{\gamma}_{ij} is an estimate of the conditional mean of γij\gamma_{ij} given {γil}lj\{\gamma_{il}\}_{l\neq j}, which we will define next in Section 4.2.

The first term in the objective function in (12) provides a measure of synchronization for component jj, across observations ii, with respect to the template μj\mu_{j}. The second term, a penalty on phase, measures the distance between the estimated phase γij\gamma_{ij} and a target γ~ij\tilde{\gamma}_{ij} determined by the correlated phase in the other components. The regularization penalty attempts to preserve the phase-induced correlation structure in the aligned components. If the function components are assumed to be independent, and γ~ij=γid\tilde{\gamma}_{ij}=\gamma_{id}, the identity warping function, for all ii and jj, the proposed approach is equivalent to KK independent univariate penalized registration problems, as specified in (3), with μj\mu_{j} acting as the template for each component.

4.2 Penalty using spatial model for cross-component phase

We first discuss the choice of distance dd on Γ\Gamma in the penalty term in (6). Elements of the function space Γ\Gamma can be viewed as differentiable, increasing distribution functions of random variables on [0,1][0,1], and thus constitute a nonlinear, convex set. A simplified geometric structure compatible with the 𝕃2\mathbb{L}^{2} norm preserving action under the operation \odot is available under the SRSF transform, under which γQ(γ)=γ˙1/2=:ψ\gamma\to Q(\gamma)=\dot{\gamma}^{1/2}=:\psi. The SRSF map QQ is bijective, and each ψ\psi in its image corresponds to a square-root probability density since 01ψ2(t)dt=1\int_{0}^{1}\psi^{2}(t)\text{d}t=1. As a consequence, the set Ψ:={Q(γ)=:ψ:[0,1]+γΓ}\Psi:=\{Q(\gamma)=:\psi:[0,1]\to\mathbb{R}_{+}\mid\gamma\in\Gamma\} can be identified with the positive orthant of the unit sphere in 𝕃2([0,1],)\mathbb{L}^{2}([0,1],\mathbb{R}). While the natural candidate for distance dd on Γ\Gamma is the intrinsic arc-length distance on Ψ\Psi, we use the extrinsic distance

d(γ1,γ2):=Q(γ1)Q(γ2)=ψ1ψ2,d(\gamma_{1},\gamma_{2}):=\|Q(\gamma_{1})-Q(\gamma_{2})\|=\|\psi_{1}-\psi_{2}\|,

where \|\cdot\| is the standard 𝕃2\mathbb{L}^{2} norm. Our choice is linked to the choice of γ~ij\tilde{\gamma}_{ij} in the penalty term. Observe that with dd defined as above on Γ\Gamma, we have that, for every α,γ1,γ2Γ\alpha,\ \gamma_{1},\ \gamma_{2}\in\Gamma,

d(γ1α,γ2α)=Q(γ1α)Q(γ2α)=ψ1αψ2α=ψ1ψ2,d(\gamma_{1}\circ\alpha,\gamma_{2}\circ\alpha)=\|Q(\gamma_{1}\circ\alpha)-Q(\gamma_{2}\circ\alpha)\|=\|\psi_{1}\odot\alpha-\psi_{2}\odot\alpha\|=\|\psi_{1}-\psi_{2}\|, (7)

and Γ\Gamma acts on itself by isometries under the SRSF map. We thus combine the cross-observation phase αi\alpha_{i} and cross-component phase ξj\xi_{j} using their SRSFs to obtain the SRSF ψij\psi_{ij} of γij\gamma_{ij} as

ψij:=Q(ξjαi)=Q(ξj)αi.\psi_{ij}:=Q(\xi_{j}\circ\alpha_{i})=Q(\xi_{j})\odot\alpha_{i}. (8)

As described in Section 2, our registration procedure will estimate ψij\psi_{ij} by first estimating the spatially correlated cross-component phases for each observation i[n]i\in[n] since αi\alpha_{i} and ξj\xi_{j} are individually not estimable; in other words, we wish to induce dependence in the generative model (1) between the confounded cross-component and cross-observation phases. We achieve this using the SRSF transform of γij\gamma_{ij} in the manner defined in (8).

Under the SRSF transform of Γ\Gamma, the population conditional expectation E[ψijψ1,,ψi(j1),ψi(j+1),,ψiK]E[\psi_{ij}\mid\psi_{1},\ldots,\psi_{i(j-1)},\psi_{i(j+1)},\ldots,\psi_{iK}] is an ideal choice for ψij\psi_{ij}. As with spatially correlated real- or vector-valued observations in traditional statistics modeled using random fields, a viable estimate of the conditional expectation in this context is the spatial interpolant or kriging estimate at location 𝐬j\mathbf{s}_{j} of a functional random field

{ψ𝐬:=Q(ξ𝐬)α,𝐬𝒟}\big{\{}\psi_{\mathbf{s}}:=Q(\xi_{\mathbf{s}})\odot\alpha,\ \mathbf{s}\in\mathcal{D}\big{\}}

assuming values in Ψ\Psi, conditioned on its values at locations 𝐬1,,𝐬j1,𝐬j+1,𝐬K\mathbf{s}_{1},\ldots,\mathbf{s}_{j-1},\mathbf{s}_{j+1},\ldots\mathbf{s}_{K}. In other words, the random field {ψ𝐬}\{\psi_{\mathbf{s}}\} is derived by time-warping the values assumed by another functional random field {ξ𝐬,𝐬𝒟}\{\xi_{\mathbf{s}},\ \mathbf{s}\in\mathcal{D}\} with a fixed αΓ\alpha\in\Gamma.

Computing the kriging estimate of {ψ𝐬}\{\psi_{\mathbf{s}}\} at 𝐬j\mathbf{s}_{j} requires estimation of Q(ξi1),,Q(ξiK)Q(\xi_{i1}),\ldots,Q(\xi_{iK}). Suppose that the components fij,i[n],j[K]f_{ij},\ i\in[n],\ j\in[K] are indexed by spatial locations 𝐬j𝒟,j[K]\mathbf{s}_{j}\in\mathcal{D},\ j\in[K]. Denote by ψij\psi_{ij} the SRSFs of the warping functions γij\gamma_{ij}, which, for every ii, are viewed as values assumed by the random field {ψ𝐬:𝐬𝒟}\{\psi_{\mathbf{s}}:\mathbf{s}\in\mathcal{D}\} defined above at spatial locations 𝐬1,,𝐬K\mathbf{s}_{1},\ldots,\mathbf{s}_{K}. Then, under the assumption that the functional random field {ψ𝐬}\{\psi_{\mathbf{s}}\} is second-order stationary and isotropic, we can consider the phase trace-variogram, defined by Guo et al. (2022) as

V(h)=1201E[ψ𝐬(t)ψ𝐬(t))]2dt=12E(ψ𝐬ψ𝐬2),V(h)=\frac{1}{2}\int_{0}^{1}E\left[\psi_{\mathbf{s}}(t)-\psi_{\mathbf{s}^{\prime}}(t))\right]^{2}\text{d}t=\frac{1}{2}E(\|\psi_{\mathbf{s}}-\psi_{\mathbf{s}^{\prime}}\|^{2}),

using Fubini’s theorem, where h=|𝐬𝐬|h=|\mathbf{s}-\mathbf{s}^{\prime}| and |||\cdot| is the Euclidean norm on the spatial domain 𝒟p\mathcal{D}\subset\mathbb{R}^{p}. For a fixed observation ii, the estimator of V(h)V(h) is given by

V^i(h)=12|N(h)|a,bN(h)ψ^iaψ^ib2,\displaystyle\hat{V}_{i}(h)=\frac{1}{2|N(h)|}\sum\limits_{a,b\in N(h)}\|\hat{\psi}_{ia}-\hat{\psi}_{ib}\|^{2}, (9)

where N(h)={(𝐬a,𝐬b)a,b=1,,K,|𝐬a𝐬b|=h}N(h)=\{(\mathbf{s}_{a},\mathbf{s}_{b})\mid a,b=1,\ldots,K,\ |\mathbf{s}_{a}-\mathbf{s}_{b}|=h\}. For irregularly spaced data, N(h)N(h) can be modified to Nϵ(h)={(𝐬a,𝐬b):|𝐬a𝐬b|(hϵ,h+ϵ)}N_{\epsilon}(h)=\{(\mathbf{s}_{a},\mathbf{s}_{b}):|\mathbf{s}_{a}-\mathbf{s}_{b}|\in(h-\epsilon,h+\epsilon)\} for a small ϵ>0\epsilon>0.

Despite the fact that the spatially correlated cross-component phases ξ1,,ξK\xi_{1},\ldots,\xi_{K} are unobserved and confounded with the cross-observation phases α1,,αn\alpha_{1},\ldots,\alpha_{n}, we can access their correlation structure by estimating γij\gamma_{ij} using V(h)V(h) and V^(h)\hat{V}(h). To see this, note that due to the isometry property in (7),

E(ψ𝐬ψ𝐬2)=E(Q(ξ𝐬)αQ(ξ𝐬)α)2)=E(Q(ξ𝐬)Q(ξ𝐬)2),\displaystyle E(\|\psi_{\mathbf{s}}-\psi_{\mathbf{s}^{\prime}}\|^{2})=E(\|Q(\xi_{\mathbf{s}})\odot\alpha-Q(\xi_{\mathbf{s}^{\prime}})\odot\alpha)\|^{2})=E(\|Q(\xi_{\mathbf{s}})-Q(\xi_{\mathbf{s}^{\prime}})\|^{2}),

and VV thus equals the phase-trace variogram of the functional random field {Q(ξ𝐬)}\{Q(\xi_{\mathbf{s}})\}; a similar argument applies to the estimate V^\hat{V} as well. We thus note that the phase-trace variogram is invariant to (simultaneous) warping (Guo et al., 2022, Lemma 1), and this crucial observation motivates our use of the extrinsic distance on Ψ\Psi and the functional random field {ψ𝐬}\{\psi_{\mathbf{s}}\} with values in Ψ\Psi.

We now move on to computing the kriging estimate ψ~ij\tilde{\psi}_{ij} using the variogram estimate V^\hat{V}. For component jj in observation ii, given phase for the other components {ψil}lj\{\psi_{il}\}_{l\neq j}, the conditional mean phase is given by the weighted average

ψ~ij=ljζijlψil,where ljζijl=1,ζijl0.\tilde{\psi}_{ij}=\sum\limits_{l\neq j}\zeta_{ijl}\psi_{il},\quad\text{where }\sum\limits_{l\neq j}\zeta_{ijl}=1,\ \zeta_{ijl}\geq 0. (10)

The coefficient vector 𝜻ij={ζijl}lj\bm{\zeta}_{ij}=\{\zeta_{ijl}\}_{l\neq j} is implicitly defined as the minimizer of

𝜻ijEψ~ijψij2.\displaystyle\bm{\zeta}_{ij}\mapsto E\|\tilde{\psi}_{ij}-\psi_{ij}\|^{2}. (11)

Guo et al. (2022) show that 𝜻ij\bm{\zeta}_{ij} can be estimated using a quadratic optimization problem that depends on the trace-variogram V(h)V(h) and pairwise spatial distances |𝐬a𝐬b||\mathbf{s}_{a}-\mathbf{s}_{b}|, for a,b=1,,Ka,b=1,\ldots,K. The estimator ψ~ij\tilde{\psi}_{ij} is subsequently normalized using ψ~ijψ~ij/ψ~ij\tilde{\psi}_{ij}\to\tilde{\psi}_{ij}/\|\tilde{\psi}_{ij}\| to correspond to a valid warping function.

Summarily, for the iith observation, if ψ~ij\tilde{\psi}_{ij} denotes the kriging or predicted value at 𝐬j\mathbf{s}_{j}, the penalty term in (6) becomes d(γij,γ~ij)=ψijψ~ijd(\gamma_{ij},\tilde{\gamma}_{ij})=\|\psi_{ij}-\tilde{\psi}_{ij}\|, and the optimization problem for registration in (6) with the spatial penalty assumes the specific form

(γ1j,,γnj,μ^j)=argminγ1j,,γnjΓ,μj𝒬i=1n{μjqijγij2+λψijψ~ij2}.(\gamma_{1j}^{*},\dots,\gamma_{nj}^{*},\hat{\mu}_{j})=\underset{\gamma_{1j},\dots,\gamma_{nj}\in\Gamma,\ \mu_{j}\in\mathcal{Q}}{\arg\min}\ \sum\limits_{i=1}^{n}\left\{\|\mu_{j}-q_{ij}\odot\gamma_{ij}\|^{2}+\lambda\|\psi_{ij}-\tilde{\psi}_{ij}\|^{2}\right\}. (12)

4.3 Registration algorithm and implementation details

The optimization problem for registration in (12) is solved in an iterative fashion in Algorithm 1 given below, where tools from Section 4.2 are used to model the spatially dependent cross-component phase within each observation. Figure 2 provides a high-level diagrammatic representation of Algorithm 1.

Refer to caption
Figure 2: High-level overview of Algorithm 1.

More specifically, given component functions {fij,i[n],j[K]}\{f_{ij},\ i\in[n],\ j\in[K]\} of multivariate functional data F1,,FnF_{1},\ldots,F_{n}, we first transform them to obtain their SRSFs {qij}\{q_{ij}\}. Then, for a fixed penalty parameter λ>0\lambda>0, Algorithm 1 can be decomposed into two blocks:

Initalization:

  1. (i)

    Obtain initial template component functions μ^1(0),,μ^K(0)\hat{\mu}^{(0)}_{1},\ldots,\hat{\mu}^{(0)}_{K} by performing independent componentwise registration of {q1j,,qnj}\{q_{1j},\ldots,q_{nj}\} for each j[K]j\in[K] given by (4);

  2. (ii)

    initialize the phase functions to the identity warping by initializing their SRSFs to ψij(0)1,i[n],j[K]\psi^{(0)}_{ij}\equiv 1,\ i\in[n],\ j\in[K];

  3. (iii)

    for each observation i[n]i\in[n], implement multiple registration of {qi1,,qiK}\{q_{i1},\ldots,q_{iK}\} using (4) to estimate cross-component phase ξi1,,ξiK\xi_{i1},\ldots,\xi_{iK};

  4. (iv)

    for each observation i[n]i\in[n], compute weights ζi1,,ζiK\zeta_{i1},\ldots,\zeta_{iK} required for the kriging estimate using the estimated phase trace variogram V^i\hat{V}_{i}.

Iteration: Starting with the initial values, obtain phases {γ^ij}\{\hat{\gamma}_{ij}\} and template components {μ^j}\{\hat{\mu}_{j}\} by iterating and updating, until convergence.

  1. (i)

    Compute ψ~ij\tilde{\psi}_{ij} in penalty using kriging coefficients {ζij}j\zeta_{ij\ell}\}_{\ell\neq j} and phases {γ^i}j\{\hat{\gamma}_{i\ell}\}_{\ell\neq j};

  2. (ii)

    solve γ^ij=argminγΓ{μ^jqijγ2+λγ˙ψ~ij2}\hat{\gamma}_{ij}=\underset{\gamma\in\Gamma}{\arg\min}\ \left\{\|\hat{\mu}_{j}-q_{ij}\odot\gamma\|^{2}+\lambda\|\sqrt{\dot{\gamma}}-\tilde{\psi}_{ij}\|^{2}\right\};

  3. (iii)

    compute template components {μ^j}\{\hat{\mu}_{j}\} by averaging {qijγ^ij}\{q_{ij}\odot\hat{\gamma}_{ij}\} over ii.

Algorithm 1 Spatially penalized registration of multivariate functions
1:Input: SRSFs qij,i[n],j[K]q_{ij},\ i\in[n],\ j\in[K] and regularization parameter λ\lambda.
2:Output: Estimated componentwise template functions μ^j,j[K]\hat{\mu}_{j},\ j\in[K], and warping functions γ^ij,i[n],j[K]\hat{\gamma}_{ij},\ i\in[n],\ j\in[K].
3:for  j=1j=1 to KK do
4:     Multiple alignment of qij,i[n]q_{ij},\ i\in[n] via (4) to initialize the template μ^j(0)\hat{\mu}_{j}^{(0)}.
5:end for
6:for  i=1i=1 to nn do
7:     Multiple alignment of {qij,j[K]}\{q_{ij},\ j\in[K]\} via (4) to estimate transformed cross-component phase {ξ^ij,j[K]}\{\hat{\xi}_{ij},\ j\in[K]\};
8:     Estimation of trace-variogram V^i(h)\hat{V}_{i}(h) in (9) using {ξ^ij,j[K]}\{\hat{\xi}_{ij},\ j\in[K]\};
9:     Estimation of coefficient vector 𝜻ij\bm{\zeta}_{ij} using V^i\hat{V}_{i} (Section 4.2).
10:end for
11:Set z=0z=0, ϵ1>0\epsilon_{1}>0 and ψ^ij(0)(t)=1,i[n],j[K]\hat{\psi}_{ij}^{(0)}(t)=1,\ i\in[n],\ j\in[K] with μ^j(0)=n1i=1nqij\hat{\mu}_{j}^{(0)}=n^{-1}\sum_{i=1}^{n}q_{ij}, j[K]j\in[K].
12:while z=0z=0 or j=1Kμ^j(z)μ^j(z1)>ϵ1\sum_{j=1}^{K}\|\hat{\mu}_{j}^{(z)}-\hat{\mu}_{j}^{(z-1)}\|>\epsilon_{1} do
13:     for  i=1i=1 to nn do
14:         Initialization of k=0k=0, ϵ2>0\epsilon_{2}>0 small;
15:         while k=0k=0 or j=1Kψ^ij(k)ψ^ij(k1)2>ϵ2\sum_{j=1}^{K}\|\hat{\psi}_{ij}^{(k)}-\hat{\psi}_{ij}^{(k-1)}\|^{2}>\epsilon_{2} do
16:              for  j=1j=1 to KK do
17:                  Estimation of ψ~ij\tilde{\psi}_{ij} using 𝜻ij\bm{\zeta}_{ij} and [ψ^i1(k+1),,ψ^i(j1)(k+1),ψ^i(j+1)(k),,ψ^iK(k)][\hat{\psi}^{(k+1)}_{i1},...,\hat{\psi}^{(k+1)}_{i(j-1)},\hat{\psi}^{(k)}_{i(j+1)},...,\hat{\psi}^{(k)}_{iK}];
18:                  Solve γ^ij=argminγΓ{μ^j(z)qijγ2+λγ˙ψ~ij2}\hat{\gamma}_{ij}=\underset{\gamma\in\Gamma}{\arg\min}\ \left\{\|\hat{\mu}^{(z)}_{j}-q_{ij}\odot\gamma\|^{2}+\lambda\|\sqrt{\dot{\gamma}}-\tilde{\psi}_{ij}\|^{2}\right\} and compute SRSF ψ^ij(k+1)\hat{\psi}^{(k+1)}_{ij} of γ^ij\hat{\gamma}_{ij}.
19:                  Set k=k+1k=k+1.
20:              end for
21:         end while
22:         Set γ^ij=γ^ij(k)\hat{\gamma}_{ij}=\hat{\gamma}_{ij}^{(k)}
23:     end for
24:     Set μ^j(z+1)=1ni=1n(qijγ^ij),j[K]\hat{\mu}_{j}^{(z+1)}=\frac{1}{n}\sum_{i=1}^{n}(q_{ij}\odot\hat{\gamma}_{ij}),\ j\in[K];
25:     Set z=z+1z=z+1.
26:end while

Algorithm 1 contains nested while loops (lines 12-26). The inner loop indexed by kk (lines 15-21) updates the estimated warping functions given the templates for all components. The outer loop index by zz (lines 12-26) updates the template for each component given the estimated warping functions. The convergence of Algorithm 1 is assessed empirically using simulation studies in Section 5.

5 Simulation studies

We conduct simulation studies to assess the performance of the proposed multiple registration approach for multivariate functional data. Specifically, we estimate the componentwise template functions {μj,j[K]}\{\mu_{j},\ j\in[K]\} using three different approaches: (1) the proposed method, which utilizes spatial cross-component phase correlation in penalized registration (Section 4 and Algorithm 1), (2) independent componentwise registration (Section 3.1 and Section 3.4 in (Srivastava et al., 2011)), and (3) universal registration wherein each component is assumed to have the same phase (Section 3.2).

5.1 Data generating model

We consider multivariate functional data wherein dependence between function components is induced by spatial correlation. In this setting, each component of a simulated observation is indexed by a spatial coordinate 𝐬p\mathbf{s}\in\mathbb{R}^{p}; component jj for observation ii is denoted by fi,𝐬jf_{i,\mathbf{s}_{j}}. Component functions of the data are generated using the model:

fi,𝐬j(t)=(μ𝐬j+ei,𝐬j)(ξi,𝐬jαi)(t),t[0,1],i[n],j[K].f_{i,\mathbf{s}_{j}}(t)=(\mu_{\mathbf{s}_{j}}+e_{i,\mathbf{s}_{j}})\circ(\xi_{i,\mathbf{s}_{j}}\circ\alpha_{i})(t),\quad t\in[0,1],\ i\in[n],\ j\in[K].

In the model, μ𝐬j\mu_{\mathbf{s}_{j}}\in\mathcal{F} denotes the template for component jj indexed by spatial location 𝐬j\mathbf{s}_{j}, which is the object that we wish to estimate through a registration procedure. The random functional error ei,𝐬je_{i,\mathbf{s}_{j}} and cross-component warping functions ξi,𝐬jΓ\xi_{i,\mathbf{s}_{j}}\in\Gamma are also indexed by the spatial location. The warping function αiΓ\alpha_{i}\in\Gamma denotes the observation specific warping function (common across all components).

We consider two simulation settings characterised by choice of the component template functions (see below). For both settings, the random phase components are composed of two warping functions: αiΓ\alpha_{i}\in\Gamma, the phase that is common across components for observation ii, and ξi,𝐬jΓ\xi_{i,\mathbf{s}_{j}}\in\Gamma, the cross-component phase within observation ii. Such a nested structure is similar to the structure in a mixed-effects model. The warping functions αi\alpha_{i} are taken to be cumulative distribution functions (CDFs) of a beta distribution, Beta(1,exp(zi))\text{Beta}(1,\exp(z_{i})), with random parameter ziUnif[Z,Z]z_{i}\sim\text{Unif}[-Z,Z]. The cross-component phase ξi,𝐬j\xi_{i,\mathbf{s}_{j}} is also the CDF of a beta distribution, Beta(1,exp(bi,𝐬j))\text{Beta}(1,\exp(b_{i,\mathbf{s}_{j}})), where (bi,𝐬1,,bi,𝐬K)(b_{i,\mathbf{s}_{1}},...,b_{i,\mathbf{s}_{K}})^{\top} follows the correlated uniform distribution on [B,B][-B,B], independently for every ii; a sample from the correlated uniform distribution can be generated by transforming a correlated multivariate normal sample with mean (0,,0)T(0,\dots,0)^{T} and Matern covariance CMat(,;1,0.5,)C_{Mat}(\cdot,\cdot;1,0.5,\ell). The parameters ZZ and BB control magnitudes of the cross-observation and cross-component phase variations; the range parameter is set to =0.5×dmax\ell=0.5\times d_{max}, where dmaxd_{max} is the maximum spatial distance between the simulated sites 𝐬j{\mathbf{s}_{j}}.

Simulation setting 1: In this setting the component template functions {μ𝐬j}\{\mu_{\mathbf{s}_{j}}\} have a specific bimodal form. We consider K=20K=20 components where the spatial coordinates 𝐬j\mathbf{s}_{j} are generated using a uniform distribution on [2,2]22[-2,2]^{2}\subset\mathbb{R}^{2}. The componentwise template functions are generated as μ𝐬j(t)=a1𝐬jexp(100(t1/3)2)+a2𝐬jexp(100(t2/3)2)\mu_{\mathbf{s}_{j}}(t)=a_{1\mathbf{s}_{j}}\exp(-100(t-1/3)^{2})+a_{2\mathbf{s}_{j}}\exp(-100(t-2/3)^{2}), where (am,𝐬1,,am𝐬K),m=1,2(a_{m,\mathbf{s}_{1}},\ldots,a_{m\mathbf{s}_{K}})^{\top},\ m=1,2 are independently sampled from a multivariate normal distribution with mean vector (3,,3)(3,\ldots,3)^{\top} and Matern covariance CMat(,;σa2,0.5,)C_{Mat}(\cdot,\cdot;\sigma_{a}^{2},0.5,\ell); here, σa2\sigma_{a}^{2} is the scale parameter, \ell is the range, and the smoothing parameter is fixed to 0.50.5. The random errors (ei,𝐬1(t),,ei,𝐬K(t))(e_{i,\mathbf{s}_{1}}(t),\ldots,e_{i,\mathbf{s}_{K}}(t))^{\top} are generated independently for each observation ii in a pointwise manner: for each value of tt we generate a sample from the multivariate normal distribution with mean vector (0,,0)(0,\ldots,0)^{\top} and covariance CMat(,;σe2,0.5,)C_{Mat}(\cdot,\cdot;\sigma_{e}^{2},0.5,\ell). The other covariance parameters are set to σa=1\sigma_{a}=1, σe=0.5\sigma_{e}=0.5 and =0.5×dmax\ell=0.5\times d_{max}.

Simulation setting 2: In this setting, in order to replicate the structure of EEG data, the component template functions {μ𝐬j}\{\mu_{\mathbf{s}_{j}}\} are chosen to possess the (typical) shape of EEG signals, i; thus each component is treated as a signal from an EEG electrode. We consider 16 electrode locations on the scalp with three-dimensional coordinates. To imitate the shape variation in EEG signals from different brain regions, we use a more flexible data generating model. The componentwise template functions are generated using 10 B-spline basis functions, Bm,m[10]B_{m},\ m\in[10], of order 4 on the interval [0,1][0,1]: μ𝐬j(t)=k=110βk,𝐬jBk(t)\mu_{\mathbf{s}_{j}}(t)=\sum_{k=1}^{10}\beta_{k,\mathbf{s}_{j}}B_{k}(t). The B-spline basis coefficients (βm,𝐬1,,βm,𝐬K),m[10](\beta_{m,\mathbf{s}_{1}},\ldots,\beta_{m,\mathbf{s}_{K}})^{\top},\ m\in[10] are generated independently from a multivariate normal distribution with mean vector (0,,0)(0,\ldots,0)^{\top} and Matern covariance CMat(,;σa2,0.5,)C_{Mat}(\cdot,\cdot;\sigma_{a}^{2},0.5,\ell). Random functional errors, ei,𝐬je_{i,\mathbf{s}_{j}}, are generated in the same way as in Simulation setting 1, but under a lower signal-to-noise ratio: the covariance parameters are set to σa=2\sigma_{a}=2, σe=0.5\sigma_{e}=0.5 or 11 and =0.5×dmax\ell=0.5\times d_{max}.

5.2 Assessing performance of template estimation

Denote by f~i,𝐬j\tilde{f}_{i,\mathbf{s}_{j}} and q~i,𝐬j\tilde{q}_{i,\mathbf{s}_{j}} the aligned multivariate functions and their SRSFs obtained after registration. We define two performance metrics that quantify the accuracy of template estimation based on the registered functions:

MSE=1Knj=1Ki=1nf~i,𝐬jμ𝐬j2,QMSE=1Knj=1Ki=1nq~i,𝐬jQ(μ𝐬j)2,\text{MSE}=\frac{1}{Kn}\sum\limits_{j=1}^{K}\sum\limits_{i=1}^{n}\|\tilde{f}_{i,\mathbf{s}_{j}}-\mu_{\mathbf{s}_{j}}\|^{2},\quad\text{QMSE}=\frac{1}{Kn}\sum\limits_{j=1}^{K}\sum\limits_{i=1}^{n}\|\tilde{q}_{i,\mathbf{s}_{j}}-Q(\mu_{\mathbf{s}_{j}})\|^{2},

where μ𝐬j\mu_{\mathbf{s}_{j}} is the true template for component jj. MSE, based on the 𝕃2\mathbb{L}^{2} norm on \mathcal{F}, is sensitive to phase errors, while QMSE, based on the eFR metric on \mathcal{F}, is sensitive to shape differences (Srivastava & Klassen, 2016).

We use n=20, 20n=20,\ 20 and K=20, 16K=20,\ 16, respectively, for the first and second simulation settings described in Section 5.1; we replicate each simulation 50 times. When implementing the proposed method, we use 4-fold cross-validation to select a value for the regularization parameter λ\lambda. Denote by μ^𝐬j[k]\hat{\mu}^{[-k]}_{\mathbf{s}_{j}} the estimated template function for component jj based on data in all folds except the kkth one; here, we use [k][k] to denote the set of observation indices in fold kk. Then, the value for the regularization parameter λ\lambda is chosen by minimizing the criterion

14Knk=14j=1Ki[k]fi,𝐬jγ^i,𝐬jμ^𝐬j[k](λ)2,\frac{1}{4Kn}\sum\limits_{k=1}^{4}\sum\limits_{j=1}^{K}\sum\limits_{i\in[k]}\|f_{i,\mathbf{s}_{j}}\circ\hat{\gamma}_{i,\mathbf{s}_{j}}-\hat{\mu}^{[-k]}_{\mathbf{s}_{j}}(\lambda)\|^{2},

where γ^i,𝐬j=argminγΓQ(μ^𝐬j[k](λ))Q(fi,𝐬j)γ2\hat{\gamma}_{i,\mathbf{s}_{j}}=\arg\min_{\gamma\in\Gamma}\|Q(\hat{\mu}^{[-k]}_{\mathbf{s}_{j}}(\lambda))-Q(f_{i,\mathbf{s}_{j}})\odot\gamma\|^{2}. That is, we align the functions in the validation set (fold kk) to the template function estimated using the training set (all folds except kk). The selected value for the regularization parameter is then used for registration of all observations.

Results for each simulation setting described in Section 5.1 are summarized in the boxplots shown in Figures 3 and 4, respectively. In all cases, we also report values of the error metrics when no registration is applied to the data; this serves as the baseline.

Refer to caption
Figure 3: Simulation setting 1 with Z=0.5Z=0.5 and σe=0.5\sigma_{e}=0.5. Boxplots of the MSE (left) and QMSE (right) for template functions estimated with no registration (red), componentwise registration (green), universal registration (cyan) and proposed penalized registration (purple). The scale parameter of cross-component phase variation is set to B=0B=0 or B=0.25B=0.25.
Refer to caption
Figure 4: Simulation setting 2 with Z=0.5Z=0.5 and B=0B=0. Boxplots of the MSE (left) and QMSE (right) for template functions estimated with no registration (red), componentwise registration (green), universal registration (cyan) and proposed penalized registration (purple). The noise levels are set to σe=0.5\sigma_{e}=0.5 or σe=1\sigma_{e}=1. In the case of σe=1\sigma_{e}=1, we also present results after applying smoothing splines with a low value of the smoothing parameter.

Simulation setting 1: We observe in Figure 3 that both componentwise registration and the proposed method clearly have smaller MSEs than universal registration; further, the mean MSE for the proposed method (0.083) is slightly lower than the mean MSE for the componentwise method (0.099). With respect to QMSE, the performance of componentwise registration is clearly worse than the proposed approach. Interestingly, universal registration outperforms componentwise registration in terms of median QMSE.

The improvement in registration quality with the proposed method over the componentwise approach is due to the addition of the regularization term. Componentwise registration tends to ‘overalign’ features of the component functions, e.g., peaks and valleys, that are due to the random error, i.e., the estimate of cross-component phase is too flexible. On the other hand, the universal approach performs poorly since all components within an observation are constrained to be identical, i.e., the estimate of cross-component phase is too restrictive. The compromise achieved by the proposed method between the componentwise and universal methods by introducing a regularization penalty that synthesizes spatial phase information from all components within an observation results in improved registration. Finally, we note that the performance of the proposed method is consistently better than the other two approaches, even if no cross-component phase variation exists in the data (the case when B=0B=0).

Simulation setting 2: Recall that the setting here is designed to replicate the structure in EEG data. The low signal-to-noise ratio presents significant challenges for all of the registration procedures. Figure 4 shows that the componentwise approach, expectedly, is the most sensitive to the random function errors due to its lack of regularization. Despite the good performance of this method in the first simulation setting, the low signal-to-noise ratio when σe=1\sigma_{e}=1 significantly deteriorates the template estimates, especially with respect to QMSE. In fact, when σe=1\sigma_{e}=1, the advantages of registration are not obvious since all of the procedures are essentially driven by random noise; regularization with the spatial penalty provides some help, but there is insufficient structure in {fi,𝐬j}\{f_{i,\mathbf{s}_{j}}\} to inform estimation of the conditional mean phase based on cross-component spatial correlation. In such situations, we recommend slight smoothing of the observed functions prior to registration using any of-the-shelf smoothing procedure (e.g., splines). To examine value in the pre-smoothing, we report additional results in Figure 4 after applying smoothing splines, with a small value for the smoothing parameter, to the simulated data when σe=1\sigma_{e}=1. The presented results show that the proposed registration method produces more accurate and stable template estimates than the other two registration approaches.

In Figure 5 we provide results of template estimation for a single component in one simulation run when σe=0.5\sigma_{e}=0.5. The simulated data for this component is shown in grey with the ground truth template in black. The estimated templates are shown in red. In Panel 1 of Figure 5, it is evident that when no registration is performed, the resulting template underestimates relevant shape features, e.g., the two peaks and one valley between t=0.4t=0.4 and t=0.75t=0.75; in Panel 2, componentwise registration results in a template that generates additional shape features due to ‘overalignment’ of noise, e.g., the two small peaks and three valleys between t=0t=0 and t=0.3t=0.3; in Panel 3, universal registration results in a template that has the ‘correct’ shape, but that is out of phase with respect to the ground truth template, e.g., there is significant lag between t=0.7t=0.7 and t=1t=1; Finally, in Panel 4, we note that the proposed approach results in a template that is very similar in shape to the ground truth and is largely in phase.

Refer to caption
Figure 5: Registration result for a single component in one simulation replicate under Simulation setting 2 with σe=0.5\sigma_{e}=0.5. Panel 1: Simulated data in grey with cross-sectional average without registration in red. Panels 2-4: Aligned functions in grey with estimated template in red, generated using componentwise registration, universal registration, and the proposed method, respectively. The ground truth template function is shown in black.

5.3 Assessing convergence of Algorithm 1

We empirically assess convergence properties of Algorithm 1 by monitoring the average cost function and the relative change in estimated phase components with respect to the iteration counter. We use a single simulation example generated under Simulation setting 1 with B=0B=0. To capture the complete trend of convergence, we remove the stopping rule in the algorithm and complete a fixed number of iterations. Recall that the algorithm has nested while loops. The inner loop updates the template for each component, while the outer loop updates the estimated warping functions given the templates. The number of iterations for the outer loop is set to 20; the number of iterations for the inner loop is set to 10. In our evaluation, the overall number of iterations is defined as the cumulative count of the iterations completed in the inner loop.

The first quantity of interest is the average of the cost function in (12) across all components j[K]j\in[K]. The second is the relative change in the estimated warping functions between iterations: δ(k)=1Kni=1nj=1Kψ^ij(k)ψ^ij(k1)2,k=2,\delta(k)=\frac{1}{Kn}\sum_{i=1}^{n}\sum_{j=1}^{K}\|\hat{\psi}_{ij}^{(k)}-\hat{\psi}_{ij}^{(k-1)}\|^{2},\ k=2,\dots. The left panel in Figure 6 is a plot of the average cost function after each inner (red) and outer (black) loop update with respect to the iteration counter. It appears that the algorithm has converged after approximately 100100 iterations. The right panel in Figure 6 plots the value of δ\delta with respect to the iteration counter. Overall, the value of δ\delta decreases with jumps occurring after an update of the component templates. The magnitudes of these jumps gradually converge to 0 as the component templates also converge.

Refer to caption
Figure 6: Left: Average cost function at each iteration of the inner (red, dashed) and outer (black, solid) loop. Right: Relative change in estimated warping functions δ\delta.

5.4 Comparison with regularization towards identity warping

The main advantage of the proposed registration method is the regularization of estimated warping functions by use of the spatial penalty term. Regularization reduces the complexity and variance of the estimated phase component and guards against ‘overalignment’ of function features that may be due to noise. In existing functional data analysis literature, most regularized registration methods use a regularization penalty that forces estimated warping functions to be close to the identity element, with respect to some prespecified distance on Γ\Gamma (Srivastava & Klassen, 2016), e.g., the squared extrinsic Fisher-Rao distance given by ψ12\|\psi-1\|^{2}, where ψ=Q(γ)\psi=Q(\gamma) and 1=Q(γid)1=Q(\gamma_{id}); see Section 3.1.1. However, in the case of multivariate functional data wherein phase is spatially dependent, the correlation structure in cross-component phase enables us to regularize estimated warping functions toward data-driven targets rather than the fixed identity warping function. To assess this benefit, we compare the estimated warping functions generated using the proposed approach to a regularized version of componentwise registration as specified by (3).

(a) (b)
Refer to caption Refer to caption
Figure 7: Estimated warping functions for (a) all components of a fixed observation ii and (b) a fixed component at location 𝐬j\mathbf{s}_{j} of all observations, under Simulation setting 1 for different values of the regularization parameter λ\lambda. Top row: Independent componentwise regularized registration. Bottom row: Proposed approach.

Figure 7(a) presents estimated warping functions for all j[K]j\in[K] components of the same observation ii in a randomly selected simulation run under Simulation setting 1. We consider three different values for the regularization parameter: λ=1×104\lambda=1\times 10^{-4}, 1×1011\times 10^{-1}, and 1×1031\times 10^{3}. The results of regularized componentwise registration are given in the top row while the results generated by the proposed approach are shown in the bottom row.

The magnitude and roughness of the cross-component phase estimated using regularized componentwise registration are penalized toward the identity warping; in regularized componentwise registration, when λ=1000\lambda=1000, the estimated warping functions all converge to the same identity element. On the other hand, the proposed approach allows estimated warping functions to converge to a more flexible target (see bottom right panel of Figure 7(a)) when λ\lambda increases. This is because the kriging estimate ψ~ij\tilde{\psi}_{ij} in the penalty can vary from observation to observation. This phenomenon is also observed in Figure 7(b) where we fix a component jj and display the estimated cross-observation phases. The regularized componentwise registration forces all warping functions towards the identity when λ\lambda is greater than 1. These results imply very small cross-observation phase variation, which is not true for the simulated data. On the other hand, the proposed penalized registration captures the cross-observation phase variation even if λ\lambda is fairly large. Based on these findings, we see that the proposed approach reduces the discrepancy of estimated warping functions for components in each observation and preserves the magnitude of estimated cross-observation phases. The proposed approach using the spatial penalty results in warping functions with appropriate smoothness and magnitude, but does not force them toward identity regardless of cross-observation phase variation.

6 Real data analysis

We apply the proposed penalized registration method to two multivariate functional datasets. The first consists of multi-trial EEG signals for alcoholism patients discussed earlier, and the second considers annual ozone concentration functions observed at different locations in a small area in northern California. In both cases, we use 4-fold cross-validation to select a value for λ\lambda in the proposed registration approach.

6.1 Electroencephalogram data

We analyze an EEG dataset arising from a study that examined EEG correlates of genetic predisposition to alcoholism (Bache & Lichman, 2013). The study resulted in EEG measurements recorded at 64 electrodes placed on the subjects’ scalps, with a sampling rate of 256 Hz (3.9-msec epoch) for 1 second. Each subject was exposed to a stimulus in each trial and completed multiple trials where different stimuli were shown. The study used standard 64-channel electrode placements as defined by the American Electroencephalographic Association. In our analysis, we use EEG signals from 50 trials for one subject, and treat each trial as an independent multivariate functional observation. Prior to analysis, each EEG signal was smoothed using smoothing splines with a small parameter value of 1×1051\times 10^{-5}. To compare the performance of different methods, we register the 50 trials using the componentwise, universal and proposed registration approaches. After registration, we average the aligned signals recorded by each electrode across trials to estimate the mean event related potentials.

For easy comparison across the three methods, we display the registration results for electrode AFZ in Figure 8. As in the simulations, we also display the result of averaging across trials when no registration is applied to the data as a baseline (Panel 1). The electrode AFZ was chosen as an illustration since many of the EEG signals recorded there contain a pronounced activation peak between t=0.75t=0.75 and t=1t=1. We observe similar results in this real EEG data example, based on the different registration methods, as in the simulation studies. In Panel 1, when no alignment is applied, the pronounced activation peak is almost completely nonexistent in the average signal shown in red; thus, registration of the data is necessary prior to averaging. In Panel 2, the componentwise registration method tends to ‘overalign’ many of the relatively small modes present in the signals that are likely due to noise. This results in an average that has many small fluctuations that are indistinguishable from the pronounced activation peak. In Panels 3 and 4, the universal and proposed methods are effective in capturing the large peak in the respective average signals and tend to produce fewer small fluctuations prior to t=0.75t=0.75. Essentially, these two methods mitigate the contribution of noise and result in average signals that reflect the most prominent feature in the given data. The magnitude of the pronounced activation peak is similar in the two averages, but it is stretched over a longer part of the domain in the average produced using the universal method.

In EEG data analysis, it is often of interest to study the relationship between average signals (across trials) at different electrode locations after registration. To show the benefits of the proposed approach in downstream EEG data analysis tasks, we explore the spatial correlation between estimated average EEG signals using the empirical trace-variogram (Giraldo et al., 2011), defined as

V^(h)=12|Nϵ(h)|a,bNϵ(h)μ^saμ^sb2,\displaystyle\hat{V}(h)=\frac{1}{2|N_{\epsilon}(h)|}\sum\limits_{a,b\in N_{\epsilon}(h)}\|\hat{\mu}_{s_{a}}-\hat{\mu}_{s_{b}}\|^{2}, (13)

where Nϵ(h)={(𝐬a,𝐬b):𝐬a𝐬b(hϵ,h+ϵ)}N_{\epsilon}(h)=\{(\mathbf{s}_{a},\mathbf{s}_{b}):\|\mathbf{s}_{a}-\mathbf{s}_{b}\|\in(h-\epsilon,h+\epsilon)\} for a small ϵ>0\epsilon>0 and μ^sa\hat{\mu}_{s_{a}} denotes the estimated average signal for electrode located at 𝐬a,a=1,,64\mathbf{s}_{a},\ a=1,\ldots,64. The empirical trace-variograms, computed based on average signals generated by the three different registration methods, are shown in the left panel of Figure 9 (componentwise in red, universal in blue, proposed in green). The variograms suggest very similar spatial correlation patterns between the average signals, but the proposed approach produces averages with smallest spatial variation. The universal approach results in more spatial variation due to its very restrictive assumption of common phase across all channels. The magnitude of spatial variation in this case is similar to the proposed method at small spatial distances. This is intuitive since cross-component phase variation at nearby electrodes is very small and the common phase assumption is reasonable. However, as the spatial distance increases, this assumption becomes unrealistic.

The componentwise method conducts separate alignment of EEG signals at each electrode, resulting in large phase variation across electrode locations. On the other hand, the proposed method avoids this issue by accounting for the spatial phase correlation across electrodes through the regularization penalty. To demonstrate this, we consider the 14 electrodes in the area corresponding to the parietal lobe. Since all of the 14 electrodes are related to the same brain region, we should expect very little phase variation in the resulting average signals. The corresponding estimated averages are shown in the middle panel in Figure 9 for the proposed method and the right panel in the same figure for the componentwise method. In the middle panel, there is very little phase variation across the estimated averages, as expected. However, in the right panel, it is easy to see that considerable phase variation remains.

Refer to caption
Figure 8: Registration and estimate μ^𝐬j\hat{\mu}_{\mathbf{s}_{j}} of template component μ𝐬j\mu_{\mathbf{s}_{j}} at electrode AFZ (location 𝐬j\mathbf{s}_{j}). Panel 1: Simulated data in grey with cross-sectional average without registration in red. Panels 2-4: Aligned functions in grey with estimated average signal in red, generated using componentwise registration, universal registration, and the proposed method, respectively.
Refer to caption
Figure 9: Empirical trace-variograms V^\hat{V} (left) computed using estimated average signals {μ^𝐬j,j[64]}\{\hat{\mu}_{\mathbf{s}_{j}},j\in[64]\} obtained using three registration methods: componentwise (red, solid), universal (blue, dashed), and proposed (green, dotted). Estimated average EEG signals {μ^𝐬1,,μ^𝐬14}\{\hat{\mu}_{\mathbf{s}_{1}},\ldots,\hat{\mu}_{\mathbf{s}_{14}}\} at 14 (out of the total 64) electrodes placed in an area related to the parietal lobe, computed using the proposed (middle) and componentwise (right) registration approaches.

6.2 Ozone concentration data

Ground-level ozone is a harmful pollutant that is monitored closely by the Environmental Protection Agency. The temporal trend of daily ozone concentration varies from year to year due to variations in various environmental conditions, including the weather. To explore average ozone concentration patterns at a certain location, we desire to first eliminate phase variation that exists across different observation years using a registration procedure. At the same time, we must account for cross-component phase variation across different locations within the same year that is due to different temporal patterns of pollutant spread. Thus, we view multi-year ozone concentration functions, observed at different spatial locations, as multivariate functional data with two different sources of phase variation; here, univariate ozone concentration functions observed at different spatial locations are treated as components of a full observation corresponding to a single year. In this analysis, we focus on a small area in northern California (353935^{\circ}\sim 39^{\circ} N, 120123120\sim 123^{\circ} W) with K=16K=16 observation stations (components). Each station recorded daily average ozone concentration (in parts per million) from year 2000 to year 2019 (sample size n=20n=20). The data is publicly available on the air data website111https://www.epa.gov/outdoor-air-quality-data of the United States Environmental Protection Agency. As in the previous real data analysis example that considered EEG data, we compare the performance of three registration procedures in this context: componentwise, universal and proposed.

We display the registration results for a single location, corresponding to Alameda County (37.8 N, 122.3 W), in Figure 10. We also display the result of averaging across trials when no registration is applied to the data as a baseline (Panel 1). We make several interesting observations based on these results. First, the proposed method yields an estimate of the average ozone concentration function (Panel 4) that contains clearer patterns than the average computed without alignment (Panel 1), e.g., the steep reduction in ozone concentration around day 200. The proposed approach produces an average that has very similar patterns to those computed using the componentwise and universal registration methods, but is smoother overall. The most noticeable difference occurs in early autumn where the proposed approach results in an average that has a single mode, while the other two methods result in averages with two smaller peaks. A possible reason for this phenomenon is the large regularization parameter value chosen via cross-validation; this restricts the complexity of estimated cross-component warping functions and forces the registration procedure to overlook small features that are potentially induced by noise.

Refer to caption
Figure 10: Registration and estimate μ^𝐬j\hat{\mu}_{\mathbf{s}_{j}} of template component μ𝐬j\mu_{\mathbf{s}_{j}} where 𝐬j\mathbf{s}_{j} is Alameda County (37.8 N, 122.3 W). Panel 1: Observed data in grey with cross-sectional average without registration in red. Panels 2-4: Aligned functions in grey with estimated average signal in red, generated using componentwise registration, universal registration, and the proposed method.
Refer to caption
Figure 11: Empirical trace-variograms V^\hat{V} (left) computed using estimated average ozone concentration functions {μ^𝐬j,j[16]}\{\hat{\mu}_{\mathbf{s}_{j}},\ j\in[16]\} at 16 locations in a small area in northern California obtained using three registration methods: componentwise (red, solid), universal (blue, dashed), and proposed (green, dotted). Estimated average ozone concentration functions {μ^𝐬j,j[16]}\{\hat{\mu}_{\mathbf{s}_{j}},\ j\in[16]\} computed using the proposed (middle) and componentwise (right) registration approaches.

As in the EEG data example, we further analyze the estimated average ozone concentration functions at the 16 spatial locations. Modeling of spatial correlation based on averages computed after registration is also of particular interest in this application. In the left panel of Figure 11, we display the empirical trace-variograms, computed using (13) after estimating the average ozone concentration functions using the componentwise (red), universal (blue) and proposed (green) approaches. The conclusions here are similar to those reached in the EEG data example. The proposed method yields componentwise averages that have smallest spatial variation. The variation gap between the universal and proposed methods appears bigger in this case, even for small spatial distances. We further display the componentwise estimated average ozone concentration functions, for each of the 16 locations, generated using the proposed and componentwise registration methods in the middle and right panels of Figure 11, respectively. It is obvious that the proposed approach (middle) results in estimated average functions that exhibit much less cross-component phase variation than the average functions generated using the componentwise registration method (right). Furthermore, most averages in the middle panel have two ozone concentration maxima around April/May and August, and a single ozone concentration minimum around June/July. In the right panel, there is much more variation in the number of extrema in the estimated averages as well as their timing.

7 Discussion

The proposed penalized registration procedure can in principle be used to align multivariate functional data with component functions that are correlated in other ways. For example, when component functions are temporally correlated, we can replace the kriging estimate ψ~ij\tilde{\psi}_{ij} in the penalty term of the objective function (6) with a temporal interpolant or temporally weighted estimate (e.g., moving time-window average). There is much to be done in this direction, and the proposed method represents a promising initial foray.

The focus in this paper was restricted to multivariate functional data wherein only the cross-component phases are spatially correlated. The objective function (6) does not quantify and incorporate any spatial correlation between the cross-component amplitudes, and this may well be of interest in certain applications. Extension of the algorithm to accommodate such a requirement is possible using the amplitude trace-variogram proposed by Guo et al. (2022), in addition to the phase trace-variogram used here. This presents a fruitful line for future work.

Importance of the elastic metric and its invariance to time warping, which is made practically useful through the square-root slope function (SRSF) representation, cannot be overstated. The invariance property is used to great benefit in both the objective function and the novel spatially-informed penalty term through the kriging estimate ψ~ij\tilde{\psi}_{ij}, and enables us to disregard estimating the cross-observation phases {αi}\{\alpha_{i}\}.

There is potential for improvement in reducing the computational burden when implementing the registration procedure when the number nn of functional observations with KK components are both large. Registration in the elastic framework uses the the dynamic programming algorithm since the class of warping functions Γ\Gamma is infinite-dimensional and unconstrained (Srivastava & Klassen, 2016). Restricting attention to a smaller parametric class (e.g., parameterised class of distribution or quantile functions on [0,1][0,1]) will reduce computing time considerably, but at the cost of flexibility in registration.

Acknowledgements: We acknowledge funding from the National Science Foundation (DMS-2015374 to KB; CCF-1740761, DMS-2015226, CCF-1839252 to SK), the National Institutes of Health (R37-CA214955 to SK and KB) and the Engineering and Physical Sciences Research Council (EP/V048104/1 to KB).

References

  • Bache & Lichman (2013) Bache, K. & Lichman, M. (2013). UCI machine learning repository. https://archive.ics.uci.edu/ml/datasets/EEG+Database.
  • Carroll & Müller (2021) Carroll, C. & Müller, H.-G. (2021). Latent transport models for multivariate functional data. arXiv:2107.05730 .
  • Carroll et al. (2021) Carroll, C., Müller, H.-G. & Kneip, A. (2021). Cross-component registration for multivariate functional data, with application to growth curves. Biometrics 77, 839–851.
  • Giraldo et al. (2011) Giraldo, R., Delicado, P. & Mateu, J. (2011). Ordinary kriging for function-valued spatial data. Environmental and Ecological Statistics 18, 411–426.
  • Guo et al. (2022) Guo, X., Kurtek, S. & Bharath, K. (2022). Variograms for kriging and clustering of spatial functional data with phase variation. Spatial Statistics , 100687.
  • Kurtek et al. (2012) Kurtek, S., Srivastava, A., Klassen, E. & Ding, Z. (2012). Statistical modeling of curves using shapes and related features. Journal of the American Statistical Association 107, 1152–1165.
  • Makeig et al. (2007) Makeig, S., Onton, J., Sejnowski, T. & Poizner, H. (2007). Prospects for mobile, high-definition brain imaging: EEG spectral modulations during 3-D reaching. Human Brain Mapping .
  • Marron et al. (2015) Marron, J. S., Ramsay, J. O., Sangalli, L. M. & Srivastava, A. (2015). Functional data analysis of amplitude and phase variation. Statistical Science 30, 468–484.
  • Olsen et al. (2016) Olsen, N., Markussen, B. & Raket, L. L. (2016). Simultaneous inference for misaligned multivariate functional data. Journal of the Royal Statistical Society: Series C 67, 1147–1176.
  • Park & Ahn (2017) Park, J. & Ahn, J. (2017). Clustering multivariate functional data with phase variation. Biometrics 73, 324–333.
  • Srivastava & Klassen (2016) Srivastava, A. & Klassen, E. P. (2016). Functional and Shape Data Analysis. Springer.
  • Srivastava et al. (2011) Srivastava, A., Wu, W., Kurtek, S., Klassen, E. & Marron, J. S. (2011). Registration of functional data using Fisher-Rao metric. arXiv:1103.3817 .
  • Stam et al. (2007) Stam, C. J., Nolte, G. & Daffertshofer, A. (2007). Phase lag index: assessment of functional connectivity from multi channel EEG and MEG with diminished bias from common sources. Human Brain Mapping 28, 1178–1193.
  • Sur & Sinha (2009) Sur, S. & Sinha, V. K. (2009). Event-related potential: An overview. Industrial Psychiatry Journal 18, 70–73.
  • Tsai et al. (2014) Tsai, A. C., Jung, T.-P., Chien, V. S., Savostyanov, A. N. & Makeig, S. (2014). Cortical surface alignment in multi-subject spatiotemporal independent EEG source imaging. NeuroImage 87, 297–310.
  • Wang et al. (2001) Wang, K., Begleiter, H. & Porjesz, B. (2001). Warp-averaging event-related potentials. Clinical Neurophysiology 112, 1917–1924.
  • Zhao et al. (2020) Zhao, W., Xu, Z., Li, W. & Wu, W. (2020). Modeling and analyzing neural signals with phase variability using Fisher-Rao registration. Journal of Neuroscience Methods 346, 108954.