This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A new technique for compression of data sets

Anatoli Torokhti Centre for Industrial and Applied Mathematics, School of Mathematics and Statistics, University of South Australia, SA 5095, Australia. Ph: +61 8 8302 3812. Fax: +61 8 8302 5785. E-mail: anatoli. [email protected].
Abstract

Data compression techniques are characterized by four key performance indices which are (i) associated accuracy, (ii) compression ratio, (iii) computational work, and (iv) degree of freedom. The method of data compression developed in this paper allows us to substantially improve all the four issues above.

The proposed transform \cal F is presented in the form of a sum with p1p-1 terms, 1,,p1\mbox{$\cal F$}_{1},\ldots,\mbox{$\cal F$}_{p-1}, where each term is a particular sub-transform presented by a first degree polynomial. For j=1,,p1j=1,\ldots,p-1, each sub-transform j\mbox{$\cal F$}_{j} is determined from interpolation-like conditions. This device provides the transform flexibility to incorporate variation of observed data and leads to performance improvement. The transform \cal F has two degrees of freedom, the number of sub-transforms and associated compression ratio associated with each sub-transform j\mbox{$\cal F$}_{j}.

1 Introduction

In this paper, a new technique for compression of a set of random signals is proposed. The technique provides the better associated accuracy, compression ratio and computational load than the Karhunen-Loève transform (KLT) and its known extensions.

The basic idea is to combine the methodologies of a piece-wise liner function interpolation [1] and the best rank-constrained operator approximation. This issue is discussed in more detail in Sections 2 and 3.

The KLT is the fundamental and, perhaps, most popular data compression method [2][12]. It is also known as the Principal Component Analysis (PCA) [13, 11] and “is probably the oldest and best known of the techniques of multivariate analysis” [13]. This technique is intensively used in a vide range of research areas such as data compression [14, 15], pattern recognition [16], interference suppression [17], image processing [18], forecasting [19], hydrology [20], physics nonlinear phenomena [21], probabilistic mechanics [22, 23], biomechanics [24], geoscience [25, 26], stochastic processes [27, 28], information theory [29, 30], 3-D object classification and video coding [31], chemical theory [32], oceanology [33], optics and laser technology [34] and others.

The purpose of techniques based on the KLT is to compress observable data vector 𝐲\mathbf{y} with nn components to a shorter vector 𝐱~\widetilde{\mbox{$\mathbf{x}$}} with rr components (also called the principal components [13]) and then reconstruct it so that the reconstruction is close to the reference signal 𝐱\mathbf{x}. In general, 𝐱\mathbf{x} contains mm components where mm is not necessarily equal to nn. The ratio

c=rm,where rm,\displaystyle c=\frac{r}{m},\quad\mbox{where $r\leq m$,}

is called the compression ratio. Smaller rr implies poorer accuracy in the reconstruction of the compressed data. In this sense, rr is the KLT degree of freedom. Therefore, the performance of the KLT and related transforms is characterized by four key performance indices which are (i) associated accuracy, (ii) compression ratio, (iii) computational work and (iv) the degree of freedom.

The method of data compression developed in this paper allows us to substantially (i) increase the accuracy, (ii) improve the compression ratio, (iii) decrease computational work and (iv) increase the degrees of freedom.

1.1 Motivations for the proposed technique

The motivations of the proposed are as follows.

1.1.1 Infinite sets of signals

Most of the literature on the subject of the KLT111Relevant references can be found, for example, in [7][12]. discusses the properties of an optimal transform for an individual finite random signal.222Here, the signal is treated as a vector with random components. We say that a random signal 𝐱\mathbf{x} is finite if 𝐱\mathbf{x} has a finite number of components. This means that if one wishes to compress and then reconstruct an infinite set of observable signals KY={𝐲1,𝐲2,,𝐲N,}K_{{}_{Y}}=\{\mbox{$\mathbf{y}$}_{1},\mbox{$\mathbf{y}$}_{2},\ldots,\mbox{$\mathbf{y}$}_{N},\ldots\} so that reconstructed signals are close to an infinite set of reference random signals KX={𝐱1,𝐱2,,𝐱N,}K_{{}_{X}}=\{\mbox{$\mathbf{x}$}_{1},\mbox{$\mathbf{x}$}_{2},\ldots,\mbox{$\mathbf{x}$}_{N},\ldots\} using the KLT approach then one is forced to find an infinite set of corresponding transforms {𝒯1,𝒯2,,𝒯N,}\{\mbox{$\cal T$}_{1},\mbox{$\cal T$}_{2},\ldots,\mbox{$\cal T$}_{N},\ldots\}: one element 𝒯i\mbox{$\cal T$}_{i} of the transform set for each representative 𝐲i\mbox{$\mathbf{y}$}_{i} of the signal set KYK_{{}_{Y}}. Clearly, such an approach cannot be applied in practice.333Here, KYK_{{}_{Y}} and KXK_{{}_{X}} are countable sets. More generally, the sets KYK_{{}_{Y}} and KXK_{{}_{X}} might be uncountable when KYK_{{}_{Y}} and KXK_{{}_{X}} depend on a continuous parameter α\alpha. Although the standard KLT has been extended in [9] to the cases of infinite signal sets, its associated accuracy is still not satisfactory (see Sections 1.1.3, 4.8 and 5).

1.1.2 Computational work

Note that even in the case when KY={𝐲1,𝐲2,,K_{{}_{Y}}=\{\mbox{$\mathbf{y}$}_{1},\mbox{$\mathbf{y}$}_{2},\ldots, 𝐲N}\mbox{$\mathbf{y}$}_{N}\} and KX={𝐱1,𝐱2,,𝐱N}K_{{}_{X}}=\{\mbox{$\mathbf{x}$}_{1},\mbox{$\mathbf{x}$}_{2},\ldots,\mbox{$\mathbf{x}$}_{N}\} with NN fixed can be represented as finite ‘long’ signals, the KLT applied to such signals leads to computation of large covariance matrices. Indeed, if each 𝐲i\mbox{$\mathbf{y}$}_{i} has nn components and each 𝐱i\mbox{$\mathbf{x}$}_{i} has mm components then the KLT approach leads to computation of a product of an mN×nNmN\times nN matrix and an nN×nNnN\times nN matrix and computation of a nN×nNnN\times nN pseudo-inverse matrix [7]. This requires O(2mn2N3)O(2mn^{2}N^{3}) and O(22n3N3)O(22n^{3}N^{3}) flops, respectively [35]. As a result, in this case, the computational work associated with the KLT approach becomes unreasonably hard.

1.1.3 Associated accuracy

For the given compression ratio, the accuracy associated with the KLT-like techniques cannot be changed. If the accuracy is not satisfactory then one has to change the compression ratio, which can be undesirable. Thus, for the KLT approach, the compression ratio is the only degree of freedom to improve the accuracy.

1.2 Brief description of the KLT

First, we need some notation as follows. Let X=(Ω,Σ,μ)X=(\Omega,\Sigma,\mu) be a probability space, where Ω={ω}\Omega=\{\omega\} is the set of outcomes, Σ\Sigma a σ\sigma–field of measurable subsets in Ω\Omega and μ:Σ[0,1]\mu:\Sigma\rightarrow[0,1] an associated probability measure on Σ\Sigma.

Let 𝐱L2(Ω,n)\mbox{$\mathbf{x}$}\in L^{2}(\Omega,{\mathbb{R}}^{n}) and 𝐲L2(Ω,m)\mbox{$\mathbf{y}$}\in L^{2}(\Omega,{\mathbb{R}}^{m}) be a reference signal and an observed signal, respectively. Let the norm Ω2\|\cdot\|^{2}_{\Omega} be given by 𝐱Ω2=Ω𝐱(ω)22,dμ(ω)\displaystyle\|\mbox{$\mathbf{x}$}\|^{2}_{\Omega}={\int}_{\Omega}\|\mbox{$\mathbf{x}$}(\omega)\|^{2}_{2},d\mu(\omega) where 𝐱(ω)2\|\mbox{$\mathbf{x}$}(\omega)\|_{2} is the Euclidean norm of 𝐱(ω)m\mbox{$\mathbf{x}$}(\omega)\in{\mathbb{R}}^{m}.

We consider a linear operator (transform) 𝒦:\mbox{$\cal K$}: L2(Ω,m)L^{2}(\Omega,{\mathbb{R}}^{m}) L2(Ω,n)\rightarrow L^{2}(\Omega,{\mathbb{R}}^{n}) defined by a matrix Km×nK\in\mbox{$\mathbb{R}$}^{m\times n} so that

[𝒦(𝐲)](ω)=K[𝐲(ω)].[\mbox{$\cal K$}(\mbox{$\mathbf{y}$})](\omega)=K[\mbox{$\mathbf{y}$}(\omega)]. (1)

A generic KLT [7] is the linear transform 𝒦\cal K that solves

min𝕂𝐱𝒦(𝐲)Ω2subject to rank𝒦=rmin{m,n},\min_{\mathbb{K}}\|\mbox{$\mathbf{x}$}-\mbox{$\cal K$}(\mbox{$\mathbf{y}$})\|^{2}_{\Omega}\quad\mbox{subject to $\mathrm{rank\;}\mbox{$\cal K$}=r\leq\min\{m,n\}$}, (2)

where 𝕂\mathbb{K} is a class of all linear operators defined by (1). As a result, the KLT returns two matrices, Dm×rD\in\mbox{$\mathbb{R}$}^{m\times r} and Cr×nC\in\mbox{$\mathbb{R}$}^{r\times n}, so that K=DCK=DC. Matrix CC performs filtering and compression of observed data 𝐲\mathbf{y} to a shorter vector 𝐱~L2(Ω,r)\widetilde{\mbox{$\mathbf{x}$}}\in L^{2}(\Omega,{\mathbb{R}}^{r}) with rr components. Matrix DD reconstructs 𝐱~\widetilde{\mbox{$\mathbf{x}$}} in such a way that the reconstructed signal is close to 𝐱\mathbf{x} in the sense of (2).

Operator 𝒦\cal K that solves (2) for a particular case with 𝐲=𝐱\mbox{$\mathbf{y}$}=\mbox{$\mathbf{x}$} is the standard KLT.

Thus, for a given compression ratio, the KLT minimizes the associated error over the class of all linear transforms of the rank defined by (2).

1.3 Related techniques

Despite the known KLT optimality, it may happen that the accuracy and compression ratio associated with the KLT are still not satisfactory. Owing to this observation and the KLT versatility in applications (see Section 1 above), the KLT idea was extended in different directions as has been done, in particular, in [2][6]. More recent developments in this area include the polynomial transforms [7], extensions of the KLT to the cases of infinite signal sets [9], the best weighted estimators [10, 11], distributed KLT [29] and the fast KLT using wavelets [36].

Nevertheless, the known transforms based on the KLT idea still imply intrinsic difficulties associated with the original KLT. In particular, most of them have been developed for the case of a single signal, and for the fixed compression ratio, the associated error cannot be improved. In the case of the polynomial transforms [7], the error can be improved if the number of terms in the polynomial transforms increase. The latter implies a substantial computational burden that, in many cases, may stifle this intention.

1.4 Contribution. Particular features of the proposed transform

To describe the contribution and compare the features of the proposed transform with those of the KLT-like techniques, we consider the same issues as in Section 1.1.

1.4.1 Infinite sets of signals

Unlike most of the known transforms, the proposed technique allows us to determine a single transform to compress any signal from the infinite signal set. We note that infinite sets of stochastic signal-vectors introduced in Section 2.2 are quite large and include, for example, time series.

1.4.2 Computational work

In the case of finite signal sets, the proposed technique requires a lesser computational load compared to the KLT approach (see Sections 4.8.3 and 5).

1.4.3 Associated accuracy

Our transform provides data compression and subsequent reconstruction with any desired accuracy (Theorem 3 in Section 4.4)444This means that any desired accuracy is achieved theoretically, as is shown in Section 4.4 below. In practice, of course, the accuracy is increased to a prescribed reasonable level.. This is achieved due to the transform structure that follows from the device of the piece-wise function interpolation (Sections 2.2 and 3). In particular, it provides the related degree of freedom, the number of the so-called interpolation pairs (see Sections 3 and 4.8.2), to improve the transform performance (Section 4.8).

Moreover, the proposed technique

(i) determines the data compression transform in terms of pseudo-inverse matrices so that the transform always exists,

(ii) uses the same initial information (signal samples) as is used in the KLT-like transforms (see Sections 4.5 and 5), and

(iii) provides the simultaneous filtering and compression.

The paper is organized as follows. In Section 2, a brief description of the problem is given. Some necessary preliminaries, used in our transform construction, are given in Section 2.2. In Section 2.2.2, the transform model \cal F is provided. In Section 3, a rigorous statement of the problem is given, which is a generalization and extension of problem (2) to the case of filtering of infinite sets of stochastic signals. In Section 4, the proposed transform FF is determined from interpolation conditions, and the associated error analysis is presented. In particular, in Section 4.8, a comparison with KLT-like approach is given.

2 Underlying idea. Device of the transform

2.1 Underlying idea

In this paper, the underlying idea is different from those considered in the above mentioned works. The proposed transform is constructed from a combination of the device of the piece-wise linear function interpolation [1] and the best rank-constrained operator approximation. This means the following.

Suppose, KYK_{{}_{Y}} and KXK_{{}_{X}} are uncountable sets of random signals. While the rigorous notation is given in Section 2.2, here, we denote 𝐲(t,)KY\mbox{$\mathbf{y}$}(t,\cdot)\in K_{{}_{Y}} and 𝐱(t,)KX\mbox{$\mathbf{x}$}(t,\cdot)\in K_{{}_{X}} where t[ab]t\in[a\hskip 5.69054ptb]\subset\mbox{$\mathbb{R}$} is a time snapshot. Let a=t1tp=b.a=t_{1}\leq\ldots\leq t_{p}=b. Choose finite numbers of signals, {𝐲(t1,),,𝐲(tp,)}KY\{\mbox{$\mathbf{y}$}(t_{1},\cdot),\ldots,\mbox{$\mathbf{y}$}(t_{p},\cdot)\}\subset K_{{}_{Y}} and {𝐱(t1,),,𝐱(tp,)}KX\{\mbox{$\mathbf{x}$}(t_{1},\cdot),\ldots,\mbox{$\mathbf{x}$}(t_{p},\cdot)\}\subset K_{{}_{X}}. The proposed transform \cal F contains p1p-1 terms 1,,p1\mbox{$\cal F$}_{1},\ldots,\mbox{$\cal F$}_{p-1} (see (5)–(6) below) where, for j=1,,p1j=1,\ldots,p-1, each term j\mbox{$\cal F$}_{j} is given as a first order operator polynomial (see (6) below) and is determined from the interpolation conditions

j[𝐲(tj,)]=𝐱(tj,)andj[𝐲(tj+1,)]𝐱(tj+1,).\mbox{$\cal F$}_{j}[\mbox{$\mathbf{y}$}(t_{j},\cdot)]=\mbox{$\mathbf{x}$}(t_{j},\cdot)\hskip 5.69054pt\mbox{and}\hskip 5.69054pt\mbox{$\cal F$}_{j}[\mbox{$\mathbf{y}$}(t_{j+1},\cdot)]\approx\mbox{$\mathbf{x}$}(t_{j+1},\cdot). (3)

A reason for using the approximate equality in (3) is explained in Section 3.3. In Section 3.1, the second condition in (3) is represented as the best rank-constrained approximation problem (11)–(12). As a result, such a transform has advantages that are similar to the known advantages of the piece-wise function interpolation as has been mentioned in Section 1.4. The procedure of data compression-reconstruction is performed via p1p-1 truncated singular value decompositions (SVD) provided in Section 4.3.

2.2 Device of the transform

2.2.1 Signal sets under consideration

We wish to consider signals in some wide sense as follows. Let T:=[a,b]T:=[a,\hskip 2.84526ptb]\subseteq\mbox{$\mathbb{R}$}, and let KXK_{{}_{X}} and KYK_{{}_{Y}} be infinite sets of signals,
KX={𝐱(t,)L2(Ω,m)|tT}K_{{}_{X}}=\{\mbox{$\mathbf{x}$}(t,\cdot)\in L^{2}(\Omega,{\mathbb{R}}^{m})\hskip 2.84526pt|\hskip 2.84526ptt\in T\} and KY={𝐲(t,)K_{{}_{Y}}=\{\mbox{$\mathbf{y}$}(t,\cdot) L2(Ω,n)|tT}\in L^{2}(\Omega,{\mathbb{R}}^{n})\hskip 2.84526pt|\hskip 2.84526ptt\in T\} where

𝐱(t,)=[𝐱(1)(t,)𝐱(m)(t,)]T\displaystyle\mbox{$\mathbf{x}$}(t,\cdot)=[\mbox{$\mathbf{x}$}^{(1)}(t,\cdot)\ldots\mbox{$\mathbf{x}$}^{(m)}(t,\cdot)]^{T}
with𝐱(j)(t,)L2(Ω,)j=1,,m,\displaystyle\mbox{with}\quad\mbox{$\mathbf{x}$}^{(j)}(t,\cdot)\in L^{2}(\Omega,{\mathbb{R}})\quad\forall\quad j=1,\ldots,m,

and

𝐲(t,)=[𝐲(1)(t,)𝐲(n)(t,)]T\displaystyle\mbox{$\mathbf{y}$}(t,\cdot)=[\mbox{$\mathbf{y}$}^{(1)}(t,\cdot)\ldots\mbox{$\mathbf{y}$}^{(n)}(t,\cdot)]^{T}
with𝐲(i)(t,)L2(Ω,)j=1,,n,\displaystyle\mbox{with}\quad\mbox{$\mathbf{y}$}^{(i)}(t,\cdot)\in L^{2}(\Omega,{\mathbb{R}})\quad\forall\quad j=1,\ldots,n,

respectively.

We interpret 𝐱(t,)\mbox{$\mathbf{x}$}(t,\cdot) as a reference signal and 𝐲(t,)\mbox{$\mathbf{y}$}(t,\cdot) as an observable signal.555In an intuitive way 𝐲(t,)\mbox{$\mathbf{y}$}(t,\cdot) can be regarded as a noise-corrupted version of 𝐱(t,)\mbox{$\mathbf{x}$}(t,\cdot). For example, 𝐲(t,)\mbox{$\mathbf{y}$}(t,\cdot) can be interpreted as 𝐲(t,)=𝐱(t,)+𝐧\mbox{$\mathbf{y}$}(t,\cdot)=\mbox{$\mathbf{x}$}(t,\cdot)+{\mathbf{n}} where 𝐧{\mathbf{n}} is white noise. In this paper, we do not restrict ourselves to this simplest version of 𝐲(t,)\mbox{$\mathbf{y}$}(t,\cdot) and assume that the dependence of 𝐲(t,)\mbox{$\mathbf{y}$}(t,\cdot) on 𝐱(t,)\mbox{$\mathbf{x}$}(t,\cdot) and 𝐧{\mathbf{n}} is arbitrary.

The variable tTt\in T represents time.666More generally, TT can be considered as a set of parameter vectors α=(α(1),,α(q))TCqq\alpha=(\alpha^{(1)},\ldots,\alpha^{(q)})^{T}\in C^{q}\subseteq\mbox{$\mathbb{R}$}^{q}, where CqC^{q} is a qq-dimensional cube. One coordinate, say α(1)\alpha^{(1)} of α\alpha, could be interpreted as time. Then, for example, 𝐱(t,)\mbox{$\mathbf{x}$}(t,\cdot) can be interpreted as an arbitrary stationary time series.

Let {tk}1pT\{t_{k}\}_{1}^{p}\subset T be a nondecreasing sequence of fixed time-points such that

a=t1tp=b.a=t_{1}\leq\ldots\leq t_{p}=b. (4)

2.2.2 The transform model

Now let :L2(Ω,n)L2(Ω,m)\mbox{$\cal F$}:L^{2}(\Omega,\mbox{$\mathbb{R}$}^{n})\rightarrow L^{2}(\Omega,\mbox{$\mathbb{R}$}^{m}) be a transform such that, for each tTt\in T,

[𝐲(t,)]=j=1p1δjj[𝐲(t,)],\mbox{$\cal F$}[\mbox{$\mathbf{y}$}(t,\cdot)]=\sum_{j=1}^{p-1}\delta_{j}\mbox{$\cal F$}_{j}[\mbox{$\mathbf{y}$}(t,\cdot)], (5)

where

j[𝐲(t,)]=𝐚j+𝒢j[𝐲(t,)]andδj={1,if tjttj+1,0,otherwise.\mbox{$\cal F$}_{j}[\mbox{$\mathbf{y}$}(t,\cdot)]=\mbox{$\mathbf{a}$}_{j}+\mbox{$\cal G$}_{j}[\mbox{$\mathbf{y}$}(t,\cdot)]\hskip 5.69054pt\mbox{and}\hskip 5.69054pt\delta_{j}=\left\{\begin{array}[]{cl}1,&\mbox{if $t_{j}\leq t\leq t_{j+1}$},\\ 0,&\mbox{otherwise.}\end{array}\right. (6)

Here, j\mbox{$\cal F$}_{j} is a sub-transform with 𝐚jL2(Ω,m)\mbox{$\mathbf{a}$}_{j}\in L^{2}(\Omega,\mbox{$\mathbb{R}$}^{m}) and 𝒢j:L2(Ω,n)L2(Ω,m)\mbox{$\cal G$}_{j}:L^{2}(\Omega,\mbox{$\mathbb{R}$}^{n})\rightarrow L^{2}(\Omega,\mbox{$\mathbb{R}$}^{m}) is a linear operator represented by a matrix Gjm×nG_{j}\in\mbox{$\mathbb{R}$}^{m\times n} so that

[𝒢j(𝐲)](t,ω)=Gj[𝐲(t,ω)].[\mbox{$\cal G$}_{j}(\mbox{$\mathbf{y}$})](t,\omega)=G_{j}[\mbox{$\mathbf{y}$}(t,\omega)].

For each j=1,,p1j=1,\ldots,p-1, vector 𝐚j\mbox{$\mathbf{a}$}_{j} and operator 𝒢j\mbox{$\cal G$}_{j} should be determined from the interpolation conditions given in the following Section 3.

Thus, j\mbox{$\cal F$}_{j} is defined by a matrix Fjm×nF_{j}\in\mbox{$\mathbb{R}$}^{m\times n} such that

Fj[𝐲(t,ω)]=𝐚j(ω)+Gj[𝐲(t,ω)].F_{j}[\mbox{$\mathbf{y}$}(t,\omega)]=\mbox{$\mathbf{a}$}_{j}(\omega)+G_{j}[\mbox{$\mathbf{y}$}(t,\omega)]. (7)

3 Statement of the problem

Let 𝐱(t1,),,𝐱(tp,)\mbox{$\mathbf{x}$}(t_{1},\cdot),\ldots,\mbox{$\mathbf{x}$}(t_{p},\cdot) and 𝐲(t1,),,𝐲(tp,)\mbox{$\mathbf{y}$}(t_{1},\cdot),\ldots,\mbox{$\mathbf{y}$}(t_{p},\cdot) be signals chosen from infinite signal sets KXK_{{}_{X}} and KYK_{{}_{Y}}.

3.1 Preliminaries

Ideally, in (5)–(7), for j=1,,p1j=1,\ldots,p-1, we would like to determine each j\mbox{$\cal F$}_{j} so that 𝐚j\mbox{$\mathbf{a}$}_{j} satisfies

𝐚j+𝒢j[𝐲(tj,)]=𝐱(tj,)\displaystyle\mbox{$\mathbf{a}$}_{j}+\mbox{$\cal G$}_{j}[\mbox{$\mathbf{y}$}(t_{j},\cdot)]=\mbox{$\mathbf{x}$}(t_{j},\cdot) (8)

and 𝒢j\mbox{$\cal G$}_{j} solves

𝐚j+𝒢j[𝐲(tj+1,)]=𝐱(tj+1,).\displaystyle\mbox{$\mathbf{a}$}_{j}+\mbox{$\cal G$}_{j}[\mbox{$\mathbf{y}$}(t_{j+1},\cdot)]=\mbox{$\mathbf{x}$}(t_{j+1},\cdot). (9)

The conditions (8)–(9) are motivated by the device of piece-wise function interpolation and associated advantages [1]. In turn, (8) implies 𝐚j=𝐱(tj,)𝒢j[𝐲(tj,)]\mbox{$\mathbf{a}$}_{j}=\mbox{$\mathbf{x}$}(t_{j},\cdot)-\mbox{$\cal G$}_{j}[\mbox{$\mathbf{y}$}(t_{j},\cdot)] which being substituted in (9), reduces (9) to

Δ𝐱(tj,tj+1,)𝒢j[Δ𝐲(tj,tj+1,)]=θ,\displaystyle\mbox{\tiny$\Delta$}\mbox{$\mathbf{x}$}(t_{j},t_{j+1},\cdot)-\mbox{$\cal G$}_{j}[\mbox{\tiny$\Delta$}\mbox{$\mathbf{y}$}(t_{j},t_{j+1},\cdot)]=\theta, (10)

where

Δ𝐱(tj,tj+1,)=𝐱(tj+1,)𝐱(tj,),\displaystyle\mbox{\tiny$\Delta$}\mbox{$\mathbf{x}$}(t_{j},t_{j+1},\cdot)=\mbox{$\mathbf{x}$}(t_{j+1},\cdot)-\mbox{$\mathbf{x}$}(t_{j},\cdot),
Δ𝐲(tj,tj+1,)=𝐲(tj+1,)𝐲(tj,)\displaystyle\mbox{\tiny$\Delta$}\mbox{$\mathbf{y}$}(t_{j},t_{j+1},\cdot)=\mbox{$\mathbf{y}$}(t_{j+1},\cdot)-\mbox{$\mathbf{y}$}(t_{j},\cdot)

and θ\theta is the zero vector.

Nevertheless, the transform \cal F with such 𝐚j\mbox{$\mathbf{a}$}_{j} and 𝒢j\mbox{$\cal G$}_{j} does not provide data compression. To provide compression, we require that instead of (10), 𝒢j\mbox{$\cal G$}_{j} solves

min𝒢jΔ𝐱(tj,tj+1,)𝒢j[Δ𝐲(tj,tj+1,)]Ω2\displaystyle\min_{\small\mbox{$\cal G$}_{j}}\left\|\mbox{\tiny$\Delta$}\mbox{$\mathbf{x}$}(t_{j},t_{j+1},\cdot)-\mbox{$\cal G$}_{j}[\mbox{\tiny$\Delta$}\mbox{$\mathbf{y}$}(t_{j},t_{j+1},\cdot)]\right\|^{2}_{\Omega} (11)

subject to

rank𝒢j=rjmin{m,n}.\displaystyle\mathrm{rank\;}\mbox{$\cal G$}_{j}=r_{j}\leq\min\{m,n\}. (12)

In (11), we use the notation

Δ𝐱(tj,tj+1,)Ω2=ΩΔ𝐱(tj,tj+1,ω)22𝑑μ(ω).\|\mbox{\tiny$\Delta$}\mbox{$\mathbf{x}$}(t_{j},t_{j+1},\cdot)\|^{2}_{\Omega}={\int}_{\Omega}\|\mbox{\tiny$\Delta$}\mbox{$\mathbf{x}$}(t_{j},t_{j+1},\omega)\|^{2}_{2}d\mu(\omega). (13)

The constraint (12) leads to compression of 𝐲(t,)\mbox{$\mathbf{y}$}(t,\cdot) to a shorter vector with rjr_{j} components. This issue is discussed in more detail in Section 4.3 below.

3.2 Problem formulation

Thus, a determination of \cal F presented by (5)–(7) is reduced to the following problem: Given {𝐱(tj,),𝐲(tj,)}j=1p1\{\mbox{$\mathbf{x}$}(t_{j},\cdot),\mbox{$\mathbf{y}$}(t_{j},\cdot)\}_{j=1}^{p-1}, let for j=1,,p1j=1,\ldots,p-1,

𝐚j=𝐱(tj,)𝒢j[𝐲(tj,)],\displaystyle\mbox{$\mathbf{a}$}_{j}=\mbox{$\mathbf{x}$}(t_{j},\cdot)-\mbox{$\cal G$}_{j}[\mbox{$\mathbf{y}$}(t_{j},\cdot)], (14)

as in (8). Find 𝒢j\mbox{$\cal G$}_{j} that solves (11) subject to (12).

The above term “given” means that covariance matrices associated with 𝐱(tj,)\mbox{$\mathbf{x}$}(t_{j},\cdot) are 𝐲(tj,)\mbox{$\mathbf{y}$}(t_{j},\cdot) are known or can be estimated. This assumption is similar to the assumption used in the known KLT-like transforms.

3.3 Problem discussion

We note that (8) and (14) can be represented as

j[𝐲(tj,)]=𝐱(tj,).\displaystyle\mbox{$\cal F$}_{j}[\mbox{$\mathbf{y}$}(t_{j},\cdot)]=\mbox{$\mathbf{x}$}(t_{j},\cdot). (15)

Also, in (11), the term Δ𝐱(tj,tj+1,)𝒢j[Δ𝐲(tj,tj+1,)]\mbox{\tiny$\Delta$}\mbox{$\mathbf{x}$}(t_{j},t_{j+1},\cdot)-\mbox{$\cal G$}_{j}[\mbox{\tiny$\Delta$}\mbox{$\mathbf{y}$}(t_{j},t_{j+1},\cdot)] can be rewriten as

Δ𝐱(tj,tj+1,)𝒢j[Δ𝐲(tj,tj+1,)]\displaystyle\mbox{\tiny$\Delta$}\mbox{$\mathbf{x}$}(t_{j},t_{j+1},\cdot)-\mbox{$\cal G$}_{j}[\mbox{\tiny$\Delta$}\mbox{$\mathbf{y}$}(t_{j},t_{j+1},\cdot)]
=𝐱(tj+1,)(𝐚j+𝒢j[𝐲(tj+1,)])\displaystyle=\mbox{$\mathbf{x}$}(t_{j+1},\cdot)-(\mbox{$\mathbf{a}$}_{j}+\mbox{$\cal G$}_{j}[\mbox{$\mathbf{y}$}(t_{j+1},\cdot)])
=𝐱(tj+1,)j[𝐲(tj+1,)].\displaystyle=\mbox{$\mathbf{x}$}(t_{j+1},\cdot)-\mbox{$\cal F$}_{j}[\mbox{$\mathbf{y}$}(t_{j+1},\cdot)].

Therefore, (11)–(12) can be represented as

j[𝐲(tj+1,)]𝐱(tj+1,).\displaystyle\mbox{$\cal F$}_{j}[\mbox{$\mathbf{y}$}(t_{j+1},\cdot)]\approx\mbox{$\mathbf{x}$}(t_{j+1},\cdot). (16)

In other words, the relations (8), (14) and (11)–(12) mean that we wish to determine j\mbox{$\cal F$}_{j} so that j\mbox{$\cal F$}_{j} exactly interpolates 𝐱(t,)\mbox{$\mathbf{x}$}(t,\cdot) at t=tjt=t_{j} and approximately interpolates 𝐱(t,)\mbox{$\mathbf{x}$}(t,\cdot) at t=tj+1t=t_{j+1}, as in (15) and (16), respectively.

By this reason, the pairs of signals {𝐱(t1,),𝐲(t1,)},\{\mbox{$\mathbf{x}$}(t_{1},\cdot),\mbox{$\mathbf{y}$}(t_{1},\cdot)\}, ,{𝐱(tp1,),𝐲(tp1,)}\ldots,\{\mbox{$\mathbf{x}$}(t_{p-1},\cdot),\mbox{$\mathbf{y}$}(t_{p-1},\cdot)\} are called the interpolation pairs.

It is worthwhile to note that for the case of pure filtering (with no compression) the constraint (12) is omitted.

4 Main results

4.1 Best rank-constrained matrix approximation

First we recall a recent result on the best rank constrained matrix approximation [11, 38] which will be used in the next section.

Let m×n\mathbb{C}^{m\times n} be a set of m×nm\times n complex valued matrices, and denote by (m,n,k)m×n\mathcal{R}(m,n,k)\subseteq\mathbb{C}^{m\times n} the variety of all m×nm\times n matrices of rank kk at most. Fix A=[aij]i,j=1m,nm×nA=[a_{ij}]_{i,j=1}^{m,n}\in\mathbb{C}^{m\times n}. Then An×mA^{*}\in\mathbb{C}^{n\times m} is the conjugate transpose of AA. Let the SVD of AA be given by

A=UAΣAVA,A=U_{A}\Sigma_{A}V_{A}^{*}, (17)

where UAm×mU_{A}\in\mathbb{C}^{m\times m} and VAn×nare unitary matrices,V_{A}\in\mathbb{C}^{n\times n}\quad\mbox{are unitary matrices,} ΣA:=diag(σ1(A),,σmin(m,n)(A))\Sigma_{A}:=\mbox{diag}(\sigma_{1}(A),\ldots,\sigma_{\min(m,n)}(A)) m×n\in\mathbb{C}^{m\times n} is a generalized diagonal matrix, with the singular values σ1(A)σ2(A)0\sigma_{1}(A)\geq\sigma_{2}(A)\geq\ldots\geq 0 on the main diagonal.

Let UA=[u1u2um]U_{A}=[u_{1}\;u_{2}\;\ldots u_{m}] and VA=[v1v2vn]V_{A}=[v_{1}\;v_{2}\;\ldots v_{n}] be the representations of UU and VV in terms of their mm and nn columns, respectively. Let

PA,L:=i=1rankAuiuim×m and PA,R:=i=1rankAvivin×nP_{A,L}:=\sum_{i=1}^{\mathrm{rank\;}A}u_{i}u_{i}^{*}\in\mathbb{C}^{m\times m}\mbox{\quad\mbox{and}\quad}P_{A,R}:=\sum_{i=1}^{\mathrm{rank\;}A}v_{i}v_{i}^{*}\in\mathbb{C}^{n\times n} (18)

be the orthogonal projections on the range of AA and AA^{*}, correspondingly. Define

Ak:=Ak:=i=1kσi(A)uivi=UAkΣAkVAkm×nA_{k}:=\langle\langle A\rangle\rangle_{k}:=\sum_{i=1}^{k}\sigma_{i}(A)u_{i}v_{i}^{*}=U_{Ak}\Sigma_{Ak}V_{Ak}^{*}\in\mathbb{C}^{m\times n} (19)

for k=1,,rankAk=1,\ldots,\mathrm{rank\;}A, where

UAk=[u1u2uk],ΣAk=diag(σ1(A),,σk(A))\displaystyle U_{Ak}=[u_{1}\;u_{2}\;\ldots u_{k}],\quad\Sigma_{Ak}=\mbox{diag}(\sigma_{1}(A),\ldots,\sigma_{k}(A)) (20)
 and VAk=[v1v2vk].\displaystyle\mbox{\quad\mbox{and}\quad}V_{Ak}=[v_{1}\;v_{2}\;\ldots v_{k}]. (21)

For k>rankA,k>\mathrm{rank\;}A, we write Ak:=A(=ArankA)A_{k}:=A\;(=A_{\mathrm{rank\;}A}). For 1k<rankA1\leq k<\mathrm{rank\;}A, the matrix AkA_{k} is uniquely defined if and only if σk(A)>σk+1(A)\sigma_{k}(A)>\sigma_{k+1}(A).

Recall that A:=VAΣAUAn×mA^{\dagger}:=V_{A}\Sigma_{A}^{\dagger}U_{A}^{*}\in\mathbb{C}^{n\times m} is the Moore-Penrose generalized inverse of AA, where ΣA:=diag(1σ1(A),,1σrankA(A),0,,0)n×m\displaystyle\Sigma_{A}^{\dagger}:=\mathrm{diag}\left(\frac{1}{\sigma_{1}(A)},\ldots,\frac{1}{\sigma_{\mathrm{rank\;}A}(A)},0,\ldots,0\right)\in\mathbb{C}^{n\times m}. See for example [37].

Henceforth \|\cdot\| designates the Frobenius norm.

Theorem 1 below provides a solution to the problem of finding a matrix X0X_{0} such that

ABX0C=minX(p,q,k)ABXC||A-BX_{0}C||=\min_{X\in\mathcal{R}(p,q,k)}||A-BXC|| (22)

and is based on the fundamental result in [38] (Theorem 2.1) which is a generalization of the well known Eckart-Young theorem [35]. The Eckart-Young theorem states that for the case when m=pm=p, q=nq=n and B=Im,B=I_{m}, C=InC=I_{n}, the solution is given by X0=AkX_{0}=A_{k}, i.e.

AAk=minX(m,n,k)AX,k=1,,min{m,n}.||A-A_{k}||=\min_{X\in\mathcal{R}(m,n,k)}||A-X||,\quad k=1,\ldots,\min\{m,n\}. (23)
Theorem 1

[11] Let Am×n,A\in\mathbb{C}^{m\times n}, Bm×pB\in\mathbb{C}^{m\times p} and Cq×nC\in\mathbb{C}^{q\times n} be given matrices. Let

LB:=(IpPB,R)S and LC:=T(IqPC,L)L_{B}:=(I_{p}-P_{B,R})S\mbox{\quad\mbox{and}\quad}L_{C}:=T(I_{q}-P_{C,L}) (24)

where Sp×pS\in\mathbb{C}^{p\times p} and Tq×qT\in\mathbb{C}^{q\times q} are any matrices, and IpI_{p} is the p×pp\times p identity matrix. Then the matrix

X0:=(Ip+LB)BPB,LAPC,RkC(Iq+LC)X_{0}:=(I_{p}+L_{B})B^{\dagger}\langle\langle P_{B,L}AP_{C,R}\rangle\rangle_{k}C^{\dagger}(I_{q}+L_{C}) (25)

is a minimizing matrix for the minimal problem (22)(\ref{prob}). Any minimizing X0X_{0} has the above form.

4.2 Determination of the transform F{F}

Let us introduce the inner product

Δ𝐱(i)(tj,tj+1,),Δ𝐲(k)(tj,tj+1,)\displaystyle\left\langle\mbox{\tiny$\Delta$}\mbox{$\mathbf{x}$}^{(i)}(t_{j},t_{j+1},\cdot),\mbox{\tiny$\Delta$}\mbox{$\mathbf{y}$}^{(k)}(t_{j},t_{j+1},\cdot)\right\rangle
=ΩΔ𝐱(i)(tj,tj+1,ω)Δ𝐲(k)(tj,tj+1,ω)𝑑μ(ω)\displaystyle=\int_{\Omega}\mbox{\tiny$\Delta$}\mbox{$\mathbf{x}$}^{(i)}(t_{j},t_{j+1},\omega)\mbox{\tiny$\Delta$}\mbox{$\mathbf{y}$}^{(k)}(t_{j},t_{j+1},\omega)\ d\mu(\omega) (26)

and the covariance matrix

EΔxjΔyj={Δ𝐱(i)(tj,tj+1,),Δ𝐲(k)(tj,tj+1,)}i,k=1m,n.E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}=\left\{\left\langle\mbox{\tiny$\Delta$}\mbox{$\mathbf{x}$}^{(i)}(t_{j},t_{j+1},\cdot),\mbox{\tiny$\Delta$}\mbox{$\mathbf{y}$}^{(k)}(t_{j},t_{j+1},\cdot)\right\rangle\right\}_{i,k=1}^{m,n}.

We denote by X1/2X^{1/2} a matrix such that X1/2X1/2=XX^{1/2}X^{1/2}=X.

Theorem 2

The proposed transform F{F} is given by

F[𝐲(t,ω)]=j=1p1δjFj[𝐲(t,ω)]{F}[\mbox{$\mathbf{y}$}(t,\omega)]=\sum_{j=1}^{p-1}\delta_{j}{F}_{j}[\mbox{$\mathbf{y}$}(t,\omega)] (27)

where

Fj[𝐲(t,ω)]=𝐱(tj,ω)+Gj[𝐲(t,ω)𝐲(tj,ω)],\displaystyle{F}_{j}[\mbox{$\mathbf{y}$}(t,\omega)]=\mbox{$\mathbf{x}$}(t_{j},\omega)+{G}_{j}[\mbox{$\mathbf{y}$}(t,\omega)-\mbox{$\mathbf{y}$}(t_{j},\omega)], (28)
Gj=EΔxjΔyj(EΔyjΔyj1/2)rj(EΔyjΔyj1/2)\displaystyle{G}_{j}=\langle\langle E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}\rangle\rangle_{r_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}
+MGj[InEΔyjΔyj1/2(EΔyjΔyj1/2)],\displaystyle\hskip 36.98857pt+M_{Gj}[I_{n}-E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}], (29)

and MCjm×nM_{Cj}\in\mbox{$\mathbb{R}$}^{m\times n} is an arbitrary matrix.

Proof 1

First, we note that Fj{F}_{j} in (28) has the form as in (7) where

𝐚j(ω)=𝐱(tj,ω)Gj[𝐲(tj,ω)].\displaystyle\mbox{$\mathbf{a}$}_{j}(\omega)=\mbox{$\mathbf{x}$}(t_{j},\omega)-{G}_{j}[\mbox{$\mathbf{y}$}(t_{j},\omega)]. (30)

To find 𝒢j\mbox{$\cal G$}_{j} that satisfies (11) and (12), we write:

Δ𝐱(tj,tj+1,)𝒢j(Δ𝐲(tj,tj+1,))Ω2\displaystyle\left\|\mbox{\tiny$\Delta$}\mbox{$\mathbf{x}$}(t_{j},t_{j+1},\cdot)-\mbox{$\cal G$}_{j}(\mbox{\tiny$\Delta$}\mbox{$\mathbf{y}$}(t_{j},t_{j+1},\cdot))\right\|^{2}_{\Omega} (31)
=tr{EΔxjΔxjEΔxjΔyjGjTGjEΔyjΔxj+GjEΔyjΔyjGjT}\displaystyle=\mbox{tr}\{E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}x_{j}}-E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}G_{j}^{T}-G_{j}E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}x_{j}}+G_{j}E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}G_{j}^{T}\}
=\displaystyle= EΔxjΔxj1/22EΔxjΔyj(EΔyjΔyj1/2)2\displaystyle\|E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}x_{j}}^{1/2}\|^{2}-\|E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{\dagger}\|^{2}
+(GjEΔxjΔyjEΔyjΔyj)EΔyjΔyj1/22\displaystyle\hskip 56.9055pt+\|(G_{j}-E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{{\dagger}})E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2}\|^{2}
=\displaystyle= EΔxjΔxj1/22EΔxjΔyj(EΔyjΔyj1/2)2\displaystyle\|E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}x_{j}}^{1/2}\|^{2}-\|E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{\dagger}\|^{2}
+EΔxjΔyj(EΔyjΔyj1/2)GjEΔyjΔyj1/22.\displaystyle\hskip 42.67912pt+\|E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}-G_{j}E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2}\|^{2}.

The latter is true because

EΔyjΔyjEΔyjΔyj1/2=(EΔyjΔyj1/2)E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{{\dagger}}E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2}=(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}

and

EΔxjΔyjEΔyjΔyjEΔyjΔyj=EΔxjΔyjE_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{\dagger}E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}=E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}} (32)

by Lemma 24 in [7].

In (31), the only term that depends on GjG_{j} is EΔxjΔyj(EΔyjΔyj1/2)GjEΔyjΔyj1/22\|E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}-G_{j}E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2}\|^{2}. Thus a determination of GjG_{j} is reduced to the problem (22) with

A=EΔxjΔyj(EΔyjΔyj1/2),B=In,X=Gj\displaystyle A=E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}},\hskip 2.84526ptB=I_{n},\hskip 2.84526ptX=G_{j} (33)
andC=EΔyjΔyj1/2.\displaystyle\hskip 54.06023pt\mbox{and}\hskip 8.53581ptC=E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2}.

Let the SVD of EΔyjΔyj1/2E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2} be given by

EΔyjΔyj1/2=VnΣVnTE_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2}=V_{n}\Sigma V_{n}^{T} (34)

where Vn=[v1,,vn]V_{n}=[v_{1},\ldots,v_{n}] with viv_{i} the ii-th column of Vn,V_{n}, Σ=diag(σ1,,σn)n×n\Sigma=\mbox{diag}(\sigma_{1},\ldots,\sigma_{n})\in\mbox{$\mathbb{R}$}^{n\times n} and σ1,,σn\sigma_{1},\ldots,\sigma_{n} are associated singular values. Let rankEyy1/2=ρ\mathrm{rank\;}E_{yy}^{1/2}=\rho. In this case, the solution by Theorem 1 is given by

Gj=Nj+NjTj(InVρVρT),G_{j}=N_{j}+N_{j}T_{j}(I_{n}-V_{\rho}V_{\rho}^{T}), (35)

where

Nj=EΔxjΔyj(EΔyjΔyj1/2)VρVρTrj(EΔyjΔyj1/2)N_{j}=\langle\langle E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}V_{\rho}V_{\rho}^{T}\rangle\rangle_{r_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}} (36)

and TjT_{j} is an arbitrary matrix. Here,

(EΔyjΔyj1/2)VρVρT=VρΣρ1VρTVρVρT=(EΔyjΔyj1/2)(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}V_{\rho}V_{\rho}^{T}=V_{\rho}\Sigma_{\rho}^{-1}V_{\rho}^{T}V_{\rho}V_{\rho}^{T}=(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}} (37)

and Σρ1=diag(1σ1,,1σρ)\Sigma_{\rho}^{-1}=\mathrm{diag}(\frac{1}{\sigma_{1}},\ldots,\frac{1}{\sigma_{\rho}}). If we choose MCj=NjTjM_{Cj}=N_{j}T_{j} then (2) follows from (35)–(37) because VρVρT=EΔyjΔyj1/2(EΔyjΔyj1/2)V_{\rho}V_{\rho}^{T}=E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}.

4.3 Procedure of compression and de-compression

For each tTt\in T, we wish to filter the observed data 𝐲(t,)L2(Ω,n)\mbox{$\mathbf{y}$}(t,\cdot)\in L^{2}(\Omega,{\mathbb{R}}^{n}), compress it to a shorter vector 𝐱~(t,)L2(Ω,rj)\widetilde{\mbox{$\mathbf{x}$}}(t,\cdot)\in L^{2}(\Omega,{\mathbb{R}}^{r_{j}}) with rj=min{m,n}r_{j}=\min\{m,n\} components777Recall that in statistics those components are called the principal components [13]. and then to reconstruct 𝐱~(t,)\widetilde{\mbox{$\mathbf{x}$}}(t,\cdot) in the form 𝐱^(t,)\widehat{\mbox{$\mathbf{x}$}}(t,\cdot) so that 𝐱^(t,)\widehat{\mbox{$\mathbf{x}$}}(t,\cdot) is close to 𝐱(t,)\mbox{$\mathbf{x}$}(t,\cdot).

This procedure is provided by the proposed transform FF for two special cases,

(i) t(t1,b]t\in(t_{1},\hskip 2.84526ptb], and

(ii) t=t1t=t_{1}

as follows. We consider the cases successively.

(i) For t(t1,b]t\in(t_{1},\hskip 2.84526ptb], compression and filtering of 𝐲(t,)\mbox{$\mathbf{y}$}(t,\cdot) is performed via a representation of GjG_{j} in (2) as a product of two matrices,

Gj=DGjCGj,G_{j}=D_{G_{j}}C_{G_{j}},

where DGjD_{G_{j}} is a m×rjm\times r_{j} matrix and CGjC_{G_{j}} is a rj×nr_{j}\times n matrix. Then matrix CGjC_{G_{j}} compresses 𝐱(tj,ω)\mbox{$\mathbf{x}$}(t_{j},\omega) to a vector with rjr_{j} components and matrix DGjD_{G_{j}} reconstructs the compressed vector so that the reconstruction is close to 𝐱(tj,ω)\mbox{$\mathbf{x}$}(t_{j},\omega).

Matrices DD and CjC_{j} are determined as follows.

Let us write the SVD of GjG_{j} in (2) as

UGjΣGjVGjT=GjU_{G_{j}}\Sigma_{G_{j}}V_{G_{j}}^{T}=G_{j} (38)

where matrices

UGj=[uj1,,ujm]m×m,\displaystyle U_{G_{j}}=[u_{j1},\ldots,u_{jm}]\in\mbox{$\mathbb{R}$}^{m\times m},
ΣGj=diag(σ1(Gj),,σmin(m,n)(Gj))m×n\displaystyle\Sigma_{G_{j}}=\mbox{diag}(\sigma_{1}(G_{j}),\ldots,\sigma_{\min(m,n)}(G_{j}))\in\mbox{$\mathbb{R}$}^{m\times n}
 and VGj=[vj1,,vjn]n×n\displaystyle\mbox{\quad\mbox{and}\quad}V_{G_{j}}=[v_{j1},\ldots,v_{jn}]\in\mbox{$\mathbb{R}$}^{n\times n}

are similar to matrices UAU_{A}, ΣA\Sigma_{A} and VAV_{A} for the SVD of matrix AA in (17), respectively. In particular, σj1\sigma_{j1}, ,\ldots, σjmin(m,n)\sigma_{j\min(m,n)} are the associated singular values.

Let us denote

UGjrj=[uj1,,ujrj]m×rj,\displaystyle U_{G_{j}r_{j}}=[u_{j1},\ldots,u_{jr_{j}}]\in\mbox{$\mathbb{R}$}^{m\times r_{j}}, (39)
ΣGjrj=diag(σ1(Gj),,σrj(Gj))rj×rj\displaystyle\Sigma_{G_{j}r_{j}}=\mbox{diag}(\sigma_{1}(G_{j}),\ldots,\sigma_{r_{j}}(G_{j}))\in\mbox{$\mathbb{R}$}^{r_{j}\times r_{j}} (41)
 and VGjrj=[vj1,,vjrj]n×rj,\displaystyle\mbox{\quad\mbox{and}\quad}V_{G_{j}r_{j}}=[v_{j1},\ldots,v_{jr_{j}}]\in\mbox{$\mathbb{R}$}^{n\times r_{j}},

where rjr_{j} is as in (12).

Then the transform \cal F in (27)–(2) can be written as

[𝐲(t,)]=j=1p1δjj[𝐲(t,)],\mbox{$\cal F$}[\mbox{$\mathbf{y}$}(t,\cdot)]=\sum_{j=1}^{p-1}\delta_{j}\mbox{$\cal F$}_{j}[\mbox{$\mathbf{y}$}(t,\cdot)], (42)

where

Fj[𝐲(t,ω)]=𝐳j(tj,ω)+DGjCGj𝐲(t,ω),F_{j}[\mbox{$\mathbf{y}$}(t,\omega)]=\mbox{$\mathbf{z}$}_{j}(t_{j},\omega)+D_{G_{j}}C_{G_{j}}\mbox{$\mathbf{y}$}(t,\omega), (43)
𝐳j(tj,ω)=𝐱(tj,ω)Gj𝐲(tj,ω),\displaystyle\mbox{$\mathbf{z}$}_{j}(t_{j},\omega)=\mbox{$\mathbf{x}$}(t_{j},\omega)-G_{j}\mbox{$\mathbf{y}$}(t_{j},\omega), (44)
DGj=UGjrjΣGjrjm×rj,CGj=VGjrjTrj×n,\displaystyle D_{G_{j}}=U_{G_{j}r_{j}}\Sigma_{G_{j}r_{j}}\in\mbox{$\mathbb{R}$}^{m\times r_{j}},\hskip 2.84526ptC_{G_{j}}=V_{G_{j}r_{j}}^{T}\in\mbox{$\mathbb{R}$}^{r_{j}\times n}, (45)

or

DGj=UGjrjm×rj,CGj=ΣGjrjVGjrjTrj×n.\displaystyle D_{G_{j}}=U_{G_{j}r_{j}}\in\mbox{$\mathbb{R}$}^{m\times r_{j}},\hskip 2.84526ptC_{G_{j}}=\Sigma_{G_{j}r_{j}}V_{G_{j}r_{j}}^{T}\in\mbox{$\mathbb{R}$}^{r_{j}\times n}. (46)

In particular, for t=tj+1t=t_{j+1} and j=1,,p1j=1,\ldots,p-1,

Fj[𝐲(tj+1,ω)]=𝐳j(tj,ω)+DGjCGj𝐲(tj+1,ω).{F}_{j}[\mbox{$\mathbf{y}$}(t_{j+1},\omega)]=\mbox{$\mathbf{z}$}_{j}(t_{j},\omega)+D_{G_{j}}C_{G_{j}}\mbox{$\mathbf{y}$}(t_{j+1},\omega).

Thus, for all t(t1,b]t\in(t_{1},\hskip 2.84526ptb], CGjC_{G_{j}} compresses 𝐲(t,ω)\mbox{$\mathbf{y}$}(t,\omega) to a shorter vector 𝐱~(t,ω)=CGj𝐲(t,ω)\widetilde{\mbox{$\mathbf{x}$}}(t,\omega)=C_{G_{j}}\mbox{$\mathbf{y}$}(t,\omega) with rjr_{j} components.

The reconstruction 𝐱^(t,)=[𝐲(t,ω)]\widehat{\mbox{$\mathbf{x}$}}(t,\cdot)=\mbox{$\cal F$}[\mbox{$\mathbf{y}$}(t,\omega)] of the reference signal 𝐱(t,ω)\mbox{$\mathbf{x}$}(t,\omega) from the compressed data 𝐱~(t,ω)\widetilde{\mbox{$\mathbf{x}$}}(t,\omega) is performed by j\mbox{$\cal F$}_{j} in (43) so that

𝐱^(t,ω)=𝐳j(tj,ω)+DGj𝐱~(t,ω).\widehat{\mbox{$\mathbf{x}$}}(t,\omega)=\mbox{$\mathbf{z}$}_{j}(t_{j},\omega)+D_{G_{j}}\widetilde{\mbox{$\mathbf{x}$}}(t,\omega). (47)

(ii) For t=t1t=t_{1}, the sub-transform F1F_{1} in (28) is reduced to

F1[𝐲(t1,ω)]=𝐱(t1,ω){F}_{1}[\mbox{$\mathbf{y}$}(t_{1},\omega)]=\mbox{$\mathbf{x}$}(t_{1},\omega)

that does not provide compression. To provide compression of 𝐱(t1,ω)\mbox{$\mathbf{x}$}(t_{1},\omega), we represent F1F_{1}, for t[t1,t2]t\in[t_{1},\hskip 2.84526ptt_{2}], as

F1[𝐲(t,ω)]=𝐱(t2,ω)G1[𝐲(t,ω)𝐲(t2,ω)].{F}_{1}[\mbox{$\mathbf{y}$}(t,\omega)]=\mbox{$\mathbf{x}$}(t_{2},\omega)-G_{1}[\mbox{$\mathbf{y}$}(t,\omega)-\mbox{$\mathbf{y}$}(t_{2},\omega)]. (48)

The latter follows when, for t[t1,t2]t\in[t_{1},\hskip 2.84526ptt_{2}], condition (8) is replaced by

𝐚1+𝒢1[𝐲(t1,)]=𝐱(t2,),\mbox{$\mathbf{a}$}_{1}+\mbox{$\cal G$}_{1}[\mbox{$\mathbf{y}$}(t_{1},\cdot)]=\mbox{$\mathbf{x}$}(t_{2},\cdot), (49)

and condition (11)–(12), for t[t1,t2]t\in[t_{1},\hskip 2.84526ptt_{2}], remains the same. Then, for t=t1t=t_{1}, (48) implies

F1[𝐲(t1,ω)]=𝐳~1(t2,ω)+DG1CG1𝐲(t1,ω),F_{1}[\mbox{$\mathbf{y}$}(t_{1},\omega)]=\widetilde{\mbox{$\mathbf{z}$}}_{1}(t_{2},\omega)+D_{G_{1}}C_{G_{1}}\mbox{$\mathbf{y}$}(t_{1},\omega), (50)

where

𝐳~1(t2,ω)=𝐱(t2,ω)G1𝐲(t2,ω),\widetilde{\mbox{$\mathbf{z}$}}_{1}(t_{2},\omega)=\mbox{$\mathbf{x}$}(t_{2},\omega)-G_{1}\mbox{$\mathbf{y}$}(t_{2},\omega),

and DG1D_{G_{1}} and CG1C_{G_{1}} are defined by (45)–(46). Here, CG1C_{G_{1}} and DG1D_{G_{1}} perform compression of 𝐲(t1,ω)\mbox{$\mathbf{y}$}(t_{1},\omega) and subsequent reconstruction as above.

Thus, by transform \cal F, the compression of 𝐲(t,ω)\mbox{$\mathbf{y}$}(t,\omega) requires p1p-1 matrices CjC_{j} and the reconstruction requires p1p-1 vectors 𝐳j(tj,ω)\mbox{$\mathbf{z}$}_{j}(t_{j},\omega) and matrices DjD_{j}.

The compression ratio of transform \cal F in (42)–(43) is given by

c=rjmwhere j=1,,p1.c=\frac{r_{j}}{m}\quad\mbox{where $j=1,\ldots,p-1$}. (51)

If r=rjr=r_{j} for all j=1,,p1j=1,\ldots,p-1 then

c=rm.c=\frac{r}{m}. (52)

4.4 Error associated with the transform \cal F in (27)–(2)

Here, the analysis of the error associated with the transform \cal F in (27)–(2) is provided.

Let us introduce the norm by

𝐱(t,)T,Ω2=1baT𝐱(t,)Ω2𝑑t.\|\mbox{$\mathbf{x}$}(t,\cdot)\|^{2}_{T,\Omega}=\frac{1}{b-a}{\int}_{T}\|\mbox{$\mathbf{x}$}(t,\cdot)\|^{2}_{\Omega}dt.

For A=EΔxjΔyj(EΔyjΔyj1/2)A=E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}, let rankA=\mathrm{rank\;}A=\ell, and let

σ1σ2σ\displaystyle\sigma_{1}\geq\sigma_{2}\geq\ldots\geq\sigma_{\ell} (53)

be singular values of AA.

The following theorem establishes relationships for the errors associated with the proposed transform \cal F.

Theorem 3

Let transform \cal F be given by (42)–(50). Let 𝐱~(t,)=CGj𝐲(t,)L2(Ω,rj)\widetilde{\mbox{$\mathbf{x}$}}(t,\cdot)=C_{G_{j}}\mbox{$\mathbf{y}$}(t,\cdot)\in L^{2}(\Omega,\mbox{$\mathbb{R}$}^{r_{j}}) be the vector compressed and filtered from 𝐲(t,)L2(Ω,n)\mbox{$\mathbf{y}$}(t,\cdot)\in L^{2}(\Omega,\mbox{$\mathbb{R}$}^{n}) by CGjC_{G_{j}} in (45), (46), where rjmin{m,n}r_{j}\leq\min\{m,n\}. Then the reference signal 𝐱(t,)L2(Ω,m)\mbox{$\mathbf{x}$}(t,\cdot)\in L^{2}(\Omega,\mbox{$\mathbb{R}$}^{m}) is reconstructed from 𝐱~(t,)\widetilde{\mbox{$\mathbf{x}$}}(t,\cdot) by \cal F as 𝐱^(t,)=[𝐲(t,ω)]\widehat{\mbox{$\mathbf{x}$}}(t,\cdot)=\mbox{$\cal F$}[\mbox{$\mathbf{y}$}(t,\omega)] with any given accuracy in the following sense:

𝐱(t,)[𝐲(t,)]T,Ω20\displaystyle\|\mbox{$\mathbf{x}$}(t,\cdot)-\mbox{$\cal F$}[\mbox{$\mathbf{y}$}(t,\cdot)]\|^{2}_{T,\Omega}\rightarrow 0 (54)

as

maxj=1,p1𝐱(t,)𝐱(tj,)T,Ω20,\displaystyle\max_{j=1,\ldots p-1}\|\mbox{$\mathbf{x}$}(t,\cdot)-\mbox{$\mathbf{x}$}(t_{j},\cdot)\|^{2}_{T,\Omega}\rightarrow 0, (55)
maxj=1,p1𝐲(t,)𝐲(tj,)T,Ω20\displaystyle\max_{j=1,\ldots p-1}\|\mbox{$\mathbf{y}$}(t,\cdot)-\mbox{$\mathbf{y}$}(t_{j},\cdot)\|^{2}_{T,\Omega}\rightarrow 0 (56)
andp.\displaystyle\hskip 5.69054pt\mbox{and}\hskip 8.53581ptp\rightarrow\infty. (57)

For t=tj+1t=t_{j+1} and j=0,,p1j=0,\ldots,p-1, the error associated with [𝐲(tj+1,)]\mbox{$\cal F$}[\mbox{$\mathbf{y}$}(t_{j+1},\cdot)] is given by

𝐱(tj+1,)[𝐲(tj+1,)]Ω2=EΔxjΔxj1/22i=1rjσi2.\displaystyle\|\mbox{$\mathbf{x}$}(t_{j+1},\cdot)-\mbox{$\cal F$}[\mbox{$\mathbf{y}$}(t_{j+1},\cdot)]\|^{2}_{\Omega}=\|E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}x_{j}}^{1/2}\|^{2}-\sum_{i=1}^{r_{j}}\sigma_{i}^{2}. (58)

where σ1,,σrj\sigma_{1},\ldots,\sigma_{r_{j}} are defined by (53).

Remark 1

Although, by the assumption, 𝐱(tj+1,)\mbox{$\mathbf{x}$}(t_{j+1},\cdot) is given, its compression implies the associated error presented by (58).

Proof 2

Let us first show that the statement in (54)–(55) is true. We have

𝐱(t,)[𝐲(t,)]T,Ω2\displaystyle\|\mbox{$\mathbf{x}$}(t,\cdot)-\mbox{$\cal F$}[\mbox{$\mathbf{y}$}(t,\cdot)]\|^{2}_{T,\Omega}
=j=1p1δj[𝐱(t,)j[𝐲(t,)]T,Ω2\displaystyle=\|\sum_{j=1}^{p-1}\delta_{j}[\mbox{$\mathbf{x}$}(t,\cdot)-\mbox{$\cal F$}_{j}[\mbox{$\mathbf{y}$}(t,\cdot)]\|^{2}_{T,\Omega}
=j=1p1δj[𝐱(t,)𝐱(tj,)]\displaystyle=\|\sum_{j=1}^{p-1}\delta_{j}[\mbox{$\mathbf{x}$}(t,\cdot)-\mbox{$\mathbf{x}$}(t_{j},\cdot)]
+Gj[𝐲(tj,)𝐲(t,)]T,Ω2\displaystyle\hskip 48.36967pt+\hskip 2.84526ptG_{j}[\mbox{$\mathbf{y}$}(t_{j},\cdot)-\mbox{$\mathbf{y}$}(t,\cdot)]\|^{2}_{T,\Omega}
maxj=1,p1{𝐱(t,)𝐱(tj,)T,Ω2\displaystyle\leq\max_{j=1,\ldots p-1}\left\{\|\mbox{$\mathbf{x}$}(t,\cdot)-\mbox{$\mathbf{x}$}(t_{j},\cdot)\|^{2}_{T,\Omega}\right.
+Gj𝐲(t,)𝐲(tj,)T,Ω2}\displaystyle\hskip 48.36967pt\left.+\|G_{j}\|\|\mbox{$\mathbf{y}$}(t,\cdot)-\mbox{$\mathbf{y}$}(t_{j},\cdot)\|^{2}_{T,\Omega}\right\} (59)

where GjG_{j} is given by ((2). Then (55)–(57) and (59) imply (54).888In particular, 𝕆=𝕆\mbox{$\mathbb{O}$}^{\dagger}=\mbox{$\mathbb{O}$}.

Let us now consider (58). In the notation (19) and (33),

Arj:=EΔxjΔyj(EΔyjΔyj1/2)rj.A_{r_{j}}:=\langle\langle E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}\rangle\rangle_{r_{j}}.

By Lemma 42 in [7],

Arj(EΔyjΔyj)EΔyjΔyj=Arj.A_{r_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}})^{{\dagger}}E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}=A_{r_{j}}.

It is also true [7] that

(EΔyjΔyj1/2)=EΔyjΔyjEΔyjΔyj1/2.(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}=E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{{\dagger}}E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2}.

Therefore, in (2), GjG_{j} can be represented as

Gj=Arj(EΔyjΔyj1/2)+Mj[InEΔyjΔyj1/2(EΔyjΔyj1/2)]\displaystyle G_{j}=A_{r_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}+M_{j}[I_{n}-E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}]
=ArjEΔyjΔyjEΔyjΔyj1/2\displaystyle=A_{r_{j}}E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{{\dagger}}E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2}
+Mj[InEΔyjΔyj1/2(EΔyjΔyj1/2)].\displaystyle\hskip 28.45274pt+M_{j}[I_{n}-E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}].

On the basis of (11), (16), (2) and (31), we have

𝐱(tj+1,)F[𝐲(tj+1,)]Ω2\displaystyle\|\mbox{$\mathbf{x}$}(t_{j+1},\cdot)-F[\mbox{$\mathbf{y}$}(t_{j+1},\cdot)]\|^{2}_{\Omega}
=Δ𝐱(tj,tj+1,)𝒢j(Δ𝐲(tj,tj+1,))Ω2\displaystyle=\left\|\mbox{\tiny$\Delta$}\mbox{$\mathbf{x}$}(t_{j},t_{j+1},\cdot)-\mbox{$\cal G$}_{j}(\mbox{\tiny$\Delta$}\mbox{$\mathbf{y}$}(t_{j},t_{j+1},\cdot))\right\|^{2}_{\Omega}
=EΔxjΔxj1/22EΔxjΔyj(EΔyjΔyj1/2)2\displaystyle=\|E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}x_{j}}^{1/2}\|^{2}-\|E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{\dagger}\|^{2}
+EΔxjΔyj(EΔyjΔyj1/2)Arj2\displaystyle\hskip 42.67912pt+\|E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}-A_{r_{j}}\|^{2} (60)

where [35]

EΔxjΔyj(EΔyjΔyj1/2)Arj2=i=rj+1σi2.\displaystyle\|E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{{\dagger}}-A_{r_{j}}\|^{2}=\sum_{i=r_{j}+1}^{\ell}\sigma_{i}^{2}. (61)

Since

EΔxjΔyj(EΔyjΔyj1/2)2=i=1σi2,\displaystyle\|E_{\mbox{\tiny$\Delta$}x_{j}\mbox{\tiny$\Delta$}y_{j}}(E_{\mbox{\tiny$\Delta$}y_{j}\mbox{\tiny$\Delta$}y_{j}}^{1/2})^{\dagger}\|^{2}=\sum_{i=1}^{\ell}\sigma_{i}^{2}, (62)

then (60)–(62) imply (58).

4.5 A particular case: signals are given by their samples. Numerical realization of transform \cal F

In practice, signals 𝐱(t,)\mbox{$\mathbf{x}$}(t,\cdot) and 𝐲(t,)\mbox{$\mathbf{y}$}(t,\cdot) are given by their samples

X(t)=[𝐱(t,ω1)𝐱(t,ωq)]m×q\displaystyle X(t)=[\mbox{$\mathbf{x}$}(t,\omega_{1})\ldots\mbox{$\mathbf{x}$}(t,\omega_{q})]\in\mbox{$\mathbb{R}$}^{m\times q} (63)
andY(t)=[𝐲(t,ω1)𝐲(t,ωq)]n×q.\displaystyle\mbox{and}\hskip 5.69054ptY(t)=[\mbox{$\mathbf{y}$}(t,\omega_{1})\ldots\mbox{$\mathbf{y}$}(t,\omega_{q})]\in\mbox{$\mathbb{R}$}^{n\times q}. (64)

respectively. In particular, for t=tjt=t_{j}, the samples are

Xj:=X(tj)=[𝐱(tj,ω1)𝐱(tj,ωq)]m×q\displaystyle X_{j}:=X(t_{j})=[\mbox{$\mathbf{x}$}(t_{j},\omega_{1})\ldots\mbox{$\mathbf{x}$}(t_{j},\omega_{q})]\in\mbox{$\mathbb{R}$}^{m\times q} (65)
andYj:=Y(tj)=[𝐲(tj,ω1)𝐲(tj,ωq)]n×q.\displaystyle\mbox{and}\hskip 5.69054ptY_{j}:=Y(t_{j})=[\mbox{$\mathbf{y}$}(t_{j},\omega_{1})\ldots\mbox{$\mathbf{y}$}(t_{j},\omega_{q})]\in\mbox{$\mathbb{R}$}^{n\times q}. (66)

In this case, the transform \cal F in (5)–(6) takes the form

F[Y(t,)]=j=1p1δjFj[Y(t)],F[Y(t,\cdot)]=\sum_{j=1}^{p-1}\delta_{j}F_{j}[Y(t)], (67)

where

Fj[Y(t)]=Aj+GjY(t)andδj={1,if tjttj+1,0,otherwise,F_{j}[Y(t)]=A_{j}+G_{j}Y(t)\hskip 5.69054pt\mbox{and}\hskip 5.69054pt\delta_{j}=\left\{\begin{array}[]{cl}1,&\mbox{if $t_{j}\leq t\leq t_{j+1}$},\\ 0,&\mbox{otherwise,}\end{array}\right. (68)

and Aj=[𝐚j(ω1),,𝐚j(ωq)]m×qA_{j}=[\mbox{$\mathbf{a}$}_{j}(\omega_{1}),\ldots,\mbox{$\mathbf{a}$}_{j}(\omega_{q})]\in\mbox{$\mathbb{R}$}^{m\times q} and Gjm×nG_{j}\in\mbox{$\mathbb{R}$}^{m\times n} are matrices such that, for j=1,,p1j=1,\ldots,p-1, AjA_{j} satisfies the condition

Aj+GjYj=XjA_{j}+G_{j}Y_{j}=X_{j} (69)

and GjG_{j} solves

minGjΔXjGjΔYj\min_{G_{j}}\|\mbox{\tiny$\Delta$}X_{j}-G_{j}\mbox{\tiny$\Delta$}Y_{j}\| (70)

subject to

rankGj=rjmin{m,n}.\mathrm{rank\;}G_{j}=r_{j}\leq min\{m,n\}. (71)

The solution is provided by the following Corollary which is a particular case of Theorem 2.

Corollary 1

In a practical case when signals 𝐱(t,)\mbox{$\mathbf{x}$}(t,\cdot) and 𝐲(t,)\mbox{$\mathbf{y}$}(t,\cdot) are represented by samples (63)–(66), the required transform is given by (67)–(68) where

Fj[Y(t)]=XjGj[Y(t)Yj]F_{j}[Y(t)]=X_{j}-G_{j}[Y(t)-Y_{j}] (72)

with

Gj=ΔXjΔYjT[(ΔYjΔYjT)1/2]rj[(ΔYjΔYjT)1/2]\displaystyle G_{j}=\langle\langle\mbox{\tiny$\Delta$}X_{j}\mbox{\tiny$\Delta$}Y_{j}^{T}[(\mbox{\tiny$\Delta$}Y_{j}\mbox{\tiny$\Delta$}Y_{j}^{T})^{1/2}]^{{\dagger}}\rangle\rangle_{r_{j}}[(\mbox{\tiny$\Delta$}Y_{j}\mbox{\tiny$\Delta$}Y_{j}^{T})^{1/2}]^{{\dagger}} (73)
+Mj[In((ΔYjΔYjT)1/2)(ΔYjΔYjT)1/2].\displaystyle+M_{j}[I_{n}-((\mbox{\tiny$\Delta$}Y_{j}\mbox{\tiny$\Delta$}Y_{j}^{T})^{1/2})^{{\dagger}}(\mbox{\tiny$\Delta$}Y_{j}\mbox{\tiny$\Delta$}Y_{j}^{T})^{1/2}].

where matrix MjM_{j} is arbitrary.

4.6 Numerical realization of transform \cal F

In fact, formulas (72)–(73) represent a numerical realization of the transform presented by (27)–(2). In practice, MjM_{j} can be chosen as the zero matrix as it is normally done in the KLT-like techniques.

4.7 Summary of the proposed transform

Here, we provide a summary of the transform presented in Sections 4.2 and 4.3.

Let KX={𝐱(t,)L2(Ω,m)|tT}K_{{}_{X}}=\{\mbox{$\mathbf{x}$}(t,\cdot)\in L^{2}(\Omega,{\mathbb{R}}^{m})\hskip 2.84526pt|\hskip 2.84526ptt\in T\} and KY={𝐲(t,)L2(Ω,n)|tT}K_{{}_{Y}}=\{\mbox{$\mathbf{y}$}(t,\cdot)\in L^{2}(\Omega,{\mathbb{R}}^{n})\hskip 2.84526pt|\hskip 2.84526ptt\in T\} be infinite sets of reference signals and observable signals, respectively (see Section 2.2).

Choose finite subsets of KXK_{{}_{X}} and KYK_{{}_{Y}}, {𝐱(t1,),\{\mbox{$\mathbf{x}$}(t_{1},\cdot), ,\ldots, 𝐱(tp1,)}\mbox{$\mathbf{x}$}(t_{p-1},\cdot)\} and {𝐲(t1,),,𝐲(tp1,)}\{\mbox{$\mathbf{y}$}(t_{1},\cdot),\ldots,\mbox{$\mathbf{y}$}(t_{p-1},\cdot)\}, respectively.

To process an observable signal 𝐲(t,)\mbox{$\mathbf{y}$}(t,\cdot) we use the transform \cal F as presented in (42), (43) where DjD_{j} and CjC_{j} are given by (45) or (46). The matrices DjD_{j} and CjC_{j} are constructed from the SVD (38) truncated to the form (39)–(41).

The matrix CjC_{j} compresses and filters 𝐲(t,)\mbox{$\mathbf{y}$}(t,\cdot) to a shorter vector x~(t,)\widetilde{x}(t,\cdot) (the vector of principal components). The reconstruction, x^(t,)\widehat{x}(t,\cdot), is performed in the form (47) so that x^(t,)\widehat{x}(t,\cdot) is close to the reference signal 𝐱(t,)\mbox{$\mathbf{x}$}(t,\cdot) in the sense (54)–(57).

4.8 Advantages of proposed transform. Comparison with KLT and its extensions

The proposed methodology provides a single transform to process any signal from an infinite set of signals. This is a distinctive feature of the considered technique.

To the best of our knowledge, there is only one other known approach [9] that provides the transform with a similar property. Moreover, the KLT is a particular case of the transform [9].

Therefore, it is natural to compare the proposed transform with that in [9]. The transform developed in [9] is presented in the special form of a sum with pp terms where each term is determined from the preceding terms as a solution of a minimization problem for the associated error. The final term in such an iterative procedure provides signal compression and decompression. In particular, the KLT [7] follows from [9] if p=1p=1.

Here, we first compare the transform in [9] and the proposed transform \cal F, and demonstrate the advantages of the transform \cal F. Then we show that the similar advantages also occur in comparison with other transforms based on the KLT idea.

4.8.1 Associated errors of transforms \cal F and [9]

Together with the distinctive feature of the transform \cal F mentioned above, its other distinctive property is an arbitrarily small associated error in the reconstruction of the reference signal 𝐱(t,)\mbox{$\mathbf{x}$}(t,\cdot) (Theorem 3 in Section 4.4). This is achieved under conditions (55)–(57).

The transform in [9] does not provide such a nice property.

In other words, \cal F given by (42)–(47) is composed from sequences of signals measured at different time instants tit_{i}, not from the averages of signals over the domain T×ΩT\times\Omega as in [9]. Therefore, the transform \cal F is ‘flexible’ to variations of signals 𝐱(t,)\mbox{$\mathbf{x}$}(t,\cdot) and 𝐲(t,)\mbox{$\mathbf{y}$}(t,\cdot), and this inherent feature leads to the decrease in the associated error as it is established in Theorem 3 above.

4.8.2 Degrees of freedom to reduce the associated error

Both the transform \cal F given by (42)–(47) and the transform in [9] have two degrees of freedom to reduce associated error: the number of terms that compose the transform, and the matrix ranks. At the same time, the error associated with the method in [9] is bounded despite the increase in its number of terms while for the the proposed transform \cal F, the increase in number of terms leads to an arbitrarily small associated error (see (54)–(57)). This occurs because of a ‘flexibility’ of the transform \cal F discussed in the following Section 4.8.3.

Moreover, unlike the method in [9] the proposed approach may provide one more degree or freedom, a distribution of interpolation pairs {𝐱(tj,),𝐲(tj,)}j=1p\{\mbox{$\mathbf{x}$}(t_{j},\cdot),\mbox{$\mathbf{y}$}(t_{j},\cdot)\}_{j=1}^{p} related to the distribution of points t1,,tpt_{1},\ldots,t_{p}. An ‘appropriate’ selection of {𝐱(tj,),𝐲(tj,)}j=1p\{\mbox{$\mathbf{x}$}(t_{j},\cdot),\mbox{$\mathbf{y}$}(t_{j},\cdot)\}_{j=1}^{p} may diminish the error. At the same time, an optimal choice of {𝐱(tj,),𝐲(tj,)}j=1p\{\mbox{$\mathbf{x}$}(t_{j},\cdot),\mbox{$\mathbf{y}$}(t_{j},\cdot)\}_{j=1}^{p} is a specific problem which is not under consideration in this paper.

4.8.3 Associated assumptions. Numerical realization

A numerical realization of the KLT based transforms requires a knowledge or estimation of related covariance matrices. The estimates are normally found from samples X(t1)X(t_{1}), \ldots, X(tp)X(t_{p}) and Y(t1)Y(t_{1}), \ldots, Y(tp)Y(t_{p}) presented in Section 4.5. That is, the assumptions used in numerical realizations of the KLT based techniques and the proposed transform \cal F (given by (42)–(50)) are, in fact, the same: it is assumed that the samples X(t1)X(t_{1}), \ldots, X(tp)X(t_{p}), Y(t1)Y(t_{1}), \ldots, Y(tp)Y(t_{p}) are known. In other words, the preparatory work that should be done for the numerical realization of the KLT and our transform \cal F is the same. At the same time, the usage of those signal samples is different. While the transform in [9] requires estimates of two covariance matrices in the form of two related averages formed from X(t1)X(t_{1}), \ldots, X(tp)X(t_{p}), Y(t1)Y(t_{1}), \ldots, Y(tp)Y(t_{p}), our transform \cal F uses each representative X(tj)X(t_{j}) and Y(tj)Y(t_{j}) separately. This makes transform \cal F ‘flexible’ to a variation of 𝐱(t,)\mbox{$\mathbf{x}$}(t,\cdot) and 𝐲(t,)\mbox{$\mathbf{y}$}(t,\cdot).

In the case of finite signal sets, {𝐲(1),,𝐲(N)}\{\mbox{$\mathbf{y}$}_{(1)},\ldots,\mbox{$\mathbf{y}$}_{(N)}\} and {𝐱(1),,𝐱(N)}\{\mbox{$\mathbf{x}$}_{(1)},\ldots,\mbox{$\mathbf{x}$}_{(N)}\}, we might apply the individual KLT transform WiW_{i} to each pair {𝐲(i),𝐱(i)}\{\mbox{$\mathbf{y}$}_{(i)},\mbox{$\mathbf{x}$}_{(i)}\}, for i=1,,Ni=1,\ldots,N, separately. Such a procedure would require NN times more computational work compared with the transform in [9].

As a result, while computational efforts associated with the proposed technique are similar or less than those needed for the KLT based transforms, transform \cal F given by (42)–(46) allows us to substantially improve the accuracy in signal estimation, as has been shown in Theorem 3. This observation is also demonstrated in the simulations presented in Section 5 below.

Table 1. Accuracy
associated with the proposed transform \cal F.
Number of interpolation pairs
p=9p=9 p=21p=21 p=41p=41 p=81p=81
Compression The best accuracy associated with
ratio our transform FF, εmin(F)×108\varepsilon_{\min}(F)\times 10^{-8}
c=5/116c=5/116 0.7 0.5 0.48 0.31
c=10/116c=10/116 0.5 0.5 0.45 0.23
Compression The worst accuracy associated with
ratio our transform FF, εmax(F)×108\varepsilon_{\max}(F)\times 10^{-8}
c=5/116c=5/116 1.71 1.04 0.71 0.64
c=10/116c=10/116 1.49 1.07 0.76 0.56
Table 2. Comparison in accuracy
of the proposed transform \cal F and the KLT.
Number of interpolation pairs
p=9p=9 p=21p=21 p=41p=41 p=81p=81
Compr. ratio Δmin\Delta_{{\min}} Δmax\Delta_{{\max}} Δmin\Delta_{{\min}} Δmax\Delta_{{\max}} Δmin\Delta_{{\min}} Δmax\Delta_{{\max}} Δmin\Delta_{{\min}} Δmax\Delta_{{\max}}
c=5/116c=5/116 4.9 15. 4 9.3 18.2 10.1 19.7 12.4 22.9
c=10/116c=10/116 4.3 17. 4 10.3 18.7 12.9 22.0 15.7 25.1
Refer to caption Refer to caption
(a)Signal X(1).X^{(1)}. (b) Signal X(55).X^{(55)}.
Refer to caption Refer to caption
(c) Signal X(115).X^{(115)}. (d) Signal X(160).X^{(160)}.
Refer to caption Refer to caption
(e) Signal X(225).X^{(225)}. (f) Signal X(280).X^{(280)}.
Refer to caption Refer to caption
(g) Signal X(340).X^{(340)}. (h) Signal X(400).X^{(400)}.
Figure 1: Examples of selected reference signals.
Refer to caption Refer to caption
(a) (b)
Refer to caption Refer to caption
(c) (d)
Figure 2: Illustration of simulation results from Tables 1 and 2. (a): Reference signal X(280).X^{(280)}. (b): Observed data Y(280)Y^{(280)}. (c): Reconstruction of X(280)X^{(280)} from Y(280)Y^{(280)} by KLT with compression ratio 10/11610/116. (d): Reconstruction of X(280)X^{(280)} from Y(280)Y^{(280)} by proposed transform FF with compression ratio 10/11610/116.

5 Simulations

Here, we consider a case where KXK_{{}_{X}} and KYK_{{}_{Y}} (introduced in Section 2.2) are represented by finite signal sets with NN members, and illustrate the advantages of the proposed technique over methods based on the KLT approach. In many practical problems (arising, e.g, in a DNA analysis [39]) the number NN is quite large, for instance, N=𝒪(104)N=\mathcal{O}(10^{4}) [39]. In these simulations, we set N=400N=400.

Let us suppose that 𝐲(τ1,)\mbox{$\mathbf{y}$}(\tau_{1},\cdot), 𝐲(τ2,)\mbox{$\mathbf{y}$}(\tau_{2},\cdot), \ldots, 𝐲(τ400,)\mbox{$\mathbf{y}$}(\tau_{400},\cdot) are 400400 input stochastic signals where 𝐲(τk,)L2(Ω,n)\mbox{$\mathbf{y}$}(\tau_{k},\cdot)\in L^{2}(\Omega,\mbox{$\mathbb{R}$}^{n}), for k=1,,400k=1,\ldots,400 and n=116n=116. Reference stochastic signals are 𝐱(τ1,),\mbox{$\mathbf{x}$}(\tau_{1},\cdot), 𝐱(τ2,),,𝐱(τ400,)\mbox{$\mathbf{x}$}(\tau_{2},\cdot),\ldots,\mbox{$\mathbf{x}$}(\tau_{{400}},\cdot), where 𝐱(τk,)L2(Ω,m)\mbox{$\mathbf{x}$}(\tau_{k},\cdot)\in L^{2}(\Omega,\mbox{$\mathbb{R}$}^{m}), for k=1,,400k=1,\ldots,400 and m=n=116m=n=116. Thus, in these simulations, the interval [ab][a\hskip 2.84526ptb] introduced in Section 2.2 is modelled as 400 points τk\tau_{k} with k=1,,400k=1,\ldots,400 so that [ab]=[τ1,τ2,,τ400][a\hskip 2.84526ptb]=[\tau_{1},\tau_{2},\ldots,\tau_{{400}}].

Signals 𝐱(τk,)\mbox{$\mathbf{x}$}(\tau_{k},\cdot) and 𝐲(τk,)\mbox{$\mathbf{y}$}(\tau_{k},\cdot) have been simulated as digital images presented by 116×512116\times 512 matrices X(k)X^{(k)} and Y(k)Y^{(k)}, respectively, with k=1,,400k=1,\ldots,400. Each column of matrices X(k)X^{(k)} and Y(k)Y^{(k)} represents a realization of signals 𝐱(τk,)\mbox{$\mathbf{x}$}(\tau_{k},\cdot) and 𝐲(τk,)\mbox{$\mathbf{y}$}(\tau_{k},\cdot), respectively. Each matrix X(k)X^{(k)} represents data obtained from a digital photograph ‘Stream and bridge’999The database is available in http://sipi.usc.edu/services/database.html.. Examples of selected images X(k)X^{(k)} are shown in Fig. 1.

Observed noisy images Y(1),,Y(400)Y^{(1)},\ldots,Y^{(400)} have been simulated in the form

Y(k)=randnX(k)rand,Y^{(k)}=\mbox{\tt randn}\bullet X^{(k)}\bullet\mbox{\tt rand},

for each k=1,,400k=1,\ldots,400. Here, \bullet means the Hadamard product, and randn and rand are 116×512116\times 512 matrices with random entries, and they simulate noise. The entries of randn are normally distributed with mean zero, variance one and standard deviation one. The entries of rand are uniformly distributed in the interval (0,1)(0,1). A typical example of such noisy images is given in Fig. 2 (b).

We wish to filter and compress the observed data Y(1)Y^{(1)}, …, Y(400)Y^{(400)} so that their subsequent reconstructions would be close to the reference signals X(1)X^{(1)}, …, X(400)X^{(400)}, respectively.

The proposed transform \cal F given by (67)–(68) and (72)–(73), the generic KLT [7] and the transform [9] were applied to this task. To compare their performance, we address three issues as those in Sections 1.1 and 1.4, as follows.

5.1 Large sets of signals

In these simulations, the signal sets are large but finite. Therefore, the known transforms developed for compression of an individual finite random signal-vector can be applied to this case. This issue has been mentioned in Section 1.1 above. Nevertheless the known methods imply difficulties (an insufficient accuracy and excessive computational work) that are discussed below.

5.2 Associated accuracy and compression ratios

5.2.1 Proposed transform \cal F

As it has been mentioned in Section 4.8.2, the proposed technique has two degrees of freedom, a number pp of interpolation pairs {𝐱j,𝐲j}j=1p\{\mbox{$\mathbf{x}$}_{j},\mbox{$\mathbf{y}$}_{j}\}_{j=1}^{p} and rank rjr_{j} of matrix GjG_{j} (see (12), (2) and (71)). We set r=rjr=r_{j} for all j=1,,400j=1,\ldots,400.

To demonstrate properties and advantages of the proposed transform \cal F, interpolation pairs {X1,Y1},,{Xp,Yp}\{X_{1},Y_{1}\},\ldots,\{X_{p},Y_{p}\} have been chosen in different ways as follows:

11st choice:

p=9p=9 and interpolation pairs are {X1,Y1}\{X_{1},Y_{1}\} ={X(1),Y(1)}=\{X^{(1)},Y^{(1)}\}, {Xi+1,Yi+1}\{X_{i+1},Y_{i+1}\} ={X50i,Y50i}=\{X^{50i},Y^{50i}\} for i=1,,8i=1,\ldots,8;

22nd choice:

p=21p=21 and interpolation pairs are {X1,Y1}\{X_{1},Y_{1}\} ={X(1),Y(1)}=\{X^{(1)},Y^{(1)}\}, {Xi+1,Yi+1}\{X_{i+1},Y_{i+1}\} ={X20i,Y20i}=\{X^{20i},Y^{20i}\} for i=1,,20i=1,\ldots,20;

33rd choice:

p=41p=41 and interpolation pairs are {X1,Y1}\{X_{1},Y_{1}\} ={X(1),Y(1)}=\{X^{(1)},Y^{(1)}\}, {Xi+1,Yi+1}\{X_{i+1},Y_{i+1}\} ={X10i,Y10i)=\{X^{10i},Y^{10i}) for i=1,,40i=1,\ldots,40;

44th choice:

p=81p=81 and interpolation pairs are {X1,Y1}\{X_{1},Y_{1}\} ={X(1),Y(1)}=\{X^{(1)},Y^{(1)}\}, {Xi+1,Yi+1}\{X_{i+1},Y_{i+1}\} ={X5i,Y5i}=\{X^{5i},Y^{5i}\} for i=1,,80i=1,\ldots,80;

For k=1,,400k=1,\ldots,400, the accuracy associated with compression, filtering of each Y(k)Y^{(k)} and its subsequent reconstruction by the proposed transform FF is represented by

εk(F)=X(k)F[Y(k)]F2\varepsilon_{k}(F)=\|X^{(k)}-{F}[Y^{(k)}]\|^{2}_{F} (74)

and

εmin(F)=mink=1,,400εk(F),εmax(F)=maxk=1,,400εk(F),\varepsilon_{\min}(F)=\min_{k=1,\ldots,400}\varepsilon_{k}(F),\hskip 2.84526pt\varepsilon_{\max}(F)=\max_{k=1,\ldots,400}\varepsilon_{k}(F), (75)

In Fig. 2 (d), an example of the restoration of signal X(280)X^{(280)} from noisy observed data Y(280)Y^{(280)} is given. X(280)X^{(280)} and Y(280)Y^{(280)} are typical representatives of signals under consideration. The image in Fig. 2 (d) has been evaluated for p=9p=9 as in the 11st choice above and the compression ratio 10/11610/116.

Values of εmin(F)\varepsilon_{\min}(F) and εmax(F)\varepsilon_{\max}(F) associated with different choices of the above interpolation pairs are given in Table 1. In the first column, the compression ratios used in the transform FF are given. In particular, it follows from Table 1 that the accuracy improves when the number of interpolation pairs increases. This is a confirmation of the statement (54)–(57) of Theorem 3.

5.2.2 Individual KLTs [7]

To each pair X(k)X^{(k)}, Y(k)Y^{(k)}, an individual KLT KkK_{k} has been applied, where k=1,,400k=1,\ldots,400. Thus, the KLT KkK_{k} has to be applied 400 times. In Fig. 2 (c), an example of the restoration of signal X(280)X^{(280)} by the KLT with the compression ratio 10/11610/116 is given.

The error associated with compression, filtering of each Y(k)Y^{(k)} and its subsequent reconstruction by the KLT is represented by

ε(Kk)=X(k)Kk[Y(k)]F2.\varepsilon(K_{k})=\|X^{(k)}-{K_{k}}[Y^{(k)}]\|^{2}_{F}. (76)

To compare the proposed transform FF and the KLT, we denote

Δmax=maxk=1,,400[ε(Kk)/εk(F)]\Delta_{{\max}}=\max_{k=1,\ldots,400}[\varepsilon(K_{k})/\varepsilon_{k}(F)] (77)

and

Δmin=mink=1,,400[ε(Kk)/εk(F)].\Delta_{{\min}}=\min_{k=1,\ldots,400}[\varepsilon(K_{k})/\varepsilon_{k}(F)]. (78)

In other words, Δmax\Delta_{{\max}} and Δmin\Delta_{{\min}} represent maximal and minimal magnitudes of the ratios of the accuracies associated with FF and the KLT KkK_{k}. These ratios have been calculated with the same two ranks of FF and KkK_{k}, r=5r=5 and r=10r=10, i.e. with the same two compression ratios of FF and KkK_{k}, c=5/116c=5/116 and c=10/116c=10/116.

The results are presented in Table 2. It follows from Table 2 that our transform FF provides the substantially better associated accuracy. Depending on the number of interpolation pairs pp and compression ratio cc, the accuracy associated with FF is from 4 to 25 times better than that of the KLT.

5.2.3 Generic KLT

[9]. In these simulations, the generic KLT [9] provides the associated accuracy that is worse than that provided by the individual KLTs above.

5.3 Computational work

The proposed transform requires the computation of p1p-1 pseudo-inverse matrices (in (2), with MGj=𝕆M_{G_{j}}=\mbox{$\mathbb{O}$}), p1p-1 SVDs in (38) and 3(p1)3(p-1) matrix multiplications in (2) and (38). In these simulations, p=9p=9, 2121, 4141, 8181.

The individual KLTs applied, for k=1,,400k=1,\ldots,400, to each pair 𝐱k:=𝐱(τk,)\mbox{$\mathbf{x}$}_{k}:=\mbox{$\mathbf{x}$}(\tau_{k},\cdot), 𝐲k:=𝐲(τk,)\mbox{$\mathbf{y}$}_{k}:=\mbox{$\mathbf{y}$}(\tau_{k},\cdot) require the computation of 400 pseudo-inverse matrices (Eykyk1/2)(E_{y_{k}y_{k}}^{1/2})^{{\dagger}} and 400 SVDs of matrices Exkyk(Eykyk1/2)rk(Eykyk1/2)\langle\langle E_{x_{k}y_{k}}(E_{y_{k}y_{k}}^{1/2})^{{\dagger}}\rangle\rangle_{r_{k}}(E_{y_{k}y_{k}}^{1/2})^{{\dagger}}. Clearly, the KLTs require substantially more computational work than that by the proposed transform.

5.4 Summary of simulation results

The results of the simulations confirm the theoretical results obtained above. In particular,

(i) the accuracy εF\displaystyle\mbox{$\varepsilon$}_{{}_{F}} associated with the proposed transform FF increases when the number pp of interpolation pairs increases (Theorem 3),

(ii) the accuracy εF\displaystyle\mbox{$\varepsilon$}_{{}_{F}} of our transform is from 4 to 25 times better than the accuracy associated with the KLT-like transforms (depending on the number pp of interpolation pairs and the compression ratios),

(iii) the proposed transform requires less computational work than that of the KLT-like transforms.

6 Conclusions

The proposed data compression technique is constructed from a combination of the idea of the piece-wise linear function interpolation and the best rank-constrained operator approximation. This device provides the advantages that allow us to

(i) achieve any desired accuracy in the reconstruction of compressed data (Theorem 3),

(ii) find a single transform to compress and then reconstruct any signal from the infinite signal set (Sections 4.2 and 4.3),

(iii) determine the transform in terms of pseudo-inverse matrices so that the transform always exists (Sections 4.2),

(iv) decrease the computational load compared to the related techniques (Section 5.3),

(v) exploit two degrees of freedom (a number of interpolation pairs and compression ratios) to improve the transform performance, and

(vi) use the same initial information (signal samples) as is usually used in the KLT-like transforms.

References

  • [1] E. H. W. Meijering, A chronology of interpolation: From ancient astronomy to modern signal and image processing, Proc. IEEE, 90, 3, pp. 319 - 342, 2002.
  • [2] L. L. Scharf, The SVD and reduced rank signal processing, Signal Processing, vol. 25, 113 - 133, 1991.
  • [3] Y. Yamashita and H. Ogawa, Relative Karhunen-Loéve transform, IEEE Trans. on Signal Processing, vol. 44, pp. 371-378, 1996.
  • [4] Y. Hua and W. Q. Liu, Generalized Karhunen-Loève transform, IEEE Signal Processing Letters, vol. 5, pp. 141-143, 1998.
  • [5] J. S. Goldstein, I. Reed, and L. L. Scharf, A Multistage Representation of the Wiener Filter Based on Orthogonal Projections, IEEE Trans. on Information Theory, vol. 44, pp. 2943-2959, 1998.
  • [6] Y. Hua, M. Nikpour, and P. Stoica, Optimal Reduced-Rank estimation and filtering, IEEE Trans. on Signal Processing, vol. 49, pp. 457-469, 2001.
  • [7] A. Torokhti and P. Howlett, Computational Methods for Modelling of Nonlinear Systems, Elsevier, 2007.
  • [8] A. Torokhti and P. Howlett, Optimal Transform Formed by a Combination of Nonlinear Operators: The Case of Data Dimensionality Reduction, IEEE Trans. on Signal Processing, 54, No. 4, pp. 1431-1444, 2006.
  • [9] A. Torokhti and P. Howlett, Filtering and Compression for Infinite Sets of Stochastic Signals, Signal Processing, 89, pp. 291-304, 2009.
  • [10] A. Torokhti and J. Manton, Generic Weighted Filtering of Stochastic Signals, IEEE Trans. on Signal Processing, 57, issue 12, pp. 4675-4685, 2009.
  • [11] A. Torokhti and S. Friedland, Towards theory of generic Principal Component Analysis, J. Multivariate Analysis, 100, 4, pp. 661-669, 2009.
  • [12] A. Torokhti and S. Miklavcic, Data Compression under Constraints of Causality and Variable Finite Memory, Signal Processing, 90 , Issue 10, pp. 2822-2834, 2010.
  • [13] I.T. Jolliffe, “Principal Component Analysis,” Springer Verlag, New York, 1986.
  • [14] I. Johnstone, A. Lu, On Consistency and Sparsity for Principal Components Analysis in High Dimensions, J. of the American Statistical Association, 104, 486, pp. 682-693, 2009.
  • [15] S. Simoens, O. Munoz-Medina, J. Vidal, and A. Del Coso, Compress-and-Forward Cooperative MIMO Relaying With Full Channel State Information, IEEE Trans. on Signal Processing, 58, No. 2, pp. 781–791, 2010.
  • [16] S.-Y.Shung-Yung Lung, Feature extracted from wavelet eigenfunction estimation for text-independent speaker recognition, Pattern Recognition, 37, 7, pp. 1543-1544, 2004.
  • [17] M. L. Honig and J. S. Goldstein, Adaptive reduced-rank interference suppression based on multistage Wiener filter, IEEE Trans. on Communications, vol. 50, no. 6, pp. 986–994, 2002.
  • [18] A. Basso, D. Cavagnino, V. Pomponiu and A. Vernone, Blind Watermarking of Color Images Using Karhunen-Loève Transform Keying, The Computer Journal, doi: 10.1093/comjnl/bxq052, 2010.
  • [19] J. H. Stock and M. W. Watson, Forecasting using principal components from a large number of predictors, Journal of the American Statistical Association, vol. 97, pp. 1167-1179, 2002.
  • [20] D. Zhang , Z. Lu, An efficient, high-order perturbation approach for flow in random porous media via Karhunen-Loève and polynomial expansions, J. of Computational Physics, 194, 2, pp. 773-794, 2004.
  • [21] C. Schwab, R. Todora, Karhunen-Loève approximation of random fields by generalized fast multipole methods, J. of Computational Physics, 217, 1, pp. 100-122, 2006.
  • [22] K.K. Phoon, H.W. Huang and S.T. Quek, Simulation of strongly non-Gaussian processes using Karhunen-Loève expansion, Probabilistic Engineering Mechanics, 20, 2, pp. 188-198, 2005.
  • [23] M. Grigoriu, Evaluation of Karhunen-Loève, Spectral, and Sampling Representations for Stochastic Processes, J. Engrg. Mech., 132, 2, pp. 179-189, 2006.
  • [24] L. Raptopoulos, M. Dutra, F. Pinto and A. Filho, Alternative approach to modal gait analysis through the Karhunen-Loève decomposition: An application in the sagittal plane, J. of Biomechanics, 39, 15, pp. 2898-2906, 2006.
  • [25] L. Jie1, L. Zhangjun, C. Jianbing, Orthogonal expansion of ground motion and PDEM-based seismic response analysis of nonlinear structures, Earthq. Eng. & Eng. Vib. 8, pp. 313-328, 2009.
  • [26] B Penna, T. Tillo, E. Magli, G. Olmo, Transform Coding Techniques for Lossy Hyperspectral Data Compression, IEEE Trans. on Geoscience and Remote Sensing, 45, 5, pp. 1408 - 1421, 2007.
  • [27] P. Deheuvels, Karhunen-Loève Expansions of Mean-Centered Wiener Processes, Lecture Notes-Monograph Series, Vol. 51, High Dimensional Probability, pp. 62-76, 2006.
  • [28] A. Gassem, Goodness-of-fit test for switching diffusion, Statistical Inference for Stochastic Processes, 13, pp. 97 - 123, 2010.
  • [29] M. Gastpar, P.L. Dragotti and M. Vetterli, The Distributed Karhunen-Loève Transform, IEEE Trans. on Information Theory, 52 (12), pp. 5177-5196, 2006.
  • [30] M. Effros, H. Fen, Suboptimality of the Karhunen-Loève transform for transform coding, IEEE Trans. on Information Theory, 50, 8, pp. 1605 - 1619, 2004.
  • [31] Y. Gao, J. Chen, S. Yu, J. Zhou and L.-M. Po, The training of Karhunen-Loève transform matrix and its application for H.264 intra coding, Multimedia Tools & Appl., 41, pp. 111 - 123, 2009.
  • [32] H. Houjou, Coarse Graining of Intermolecular Vibrations by a Karhunen-Love Transformation of Atomic Displacement Vectors, J. Chem. Theory Comput., 5 (7), pp. 1814 - 1821, 2009.
  • [33] H. Chen, B. Yin, G. Fang, Y. Wang, Comparison of nonlinear and linear PCA on surface wind, surface height, and SST in the South China Sea, Chinese J. of Oceanology and Limnology, 28, 5, pp. 981 - 989, 2010.
  • [34] A. Dubey, V. Yadavaa, Multi-objective optimization of Nd: YAG laser cutting of nickel-based superalloy sheet using orthogonal array with principal component analysis, Optics and Lasers in Engineering, 46, 2, pp,124 - 132, 2008.
  • [35] G. H. Golub and C. F. van Loan, Matrix Computations, Johns Hopkins University Press, Baltimore, 1996.
  • [36] J. Castrillon-Candas, K. Amaratunga, Fast estimation of continuous Karhunen-Loève eigenfunctions using wavelets, IEEE Trans. on Signal Processing, 50, 1, pp. 78 - 86, 2002.
  • [37] A. Ben-Israel and T. N. E. Greville, Generalized Inverses: Theory and Applications, John Wiley & Sons, New York, 1974.
  • [38] S. Friedland and A. P. Torokhti, Generalized rank-constrained matrix approximations, SIAM J. Matrix Anal. Appl., 29, issue 2, pp. 656 - 659, 2007.
  • [39] S. Friedland, A. Niknejad and L. Chihara,A simultaneous reconstruction of missing data in DNA microarrays, Linear Alg. Appl., 416, pp. 8 - 28, 2006.