A new technique for compression of data sets
Abstract
Data compression techniques are characterized by four key performance indices which are (i) associated accuracy, (ii) compression ratio, (iii) computational work, and (iv) degree of freedom. The method of data compression developed in this paper allows us to substantially improve all the four issues above.
The proposed transform is presented in the form of a sum with terms, , where each term is a particular sub-transform presented by a first degree polynomial. For , each sub-transform is determined from interpolation-like conditions. This device provides the transform flexibility to incorporate variation of observed data and leads to performance improvement. The transform has two degrees of freedom, the number of sub-transforms and associated compression ratio associated with each sub-transform .
1 Introduction
In this paper, a new technique for compression of a set of random signals is proposed. The technique provides the better associated accuracy, compression ratio and computational load than the Karhunen-Loève transform (KLT) and its known extensions.
The basic idea is to combine the methodologies of a piece-wise liner function interpolation [1] and the best rank-constrained operator approximation. This issue is discussed in more detail in Sections 2 and 3.
The KLT is the fundamental and, perhaps, most popular data compression method [2]–[12]. It is also known as the Principal Component Analysis (PCA) [13, 11] and “is probably the oldest and best known of the techniques of multivariate analysis” [13]. This technique is intensively used in a vide range of research areas such as data compression [14, 15], pattern recognition [16], interference suppression [17], image processing [18], forecasting [19], hydrology [20], physics nonlinear phenomena [21], probabilistic mechanics [22, 23], biomechanics [24], geoscience [25, 26], stochastic processes [27, 28], information theory [29, 30], 3-D object classification and video coding [31], chemical theory [32], oceanology [33], optics and laser technology [34] and others.
The purpose of techniques based on the KLT is to compress observable data vector with components to a shorter vector with components (also called the principal components [13]) and then reconstruct it so that the reconstruction is close to the reference signal . In general, contains components where is not necessarily equal to . The ratio
is called the compression ratio. Smaller implies poorer accuracy in the reconstruction of the compressed data. In this sense, is the KLT degree of freedom. Therefore, the performance of the KLT and related transforms is characterized by four key performance indices which are (i) associated accuracy, (ii) compression ratio, (iii) computational work and (iv) the degree of freedom.
The method of data compression developed in this paper allows us to substantially (i) increase the accuracy, (ii) improve the compression ratio, (iii) decrease computational work and (iv) increase the degrees of freedom.
1.1 Motivations for the proposed technique
The motivations of the proposed are as follows.
1.1.1 Infinite sets of signals
Most of the literature on the subject of the KLT111Relevant references can be found, for example, in [7]–[12]. discusses the properties of an optimal transform for an individual finite random signal.222Here, the signal is treated as a vector with random components. We say that a random signal is finite if has a finite number of components. This means that if one wishes to compress and then reconstruct an infinite set of observable signals so that reconstructed signals are close to an infinite set of reference random signals using the KLT approach then one is forced to find an infinite set of corresponding transforms : one element of the transform set for each representative of the signal set . Clearly, such an approach cannot be applied in practice.333Here, and are countable sets. More generally, the sets and might be uncountable when and depend on a continuous parameter . Although the standard KLT has been extended in [9] to the cases of infinite signal sets, its associated accuracy is still not satisfactory (see Sections 1.1.3, 4.8 and 5).
1.1.2 Computational work
Note that even in the case when and with fixed can be represented as finite ‘long’ signals, the KLT applied to such signals leads to computation of large covariance matrices. Indeed, if each has components and each has components then the KLT approach leads to computation of a product of an matrix and an matrix and computation of a pseudo-inverse matrix [7]. This requires and flops, respectively [35]. As a result, in this case, the computational work associated with the KLT approach becomes unreasonably hard.
1.1.3 Associated accuracy
For the given compression ratio, the accuracy associated with the KLT-like techniques cannot be changed. If the accuracy is not satisfactory then one has to change the compression ratio, which can be undesirable. Thus, for the KLT approach, the compression ratio is the only degree of freedom to improve the accuracy.
1.2 Brief description of the KLT
First, we need some notation as follows. Let be a probability space, where is the set of outcomes, a –field of measurable subsets in and an associated probability measure on .
Let and be a reference signal and an observed signal, respectively. Let the norm be given by where is the Euclidean norm of .
We consider a linear operator (transform) defined by a matrix so that
(1) |
A generic KLT [7] is the linear transform that solves
(2) |
where is a class of all linear operators defined by (1). As a result, the KLT returns two matrices, and , so that . Matrix performs filtering and compression of observed data to a shorter vector with components. Matrix reconstructs in such a way that the reconstructed signal is close to in the sense of (2).
Operator that solves (2) for a particular case with is the standard KLT.
Thus, for a given compression ratio, the KLT minimizes the associated error over the class of all linear transforms of the rank defined by (2).
1.3 Related techniques
Despite the known KLT optimality, it may happen that the accuracy and compression ratio associated with the KLT are still not satisfactory. Owing to this observation and the KLT versatility in applications (see Section 1 above), the KLT idea was extended in different directions as has been done, in particular, in [2]–[6]. More recent developments in this area include the polynomial transforms [7], extensions of the KLT to the cases of infinite signal sets [9], the best weighted estimators [10, 11], distributed KLT [29] and the fast KLT using wavelets [36].
Nevertheless, the known transforms based on the KLT idea still imply intrinsic difficulties associated with the original KLT. In particular, most of them have been developed for the case of a single signal, and for the fixed compression ratio, the associated error cannot be improved. In the case of the polynomial transforms [7], the error can be improved if the number of terms in the polynomial transforms increase. The latter implies a substantial computational burden that, in many cases, may stifle this intention.
1.4 Contribution. Particular features of the proposed transform
To describe the contribution and compare the features of the proposed transform with those of the KLT-like techniques, we consider the same issues as in Section 1.1.
1.4.1 Infinite sets of signals
Unlike most of the known transforms, the proposed technique allows us to determine a single transform to compress any signal from the infinite signal set. We note that infinite sets of stochastic signal-vectors introduced in Section 2.2 are quite large and include, for example, time series.
1.4.2 Computational work
1.4.3 Associated accuracy
Our transform provides data compression and subsequent reconstruction with any desired accuracy (Theorem 3 in Section 4.4)444This means that any desired accuracy is achieved theoretically, as is shown in Section 4.4 below. In practice, of course, the accuracy is increased to a prescribed reasonable level.. This is achieved due to the transform structure that follows from the device of the piece-wise function interpolation (Sections 2.2 and 3). In particular, it provides the related degree of freedom, the number of the so-called interpolation pairs (see Sections 3 and 4.8.2), to improve the transform performance (Section 4.8).
Moreover, the proposed technique
(i) determines the data compression transform in terms of pseudo-inverse matrices so that the transform always exists,
(ii) uses the same initial information (signal samples) as is used in the KLT-like transforms (see Sections 4.5 and 5), and
(iii) provides the simultaneous filtering and compression.
The paper is organized as follows. In Section 2, a brief description of the problem is given. Some necessary preliminaries, used in our transform construction, are given in Section 2.2. In Section 2.2.2, the transform model is provided. In Section 3, a rigorous statement of the problem is given, which is a generalization and extension of problem (2) to the case of filtering of infinite sets of stochastic signals. In Section 4, the proposed transform is determined from interpolation conditions, and the associated error analysis is presented. In particular, in Section 4.8, a comparison with KLT-like approach is given.
2 Underlying idea. Device of the transform
2.1 Underlying idea
In this paper, the underlying idea is different from those considered in the above mentioned works. The proposed transform is constructed from a combination of the device of the piece-wise linear function interpolation [1] and the best rank-constrained operator approximation. This means the following.
Suppose, and are uncountable sets of random signals. While the rigorous notation is given in Section 2.2, here, we denote and where is a time snapshot. Let Choose finite numbers of signals, and . The proposed transform contains terms (see (5)–(6) below) where, for , each term is given as a first order operator polynomial (see (6) below) and is determined from the interpolation conditions
(3) |
A reason for using the approximate equality in (3) is explained in Section 3.3. In Section 3.1, the second condition in (3) is represented as the best rank-constrained approximation problem (11)–(12). As a result, such a transform has advantages that are similar to the known advantages of the piece-wise function interpolation as has been mentioned in Section 1.4. The procedure of data compression-reconstruction is performed via truncated singular value decompositions (SVD) provided in Section 4.3.
2.2 Device of the transform
2.2.1 Signal sets under consideration
We wish to consider signals in some wide sense as follows. Let
, and let and be infinite sets of signals,
and where
and
respectively.
We interpret as a reference signal and as an observable signal.555In an intuitive way can be regarded as a noise-corrupted version of . For example, can be interpreted as where is white noise. In this paper, we do not restrict ourselves to this simplest version of and assume that the dependence of on and is arbitrary.
The variable represents time.666More generally, can be considered as a set of parameter vectors , where is a -dimensional cube. One coordinate, say of , could be interpreted as time. Then, for example, can be interpreted as an arbitrary stationary time series.
Let be a nondecreasing sequence of fixed time-points such that
(4) |
2.2.2 The transform model
Now let be a transform such that, for each ,
(5) |
where
(6) |
Here, is a sub-transform with and is a linear operator represented by a matrix so that
For each , vector and operator should be determined from the interpolation conditions given in the following Section 3.
Thus, is defined by a matrix such that
(7) |
3 Statement of the problem
Let and be signals chosen from infinite signal sets and .
3.1 Preliminaries
Ideally, in (5)–(7), for , we would like to determine each so that satisfies
(8) |
and solves
(9) |
The conditions (8)–(9) are motivated by the device of piece-wise function interpolation and associated advantages [1]. In turn, (8) implies which being substituted in (9), reduces (9) to
(10) |
where
and is the zero vector.
Nevertheless, the transform with such and does not provide data compression. To provide compression, we require that instead of (10), solves
(11) |
subject to
(12) |
In (11), we use the notation
(13) |
The constraint (12) leads to compression of to a shorter vector with components. This issue is discussed in more detail in Section 4.3 below.
3.2 Problem formulation
Thus, a determination of presented by (5)–(7) is reduced to the following problem: Given , let for ,
(14) |
The above term “given” means that covariance matrices associated with are are known or can be estimated. This assumption is similar to the assumption used in the known KLT-like transforms.
3.3 Problem discussion
We note that (8) and (14) can be represented as
(15) |
Also, in (11), the term can be rewriten as
Therefore, (11)–(12) can be represented as
(16) |
In other words, the relations (8), (14) and (11)–(12) mean that we wish to determine so that exactly interpolates at and approximately interpolates at , as in (15) and (16), respectively.
By this reason, the pairs of signals are called the interpolation pairs.
It is worthwhile to note that for the case of pure filtering (with no compression) the constraint (12) is omitted.
4 Main results
4.1 Best rank-constrained matrix approximation
First we recall a recent result on the best rank constrained matrix approximation [11, 38] which will be used in the next section.
Let be a set of complex valued matrices, and denote by the variety of all matrices of rank at most. Fix . Then is the conjugate transpose of . Let the SVD of be given by
(17) |
where and is a generalized diagonal matrix, with the singular values on the main diagonal.
Let and be the representations of and in terms of their and columns, respectively. Let
(18) |
be the orthogonal projections on the range of and , correspondingly. Define
(19) |
for , where
(20) | |||
(21) |
For we write . For , the matrix is uniquely defined if and only if .
Recall that is the Moore-Penrose generalized inverse of , where . See for example [37].
Henceforth designates the Frobenius norm.
Theorem 1 below provides a solution to the problem of finding a matrix such that
(22) |
and is based on the fundamental result in [38] (Theorem 2.1) which is a generalization of the well known Eckart-Young theorem [35]. The Eckart-Young theorem states that for the case when , and , the solution is given by , i.e.
(23) |
Theorem 1
[11] Let and be given matrices. Let
(24) |
where and are any matrices, and is the identity matrix. Then the matrix
(25) |
is a minimizing matrix for the minimal problem . Any minimizing has the above form.
4.2 Determination of the transform
Let us introduce the inner product
(26) |
and the covariance matrix
We denote by a matrix such that .
Theorem 2
The proposed transform is given by
(27) |
where
(28) |
(29) |
and is an arbitrary matrix.
Proof 1
To find that satisfies (11) and (12), we write:
(31) | |||||
The latter is true because
and
(32) |
by Lemma 24 in [7].
In (31), the only term that depends on is . Thus a determination of is reduced to the problem (22) with
(33) | |||
Let the SVD of be given by
(34) |
where with the -th column of and are associated singular values. Let . In this case, the solution by Theorem 1 is given by
(35) |
where
(36) |
and is an arbitrary matrix. Here,
(37) |
and . If we choose then (2) follows from (35)–(37) because .
4.3 Procedure of compression and de-compression
For each , we wish to filter the observed data , compress it to a shorter vector with components777Recall that in statistics those components are called the principal components [13]. and then to reconstruct in the form so that is close to .
This procedure is provided by the proposed transform for two special cases,
(i) , and
(ii)
as follows. We consider the cases successively.
(i) For , compression and filtering of is performed via a representation of in (2) as a product of two matrices,
where is a matrix and is a matrix. Then matrix compresses to a vector with components and matrix reconstructs the compressed vector so that the reconstruction is close to .
Matrices and are determined as follows.
Let us write the SVD of in (2) as
(38) |
where matrices
are similar to matrices , and for the SVD of matrix in (17), respectively. In particular, , are the associated singular values.
Then the transform in (27)–(2) can be written as
(42) |
where
(43) |
(44) | |||
(45) |
or
(46) |
In particular, for and ,
Thus, for all , compresses to a shorter vector with components.
The reconstruction of the reference signal from the compressed data is performed by in (43) so that
(47) |
(ii) For , the sub-transform in (28) is reduced to
that does not provide compression. To provide compression of , we represent , for , as
(48) |
The latter follows when, for , condition (8) is replaced by
(49) |
and condition (11)–(12), for , remains the same. Then, for , (48) implies
(50) |
where
and and are defined by (45)–(46). Here, and perform compression of and subsequent reconstruction as above.
Thus, by transform , the compression of requires matrices and the reconstruction requires vectors and matrices .
4.4 Error associated with the transform in (27)–(2)
Let us introduce the norm by
For , let , and let
(53) |
be singular values of .
The following theorem establishes relationships for the errors associated with the proposed transform .
Theorem 3
Remark 1
Although, by the assumption, is given, its compression implies the associated error presented by (58).
4.5 A particular case: signals are given by their samples. Numerical realization of transform
In practice, signals and are given by their samples
(63) | |||
(64) |
respectively. In particular, for , the samples are
(65) | |||
(66) |
In this case, the transform in (5)–(6) takes the form
(67) |
where
(68) |
and and are matrices such that, for , satisfies the condition
(69) |
and solves
(70) |
subject to
(71) |
The solution is provided by the following Corollary which is a particular case of Theorem 2.
4.6 Numerical realization of transform
4.7 Summary of the proposed transform
Let and be infinite sets of reference signals and observable signals, respectively (see Section 2.2).
Choose finite subsets of and , and , respectively.
4.8 Advantages of proposed transform. Comparison with KLT and its extensions
The proposed methodology provides a single transform to process any signal from an infinite set of signals. This is a distinctive feature of the considered technique.
To the best of our knowledge, there is only one other known approach [9] that provides the transform with a similar property. Moreover, the KLT is a particular case of the transform [9].
Therefore, it is natural to compare the proposed transform with that in [9]. The transform developed in [9] is presented in the special form of a sum with terms where each term is determined from the preceding terms as a solution of a minimization problem for the associated error. The final term in such an iterative procedure provides signal compression and decompression. In particular, the KLT [7] follows from [9] if .
Here, we first compare the transform in [9] and the proposed transform , and demonstrate the advantages of the transform . Then we show that the similar advantages also occur in comparison with other transforms based on the KLT idea.
4.8.1 Associated errors of transforms and [9]
Together with the distinctive feature of the transform mentioned above, its other distinctive property is an arbitrarily small associated error in the reconstruction of the reference signal (Theorem 3 in Section 4.4). This is achieved under conditions (55)–(57).
The transform in [9] does not provide such a nice property.
In other words, given by (42)–(47) is composed from sequences of signals measured at different time instants , not from the averages of signals over the domain as in [9]. Therefore, the transform is ‘flexible’ to variations of signals and , and this inherent feature leads to the decrease in the associated error as it is established in Theorem 3 above.
4.8.2 Degrees of freedom to reduce the associated error
Both the transform given by (42)–(47) and the transform in [9] have two degrees of freedom to reduce associated error: the number of terms that compose the transform, and the matrix ranks. At the same time, the error associated with the method in [9] is bounded despite the increase in its number of terms while for the the proposed transform , the increase in number of terms leads to an arbitrarily small associated error (see (54)–(57)). This occurs because of a ‘flexibility’ of the transform discussed in the following Section 4.8.3.
Moreover, unlike the method in [9] the proposed approach may provide one more degree or freedom, a distribution of interpolation pairs related to the distribution of points . An ‘appropriate’ selection of may diminish the error. At the same time, an optimal choice of is a specific problem which is not under consideration in this paper.
4.8.3 Associated assumptions. Numerical realization
A numerical realization of the KLT based transforms requires a knowledge or estimation of related covariance matrices. The estimates are normally found from samples , , and , , presented in Section 4.5. That is, the assumptions used in numerical realizations of the KLT based techniques and the proposed transform (given by (42)–(50)) are, in fact, the same: it is assumed that the samples , , , , , are known. In other words, the preparatory work that should be done for the numerical realization of the KLT and our transform is the same. At the same time, the usage of those signal samples is different. While the transform in [9] requires estimates of two covariance matrices in the form of two related averages formed from , , , , , , our transform uses each representative and separately. This makes transform ‘flexible’ to a variation of and .
In the case of finite signal sets, and , we might apply the individual KLT transform to each pair , for , separately. Such a procedure would require times more computational work compared with the transform in [9].
As a result, while computational efforts associated with the proposed technique are similar or less than those needed for the KLT based transforms, transform given by (42)–(46) allows us to substantially improve the accuracy in signal estimation, as has been shown in Theorem 3. This observation is also demonstrated in the simulations presented in Section 5 below.
Table 1. Accuracy | ||||
---|---|---|---|---|
associated with the proposed transform . | ||||
Number of interpolation pairs | ||||
Compression | The best accuracy associated with | |||
ratio | our transform , | |||
0.7 | 0.5 | 0.48 | 0.31 | |
0.5 | 0.5 | 0.45 | 0.23 | |
Compression | The worst accuracy associated with | |||
ratio | our transform , | |||
1.71 | 1.04 | 0.71 | 0.64 | |
1.49 | 1.07 | 0.76 | 0.56 |
Table 2. Comparison in accuracy | ||||||||
---|---|---|---|---|---|---|---|---|
of the proposed transform and the KLT. | ||||||||
Number of interpolation pairs | ||||||||
Compr. ratio | ||||||||
4.9 | 15. 4 | 9.3 | 18.2 | 10.1 | 19.7 | 12.4 | 22.9 | |
4.3 | 17. 4 | 10.3 | 18.7 | 12.9 | 22.0 | 15.7 | 25.1 |
![]() |
![]() |
(a)Signal | (b) Signal |
![]() |
![]() |
(c) Signal | (d) Signal |
![]() |
![]() |
(e) Signal | (f) Signal |
![]() |
![]() |
(g) Signal | (h) Signal |
![]() |
![]() |
(a) | (b) |
![]() |
![]() |
(c) | (d) |
5 Simulations
Here, we consider a case where and (introduced in Section 2.2) are represented by finite signal sets with members, and illustrate the advantages of the proposed technique over methods based on the KLT approach. In many practical problems (arising, e.g, in a DNA analysis [39]) the number is quite large, for instance, [39]. In these simulations, we set .
Let us suppose that , , , are input stochastic signals where , for and . Reference stochastic signals are , where , for and . Thus, in these simulations, the interval introduced in Section 2.2 is modelled as 400 points with so that .
Signals and have been simulated as digital images presented by matrices and , respectively, with . Each column of matrices and represents a realization of signals and , respectively. Each matrix represents data obtained from a digital photograph ‘Stream and bridge’999The database is available in http://sipi.usc.edu/services/database.html.. Examples of selected images are shown in Fig. 1.
Observed noisy images have been simulated in the form
for each . Here, means the Hadamard product, and randn and rand are matrices with random entries, and they simulate noise. The entries of randn are normally distributed with mean zero, variance one and standard deviation one. The entries of rand are uniformly distributed in the interval . A typical example of such noisy images is given in Fig. 2 (b).
We wish to filter and compress the observed data , …, so that their subsequent reconstructions would be close to the reference signals , …, , respectively.
The proposed transform given by (67)–(68) and (72)–(73), the generic KLT [7] and the transform [9] were applied to this task. To compare their performance, we address three issues as those in Sections 1.1 and 1.4, as follows.
5.1 Large sets of signals
In these simulations, the signal sets are large but finite. Therefore, the known transforms developed for compression of an individual finite random signal-vector can be applied to this case. This issue has been mentioned in Section 1.1 above. Nevertheless the known methods imply difficulties (an insufficient accuracy and excessive computational work) that are discussed below.
5.2 Associated accuracy and compression ratios
5.2.1 Proposed transform
As it has been mentioned in Section 4.8.2, the proposed technique has two degrees of freedom, a number of interpolation pairs and rank of matrix (see (12), (2) and (71)). We set for all .
To demonstrate properties and advantages of the proposed transform , interpolation pairs have been chosen in different ways as follows:
- st choice:
-
and interpolation pairs are , for ;
- nd choice:
-
and interpolation pairs are , for ;
- rd choice:
-
and interpolation pairs are , for ;
- th choice:
-
and interpolation pairs are , for ;
For , the accuracy associated with compression, filtering of each and its subsequent reconstruction by the proposed transform is represented by
(74) |
and
(75) |
In Fig. 2 (d), an example of the restoration of signal from noisy observed data is given. and are typical representatives of signals under consideration. The image in Fig. 2 (d) has been evaluated for as in the st choice above and the compression ratio .
Values of and associated with different choices of the above interpolation pairs are given in Table 1. In the first column, the compression ratios used in the transform are given. In particular, it follows from Table 1 that the accuracy improves when the number of interpolation pairs increases. This is a confirmation of the statement (54)–(57) of Theorem 3.
5.2.2 Individual KLTs [7]
To each pair , , an individual KLT has been applied, where . Thus, the KLT has to be applied 400 times. In Fig. 2 (c), an example of the restoration of signal by the KLT with the compression ratio is given.
The error associated with compression, filtering of each and its subsequent reconstruction by the KLT is represented by
(76) |
To compare the proposed transform and the KLT, we denote
(77) |
and
(78) |
In other words, and represent maximal and minimal magnitudes of the ratios of the accuracies associated with and the KLT . These ratios have been calculated with the same two ranks of and , and , i.e. with the same two compression ratios of and , and .
The results are presented in Table 2. It follows from Table 2 that our transform provides the substantially better associated accuracy. Depending on the number of interpolation pairs and compression ratio , the accuracy associated with is from 4 to 25 times better than that of the KLT.
5.2.3 Generic KLT
5.3 Computational work
The proposed transform requires the computation of pseudo-inverse matrices (in (2), with ), SVDs in (38) and matrix multiplications in (2) and (38). In these simulations, , , , .
The individual KLTs applied, for , to each pair , require the computation of 400 pseudo-inverse matrices and 400 SVDs of matrices . Clearly, the KLTs require substantially more computational work than that by the proposed transform.
5.4 Summary of simulation results
The results of the simulations confirm the theoretical results obtained above. In particular,
(i) the accuracy associated with the proposed transform increases when the number of interpolation pairs increases (Theorem 3),
(ii) the accuracy of our transform is from 4 to 25 times better than the accuracy associated with the KLT-like transforms (depending on the number of interpolation pairs and the compression ratios),
(iii) the proposed transform requires less computational work than that of the KLT-like transforms.
6 Conclusions
The proposed data compression technique is constructed from a combination of the idea of the piece-wise linear function interpolation and the best rank-constrained operator approximation. This device provides the advantages that allow us to
(i) achieve any desired accuracy in the reconstruction of compressed data (Theorem 3),
(ii) find a single transform to compress and then reconstruct any signal from the infinite signal set (Sections 4.2 and 4.3),
(iii) determine the transform in terms of pseudo-inverse matrices so that the transform always exists (Sections 4.2),
(iv) decrease the computational load compared to the related techniques (Section 5.3),
(v) exploit two degrees of freedom (a number of interpolation pairs and compression ratios) to improve the transform performance, and
(vi) use the same initial information (signal samples) as is usually used in the KLT-like transforms.
References
- [1] E. H. W. Meijering, A chronology of interpolation: From ancient astronomy to modern signal and image processing, Proc. IEEE, 90, 3, pp. 319 - 342, 2002.
- [2] L. L. Scharf, The SVD and reduced rank signal processing, Signal Processing, vol. 25, 113 - 133, 1991.
- [3] Y. Yamashita and H. Ogawa, Relative Karhunen-Loéve transform, IEEE Trans. on Signal Processing, vol. 44, pp. 371-378, 1996.
- [4] Y. Hua and W. Q. Liu, Generalized Karhunen-Loève transform, IEEE Signal Processing Letters, vol. 5, pp. 141-143, 1998.
- [5] J. S. Goldstein, I. Reed, and L. L. Scharf, A Multistage Representation of the Wiener Filter Based on Orthogonal Projections, IEEE Trans. on Information Theory, vol. 44, pp. 2943-2959, 1998.
- [6] Y. Hua, M. Nikpour, and P. Stoica, Optimal Reduced-Rank estimation and filtering, IEEE Trans. on Signal Processing, vol. 49, pp. 457-469, 2001.
- [7] A. Torokhti and P. Howlett, Computational Methods for Modelling of Nonlinear Systems, Elsevier, 2007.
- [8] A. Torokhti and P. Howlett, Optimal Transform Formed by a Combination of Nonlinear Operators: The Case of Data Dimensionality Reduction, IEEE Trans. on Signal Processing, 54, No. 4, pp. 1431-1444, 2006.
- [9] A. Torokhti and P. Howlett, Filtering and Compression for Infinite Sets of Stochastic Signals, Signal Processing, 89, pp. 291-304, 2009.
- [10] A. Torokhti and J. Manton, Generic Weighted Filtering of Stochastic Signals, IEEE Trans. on Signal Processing, 57, issue 12, pp. 4675-4685, 2009.
- [11] A. Torokhti and S. Friedland, Towards theory of generic Principal Component Analysis, J. Multivariate Analysis, 100, 4, pp. 661-669, 2009.
- [12] A. Torokhti and S. Miklavcic, Data Compression under Constraints of Causality and Variable Finite Memory, Signal Processing, 90 , Issue 10, pp. 2822-2834, 2010.
- [13] I.T. Jolliffe, “Principal Component Analysis,” Springer Verlag, New York, 1986.
- [14] I. Johnstone, A. Lu, On Consistency and Sparsity for Principal Components Analysis in High Dimensions, J. of the American Statistical Association, 104, 486, pp. 682-693, 2009.
- [15] S. Simoens, O. Munoz-Medina, J. Vidal, and A. Del Coso, Compress-and-Forward Cooperative MIMO Relaying With Full Channel State Information, IEEE Trans. on Signal Processing, 58, No. 2, pp. 781–791, 2010.
- [16] S.-Y.Shung-Yung Lung, Feature extracted from wavelet eigenfunction estimation for text-independent speaker recognition, Pattern Recognition, 37, 7, pp. 1543-1544, 2004.
- [17] M. L. Honig and J. S. Goldstein, Adaptive reduced-rank interference suppression based on multistage Wiener filter, IEEE Trans. on Communications, vol. 50, no. 6, pp. 986–994, 2002.
- [18] A. Basso, D. Cavagnino, V. Pomponiu and A. Vernone, Blind Watermarking of Color Images Using Karhunen-Loève Transform Keying, The Computer Journal, doi: 10.1093/comjnl/bxq052, 2010.
- [19] J. H. Stock and M. W. Watson, Forecasting using principal components from a large number of predictors, Journal of the American Statistical Association, vol. 97, pp. 1167-1179, 2002.
- [20] D. Zhang , Z. Lu, An efficient, high-order perturbation approach for flow in random porous media via Karhunen-Loève and polynomial expansions, J. of Computational Physics, 194, 2, pp. 773-794, 2004.
- [21] C. Schwab, R. Todora, Karhunen-Loève approximation of random fields by generalized fast multipole methods, J. of Computational Physics, 217, 1, pp. 100-122, 2006.
- [22] K.K. Phoon, H.W. Huang and S.T. Quek, Simulation of strongly non-Gaussian processes using Karhunen-Loève expansion, Probabilistic Engineering Mechanics, 20, 2, pp. 188-198, 2005.
- [23] M. Grigoriu, Evaluation of Karhunen-Loève, Spectral, and Sampling Representations for Stochastic Processes, J. Engrg. Mech., 132, 2, pp. 179-189, 2006.
- [24] L. Raptopoulos, M. Dutra, F. Pinto and A. Filho, Alternative approach to modal gait analysis through the Karhunen-Loève decomposition: An application in the sagittal plane, J. of Biomechanics, 39, 15, pp. 2898-2906, 2006.
- [25] L. Jie1, L. Zhangjun, C. Jianbing, Orthogonal expansion of ground motion and PDEM-based seismic response analysis of nonlinear structures, Earthq. Eng. & Eng. Vib. 8, pp. 313-328, 2009.
- [26] B Penna, T. Tillo, E. Magli, G. Olmo, Transform Coding Techniques for Lossy Hyperspectral Data Compression, IEEE Trans. on Geoscience and Remote Sensing, 45, 5, pp. 1408 - 1421, 2007.
- [27] P. Deheuvels, Karhunen-Loève Expansions of Mean-Centered Wiener Processes, Lecture Notes-Monograph Series, Vol. 51, High Dimensional Probability, pp. 62-76, 2006.
- [28] A. Gassem, Goodness-of-fit test for switching diffusion, Statistical Inference for Stochastic Processes, 13, pp. 97 - 123, 2010.
- [29] M. Gastpar, P.L. Dragotti and M. Vetterli, The Distributed Karhunen-Loève Transform, IEEE Trans. on Information Theory, 52 (12), pp. 5177-5196, 2006.
- [30] M. Effros, H. Fen, Suboptimality of the Karhunen-Loève transform for transform coding, IEEE Trans. on Information Theory, 50, 8, pp. 1605 - 1619, 2004.
- [31] Y. Gao, J. Chen, S. Yu, J. Zhou and L.-M. Po, The training of Karhunen-Loève transform matrix and its application for H.264 intra coding, Multimedia Tools & Appl., 41, pp. 111 - 123, 2009.
- [32] H. Houjou, Coarse Graining of Intermolecular Vibrations by a Karhunen-Love Transformation of Atomic Displacement Vectors, J. Chem. Theory Comput., 5 (7), pp. 1814 - 1821, 2009.
- [33] H. Chen, B. Yin, G. Fang, Y. Wang, Comparison of nonlinear and linear PCA on surface wind, surface height, and SST in the South China Sea, Chinese J. of Oceanology and Limnology, 28, 5, pp. 981 - 989, 2010.
- [34] A. Dubey, V. Yadavaa, Multi-objective optimization of Nd: YAG laser cutting of nickel-based superalloy sheet using orthogonal array with principal component analysis, Optics and Lasers in Engineering, 46, 2, pp,124 - 132, 2008.
- [35] G. H. Golub and C. F. van Loan, Matrix Computations, Johns Hopkins University Press, Baltimore, 1996.
- [36] J. Castrillon-Candas, K. Amaratunga, Fast estimation of continuous Karhunen-Loève eigenfunctions using wavelets, IEEE Trans. on Signal Processing, 50, 1, pp. 78 - 86, 2002.
- [37] A. Ben-Israel and T. N. E. Greville, Generalized Inverses: Theory and Applications, John Wiley & Sons, New York, 1974.
- [38] S. Friedland and A. P. Torokhti, Generalized rank-constrained matrix approximations, SIAM J. Matrix Anal. Appl., 29, issue 2, pp. 656 - 659, 2007.
- [39] S. Friedland, A. Niknejad and L. Chihara,A simultaneous reconstruction of missing data in DNA microarrays, Linear Alg. Appl., 416, pp. 8 - 28, 2006.