This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Condition numbers for the truncated total least squares problem and their estimations

Qing-Le Meng1, Huai-An Diao2, Zheng-Jian Bai3
1 School of Mathematical Sciences, Xiamen University, Xiamen 361005, P.R. China. Email: [email protected]
2 Corresponding author. School of Mathematics and Statistics, Northeast Normal University, No. 5268 Renmin
   Street, Changchun 130024, P.R. China. Email: [email protected]
3 School of Mathematical Sciences and Fujian Provincial Key Laboratory on Mathematical Modeling & High Performance Scientific Computing, Xiamen University, Xiamen 361005, P.R. China. Email: [email protected]
The research of Z.-J. Bai is partially supported by the National Natural Science Foundation of China (No. 11671337) and the Fundamental Research Funds for the Central Universities (No. 20720180008).

Abstract. In this paper, we present explicit expressions for the mixed and componentwise condition numbers of the truncated total least squares (TTLS) solution of A𝒙𝒃A\boldsymbol{x}\approx\boldsymbol{b} under the genericity condition, where AA is a m×nm\times n real data matrix and 𝒃\boldsymbol{b} is a real mm-vector. Moreover, we reveal that normwise, componentwise and mixed condition numbers for the TTLS problem can recover the previous corresponding counterparts for the total least squares (TLS) problem when the truncated level of for the TTLS problem is nn. When AA is a structured matrix, the structured perturbations for the structured truncated TLS (STTLS) problem are investigated and the corresponding explicit expressions for the structured normwise, componentwise and mixed condition numbers for the STTLS problem are obtained. Furthermore, the relationships between the structured and unstructured normwise, componentwise and mixed condition numbers for the STTLS problem are studied. Based on small sample statistical condition estimation, reliable condition estimation algorithms are proposed for both unstructured and structured normwise, mixed and componentwise cases, which utilize the SVD of the augmented matrix [A𝒃][A~{}\boldsymbol{b}]. The proposed condition estimation algorithms can be integrated into the SVD-based direct solver for the small and medium size TTLS problem to give the error estimation for the numerical TTLS solution. Numerical experiments are reported to illustrate the reliability of the proposed condition estimation algorithms.

Keywords: Truncated total least squares, normwise perturbation, componentwise perturbation, structured perturbation, singular value decomposition, small sample statistical condition estimation.
AMS subject classifications: 15A09, 65F20, 65F35

1. Introduction

In this paper, we consider the following linear model

A𝒙𝒃,A{\boldsymbol{x}}\approx{\boldsymbol{b}}, (1.1)

where the data matrix Am×nA\in{\mathbb{R}}^{m\times n} and the observation vector 𝒃m{\boldsymbol{b}}\in{\mathbb{R}}^{m} are both perturbed. When m>nm>n, the linear model (1.1) is overdetermined. To find a solution to (1.1), one may solve the following minimization problem:

min[ΔAΔ𝒃]Fsubject to (s.t.)(A+ΔA)𝒙=𝒃+Δ𝒃,\begin{array}[]{cc}\min&\big{\|}[\Delta A~{}\Delta{\boldsymbol{b}}]\big{\|}_{F}\\[5.69054pt] \mbox{subject to (s.t.)}&(A+\Delta A){\boldsymbol{x}}={\boldsymbol{b}}+\Delta{\boldsymbol{b}},\end{array} (1.2)

where F\|\cdot\|_{F} means the Frobenius matrix norm. This is the classical total least square (TLS) problem, which was originally proposed by Golub and Van Loan [11].

The TLS problem is often used for the linear model (1.1) when the augmented matrix [A𝒃][A~{}{\boldsymbol{b}}] is rank deficient, i.e., the small singular values of [A𝒃][A~{}{\boldsymbol{b}}] are assumed to be separated from the others. More interestingly, the truncated total least square (TTLS) method aims to solve the linear model (1.1) in the sense that the small singular values of [A𝒃][A~{}{\boldsymbol{b}}] are set to be zeros. For the discussion of the TTLS, one may refer to [27, §3.6.1] and [6, 8]. The TTLS problem arises in various applications such as linear system theory, computer vision, image reconstruction, system identification, speech and audio processing, modal and spectral analysis, and astronomy, etc. The overview of the TTLS can be found in [25].

Let kk be the predefined truncated level, where 1kn1\leq k\leq n. The TTLS problem aims to solve the following problem:

𝒙k=argmin𝒙2, subject to Ak𝒙=𝒃k,{\boldsymbol{x}}_{k}=\arg\min\|{\boldsymbol{x}}\|_{2},\mbox{ subject to }A_{k}{\boldsymbol{x}}={\boldsymbol{b}}_{k}, (1.3)

where 2\|\cdot\|_{2} denotes the Euclidean vector norm or its induced matrix norm and [Ak𝒃k][A_{k}~{}\,{\boldsymbol{b}}_{k}] is the best rank-kk approximation of [A𝒃][A~{}\,{\boldsymbol{b}}] in the Frobenius norm. The TTLS problem can be viewed as the regularized TLS (1.2) by truncating the small singular values of [A𝒃][A~{}{\boldsymbol{b}}] to be zero (cf.[6, 8]).

In order to solve (1.3), we first recall the singular value decomposition (SVD) of [A𝒃][A~{}{\boldsymbol{b}}] which is given by

[A𝒃]=UΣV,[A~{}{\boldsymbol{b}}]=U\Sigma V^{\top}, (1.4)

where Um×mU\in{\mathbb{R}}^{m\times m} and V(n+1)×(n+1)V\in{\mathbb{R}}^{(n+1)\times(n+1)} are orthogonal matrices and Σ\Sigma is a m×(n+1)m\times(n+1) real matrix with a vector [σ1,,σp][\sigma_{1},\ldots,\sigma_{p}]^{\top} on its diagonal and σ1σ2σp0\sigma_{1}\geq\sigma_{2}\geq\cdots\geq\sigma_{p}\geq 0, where p=min{m,n+1}p=\min\{m,n+1\}. Here, diag(𝒙)\mathop{\rm diag}\nolimits({\boldsymbol{x}}) is a diagonal matrix with a vector 𝒙{\boldsymbol{x}} on its diagonal and the superscript “\cdot^{\top}” takes the transpose of a matrix or vector. Then we have [Ak𝒃k]=UΣkV[A_{k}~{}{\boldsymbol{b}}_{k}]=U\Sigma_{k}V^{\top}, where Σk=diag([σ1,,σk,0,,0])m×(n+1)\Sigma_{k}=\mathop{\rm diag}\nolimits([\sigma_{1},\ldots,\sigma_{k},0,\ldots,0])\in{\mathbb{R}}^{m\times(n+1)}. Suppose the truncation level k<min{m,n+1}k<\min\{m,\,n+1\} satisfies the condition

σk>σk+1.\sigma_{k}>\sigma_{k+1}. (1.5)

Define

V=[V11V12V21V22],V11n×k.V=\left[\begin{array}[]{cc}V_{11}&V_{12}\\[5.69054pt] V_{21}&V_{22}\end{array}\right],\quad V_{11}\in{\mathbb{R}}^{n\times k}. (1.6)

If

V220,V_{22}\neq 0, (1.7)

then the TTLS problem is generic and the TTLS solution 𝒙k{\boldsymbol{x}}_{k} is given by [14]

𝒙k=1V2222V12V22.{\boldsymbol{x}}_{k}=-\frac{1}{\|V_{22}\|_{2}^{2}}V_{12}V_{22}^{\top}. (1.8)

Condition numbers measure the worst-case sensitivity of an input data to small perturbations. Normwise condition numbers for the TLS problem (1.2) under the genericity condition were studied in [1, 16], where SVD-based explicit formulas for normwise condition numbers were derived. The normwise condition number of the truncated SVD solution to a linear model as (1.1) was introduced in [2]. When the data is sparse or badly scaled, it is more suitable to consider the componentwise perturbations since normwise perturbations only measure the perturbation for the data by means of norm and may ignore the relative size of the perturbation on its small (or zero) entries (cf. [15]). There are two types of condition numbers in componentwise perturbations. The mixed condition number measures the errors in the output using norms and the input perturbations componentwise, while the componentwise condition number measures both the error in the output and the perturbation in the input componentwise (cf. [9]). The Kronecker product based formulas for the mixed and componentwise condition numbers to the TLS problem (1.2) were derived in [31, 4]. The corresponding componentwise perturbation analysis for the multidimensional TLS problem and mixed least squares-TLS problem can be found in [29, 30].

Gratton et al. in [14] investigated the normwise condition number for the TTLS problem (1.3). The normwise condition number formula and its computable upper bounds for the TTLS solution (1.3) were derived (cf. [14, Theorems 2.4-2.6]), which rely on the SVD of the augmented matrix [A𝒃][A~{}{\boldsymbol{b}}]. Since the normwise condition number formula for the TTLS problem (1.3) involves Kronecker product, which is not easy to compute or evaluate even for the medium size TTLS problem, the condition estimation method based on the power method [15] or the Golub-Kahan-Lanczos (GKL) bidiagonalization algorithm [10] was proposed to estimate the spectral norm of Fréchet derivative matrix related to (1.3). Furthermore, as point in [14], first-order perturbation bounds based on the normwise condition number can significantly improve the pervious normwise perturbation results in [7, 28] for (1.3).

As mentioned before, when the TTLS problem (1.3) is sparse or badly scaled, which often occurs in scientific computing, the conditioning based on normwise perturbation analysis may severely overestimate the true error of the numerical solution to (1.3). Indeed, from the numerical results for Example 5.1 in Section 5, the TTLS problem (1.3) with respect to the specific data AA and 𝒃{\boldsymbol{b}} is well-conditioned under componentwise perturbation analysis while it is very ill-conditioned under normwise perturbation, which implies that the normwise relative errors for the numerical solution to (1.3) are pessimistic. In this paper, we propose the mixed and componentwise condition number for the TTLS problem (1.3) and the corresponding explicit expressions are derived, which can capture the true conditioning of (1.3) with respect to the sparsity and scaling for the input data. As shown in Example 5.1, the introduced mixed and componentwise condition numbers for (1.3) can be much smaller than the normwise condition number appeared in [14], which can improve the first-order perturbation bounds for (1.3) significantly. Furthermore, when the truncated level kk in (1.3) is selected to be nn, (1.3) reduces to (1.2). The normwised, mixed and componentwise condition numbers for the TTLS problem (1.3) are shown to be mathematically equivalent to the corresponding ones [1, 16, 31] for the untruncated case from their explicit expressions.

Structured TLS problems [17, 22, 25] had been studied extensively in the past decades. For structured TLS problems, it is suitable to investigate structured perturbations on the input data, because structure-preserving algorithms that preserve the underlying matrix structure can enhance the accuracy and efficiency of the TLS solution computation. Structured condition numbers for structured TLS problems can be found in [23, 4, 5] and references therein. In this paper, we introduce structured perturbation analysis for the structured TTLS (STTLS) problem. The explicit structured normwise, mixed and componentwise condition numbers for the STTLS problem are obtained, and their relationships corresponding to the unstructured ones are investigated.

The Kronecker product based expressions, for both unstructured and structured normwise, mixed and componentwise condition numbers of the TTLS solution in Theorems 2.1 and 2.2, involve higher dimensions and thus prevent the efficient calculations of these condition numbers. In practice, it is important to estimate condition numbers efficiently since the forward error for the numerical solution can be obtained via combining condition numbers with backward errors. In this paper, based on the small sample statistical condition estimation (SCE) [18], we propose reliable condition estimation algorithms for both unstructured and structured normwise, mixed and componentwise condition numbers of the TTLS solution, which utilize the SVD of [A𝒃][A~{}{\boldsymbol{b}}] to reduce the computational cost. Furthermore, the proposed condition estimation algorithms can be integrated into the SVD-based direct solver for the small or medium size TTLS problem (1.3). Therefore, one can obtain the reliable forward error estimations for the numerical TTLS solution after implementing the proposed condition estimation algorithms. The main computational cost in condition number estimations for (1.3) is to evaluate the directional derivatives with respect to the generated direction during the loops in condition number estimations algorithms. We point out that the power method [15] for estimating the normwise condition number in [14] needs to evaluate the directional derivatives twice in one loop. However, only evaluating direction derivative once is needed in the loop of Algorithms 1 to 3. Therefore, compared with the normwise condition number estimation algorithm proposed in [14], our proposed condition number estimations algorithms in this paper are more efficient in terms of the computational complexity, which are also applicable for estimating the componentwise and structured perturbations for (1.3). For recent SCE’s developments for (structured) linear systems, linear least squares and TLS problem, we refer to [20, 19, 21, 5] and references therein.

The rest of this paper is organized as follows. In Section 2 we review pervious perturbation results on the TTLS problem and derive explicit expressions of the mixed and componentwise condition numbers. The structured normwise, mixed and componentwise condition numbers are also investigated in Section 2, where the relationships between the unstructured normwise, mixed and componentwise condition numbers for (1.3) with the corresponding structured counterparts are investigated. In Section 3 we establish the relationship between normwise, componentwise and mixed condition numbers for the TTLS problem and the corresponding counterparts for the untruncated TLS. In Section 4 we are devoted to propose several condition estimation algorithms for the normwise, mixed and componentwise condition numbers of the TTLS problem. Moreover, the structured condition estimation is considered. In Section 5, numerical examples are shown to illustrate the efficiency and reliability of the proposed algorithms and report the perturbation bounds based on the proposed condition number. Finally, some concluding remarks are drawn in the last section.

2. Condition numbers for the TTLS problem

In this section we review previous perturbation results on the TTLS problem. The explicit expressions of the mixed and componentwise condition numbers for the TTLS problem are derived. Furthermore, for the structured TTLS problem, we propose the normwise, mixed and componentwise condition numbers, where explicit formulas for the corresponding counterparts are derived. The relationships between the unstructured normwise, mixed and componentwise condition numbers for (1.3) with the corresponding structured counterparts are investigated. We first introduce some conventional notations.

Throughout this paper, we use the following notation. Let \|\cdot\|_{\infty} be the vector \infty-norm or its induced matrix norm. Let InI_{n} be the identity matrix of order nn. Let 𝒆j{\boldsymbol{e}}_{j} be the jj-th column vector of an identity matrix of an appropriate dimension. The superscripts “\cdot^{-}” and“\cdot^{\dagger}” mean the inverse and the Moore-Penrose inverse of a matrix respectively. The symbol “\boxdot” means componentwise multiplication of two conformal dimensional matrices. For any matrix B=(bij)B=(b_{ij}), let |B|=(|bij|)|B|=(|b_{ij}|), where |bij||b_{ij}| denote the absolute value of bijb_{ij}. For any two matrices B,Cm×nB,C\in{\mathbb{R}}^{m\times n}, |B||C||B|\leq|C| represents |bij||cij||b_{ij}|\leq|c_{ij}| for all 1im1\leq i\leq m and 1jn1\leq j\leq n. For any 𝒙,𝒚n{\boldsymbol{x}},{\boldsymbol{y}}\in{\mathbb{R}}^{n}, we define 𝒛:=𝒚𝒙{\boldsymbol{z}}:=\frac{{\boldsymbol{y}}}{{\boldsymbol{x}}} by

𝒛i={𝒙i/𝒚i,if 𝒚i0,0,if 𝒙i=𝒚i=0,,otherwise.{\boldsymbol{z}}_{i}=\left\{\begin{array}[]{ll}{\boldsymbol{x}}_{i}/{\boldsymbol{y}}_{i},&\mbox{if ${\boldsymbol{y}}_{i}\neq 0$},\\ 0,&\mbox{if ${\boldsymbol{x}}_{i}={\boldsymbol{y}}_{i}=0$},\\ \infty,&\mbox{otherwise}.\end{array}\right.

Let 𝗏𝖾𝖼(B){\sf vec}(B) be a column vector obtained by stacking the columns of BB on top of one another. For a vector 𝒃mn{\boldsymbol{b}}\in{\mathbb{R}}^{mn}, let B=𝗎𝗇𝗏𝖾𝖼(b)m×nB={{\sf unvec}}(b)\in{\mathbb{R}}^{m\times n}, where Bij=𝒃i+(j1)mB_{ij}={\boldsymbol{b}}_{i+(j-1)m} for i=1,,mi=1,\ldots,m and j=1,,nj=1,\ldots,n. The symbol “\otimes” means the Kronecker product and Πm,nmn×mn\Pi_{m,n}\in{\mathbb{R}}^{mn\times mn} is a permutation matrix defined by

𝗏𝖾𝖼(B)=Πm,n𝗏𝖾𝖼(B),Bm×n.{\sf vec}(B^{\top})=\Pi_{m,n}{\sf vec}(B),\quad\forall B\in{\mathbb{R}}^{m\times n}. (2.1)

Given the matrices Xm×nX\in{\mathbb{R}}^{m\times n}, Dn×pD\in{\mathbb{R}}^{n\times p}, and Yp×qY\in{\mathbb{R}}^{p\times q}, and X1,X2,Y1,Y2X_{1},X_{2},Y_{1},Y_{2} with appropriate dimensions, we have the following propertes of the Kronecker product and vec operator [13]:

{𝗏𝖾𝖼(XDY)=(YX)𝗏𝖾𝖼(D),(X1X2)(Y1Y2)=(X1Y1)(X2Y2),Πp,m(YX)=(XY)Πn,q.\left\{\begin{array}[]{c}{\sf vec}(XDY)=(Y^{\top}\otimes X){\sf vec}(D),\\[5.69054pt] (X_{1}\otimes X_{2})(Y_{1}\otimes Y_{2})=(X_{1}Y_{1})\otimes(X_{2}Y_{2}),\\[5.69054pt] \Pi_{p,m}(Y\otimes X)=(X\otimes Y)\Pi_{n,q}.\end{array}\right. (2.2)

2.1. Preliminaries

In this subsection, we recall the definition of absolute normwise condition number of the TTLS solution 𝒙k{\boldsymbol{x}}_{k} defined by (1.3) (cf. [14]). The absolute normwise condition number of 𝒙k{\boldsymbol{x}}_{k} in (1.3) is defined by

κ(A,𝒃)=limϵ0supΔHFϵψk([A𝒃]+ΔH)ψk([A𝒃])2ΔHF,\kappa(A,{\boldsymbol{b}})=\lim_{\epsilon\to 0}\sup_{\|\Delta H\|_{F}\leq\epsilon}\frac{\left\|\psi_{k}([A~{}{\boldsymbol{b}}]+\Delta H)-\psi_{k}([A~{}{\boldsymbol{b}}])\right\|_{2}}{\|\Delta H\|_{F}}, (2.3)

where the function ψk\psi_{k} is given by

ψk([A𝒃]):m×n×mn:[A𝒃]𝒙k.\psi_{k}([A~{}{\boldsymbol{b}}])\quad:\quad{\mathbb{R}}^{m\times n}\times{\mathbb{R}}^{m}\rightarrow{\mathbb{R}}^{n}\quad:\quad[A~{}\,{\boldsymbol{b}}]\mapsto{\boldsymbol{x}}_{k}. (2.4)

Let the SVD of [A𝒃]m×(n+1)[A~{}\,{\boldsymbol{b}}]\in{\mathbb{R}}^{m\times(n+1)} be given by (1.4). If the truncation level kk satisfies the conditions (1.5) and (1.7), then the explicit expression of κ(A,𝒃)\kappa(A,{\boldsymbol{b}}) is given by [14, Theorem 2.4]

κ(A,𝒃)=Mk2,\kappa(A,{\boldsymbol{b}})=\|M_{k}\|_{2}, (2.5)

where

Mk=1V2222[In𝒙k]VKD1[IkΣ2Σ1Ink+1]WM_{k}=\frac{1}{\|V_{22}\|_{2}^{2}}[I_{n}\quad{\boldsymbol{x}}_{k}]VKD^{-1}[I_{k}\otimes\Sigma_{2}^{\top}\quad\Sigma_{1}\otimes I_{n-k+1}]W (2.6)

with

Σ1\displaystyle\Sigma_{1} =diag([σ1,,σk])k×k,Σ2=diag([σk+1,,σp])(mk)×(nk+1),\displaystyle=\mathop{\rm diag}\nolimits\Big{(}[\sigma_{1},\ldots,\sigma_{k}]\Big{)}\in{\mathbb{R}}^{k\times k},\quad\Sigma_{2}=\mathop{\rm diag}\nolimits\Big{(}[\sigma_{k+1},\ldots,\sigma_{p}]\Big{)}\in{\mathbb{R}}^{(m-k)\times(n-k+1)},
σ1\displaystyle\sigma_{1} σk>σk+1σp0,k<p=min{m,n+1},\displaystyle\geq\cdots\geq\sigma_{k}>\sigma_{k+1}\geq\cdots\geq\sigma_{p}\geq 0,\quad k<p=\min\{m,n+1\},
K\displaystyle K =[(V22Ik)Πnk+1,kV21Ink+1],D=Σ12Ink+1Ik(Σ2Σ2),\displaystyle=\begin{bmatrix}(V_{22}\otimes I_{k})\Pi_{n-k+1,k}\\ V_{21}\otimes I_{n-k+1}\end{bmatrix},\quad D=\Sigma_{1}^{2}\otimes I_{n-k+1}-I_{k}\otimes(\Sigma_{2}^{\top}\Sigma_{2}),
W\displaystyle W =[V1U2Πnk+1,k(V2U1)],U=[U1U2],V=[V1V2],\displaystyle=\begin{bmatrix}V_{1}^{\top}\otimes U_{2}^{\top}\\ \Pi_{n-k+1,k}(V_{2}^{\top}\otimes U_{1}^{\top})\end{bmatrix},\quad U=[U_{1}\quad U_{2}],\quad V=[V_{1}\quad V_{2}],
V1\displaystyle V_{1} =[V11V21](n+1)×k,V2=[V12V22](n+1)×(nk+1),U1m×k,U2m×(mk).\displaystyle=\begin{bmatrix}V_{11}\\ V_{21}\end{bmatrix}\in{\mathbb{R}}^{(n+1)\times k},V_{2}=\begin{bmatrix}V_{12}\\ V_{22}\end{bmatrix}\in{\mathbb{R}}^{(n+1)\times(n-k+1)},\quad U_{1}\in{\mathbb{R}}^{m\times k},\quad U_{2}\in{\mathbb{R}}^{m\times(m-k)}.

Please be noted the the dimension of MkM_{k} may be large even for medium size TTLS problems. The explicit formula κ(A,𝒃)\kappa(A,{\boldsymbol{b}}) given by (2.5) involves the computation of the spectral norm of MkM_{k}. Hence, upper bounds for κ(A,𝒃)\kappa(A,{\boldsymbol{b}}) is obtain in [14, §2.4], which only rely on the singular values of [A𝒃][A~{}\,{\boldsymbol{b}}] and 𝒙2\|{\boldsymbol{x}}\|_{2}. When the data is sparse or badly scaled, the normwise condition number κ(A,𝒃)\kappa(A,{\boldsymbol{b}}) may not reveal the conditioning of (1.3), since normwise perturbations ignore the relative size of the perturbation on its small (or zero) entries. Therefore, it is more suitable to consider the componentwise perturbation analysis for (1.3) when the data is sparse or badly scaled. In the next subsection, we shall introduce the mixed and componentwise condition number for (1.3).

In [14, §2.3], if both 𝒙k{\boldsymbol{x}}_{k} and the full SVD of [A𝒃][A\,\,{\boldsymbol{b}}] are available, then one may compute Mk2\|M_{k}\|_{2} by using the power method [15, Chap. 15] to MkM_{k} or the Golub-Kahan-Lanczos (GKL) bidiagonalization algorithm [10] to MkM_{k}, where only the matrix-vector product is needed. However, as pointed in the introduction part, the normwise condition number estimation algorithm in [14] are devised based on the power method [15], which needs to evaluate the matrix-vector products Mk𝒇M_{k}{\boldsymbol{f}} and Mk𝒈M_{k}^{\top}{\boldsymbol{g}} in one loop for some suitable dimensional vectors 𝒇{\boldsymbol{f}} and 𝒈{\boldsymbol{g}}. In Section 4, SCE-based condition estimation algorithms for (1.3) shall be proposed, where in one loop we only need to compute the directional derivative Mk𝒇M_{k}{\boldsymbol{f}} but the matrix-vector product Mk𝒈M_{k}^{\top}{\boldsymbol{g}} is not involved. Therefore, compared with normwise condition number estimation algorithm in [14], SCE-based condition estimation algorithms in Section 4 are more efficient.

2.2. Mixed and componentwise condition numbers

In Lemma 2.1 below, the first order perturbation expansion of ψk\psi_{k} with respect to the perturbations of the data AA and 𝒃{\boldsymbol{b}} is reviewed, which involves the Kronecker product. In order to avoid forming Kronecker product explicitly in the explicit expression for the directional derivative of ψk\psi_{k}, we derive the corresponding equivalent formula (2.7) in Lemma 2.2. Furthermore, the directional derivative (2.7) can be used to save computation memory of SCE-based condition estimation algorithms in Section 4.

Lemma 2.1.

[14, Theorem 2.4] Let the SVD of the augmented matrix [A𝐛]m×(n+1)[A~{}{\boldsymbol{b}}]\in{\mathbb{R}}^{m\times(n+1)} be given by (1.4). Suppose kk is a truncation level such that V220V_{22}\neq 0 and σk>σk+1\sigma_{k}>\sigma_{k+1}. If [A~𝐛~]=[A𝐛]+ΔH[\tilde{A}~{}\tilde{\boldsymbol{b}}]=[A~{}{\boldsymbol{b}}]+\Delta H with ΔHF\|\Delta H\|_{F} sufficiently small, then, for the TTLS solution 𝐱k{\boldsymbol{x}}_{k} of A𝐱𝐛A{\boldsymbol{x}}\approx{\boldsymbol{b}} and the TTLS solution 𝐱~k\tilde{\boldsymbol{x}}_{k} of A~𝐱𝐛~\tilde{A}{\boldsymbol{x}}\approx\tilde{\boldsymbol{b}}, we have

𝒙~k=𝒙k+Mk𝗏𝖾𝖼(ΔH)+𝒪(ΔHF2).\tilde{\boldsymbol{x}}_{k}={\boldsymbol{x}}_{k}+M_{k}\,{\sf vec}(\Delta H)+{\mathcal{O}}(\|\Delta H\|_{F}^{2}).
Lemma 2.2.

Under the same assumptions as in Lemma 2.1, if [A~𝐛~]=[A𝐛]+[ΔAΔ𝐛][A𝐛]+ΔH[\tilde{A}~{}\tilde{\boldsymbol{b}}]=[A~{}{\boldsymbol{b}}]+[\Delta A~{}\Delta{\boldsymbol{b}}]\equiv[A~{}{\boldsymbol{b}}]+\Delta H with ΔHF\|\Delta H\|_{F} sufficiently small, then the directional derivative of 𝐱k{\boldsymbol{x}}_{k} at [A𝐛][A~{}{\boldsymbol{b}}] in the direction [ΔAΔ𝐛][\Delta A~{}\Delta{\boldsymbol{b}}] is given by

ψk([A𝒃];[ΔAΔ𝒃])\displaystyle\psi_{k}^{\prime}([A~{}{\boldsymbol{b}}];[\Delta A~{}\Delta{\boldsymbol{b}}]) =1V2222(V11(Z1+Z2)V22+V12(Z1+Z2)V21+𝒙kj=14cj),\displaystyle=\frac{1}{\|V_{22}\|_{2}^{2}}\left(V_{11}\,(Z_{1}^{\top}+Z_{2})V_{22}^{\top}+V_{12}\,(Z_{1}+Z_{2}^{\top})V_{21}^{\top}+{\boldsymbol{x}}_{k}\sum_{j=1}^{4}c_{j}\right), (2.7)

where

Z1\displaystyle Z_{1} =(Σ2U2ΔHV1)𝒟(nk+1)×k,Z2=(Σ1U1ΔHV2)𝒟k×(nk+1),\displaystyle=\left(\Sigma_{2}^{\top}U_{2}^{\top}\Delta HV_{1}\right)\boxdot{\mathcal{D}}\in{\mathbb{R}}^{(n-k+1)\times k},\quad Z_{2}=\left(\Sigma_{1}^{\top}U_{1}^{\top}\Delta HV_{2}\right)\boxdot{\mathcal{D}}^{\top}\in{\mathbb{R}}^{k\times(n-k+1)},
c1\displaystyle c_{1} =V21Z1V22,c2=V21Z2V22,c3=V22Z1V21,c4=V22Z2V21,\displaystyle=V_{21}Z_{1}^{\top}V_{22}^{\top},\quad c_{2}=V_{21}Z_{2}V_{22}^{\top},\quad c_{3}=V_{22}Z_{1}V_{21}^{\top},\quad c_{4}=V_{22}Z_{2}^{\top}V_{21}^{\top},

𝒟=[𝒟(:,1),,𝒟(:,k)](nk+1)×k{\mathcal{D}}=[{\mathcal{D}}(:,1),\ldots,{\mathcal{D}}(:,k)]\in{\mathbb{R}}^{(n-k+1)\times k} with

𝒟(:,i)={[(σi2σk+12)1(σi2σm2)1σi2σi2](nk+1), if m<n+1,[(σi2σk+12)1(σi2σn+12)1](nk+1), if mn+1.{\mathcal{D}}(:,i)=\left\{\begin{array}[]{ll}\begin{bmatrix}(\sigma_{i}^{2}-\sigma_{k+1}^{2})^{-1}\\ \vdots\\ (\sigma_{i}^{2}-\sigma_{m}^{2})^{-1}\\ \sigma_{i}^{-2}\\ \vdots\\ \sigma_{i}^{-2}\\ \end{bmatrix}\in{\mathbb{R}}^{(n-k+1)},\quad\mbox{ if $m<n+1$},\\[45.5244pt] \begin{bmatrix}(\sigma_{i}^{2}-\sigma_{k+1}^{2})^{-1}\\ \vdots\\ (\sigma_{i}^{2}-\sigma_{n+1}^{2})^{-1}\end{bmatrix}\in{\mathbb{R}}^{(n-k+1)},\quad\mbox{ if $m\geq n+1$}.\end{array}\right. (2.8)

Proof.   From Lemma 2.1 we have

ψk([A𝒃];[ΔAΔ𝒃])=Mk𝗏𝖾𝖼(ΔH),\psi_{k}^{\prime}([A~{}{\boldsymbol{b}}];[\Delta A~{}\Delta{\boldsymbol{b}}])=M_{k}{\sf vec}(\Delta H),

where MkM_{k} is defined by (2.6). Using (2.2), it is easy to verify that

[IkΣ2Σ1Ink+1]W\displaystyle[I_{k}\otimes\Sigma_{2}^{\top}\quad\Sigma_{1}\otimes I_{n-k+1}]\,W =(IkΣ2)(V1U2)+(Σ1Ink+1)Πnk+1,k(V2U1)\displaystyle=(I_{k}\otimes\Sigma_{2}^{\top})\,(V_{1}^{\top}\otimes U_{2}^{\top})+(\Sigma_{1}^{\top}\otimes I_{n-k+1})\,\Pi_{n-k+1,k}(V_{2}^{\top}\otimes U_{1}^{\top})
=V1(Σ2U2)+(Σ1U1V2)Πm,n+1,\displaystyle=V_{1}^{\top}\otimes(\Sigma_{2}^{\top}U_{2}^{\top})+(\Sigma_{1}^{\top}U_{1}^{\top}\otimes V_{2}^{\top})\Pi_{m,n+1}, (2.9)

Using the fact that 𝗏𝖾𝖼(ΔH)=[𝗏𝖾𝖼(ΔA)𝗏𝖾𝖼(Δ𝒃)]{\sf vec}(\Delta H)=[{\sf vec}(\Delta A)^{\top}{\sf vec}(\Delta{\boldsymbol{b}})^{\top}]^{\top} and (2.2) we have

[IkΣ2Σ1Ink+1]W𝗏𝖾𝖼(ΔH)\displaystyle[I_{k}\otimes\Sigma_{2}^{\top}\quad\Sigma_{1}\otimes I_{n-k+1}]\,W{\sf vec}(\Delta H) (2.10)
=\displaystyle= (V1(Σ2U2))𝗏𝖾𝖼(ΔH)+(Σ1U1V2)Πm,n+1𝗏𝖾𝖼(ΔH)\displaystyle\left(V_{1}^{\top}\otimes(\Sigma_{2}^{\top}U_{2}^{\top})\right){\sf vec}(\Delta H)+(\Sigma_{1}^{\top}U_{1}^{\top}\otimes V_{2}^{\top})\Pi_{m,n+1}\,{\sf vec}(\Delta H)
=\displaystyle= 𝗏𝖾𝖼(Σ2U2ΔHV1)+𝗏𝖾𝖼(V2ΔHU1Σ1).\displaystyle{\sf vec}(\Sigma_{2}^{\top}U_{2}^{\top}\Delta HV_{1})+{\sf vec}(V_{2}^{\top}\Delta H^{\top}U_{1}\Sigma_{1}).

From (2.6), we see that the ii-th diagonal block D(i)D^{(i)} of DD is given by

D(i)={diag([σi2σk+12,,σi2σm2,σi2,,σi2])(nk+1)×(nk+1),if m<n+1,diag([σi2σk+12,,σi2σn+12])(nk+1)×(nk+1), if mn+1,D^{(i)}=\left\{\begin{array}[]{ll}\mathop{\rm diag}\nolimits\Big{(}[\sigma_{i}^{2}-\sigma_{k+1}^{2},\ldots,\sigma_{i}^{2}-\sigma_{m}^{2},\sigma_{i}^{2},\ldots,\sigma_{i}^{2}]^{\top}\Big{)}\in{\mathbb{R}}^{(n-k+1)\times(n-k+1)},\quad\mbox{if $m<n+1$},\\[5.69054pt] \mathop{\rm diag}\nolimits\Big{(}[\sigma_{i}^{2}-\sigma_{k+1}^{2},\ldots,\sigma_{i}^{2}-\sigma_{n+1}^{2}]^{\top}\Big{)}\in{\mathbb{R}}^{(n-k+1)\times(n-k+1)},\quad\mbox{ if $m\geq n+1$},\end{array}\right. (2.11)

for i=1,,ki=1,\ldots,k. By the definition of 𝒟(nk+1)×k{\mathcal{D}}\in{\mathbb{R}}^{(n-k+1)\times k} we have

{D1𝗏𝖾𝖼(Σ2U2ΔHV1)=𝗏𝖾𝖼((Σ2U2ΔHV1)𝒟),D1𝗏𝖾𝖼(V2ΔHU1Σ1)=𝗏𝖾𝖼((V2ΔHU1Σ1)𝒟).\left\{\begin{array}[]{c}D^{-1}{\sf vec}(\Sigma_{2}^{\top}U_{2}^{\top}\Delta HV_{1})={\sf vec}\left(\left(\Sigma_{2}^{\top}U_{2}^{\top}\Delta HV_{1}\right)\boxdot{\mathcal{D}}\right),\\[5.69054pt] D^{-1}{\sf vec}(V_{2}^{\top}\Delta H^{\top}U_{1}\Sigma_{1})={\sf vec}\left(\left(V_{2}^{\top}\Delta H^{\top}U_{1}\Sigma_{1}\right)\boxdot{\mathcal{D}}\right).\end{array}\right. (2.12)

Then, using the partition of VV given by (1.6) we have

[In𝒙k]VK=[In𝒙k][V11V12V21V22][(V22Ik)Πnk+1,kV21Ink+1]\displaystyle\quad[I_{n}~{}{\boldsymbol{x}}_{k}]VK=[I_{n}~{}{\boldsymbol{x}}_{k}]\begin{bmatrix}V_{11}&V_{12}\\ V_{21}&V_{22}\end{bmatrix}\begin{bmatrix}(V_{22}\otimes I_{k})\Pi_{n-k+1,k}\\ V_{21}\otimes I_{n-k+1}\end{bmatrix}
=V11(V22Ik)Πnk+1,k+V12(V21Ink+1)+𝒙kV21(V22Ik)Πnk+1,k+𝒙kV22(V21Ink+1).\displaystyle=V_{11}(V_{22}\otimes I_{k})\Pi_{n-k+1,k}+V_{12}(V_{21}\otimes I_{n-k+1})+{\boldsymbol{x}}_{k}V_{21}(V_{22}\otimes I_{k})\Pi_{n-k+1,k}+{\boldsymbol{x}}_{k}V_{22}(V_{21}\otimes I_{n-k+1}).

This, together with (2.10) and (2.12), yields

[In𝒙k]VKD1[IkΣ2Σ1Ink+1]W𝗏𝖾𝖼(ΔH)\displaystyle[I_{n}~{}{\boldsymbol{x}}_{k}]VKD^{-1}[I_{k}\otimes\Sigma_{2}^{\top}\quad\Sigma_{1}\otimes I_{n-k+1}]W{\sf vec}(\Delta H)
=\displaystyle= (V11(V22Ik)Πnk+1,k+V12(V21Ink+1)+𝒙kV21(V22Ik)Πnk+1,k+𝒙kV22(V21Ink+1))\displaystyle\Big{(}V_{11}(V_{22}\otimes I_{k})\Pi_{n-k+1,k}+V_{12}(V_{21}\otimes I_{n-k+1})+{\boldsymbol{x}}_{k}V_{21}(V_{22}\otimes I_{k})\Pi_{n-k+1,k}+{\boldsymbol{x}}_{k}V_{22}(V_{21}\otimes I_{n-k+1})\Big{)}
(𝗏𝖾𝖼((Σ2U2ΔHV1)𝒟)+𝗏𝖾𝖼((V2ΔHU1Σ1)𝒟))\displaystyle\Big{(}{\sf vec}\left((\Sigma_{2}^{\top}U_{2}^{\top}\Delta HV_{1})\boxdot{\mathcal{D}}\right)+{\sf vec}\left((V_{2}^{\top}\Delta H^{\top}U_{1}\Sigma_{1})\boxdot{\mathcal{D}}\right)\Big{)}
=\displaystyle= V11((V1ΔHU2Σ2)𝒟)V22+V11((Σ1U1ΔHV2)𝒟)V22\displaystyle V_{11}\left(\left(V_{1}^{\top}\Delta H^{\top}U_{2}\Sigma_{2}\right)\boxdot{\mathcal{D}}^{\top}\right)V_{22}^{\top}+V_{11}\left(\left(\Sigma_{1}^{\top}U_{1}^{\top}\Delta HV_{2}\right)\boxdot{\mathcal{D}}^{\top}\right)V_{22}^{\top}
+V12((Σ2U2ΔHV1)𝒟]V21+V12((V2ΔHU1Σ1)𝒟)V21\displaystyle+V_{12}\left(\left(\Sigma_{2}^{\top}U_{2}^{\top}\Delta HV_{1}\right)\boxdot{\mathcal{D}}\right]V_{21}^{\top}+V_{12}\left(\left(V_{2}^{\top}\Delta H^{\top}U_{1}\Sigma_{1}\right)\boxdot{\mathcal{D}}\right)V_{21}^{\top}
+𝒙kV21((V1ΔHU2Σ2)𝒟)V22+𝒙kV21((Σ1U1ΔHV2)𝒟)V22\displaystyle+{\boldsymbol{x}}_{k}V_{21}\left(\left(V_{1}^{\top}\Delta H^{\top}U_{2}\Sigma_{2}\right)\boxdot{\mathcal{D}}^{\top}\right)V_{22}^{\top}+{\boldsymbol{x}}_{k}V_{21}\left(\left(\Sigma_{1}^{\top}U_{1}^{\top}\Delta HV_{2}\right)\boxdot{\mathcal{D}}^{\top}\right)V_{22}^{\top}
+𝒙kV22((Σ2U2ΔHV1)𝒟]V21+𝒙kV22((V2ΔHU1Σ1)𝒟)V21.\displaystyle+{\boldsymbol{x}}_{k}V_{22}\left(\left(\Sigma_{2}^{\top}U_{2}^{\top}\Delta HV_{1}\right)\boxdot{\mathcal{D}}\right]V_{21}^{\top}+{\boldsymbol{x}}_{k}V_{22}\left(\left(V_{2}^{\top}\Delta H^{\top}U_{1}\Sigma_{1}\right)\boxdot{\mathcal{D}}\right)V_{21}^{\top}.

This completes the proof. ∎

When the data is sparse or badly-scaled, it is more suitable to adopt the componentwise perturbation analysis to investigate the conditioning of the TTLS problem. In the following definition, we introduce the relative mixed and componentwise condition numbers for the TTLS problem.

Definition 2.1.

Suppose the truncation level kk is chosen such that V220V_{22}\neq 0 and σk>σk+1\sigma_{k}>\sigma_{k+1}. The mixed and componentwise condition numbers for the TTLS problem (1.3) are defined as follows:

m(A,𝒃)\displaystyle m(A,{\boldsymbol{b}}) =limϵ0sup|ΔH|ϵ|[A𝒃]|ψk([A𝒃]+ΔH)ψk([A𝒃])ϵ𝒙k,\displaystyle=\lim_{\epsilon\to 0}\sup_{\left|\Delta H\right|\leq\epsilon\big{|}[A\,{\boldsymbol{b}}]\big{|}}\frac{\left\|\psi_{k}([A~{}\,{\boldsymbol{b}}]+\Delta H)-\psi_{k}([A~{}\,{\boldsymbol{b}}])\right\|_{\infty}}{\epsilon\|{\boldsymbol{x}}_{k}\|_{\infty}},
c(A,𝒃)\displaystyle c(A,{\boldsymbol{b}}) =limϵ0sup|ΔH|ϵ|[A𝒃]|1ϵψk([A𝒃]+ΔH)ψk([A𝒃])𝒙k.\displaystyle=\lim_{\epsilon\to 0}\sup_{\left|\Delta H\right|\leq\epsilon\big{|}[A~{}\,{\boldsymbol{b}}]\big{|}}\frac{1}{\epsilon}\left\|\frac{\psi_{k}([A~{}\,{\boldsymbol{b}}]+\Delta H)-\psi_{k}([A~{}\,{\boldsymbol{b}}])}{{\boldsymbol{x}}_{k}}\right\|_{\infty}.

In the following theorem, we give the Kronecker product based explicit expressions of m(A,𝒃)m(A,{\boldsymbol{b}}) and c(A,𝒃)c(A,{\boldsymbol{b}}).

Theorem 2.1.

Suppose the truncation level kk is chosen such that V220V_{22}\neq 0 and σk>σk+1\sigma_{k}>\sigma_{k+1}. Then the mixed and componentwise condition numbers m(A,𝐛)m(A,{\boldsymbol{b}}) and c(A,𝐛)c(A,{\boldsymbol{b}}) defined in Definition 2.1 for the TTLS problem (1.3) can be characterized by

m(A,𝒃)\displaystyle m(A,{\boldsymbol{b}}) =|Mk|𝗏𝖾𝖼([|A||𝒃|])𝒙k,\displaystyle=\frac{\Big{\|}|M_{k}|{\sf vec}(\,[|A|~{}|{\boldsymbol{b}}|]\,)\Big{\|}_{\infty}}{\|{\boldsymbol{x}}_{k}\|_{\infty}}, (2.13a)
c(A,𝒃)\displaystyle c(A,{\boldsymbol{b}}) =|Mk|𝗏𝖾𝖼([|A||𝒃|])𝒙k.\displaystyle=\left\|\frac{|M_{k}|{\sf vec}(\,[|A|~{}|{\boldsymbol{b}}|]\,)}{{\boldsymbol{x}}_{k}}\right\|_{\infty}. (2.13b)

Proof.   Let ΔH=[ΔAΔ𝒃]\Delta H=[\Delta A~{}\,\Delta{\boldsymbol{b}}], where ΔAm×n\Delta A\in{\mathbb{R}}^{m\times n} and Δ𝒃m\Delta{\boldsymbol{b}}\in{\mathbb{R}}^{m}. For any ϵ>0\epsilon>0, it follows from |ΔH|ϵ|[A𝒃]|\left|\Delta H\right|\leq\epsilon\big{|}\left[A~{}\,{\boldsymbol{b}}\right]\big{|} that

|ΔA|ϵ|A|and|Δ𝒃|ϵ|𝒃|.|\Delta A|\leq\epsilon|A|\quad\mbox{and}\quad|\Delta{\boldsymbol{b}}|\leq\epsilon|{\boldsymbol{b}}|.

Define

ΘA=diag(𝗏𝖾𝖼(A))andΘb=diag(𝒃).\Theta_{A}=\mathop{\rm diag}\nolimits({\sf vec}(A))\quad\mbox{and}\quad\Theta_{b}=\mathop{\rm diag}\nolimits({\boldsymbol{b}}).

By Lemma 2.1 we have for ϵ>0\epsilon>0 sufficiently small,

ψk([A𝒃]+ΔH)ψk([A𝒃])=Mk𝗏𝖾𝖼(ΔH)+𝒪(ΔHF2)\displaystyle\psi_{k}([A~{}{\boldsymbol{b}}]+\Delta H)-\psi_{k}([A~{}{\boldsymbol{b}}])=M_{k}\,{\sf vec}(\Delta H)+{\mathcal{O}}(\|\Delta H\|_{F}^{2})
=\displaystyle= Mk[ΘAΘb][ΘA𝗏𝖾𝖼(ΔA)Θb𝗏𝖾𝖼(Δ𝒃)]+𝒪([ΔAΔ𝒃]F2),\displaystyle M_{k}\begin{bmatrix}\Theta_{A}&\\[5.69054pt] &\Theta_{b}\end{bmatrix}\begin{bmatrix}\Theta_{A}^{\dagger}{\sf vec}(\Delta A)\\[5.69054pt] \Theta_{b}^{\dagger}{\sf vec}(\Delta{\boldsymbol{b}})\end{bmatrix}+{\mathcal{O}}(\|[\Delta A~{}\Delta{\boldsymbol{b}}]\|_{F}^{2}),

and taking infinity norms we have

ψk([A𝒃]+ΔH)ψk([A𝒃])\displaystyle\left\|\psi_{k}([A~{}\,{\boldsymbol{b}}]+\Delta H)-\psi_{k}([A~{}\,{\boldsymbol{b}}])\right\|_{\infty} =\displaystyle= Mk[ΘAΘb][ΘA𝗏𝖾𝖼(ΔA)Θb𝗏𝖾𝖼(Δ𝒃)]+𝒪([ΔAΔ𝒃]F2)\displaystyle\left\|M_{k}\begin{bmatrix}\Theta_{A}&\\[5.69054pt] &\Theta_{b}\end{bmatrix}\begin{bmatrix}\Theta_{A}^{\dagger}{\sf vec}(\Delta A)\\[5.69054pt] \Theta_{b}^{\dagger}{\sf vec}(\Delta{\boldsymbol{b}})\end{bmatrix}\right\|_{\infty}+{\mathcal{O}}(\|[\Delta A~{}\Delta{\boldsymbol{b}}]\|_{F}^{2})
\displaystyle\leq ϵ|Mk|[|ΘA||Θb|]+𝒪(ϵ2),\displaystyle\epsilon\Big{\|}\big{|}M_{k}\big{|}\begin{bmatrix}|\Theta_{A}|&\\[5.69054pt] &|\Theta_{b}|\end{bmatrix}\Big{\|}_{\infty}+{\mathcal{O}}(\epsilon^{2}),

where the fact that 𝒪([ΔAΔ𝒃]F2)𝒪(ϵ2){\mathcal{O}}(\|[\Delta A~{}\Delta{\boldsymbol{b}}]\|_{F}^{2})\leq{\mathcal{O}}(\epsilon^{2}) is used. Thus,

m(A,𝒃)\displaystyle m(A,{\boldsymbol{b}}) \displaystyle\leq |Mk|[|ΘA||Θb|]𝒙k=|Mk|[|ΘA||Θb|]𝟏mn+m𝒙k\displaystyle\frac{\left\||M_{k}|\begin{bmatrix}|\Theta_{A}|&\\[5.69054pt] &|\Theta_{b}|\end{bmatrix}\right\|_{\infty}}{\|{\boldsymbol{x}}_{k}\|_{\infty}}=\frac{\left\||M_{k}|\begin{bmatrix}|\Theta_{A}|&\\[5.69054pt] &|\Theta_{b}|\end{bmatrix}{\bf 1}_{mn+m}\right\|_{\infty}}{\|{\boldsymbol{x}}_{k}\|_{\infty}}
=\displaystyle= |Mk|[𝗏𝖾𝖼(|A|)|𝒃|]𝒙k=|Mk|𝗏𝖾𝖼([|A||𝒃|])𝒙k,\displaystyle\frac{\left\||M_{k}|\begin{bmatrix}{\sf vec}(|A|)\\ |{\boldsymbol{b}}|\end{bmatrix}\right\|_{\infty}}{\|{\boldsymbol{x}}_{k}\|_{\infty}}=\frac{\bigg{\|}|M_{k}|{\sf vec}\left([|A|~{}|{\boldsymbol{b}}|]\right)\bigg{\|}_{\infty}}{\|{\boldsymbol{x}}_{k}\|_{\infty}},

where 𝟏mn+m=[1,,1]mn+m{\bf 1}_{mn+m}=[1,\ldots,1]^{\top}\in{\mathbb{R}}^{mn+m}.

On the other hand, let the index aa is such that

|Mk|𝗏𝖾𝖼([|A||𝒃|])=|Mk(a,:)|𝗏𝖾𝖼([|A||𝒃|]),\big{\|}|M_{k}|{\sf vec}\left([|A|~{}|{\boldsymbol{b}}|]\right)\big{\|}_{\infty}=|M_{k}(a,:)|{\sf vec}([|A|~{}|{\boldsymbol{b}}|]),

where |Mk(a,:)||M_{k}(a,:)| denotes the aa-th row of |Mk||M_{k}|. We choose

𝗏𝖾𝖼(ΔH)=ϵΘ𝗏𝖾𝖼([|A||𝒃|]),{\sf vec}(\Delta H)=\epsilon\,\Theta\;{\sf vec}([|A|\,\,|{\boldsymbol{b}}|]),

where Θmn×mn\Theta\in{\mathbb{R}}^{mn\times mn} is a diagonal matrix such that θjj\theta_{jj}=sign((Mk)aj)((M_{k})_{aj}) for j=1,2,,m(n+1)j=1,2,\ldots,m(n+1). Using (2.2) we have

m(A,𝒃)\displaystyle m(A,{\boldsymbol{b}}) \displaystyle\geq limϵ0ϵMkΘ𝗏𝖾𝖼([|A||𝒃|])+𝒪(ϵ𝗏𝖾𝖼([|A||𝒃|])22)ϵ𝒙k\displaystyle\lim_{\epsilon\to 0}\frac{\left\|\epsilon M_{k}\Theta\;{\sf vec}([|A|\,\,|{\boldsymbol{b}}|])+{\mathcal{O}}(\epsilon\|{\sf vec}([|A|\,\,|{\boldsymbol{b}}|])\|_{2}^{2})\right\|_{\infty}}{\epsilon\|{\boldsymbol{x}}_{k}\|_{\infty}}
=\displaystyle= MkΘ𝗏𝖾𝖼([|A||𝒃|])𝒙k\displaystyle\frac{\left\|M_{k}\Theta\;{\sf vec}([|A|\,\,|{\boldsymbol{b}}|])\right\|_{\infty}}{\|{\boldsymbol{x}}_{k}\|_{\infty}}
=\displaystyle= |Mk|𝗏𝖾𝖼([|A||𝒃|])𝒙k.\displaystyle\frac{\left\||M_{k}|{\sf vec}([|A|\,\,|{\boldsymbol{b}}|])\right\|_{\infty}}{\|{\boldsymbol{x}}_{k}\|_{\infty}}.

Therefore, we derive (2.13a). One can use the similar argument to obtain (2.13b). ∎

Remark 2.1.

Based on (2.3) and (2.5), the relative normwise condition number for the TTLS problem (1.3) can be defined and has the following expression

κrel(A,𝒃)=limϵ0supΔHFϵ[A𝒃]Fψk([A𝒃]+ΔH)ψk([A𝒃])2ϵ𝒙k2=Mk2[A𝒃]F𝒙k2.\kappa^{\rm rel}(A,{\boldsymbol{b}})=\lim_{\epsilon\to 0}\sup_{\|\Delta H\|_{F}\leq\epsilon\big{\|}[A\,\,{\boldsymbol{b}}]\big{\|}_{F}}\frac{\left\|\psi_{k}([A\,\,{\boldsymbol{b}}]+\Delta H)-\psi_{k}([A\,\,{\boldsymbol{b}}])\right\|_{2}}{\epsilon\|{\boldsymbol{x}}_{k}\|_{2}}=\frac{\|M_{k}\|_{2}~{}\|[A\,\,{\boldsymbol{b}}]\|_{F}}{\|{\boldsymbol{x}}_{k}\|_{2}}. (2.15)

Using the fact that

{|Mk|2=Mk2,Mkm(n+1)Mk2,𝒙k2n𝒙k,𝗏𝖾𝖼([A𝒃])[A𝒃]F,\left\{\begin{array}[]{ll}\big{\|}\,|M_{k}|\,\big{\|}_{2}=\|M_{k}\|_{2},\\[5.69054pt] \|M_{k}\|_{\infty}\leq\sqrt{m(n+1)}\,\|M_{k}\|_{2},\\[5.69054pt] \|{\boldsymbol{x}}_{k}\|_{2}\leq\sqrt{n}\|{\boldsymbol{x}}_{k}\|_{\infty},\\[5.69054pt] \big{\|}{\sf vec}([A~{}{\boldsymbol{b}}])\big{\|}_{\infty}\leq\big{\|}[A~{}{\boldsymbol{b}}]\big{\|}_{F},\end{array}\right.

it is easy to see that

m(A,𝒃)(n+1)nmκrel(A,𝒃).m(A,{\boldsymbol{b}})\leq\sqrt{(n+1)nm}~{}\kappa^{\rm rel}(A,{\boldsymbol{b}}). (2.16)

From Example 5.1, we can see m(A,𝒃)m(A,{\boldsymbol{b}}) and c(A,𝒃)c(A,{\boldsymbol{b}}) can be much smaller than κrel(A,𝒃)\kappa^{\rm rel}(A,{\boldsymbol{b}}) when the data is sparse and badly scaled. Therefore, one should adopt the mixed and componentwise condition number to measure the conditioning of (1.3) instead of the normwise condition number when [A𝒃][A~{}{\boldsymbol{b}}] is spare or badly scaled. However, since the explicit expressions of m(A,𝒃)m(A,{\boldsymbol{b}}) and c(A,𝒃)c(A,{\boldsymbol{b}}) are based on Kronecker product, which involves large dimensional computer memory to form them explicitly even for medium size TLS problems, it is necessary to propose efficient and reliable condition estimations for m(A,𝒃)m(A,{\boldsymbol{b}}) and c(A,𝒃)c(A,{\boldsymbol{b}}), which will be investigated in Section 4.

In [3, 17, 22], the structured TLS (STTLS) problem has been studied extensively. Hence, it is interesting to study the structured perturbation analysis for the STTLS problem. In the following, we propose the structured normwise, mixed and componentwise condition numbers for the STTLS problem, where AA is a linear structured data matrix. Assume that 𝒮m×n{\mathcal{S}}\subset{\mathbb{R}}^{m\times n} is a linear subspace which consists of a class of basis matrices. Suppose there are tt (tmnt\leq mn) linearly independent matrices S1,,StS_{1},\ldots,S_{t} in 𝒮\mathcal{S}, where SiS_{i} are matrices of constants, typically 0’s and 1’s. For any A𝒮A\in\mathcal{S}, there is a uniques vector 𝒂=[a1,,at]t{\boldsymbol{a}}=[a_{1},\ldots,a_{t}]^{\top}\in{\mathbb{R}}^{t} such that

A=i=1taiSi.A=\sum_{i=1}^{t}a_{i}S_{i}. (2.17)

In the following, we study the sensitivity of the STTLS solution 𝒙k{\boldsymbol{x}}_{k} to perturbations on the data 𝒂{\boldsymbol{a}} and 𝒃{\boldsymbol{b}}, which is defined by

ψs,k(𝒂,𝒃):t×mn:(𝒂,𝒃)𝒙k,\displaystyle\psi_{s,k}({\boldsymbol{a}},\,{\boldsymbol{b}})\quad:\quad{\mathbb{R}}^{t}\times{\mathbb{R}}^{m}\rightarrow{\mathbb{R}}^{n}\quad:\quad({\boldsymbol{a}},\,{\boldsymbol{b}})\mapsto{\boldsymbol{x}}_{k}, (2.18)

where 𝒙k{\boldsymbol{x}}_{k} is the unique solution to the STTLS problem (1.3) and (2.17).

Definition 2.2.

Suppose the truncation level kk is chosen such that V220V_{22}\neq 0 and σk>σk+1\sigma_{k}>\sigma_{k+1}. The absolute structured normwise, mixed and componentwise condition number for the STTLS problem (1.3) and (2.17) are defined as follows:

κs(𝒂,𝒃)\displaystyle\kappa_{s}({\boldsymbol{a}},{\boldsymbol{b}}) =limϵ0sup[Δ𝒂Δ𝒃]2ϵψs,k((𝒂,𝒃)+(Δ𝒂,Δ𝒃))ψs,k(𝒂,𝒃)2[Δ𝒂Δ𝒃]2,\displaystyle=\lim_{\epsilon\to 0}\sup_{\left\|\begin{bmatrix}\Delta{\boldsymbol{a}}\\ \Delta{\boldsymbol{b}}\end{bmatrix}\right\|_{2}\leq~{}\epsilon}\frac{\left\|\psi_{s,k}(({\boldsymbol{a}},\,{\boldsymbol{b}})+(\Delta{\boldsymbol{a}},\,\Delta{\boldsymbol{b}}))-\psi_{s,k}({\boldsymbol{a}},\,{\boldsymbol{b}})\right\|_{2}}{\left\|\begin{bmatrix}\Delta{\boldsymbol{a}}\\ \Delta{\boldsymbol{b}}\end{bmatrix}\right\|_{2}},
ms(𝒂,𝒃)\displaystyle m_{s}({\boldsymbol{a}},{\boldsymbol{b}}) =limϵ0sup|Δ𝒂|ϵ|𝒂||Δ𝒃|ϵ|𝒃|ψs,k((𝒂,𝒃)+(Δ𝒂,Δ𝒃))ψs,k(𝒂,𝒃)ϵ𝒙k,\displaystyle=\lim_{\epsilon\to 0}\sup_{|\Delta{\boldsymbol{a}}|\leq\epsilon|{\boldsymbol{a}}|\atop\left|\Delta{\boldsymbol{b}}\right|\leq\epsilon\left|{\boldsymbol{b}}\right|}\frac{\left\|\psi_{s,k}(({\boldsymbol{a}},\,{\boldsymbol{b}})+(\Delta{\boldsymbol{a}},\,\Delta{\boldsymbol{b}}))-\psi_{s,k}({\boldsymbol{a}},\,{\boldsymbol{b}})\right\|_{\infty}}{\epsilon\|{\boldsymbol{x}}_{k}\|_{\infty}},
cs(𝒂,𝒃)\displaystyle c_{s}({\boldsymbol{a}},{\boldsymbol{b}}) =limϵ0sup|Δ𝒂|ϵ|𝒂||Δ𝒃|ϵ|𝒃|1ϵψs,k((𝒂,𝒃)+(Δ𝒂,Δ𝒃))ψs,k(𝒂,𝒃)𝒙k.\displaystyle=\lim_{\epsilon\to 0}\sup_{|\Delta{\boldsymbol{a}}|\leq\epsilon|{\boldsymbol{a}}|\,\atop\left|\Delta{\boldsymbol{b}}\right|\leq\epsilon\left|{\boldsymbol{b}}\right|}\frac{1}{\epsilon}\left\|\frac{\psi_{s,k}(({\boldsymbol{a}},\,{\boldsymbol{b}})+(\Delta{\boldsymbol{a}},\,\Delta{\boldsymbol{b}}))-\psi_{s,k}({\boldsymbol{a}},\,{\boldsymbol{b}})}{{\boldsymbol{x}}_{k}}\right\|_{\infty}.

In the following lemma, we provide the first order expansion of the STTLS solution 𝒙k{\boldsymbol{x}}_{k} with respect to the structured perturbations Δ𝒂\Delta{\boldsymbol{a}} on 𝒂{\boldsymbol{a}} and Δ𝒃\Delta{\boldsymbol{b}} on 𝒃{\boldsymbol{b}}, which help us to derive the structured condition number expressions for the STTLS problem (1.3) and (2.17). In view of the fact that 𝗏𝖾𝖼(ΔA)=i=1tΔai𝗏𝖾𝖼(Si){\sf vec}(\Delta A)=\sum_{i=1}^{t}\Delta a_{i}{\sf vec}(S_{i}), we can prove the following lemma from Lemma 2.1. The detailed proof is omitted here.

Lemma 2.3.

Under the same assumptions of Lemma 2.1, if [A~𝐛~]=[A𝐛]+[i=1tΔaiSiΔ𝐛][\tilde{A}~{}\tilde{\boldsymbol{b}}]=[A~{}{\boldsymbol{b}}]+\big{[}\sum_{i=1}^{t}\Delta a_{i}S_{i}~{}\,\Delta{\boldsymbol{b}}\big{]} with [Δ𝐚Δ𝐛]2\big{\|}[\Delta{\boldsymbol{a}}^{\top}~{}\Delta{\boldsymbol{b}}^{\top}]^{\top}\big{\|}_{2} sufficiently small, then, for the STTLS solution 𝐱k{\boldsymbol{x}}_{k} of A𝐱𝐛A{\boldsymbol{x}}\approx{\boldsymbol{b}} and the STTLS solution 𝐱~k\tilde{\boldsymbol{x}}_{k} of A~𝐱𝐛~\tilde{A}{\boldsymbol{x}}\approx\tilde{\boldsymbol{b}}, we have

𝒙~k=𝒙k+Mk[00Im][Δ𝒂Δ𝒃]+𝒪([Δ𝒂Δ𝒃]22),\tilde{\boldsymbol{x}}_{k}={\boldsymbol{x}}_{k}+M_{k}\begin{bmatrix}{\mathcal{M}}&0\\ 0&I_{m}\end{bmatrix}\begin{bmatrix}\Delta{\boldsymbol{a}}\\ \Delta{\boldsymbol{b}}\end{bmatrix}+{\mathcal{O}}\left(\left\|\begin{bmatrix}\Delta{\boldsymbol{a}}\\ \Delta{\boldsymbol{b}}\end{bmatrix}\right\|_{2}^{2}\right),

where =[𝗏𝖾𝖼(S1),,𝗏𝖾𝖼(St)]mn×t{\mathcal{M}}=\left[{\sf vec}(S_{1}),\ldots,{\sf vec}(S_{t})\right]\in{\mathbb{R}}^{mn\times t}.

The following theorem concerns with the explicit expressions for the structured normwise, mixed and componentwise condition numbers κs(𝒂,𝒃)\kappa_{s}({\boldsymbol{a}},{\boldsymbol{b}}), ms(𝒂,𝒃)m_{s}({\boldsymbol{a}},{\boldsymbol{b}}), and cs(𝒂,𝒃)c_{s}({\boldsymbol{a}},{\boldsymbol{b}}) defined in Definition 2.2 when AA can be expressed by (2.17). Since the proof is similar to Theorem 2.1, we omit it here.

Theorem 2.2.

Suppose the truncation level kk is chosen such that V220V_{22}\neq 0 and σk>σk+1\sigma_{k}>\sigma_{k+1}. The absolute structured normwise, mixed and componentwise condition numbers κs(𝐚,𝐛)\kappa_{s}({\boldsymbol{a}},{\boldsymbol{b}}), ms(𝐚,𝐛)m_{s}({\boldsymbol{a}},{\boldsymbol{b}}), and cs(𝐚,𝐛)c_{s}({\boldsymbol{a}},{\boldsymbol{b}}) defined in Definition 2.2 for the STTLS problem (1.3) and (2.17) can be characterized by

κs(𝒂,𝒃)=Mk[00Im]2,ms(𝒂,𝒃)=|Mk[00Im]|[|𝒂||𝒃|]𝒙k,cs(𝒂,𝒃)=|Mk[00Im]|[|𝒂||𝒃|]𝒙k.\kappa_{s}({\boldsymbol{a}},{\boldsymbol{b}})=\left\|M_{k}\begin{bmatrix}{\mathcal{M}}&0\\ 0&I_{m}\end{bmatrix}\right\|_{2},m_{s}({\boldsymbol{a}},{\boldsymbol{b}})=\frac{\left\|~{}\left|M_{k}\begin{bmatrix}{\mathcal{M}}&0\\ 0&I_{m}\end{bmatrix}\right|\begin{bmatrix}|{\boldsymbol{a}}|\\ |{\boldsymbol{b}}|\end{bmatrix}\right\|_{\infty}}{\|{\boldsymbol{x}}_{k}\|_{\infty}},c_{s}({\boldsymbol{a}},{\boldsymbol{b}})=\left\|\frac{~{}\left|M_{k}\begin{bmatrix}{\mathcal{M}}&0\\ 0&I_{m}\end{bmatrix}\right|\begin{bmatrix}|{\boldsymbol{a}}|\\ |{\boldsymbol{b}}|\end{bmatrix}}{{\boldsymbol{x}}_{k}}\right\|_{\infty}.
Remark 2.2.

Based on Definition 2.2 and Theorem 2.2, the relative normwise condition number for the STTLS problem (1.3) and (2.17) can be defined and has the following expression

κsrel(𝒂,𝒃)\displaystyle\kappa_{s}^{\rm rel}({\boldsymbol{a}},{\boldsymbol{b}}) =limϵ0sup[Δ𝒂Δ𝒃]2ϵ[𝒂𝒃]2ψs,k((𝒂,𝒃)+(Δ𝒂,Δ𝒃))ψs,k(𝒂,𝒃)2ϵ𝒙k2\displaystyle=\lim_{\epsilon\to 0}\sup_{\left\|\begin{bmatrix}\Delta{\boldsymbol{a}}\\ \Delta{\boldsymbol{b}}\end{bmatrix}\right\|_{2}\leq~{}\epsilon~{}\left\|\begin{bmatrix}{\boldsymbol{a}}\\ {\boldsymbol{b}}\end{bmatrix}\right\|_{2}}\frac{\left\|\psi_{s,k}(({\boldsymbol{a}},\,{\boldsymbol{b}})+(\Delta{\boldsymbol{a}},\,\Delta{\boldsymbol{b}}))-\psi_{s,k}({\boldsymbol{a}},\,{\boldsymbol{b}})\right\|_{2}}{\epsilon\|{\boldsymbol{x}}_{k}\|_{2}}
=Mk[00Im]2[𝒂𝒃]2𝒙k2.\displaystyle=\frac{\left\|M_{k}\begin{bmatrix}{\mathcal{M}}&0\\ 0&I_{m}\end{bmatrix}\right\|_{2}\left\|\begin{bmatrix}{\boldsymbol{a}}\\ {\boldsymbol{b}}\end{bmatrix}\right\|_{2}}{\|{\boldsymbol{x}}_{k}\|_{2}}. (2.19)

Similar to (2.16), we have

ms(𝒂,𝒃)(t+m)nκsrel(𝒂,𝒃).m_{s}({\boldsymbol{a}},{\boldsymbol{b}})\leq\sqrt{(t+m)n}~{}\kappa_{s}^{\rm rel}({\boldsymbol{a}},{\boldsymbol{b}}). (2.20)

In Example 5.2, we can see ms(A,𝐛)m_{s}(A,{\boldsymbol{b}}) can be much smaller than κsrel(A,𝐛)\kappa_{s}^{\rm rel}(A,{\boldsymbol{b}}). Hence, structured condition number can explain that structure-preserving algorithms can enhance the accuracy of the numerical solution, since structure-preserving algorithms preserve the underlying matrix structure.

In the following proposition, we show that, when AA is a linear structured matrix defined by (2.17), the structured normwise, mixed and componentwise condition numbers κs(𝒂,𝒃)\kappa_{s}({\boldsymbol{a}},{\boldsymbol{b}}), ms(𝒂,𝒃)m_{s}({\boldsymbol{a}},{\boldsymbol{b}}), and cs(𝒂,𝒃)c_{s}({\boldsymbol{a}},{\boldsymbol{b}}) are smaller than the corresponding unstructured condition numbers κ(A,𝒃)\kappa(A,{\boldsymbol{b}}), m(A,𝒃)m(A,{\boldsymbol{b}}) and c(A,𝒃)c(A,{\boldsymbol{b}}) respectively.

Proposition 2.1.

Using the notations above, we have κs(𝐚,𝐛)κ(A,𝐛)\kappa_{s}({\boldsymbol{a}},{\boldsymbol{b}})\leq\kappa(A,{\boldsymbol{b}}). Moreover, if |A|=i=1t|ai||Si||A|=\sum_{i=1}^{t}|a_{i}||S_{i}|, then we have

ms(𝒂,𝒃)m(A,𝒃),cs(𝒂,𝒃)c(A,𝒃).\displaystyle m_{s}({\boldsymbol{a}},{\boldsymbol{b}})\leq m(A,{\boldsymbol{b}}),\quad c_{s}({\boldsymbol{a}},{\boldsymbol{b}})\leq c(A,{\boldsymbol{b}}).

Proof.   From [23, Theorem 4.1], the matrix {\mathcal{M}} is column orthogonal. Hence, 2=1\|{\mathcal{M}}\|_{2}=1 and it is not difficult to see that κs(𝒂,𝒃)κ(A,𝒃)\kappa_{s}({\boldsymbol{a}},{\boldsymbol{b}})\leq\kappa(A,{\boldsymbol{b}}) by comparing their expressions. Using the monotonicity of infinity norm, it can be obtained that

|Mk[00Im]|[|a||b|]|Mk|[||00Im][|𝒂||𝒃|]=|Mk|[𝗏𝖾𝖼(|A|)|𝒃|],\displaystyle\left\|~{}\left|M_{k}\begin{bmatrix}{\mathcal{M}}&0\\ 0&I_{m}\end{bmatrix}\right|\begin{bmatrix}|a|\\ |b|\end{bmatrix}\right\|_{\infty}\leq\left\|~{}|M_{k}|~{}\begin{bmatrix}|{\mathcal{M}}|&0\\ 0&I_{m}\end{bmatrix}\begin{bmatrix}|{\boldsymbol{a}}|\\ |{\boldsymbol{b}}|\end{bmatrix}\right\|_{\infty}=\left\|~{}|M_{k}|\begin{bmatrix}{\sf vec}(|A|)\\ |{\boldsymbol{b}}|\end{bmatrix}\right\|_{\infty},

therefore we prove that ms(𝒂,𝒃)m(A,𝒃)m_{s}({\boldsymbol{a}},{\boldsymbol{b}})\leq m(A,{\boldsymbol{b}}) and cs(𝒂,𝒃)c(A,𝒃)c_{s}({\boldsymbol{a}},{\boldsymbol{b}})\leq c(A,{\boldsymbol{b}}) can be proved similarly. ∎

3. Revisiting condition numbers of the untruncated TLS problem

In this section, we investigate the relationship between normwise, componentwise and mixed condition numbers for the TTLS problem and the previous corresponding counterparts for the untruncated TLS. In the following, let 𝒙n{\boldsymbol{x}}_{n} be the untruncated TLS solution to (1.2). First let us review previous results on condition numbers for the untruncated TLS problem.

Let σ~n\widetilde{\sigma}_{n} be the smallest singular value of AA. As noted in [11], if

σ~n>σn+1,\widetilde{\sigma}_{n}>\sigma_{n+1}, (3.1)

then the TLS problem (1.2) has a unique TLS solution

𝒙n=(AAσn+12In)1A𝒃.{\boldsymbol{x}}_{n}=(A^{\top}A-\sigma_{n+1}^{2}I_{n})^{-1}A^{\top}{\boldsymbol{b}}.

Let L𝒙nL^{\top}{\boldsymbol{x}}_{n} be a linear function of the TLS solution 𝒙n{\boldsymbol{x}}_{n}, where Ln×lL\in{\mathbb{R}}^{n\times l} is a fixed matrix with lnl\leq n. We define the mapping

h:m×n×ml:(A,𝒃)L𝒙n=L(AAσn+12In)1A𝒃.h\quad:\quad{\mathbb{R}}^{m\times n}\times{\mathbb{R}}^{m}\rightarrow{\mathbb{R}}^{l}\quad:\quad(A,\,{\boldsymbol{b}})\mapsto L^{\top}{\boldsymbol{x}}_{n}=L^{\top}(A^{\top}A-\sigma_{n+1}^{2}I_{n})^{-1}A^{\top}{\boldsymbol{b}}. (3.2)

As in [1], the absolute normwise condition number of L𝒙L^{\top}{\boldsymbol{x}} can be characterized by

κ1(L,A,𝒃)\displaystyle\kappa_{1}(L,A,{\boldsymbol{b}}) =max[ΔA,Δ𝒃]0h(A,𝒃)(ΔA,Δ𝒃)2[ΔAΔ𝒃]F\displaystyle=\max_{[\Delta A,~{}\Delta{\boldsymbol{b}}]\neq 0}\frac{\|h^{\prime}(A,{\boldsymbol{b}})\cdot(\Delta A,\Delta{\boldsymbol{b}})\|_{2}}{\big{\|}[\Delta A~{}\Delta{\boldsymbol{b}}]\big{\|}_{F}}
=(1+𝒙n22)1/2LP1(AA+σn+12(In2𝒙n𝒙n1+𝒙n22))P1L21/2,\displaystyle=\left(1+\|{\boldsymbol{x}}_{n}\|_{2}^{2}\right)^{1/2}\left\|L^{\top}P^{-1}\Big{(}A^{\top}A+\sigma_{n+1}^{2}\bigg{(}I_{n}-\frac{2{\boldsymbol{x}}_{n}{\boldsymbol{x}}_{n}^{\top}}{1+\|{\boldsymbol{x}}_{n}\|_{2}^{2}}\Big{)}\Big{)}P^{-1}L\right\|^{1/2}_{2}, (3.3)

where

P=AAσn+12In.P=A^{\top}A-\sigma_{n+1}^{2}I_{n}. (3.4)

Later, in [16], an equivalent expression of κ1(In,A,𝒃)\kappa_{1}(I_{n},A,{\boldsymbol{b}}) was given by

κ1(In,A,𝒃)=1+𝒙n22V11S21/2,\kappa_{1}(I_{n},A,{\boldsymbol{b}})=\sqrt{1+\|{\boldsymbol{x}}_{n}\|_{2}^{2}}\left\|V_{11}^{-\top}S\right\|^{1/2}_{2}, (3.5)

where V11V_{11} is defined by (1.6) with k=nk=n and S=diag([s1,,sn])S={\rm diag}([s_{1},\ldots,s_{n}]^{\top}) with

si=σi2+σn+12σi2σn+12.s_{i}=\frac{\sqrt{\sigma_{i}^{2}+\sigma_{n+1}^{2}}}{\sigma_{i}^{2}-\sigma_{n+1}^{2}}.

Recall that κ(A,𝒃)\kappa(A,{\boldsymbol{b}}) is given by (2.5). The relationship between the upper bound for κ(A,𝒃)\kappa(A,{\boldsymbol{b}}) and the corresponding counterpart for κ1(In,A,𝒃)\kappa_{1}(I_{n},A,{\boldsymbol{b}}) was studied in [14, §2.5]. The following theorem shows the equivalence of κ(A,𝒃)\kappa(A,{\boldsymbol{b}}) and κ1(In,A,𝒃)\kappa_{1}(I_{n},A,{\boldsymbol{b}}).

Theorem 3.1.

For the untruncated TLS problem (1.2), the explicit expression of κ(A,𝐛)\kappa(A,{\boldsymbol{b}}) given by (2.5) with k=nk=n is equivalent to that of κ1(In,A,𝐛)\kappa_{1}(I_{n},A,{\boldsymbol{b}}) given by (3.5)

Proof.   When k=nk=n, it is easy to see that Πn,1=Π1,n=In\Pi_{n,1}=\Pi_{1,n}=I_{n}. Also we have

Mn\displaystyle M_{n} =1V222[In𝒙n]VKD1[InΣ2Σ1]W,\displaystyle=\frac{1}{V_{22}^{2}}[I_{n}\quad{\boldsymbol{x}}_{n}]VKD_{*}^{-1}[I_{n}\otimes\Sigma_{2}^{\top}\quad\Sigma_{1}]W, (3.6)

where Σ1=diag([σ1,,σn])n×n\Sigma_{1}=\mathop{\rm diag}\nolimits([\sigma_{1},\,\dots\,,\sigma_{n}]^{\top})\in{\mathbb{R}}^{n\times n}, Σ2=[σn+1,0,,0]mn\Sigma_{2}=\left[\sigma_{n+1},0,\ldots,0\right]^{\top}\in{\mathbb{R}}^{m-n}, D=Σ12σn+12InD_{*}=\Sigma_{1}^{2}-\sigma_{n+1}^{2}I_{n}, and

K=[V22InV21],W=[V1U2V2U1].K=\begin{bmatrix}V_{22}I_{n}\\ V_{21}\end{bmatrix},\quad W=\begin{bmatrix}V_{1}^{\top}\otimes U_{2}^{\top}\\[5.69054pt] V_{2}^{\top}\otimes U_{1}^{\top}\end{bmatrix}. (3.7)

When k=nk=n, under the genericity condition (3.1), the following identities hold for the TLS solution 𝒙n{\boldsymbol{x}}_{n} (cf. [11])

[𝒙n1]=1V22V2=1V22[V12V22],V22=11+𝒙n𝒙n,\displaystyle\begin{bmatrix}{\boldsymbol{x}}_{n}\cr-1\end{bmatrix}=-\frac{1}{V_{22}}V_{2}=-\frac{1}{V_{22}}\begin{bmatrix}V_{12}\\ V_{22}\end{bmatrix},\quad V_{22}=\frac{1}{\sqrt{1+{\boldsymbol{x}}_{n}^{\top}{\boldsymbol{x}}_{n}}}, (3.8)

where VV has the partition in (1.6) and 𝒙n=V12/V22{\boldsymbol{x}}_{n}=-V_{12}/V_{22} given by (1.8). Thus it is not difficult to see that

1V222[In𝒙n]VK\displaystyle\frac{1}{V_{22}^{2}}[I_{n}\quad{\boldsymbol{x}}_{n}]VK =1V222[In𝒙n][V11V12V21V22][V22InV21]=1V22(V111V22V12V21).\displaystyle=\frac{1}{V_{22}^{2}}[I_{n}\quad{\boldsymbol{x}}_{n}]\begin{bmatrix}V_{11}&V_{12}\\ V_{21}&V_{22}\end{bmatrix}\begin{bmatrix}V_{22}I_{n}\\ V_{21}\end{bmatrix}=\frac{1}{V_{22}}\left(V_{11}-\frac{1}{V_{22}}V_{12}V_{21}\right). (3.9)

Since

[V11V21V12V22][V11V12V21V22]=[In001],\begin{bmatrix}V_{11}^{\top}&V_{21}^{\top}\cr V_{12}^{\top}&V_{22}\end{bmatrix}\begin{bmatrix}V_{11}&V_{12}\cr V_{21}&V_{22}\end{bmatrix}=\begin{bmatrix}I_{n}&0\cr 0&1\end{bmatrix},

we know that

V11V11+V21V21=In,V11V12+V22V21=0,V_{11}^{\top}V_{11}+V_{21}^{\top}V_{21}=I_{n},\quad V_{11}^{\top}V_{12}+V_{22}V_{21}^{\top}=0,

thus, it can be verified that

In=V11(V111V22V12V21).I_{n}=V_{11}^{\top}\left(V_{11}-\frac{1}{V_{22}}V_{12}V_{21}\right). (3.10)

Combining (3.9) and (3.10) with the expression of MnM_{n} given by (3.6), we have

Mn=1V22V11D1[InΣ2Σ1]W.M_{n}=\frac{1}{V_{22}}V_{11}^{-\top}D_{*}^{-1}[I_{n}\otimes\Sigma_{2}^{\top}\quad\Sigma_{1}]W. (3.11)

This, together with WW=ImnWW^{\top}=I_{mn}, yields

MnMn=(1+𝒙n22)V11S2V111,\displaystyle M_{n}M_{n}^{\top}=(1+\|{\boldsymbol{x}}_{n}\|_{2}^{2})V_{11}^{-\top}S^{2}V_{11}^{-1},

where SS is defined in (3.5). Therefore, when k=nk=n the expression of κ(A,𝒃)\kappa(A,{\boldsymbol{b}}) given by (2.5) is reduced to

κ(A,𝒃)=Mn2=MnMn21/2=κ1(In,A,𝒃).\kappa(A,{\boldsymbol{b}})=\left\|M_{n}\right\|_{2}=\|M_{n}M_{n}^{\top}\|_{2}^{1/2}=\kappa_{1}(I_{n},A,{\boldsymbol{b}}).

The proof is complete. ∎

In [31], Zhou et al. defined and derived the relative mixed and componentwise condition numbers for the untruncated TLS problem (1.2) as follows: Let [A~𝒃~]=[A𝒃]+[ΔAΔ𝒃][\tilde{A}~{}\tilde{\boldsymbol{b}}]=[A~{}{\boldsymbol{b}}]+[\Delta A~{}\Delta{\boldsymbol{b}}], where ΔA\Delta A and Δ𝒃\Delta{\boldsymbol{b}} are the perturbations of AA and 𝒃{\boldsymbol{b}} respectively. When the norm [ΔA,Δ𝒃]F\|[\Delta A,\Delta{\boldsymbol{b}}]\|_{F} is small enough, for the TLS solution 𝒙n{\boldsymbol{x}}_{n} of A𝒙𝒃A{\boldsymbol{x}}\approx{\boldsymbol{b}} and the TTLS solution 𝒙~n\tilde{\boldsymbol{x}}_{n} of A~𝒙𝒃~\tilde{A}{\boldsymbol{x}}\approx\tilde{\boldsymbol{b}}, we have

m1(A,𝒃)=limϵ0sup|ΔA|ϵ|A|,|Δ𝒃|ϵ|𝒃|𝒙~n𝒙nϵ𝒙n=|M+N|𝗏𝖾𝖼([|A||𝒃|])𝒙,\displaystyle m_{1}(A,{\boldsymbol{b}})=\lim_{\epsilon\to 0}\sup_{\begin{subarray}{c}|\Delta A|\leq\epsilon|A|,\atop|\Delta{\boldsymbol{b}}|\leq\epsilon|{\boldsymbol{b}}|\end{subarray}}\frac{\|\tilde{\boldsymbol{x}}_{n}-{\boldsymbol{x}}_{n}\|_{\infty}}{\epsilon\|{\boldsymbol{x}}_{n}\|_{\infty}}=\frac{\Big{\|}\big{|}M+N\big{|}~{}{\sf vec}\Big{(}\big{[}\,|A|~{}|{\boldsymbol{b}}|\big{]}\Big{)}\Big{\|}_{\infty}}{\|{\boldsymbol{x}}\|_{\infty}}, (3.12)
c1(A,𝒃)=limϵ0sup|ΔA|ϵ|A|,|Δ𝒃|ϵ|𝒃|1ϵ𝒙~n𝒙n𝒙n=|M+N|𝗏𝖾𝖼([|A||𝒃|])𝒙n,\displaystyle c_{1}(A,{\boldsymbol{b}})=\lim_{\epsilon\to 0}\sup_{\begin{subarray}{c}|\Delta A|\leq\epsilon|A|,\atop|\Delta{\boldsymbol{b}}|\leq\epsilon|{\boldsymbol{b}}|\end{subarray}}\frac{1}{\epsilon}\left\|\frac{\tilde{\boldsymbol{x}}_{n}-{\boldsymbol{x}}_{n}}{{\boldsymbol{x}}_{n}}\right\|_{\infty}=\left\|\frac{\left|M+N\right|~{}{\sf vec}\Big{(}\big{[}|A|~{}|{\boldsymbol{b}}|\big{]}\Big{)}}{{\boldsymbol{x}}_{n}}\right\|_{\infty}, (3.13)

where

M\displaystyle M =[P1𝒃𝒙n(P1A)P1A],N=2σn+1P1𝒙n(𝒗n+1𝒖n+1),\displaystyle=\begin{bmatrix}P^{-1}\otimes{\boldsymbol{b}}^{\top}-{\boldsymbol{x}}_{n}^{\top}\otimes(P^{-1}A^{\top})&\quad P^{-1}A^{\top}\end{bmatrix},\,N=2\sigma_{n+1}P^{-1}{\boldsymbol{x}}_{n}({\boldsymbol{v}}_{n+1}^{\top}\otimes{\boldsymbol{u}}_{n+1}^{\top}),

and 𝒗n+1{\boldsymbol{v}}_{n+1} are the (n+1)(n+1)-th column of UU and VV respectively.

Recently, in [4], Diao and Sun defined and gave mixed and componentwise condition numbers for the linear function L𝒙nL^{\top}{\boldsymbol{x}}_{n} as follows.

m1,L(A,𝒃)\displaystyle m_{1,L}(A,{\boldsymbol{b}}) =\displaystyle= |LP1(𝒙n(A+2𝒙nr1+𝒙n𝒙n)In𝒓)|𝗏𝖾𝖼(|A|)+|LP1(A+2𝒙n𝒓1+𝒙n𝒙n)||𝒃|L𝒙n,\displaystyle\frac{\left\|~{}\left|L^{\top}P^{-1}\left({\boldsymbol{x}}_{n}^{\top}\otimes\left(A^{\top}+\frac{2{\boldsymbol{x}}_{n}r^{\top}}{1+{\boldsymbol{x}}_{n}^{\top}{\boldsymbol{x}}_{n}}\right)-I_{n}\otimes{\boldsymbol{r}}^{\top}\right)\right|{\sf vec}(|A|)+\left|L^{\top}P^{-1}\left(A^{\top}+\frac{2{\boldsymbol{x}}_{n}{\boldsymbol{r}}^{\top}}{1+{\boldsymbol{x}}_{n}^{\top}{\boldsymbol{x}}_{n}}\right)\right||{\boldsymbol{b}}|\right\|_{\infty}}{\|L^{\top}{\boldsymbol{x}}_{n}\|_{\infty}},
c1,L(A,𝒃)\displaystyle c_{1,L}(A,{\boldsymbol{b}}) =\displaystyle= DL𝒙n|LP1(𝒙n(A+2𝒙n𝒓1+𝒙n𝒙n)In𝒓)|𝗏𝖾𝖼(|A|)\displaystyle\left\|D_{L^{\top}{\boldsymbol{x}}_{n}}^{\dagger}\left|L^{\top}P^{-1}\left({\boldsymbol{x}}_{n}^{\top}\otimes\left(A^{\top}+\frac{2{\boldsymbol{x}}_{n}{\boldsymbol{r}}^{\top}}{1+{\boldsymbol{x}}_{n}^{\top}{\boldsymbol{x}}_{n}}\right)-I_{n}\otimes{\boldsymbol{r}}^{\top}\right)\right|{\sf vec}(|A|)\right.
+DL𝒙|LP1(A+2𝒙n𝒓1+𝒙n𝒙n)||𝒃|,\displaystyle\left.\quad\quad\quad+D_{L^{\top}{\boldsymbol{x}}}^{\dagger}\left|L^{\top}P^{-1}\left(A^{\top}+\frac{2{\boldsymbol{x}}_{n}{\boldsymbol{r}}^{\top}}{1+{\boldsymbol{x}}_{n}^{\top}{\boldsymbol{x}}_{n}}\right)\right||{\boldsymbol{b}}|\right\|_{\infty},

where 𝒓=𝒃A𝒙n{\boldsymbol{r}}={\boldsymbol{b}}-A{\boldsymbol{x}}_{n}. Moreover, when L=InL=I_{n}, the expressions of m1,In(A,𝒃)m_{1,I_{n}}(A,{\boldsymbol{b}}) and c1,In(A,𝒃)c_{1,I_{n}}(A,{\boldsymbol{b}}) were equivalent to the explicit expressions of m1(A,𝒃)m_{1}(A,{\boldsymbol{b}}) and c1(A,𝒃)c_{1}(A,{\boldsymbol{b}}) given by (3.12)–(3.13) (cf.[4, Theorem 3.2]).

In the following theorem, we prove that, when k=nk=n, m(A,𝒃)m(A,{\boldsymbol{b}}) and c(A,𝒃)c(A,{\boldsymbol{b}}) given by Theorem 2.1 are reduced to those of m1(A,𝒃)m_{1}(A,{\boldsymbol{b}}) and c1(A,𝒃)c_{1}(A,{\boldsymbol{b}}), respectively.

Theorem 3.2.

Using the notations above, when k=nk=n, we have m(A,𝐛)=m1(A,𝐛)m(A,{\boldsymbol{b}})=m_{1}(A,{\boldsymbol{b}}) and c(A,𝐛)=c1(A,𝐛)c(A,{\boldsymbol{b}})=c_{1}(A,{\boldsymbol{b}}).

Proof.   From the proof of [5, Lemma 2], for PP given by (3.4) we have

P=V11DV11,P=V_{11}D_{*}V_{11}^{\top},

where DD_{*} and V11V_{11} are defined in (3.6) and (1.6), respectively. Thus, when k=nk=n, using (3.11) and (3.7) we have

Mn\displaystyle M_{n} =1V22V11D1(V1(Σ2U2)+Σ1(V2U1))\displaystyle=\frac{1}{V_{22}}V_{11}^{-\top}D_{*}^{-1}\left(V_{1}^{\top}\otimes(\Sigma_{2}^{\top}U_{2}^{\top})+\Sigma_{1}(V_{2}^{\top}\otimes U_{1}^{\top})\right)
=1V22P1V11(V1(Σ2U2)+Σ1(V2U1))\displaystyle=\frac{1}{V_{22}}P^{-1}V_{11}\left(V_{1}^{\top}\otimes(\Sigma_{2}^{\top}U_{2}^{\top})+\Sigma_{1}(V_{2}^{\top}\otimes U_{1}^{\top})\right)
=P1(Mn,1+Mn,2),\displaystyle=P^{-1}\left(M_{n,1}+M_{n,2}\right), (3.14)

where

Mn,2=1V22V11Σ1(V2U1),Mn,1=1V22V11(V1(Σ2U2)).\displaystyle M_{n,2}=\frac{1}{V_{22}}V_{11}\Sigma_{1}(V_{2}^{\top}\otimes U_{1}^{\top}),\,M_{n,1}=\frac{1}{V_{22}}V_{11}\left(V_{1}^{\top}\otimes(\Sigma_{2}^{\top}U_{2}^{\top})\right). (3.15)

Partition Mn,1M_{n,1} and Mn,2M_{n,2} as follows:

Mn,1=[𝒩1𝒩2],Mn,2=[𝒯1𝒯2],𝒩1,𝒯1n×mn.M_{n,1}=[{\mathcal{N}}_{1}\quad{\mathcal{N}}_{2}],\quad M_{n,2}=[{\mathcal{T}}_{1}\quad{\mathcal{T}}_{2}],\quad{\mathcal{N}}_{1},{\mathcal{T}}_{1}\in{\mathbb{R}}^{n\times mn}. (3.16)

By comparing the expressions of m1,In(A,𝒃)m_{1,I_{n}}(A,{\boldsymbol{b}}) and m(A,𝒃)m(A,{\boldsymbol{b}}) with k=nk=n, we only need to show that

𝒩1+𝒯1=Q1,𝒩2+𝒯2=Q2,\displaystyle{\mathcal{N}}_{1}+{\mathcal{T}}_{1}=Q_{1},\quad\quad{\mathcal{N}}_{2}+{\mathcal{T}}_{2}=Q_{2}, (3.17)

where

Q1=In𝒓𝒙nQ2,Q2=A+2𝒙n𝒓1+𝒙n𝒙n.\displaystyle Q_{1}=I_{n}\otimes{\boldsymbol{r}}^{\top}-{\boldsymbol{x}}_{n}^{\top}\otimes Q_{2},\quad Q_{2}=A^{\top}+\frac{2{\boldsymbol{x}}_{n}{\boldsymbol{r}}^{\top}}{1+{\boldsymbol{x}}_{n}^{\top}{\boldsymbol{x}}_{n}}.

Using the SVD of [A,𝒃][A,\,{\boldsymbol{b}}] in (1.4), the partitions of VV, UU in (1.6) and Σ\Sigma in (2.5), it follows that

A=([A𝒃][In0])=V11Σ1U1+V12Σ2U2=V11Σ1U1+σn+1V12𝒖n+1,\displaystyle A^{\top}=\left([A\quad{\boldsymbol{b}}]\begin{bmatrix}I_{n}\\ 0\end{bmatrix}\right)^{\top}=V_{11}\Sigma_{1}U_{1}^{\top}+V_{12}\Sigma_{2}^{\top}U_{2}^{\top}=V_{11}\Sigma_{1}U_{1}^{\top}+\sigma_{n+1}V_{12}{\boldsymbol{u}}_{n+1}^{\top}, (3.18)

where 𝒖n+1{\boldsymbol{u}}_{n+1} is the (n+1)(n+1)-th column of UU. We note that

Σ2=σn+1𝒆1(mn),Σ2U2=𝒖n+1,\displaystyle\Sigma_{2}=\sigma_{n+1}{\boldsymbol{e}}_{1}^{(m-n)},\quad\Sigma_{2}^{\top}U_{2}^{\top}={\boldsymbol{u}}_{n+1}^{\top}, (3.19)

where 𝒆1(mn){\boldsymbol{e}}_{1}^{(m-n)} is the first column of ImnI_{m-n}. From the SVD of [A𝒃][A~{}{\boldsymbol{b}}] and (3.8), it is easy to check that

𝒓=𝒃A𝒙n=[A𝒃][𝒙n1]=1V22[A𝒃]V2=σn+1V22𝒖n+1.{\boldsymbol{r}}={\boldsymbol{b}}-A{\boldsymbol{x}}_{n}=-[A~{}{\boldsymbol{b}}]\begin{bmatrix}{\boldsymbol{x}}_{n}\cr-1\end{bmatrix}=\frac{1}{V_{22}}[A~{}{\boldsymbol{b}}]V_{2}=\frac{\sigma_{n+1}}{V_{22}}{\boldsymbol{u}}_{n+1}. (3.20)

Substituting (3.18), (3.8) and (3.20) into the expression of Q2Q_{2} and Q1Q_{1} yields

Q2\displaystyle Q_{2} =V11Σ1U1σn+1V12𝒖n+1,\displaystyle=V_{11}\Sigma_{1}U_{1}^{\top}-\sigma_{n+1}V_{12}{\boldsymbol{u}}_{n+1}^{\top}, (3.21)
Q1\displaystyle Q_{1} =σn+1V22In𝒖n+1+1V22V12(V11Σ1U1)σn+1V22V12(V12𝒖n+1).\displaystyle=\frac{\sigma_{n+1}}{V_{22}}I_{n}\otimes{\boldsymbol{u}}_{n+1}^{\top}+\frac{1}{V_{22}}V_{12}^{\top}\otimes(V_{11}\Sigma_{1}U_{1}^{\top})-\frac{\sigma_{n+1}}{V_{22}}V_{12}^{\top}\otimes\left(V_{12}{\boldsymbol{u}}_{n+1}^{\top}\right). (3.22)

Since V11V11+V12V12=In,V11V21+V22V12=0,V_{11}V_{11}^{\top}+V_{12}V_{12}^{\top}=I_{n},\,V_{11}V_{21}^{\top}+V_{22}V_{12}=0, we deduce that

V11V11=InV12V12,V11V21=V22V12.\displaystyle V_{11}V_{11}^{\top}=I_{n}-V_{12}V_{12}^{\top},\quad V_{11}V_{21}^{\top}=-V_{22}V_{12}. (3.23)

Using (3.15), (3.19), and (3.23) we have

Mn,1\displaystyle M_{n,1} =1V22(V11V1(Σ2U2))=1V22σn+1[V11V11V11V21]𝒖n+1\displaystyle=\frac{1}{V_{22}}\left(V_{11}V_{1}^{\top}\otimes(\Sigma_{2}^{\top}U_{2}^{\top})\right)=\frac{1}{V_{22}}\sigma_{n+1}\left[V_{11}V_{11}^{\top}\quad V_{11}V_{21}^{\top}\right]\otimes{\boldsymbol{u}}_{n+1}^{\top}
=σn+1[1V22(InV12V12)𝒖n+1V12𝒖n+1],\displaystyle=\sigma_{n+1}\left[\frac{1}{V_{22}}(I_{n}-V_{12}V_{12}^{\top})\otimes{\boldsymbol{u}}_{n+1}^{\top}\quad-V_{12}\otimes{\boldsymbol{u}}_{n+1}^{\top}\right],
𝒩1\displaystyle{\mathcal{N}}_{1} =σn+1V22(InV12V12)𝒖n+1,𝒩2=σn+1V12𝒖n+1=σn+1V12𝒖n+1.\displaystyle=\frac{\sigma_{n+1}}{V_{22}}(I_{n}-V_{12}V_{12}^{\top})\otimes{\boldsymbol{u}}_{n+1}^{\top},\quad{\mathcal{N}}_{2}=-\sigma_{n+1}V_{12}\otimes{\boldsymbol{u}}_{n+1}^{\top}=-\sigma_{n+1}V_{12}{\boldsymbol{u}}_{n+1}^{\top}. (3.24)

Using the partition of V2=[V12V22]V_{2}^{\top}=[V_{12}^{\top}\quad V_{22}] and (3.15) we have

Mn,2\displaystyle M_{n,2} =1V22[V11Σ1(V12U1)V11Σ1(V22U1)],\displaystyle=\frac{1}{V_{22}}\left[V_{11}\Sigma_{1}(V_{12}^{\top}\otimes U_{1}^{\top})\quad V_{11}\Sigma_{1}(V_{22}^{\top}\otimes U_{1}^{\top})\right],
𝒯1\displaystyle{\mathcal{T}}_{1} =1V22V11Σ1(V12U1)=1V22V12(V11Σ1U1),𝒯2=1V22V11Σ1V22U1=V11Σ1U1.\displaystyle=\frac{1}{V_{22}}V_{11}\Sigma_{1}(V_{12}^{\top}\otimes U_{1}^{\top})=\frac{1}{V_{22}}V_{12}^{\top}\otimes(V_{11}\Sigma_{1}U_{1}^{\top}),{\mathcal{T}}_{2}=\frac{1}{V_{22}}V_{11}\Sigma_{1}V_{22}U_{1}^{\top}=V_{11}\Sigma_{1}U_{1}^{\top}. (3.25)

From (3), (3), (3.22), and (3.21), it is easy to check that the two inequalities in (3.17) hold. The proof is complete. ∎

Remark 3.1.

From (3), (3.16), (3.17) we have

Mn=P1[𝒙n(A+2𝒙n𝒓1+𝒙n𝒙n)+In𝒓A+2𝒙n𝒓1+𝒙n𝒙n].M_{n}=P^{-1}\left[-{\boldsymbol{x}}_{n}^{\top}\otimes\left(A^{\top}+\frac{2{\boldsymbol{x}}_{n}{\boldsymbol{r}}^{\top}}{1+{\boldsymbol{x}}_{n}^{\top}{\boldsymbol{x}}_{n}}\right)+I_{n}\otimes{\boldsymbol{r}}^{\top}\quad A^{\top}+\frac{2{\boldsymbol{x}}_{n}{\boldsymbol{r}}^{\top}}{1+{\boldsymbol{x}}_{n}^{\top}{\boldsymbol{x}}_{n}}\right].

The structured condition numbers for the untruncated TLS problem with linear structures were studied by Li and Jia in [23]. For the structured matrix AA defined by (2.17), denote

𝒦\displaystyle{\mathcal{K}} =P1(2A𝒓𝒓𝒓22G(𝒙n)AG(𝒙n)+[In𝒓0]),G(𝒙n)=[𝒙n1]Im,\displaystyle=P^{-1}\left(2A^{\top}\frac{{\boldsymbol{r}}{\boldsymbol{r}}^{\top}}{\|{\boldsymbol{r}}\|_{2}^{2}}G({\boldsymbol{x}}_{n})-A^{\top}G({\boldsymbol{x}}_{n})+\left[I_{n}\otimes{\boldsymbol{r}}^{\top}\quad 0\right]\right),\quad G({\boldsymbol{x}}_{n})=\left[{\boldsymbol{x}}_{n}^{\top}\quad-1\right]\otimes I_{m}, (3.26)

where PP is defined by (3.4). The structured mixed condition number ms,n(𝒂,𝒃)m_{s,n}({\boldsymbol{a}},{\boldsymbol{b}}) is characterized as [23]

ms,n(𝒂,𝒃)\displaystyle m_{s,n}({\boldsymbol{a}},{\boldsymbol{b}}) =\displaystyle= limϵ0sup|Δ𝒂|ϵ|𝒂|,|Δ𝒃|ϵ|𝒃|𝒙~n𝒙nϵ𝒙n=|𝒦[00Im]|[|𝒂||𝒃|]𝒙n.\displaystyle\lim_{\epsilon\rightarrow 0}\sup_{\begin{subarray}{c}|\Delta{\boldsymbol{a}}|\leq\epsilon|{\boldsymbol{a}}|,\\ |\Delta{\boldsymbol{b}}|\leq\epsilon|{\boldsymbol{b}}|\end{subarray}}\frac{\|\tilde{\boldsymbol{x}}_{n}-{\boldsymbol{x}}_{n}\|_{\infty}}{\epsilon\|{\boldsymbol{x}}_{n}\|_{\infty}}=\frac{\left\|~{}\left|{\mathcal{K}}\begin{bmatrix}{\mathcal{M}}&0\cr 0&I_{m}\end{bmatrix}\right|\begin{bmatrix}|{\boldsymbol{a}}|\\ |{\boldsymbol{b}}|\end{bmatrix}\,\right\|_{\infty}}{\|{\boldsymbol{x}}_{n}\|_{\infty}}. (3.27)

In [23, Theorem 4.3], Li and Jia proved that ms,n(𝒂,𝒃)m1(A,𝒃)m_{s,n}({\boldsymbol{a}},{\boldsymbol{b}})\leq m_{1}(A,{\boldsymbol{b}}) and 𝒦=Mn{\mathcal{K}}=M_{n}, where m1(A,𝒃)m_{1}(A,{\boldsymbol{b}}) is given by (3.12). Hence we have the following proposition. Indeed, we prove that, when k=nk=n, the expression of ms,n(𝒂,𝒃)m_{s,n}({\boldsymbol{a}},{\boldsymbol{b}}) given by (3.27) is reduced to that of ms(𝒂,𝒃)m_{s}({\boldsymbol{a}},{\boldsymbol{b}}) in Theorem 2.2.

Proposition 3.1.

Using the notations above, when k=nk=n, we have ms(𝐚,𝐛)=ms,n(𝐚,𝐛)m_{s}({\boldsymbol{a}},{\boldsymbol{b}})=m_{s,n}({\boldsymbol{a}},{\boldsymbol{b}}).

4. Small sample statistical condition estimation

Based on small sample statistical condition estimation (SCE), reliable condition estimation algorithms for both unstructured and structured normwise, mixed and componentwise are devised, which utilize the SVD of the augmented matrix [A𝒃][A~{}{\boldsymbol{b}}]. In the following, we first review the basic idea of SCE. Let ψ:p\psi:{\mathbb{R}}^{p}\rightarrow{\mathbb{R}} be a differentiable function. For the input vector uu, we want to estimate the sensitivity of the output ψ(𝒖)\psi({\boldsymbol{u}}) with respect to small perturbation ϵ𝒅\epsilon{\boldsymbol{d}} on 𝒖{\boldsymbol{u}}, where 𝒅{\boldsymbol{d}} is a unit vector and ϵ\epsilon is a small positive number. The Taylor expansion of ψ\psi at 𝒖{\boldsymbol{u}} is given by

ψ(𝒖+ϵ𝒅)=ψ(𝒖)+ϵ(ψ(𝒖))𝒅+𝒪(ϵ2),\psi({\boldsymbol{u}}+\epsilon{\boldsymbol{d}})=\psi({\boldsymbol{u}})+\epsilon(\nabla\psi({\boldsymbol{u}}))^{\top}{\boldsymbol{d}}+{\mathcal{O}}(\epsilon^{2}),

where ψ(𝒖)p\nabla\psi({\boldsymbol{u}})\in{\mathbb{R}}^{p} is the gradient of ψ\psi at 𝒖{\boldsymbol{u}}. Neglecting the second and higher order terms of ϵ\epsilon we have

|ψ(𝒖+ϵ𝒅)ψ(𝒖)|ϵ(ψ(𝒖))𝒅,\left|\psi({\boldsymbol{u}}+\epsilon{\boldsymbol{d}})-\psi({\boldsymbol{u}})\right|\approx\epsilon(\nabla\psi({\boldsymbol{u}}))^{\top}{\boldsymbol{d}},

from which we conclude that the local sensitivity can be measured by ψ(𝒖)2\|\nabla\psi({\boldsymbol{u}})\|_{2}. Let the Wallis factor be given by [18]

ωp={1,forp1,2π,forp2,135(p2)246(p1),forpoddandp>2,2π246(p2)135(p1),forpevenandp>2.\omega_{p}=\begin{cases}1,&\text{for}~{}p\equiv 1,\\ \frac{2}{\pi},&\text{for}~{}p\equiv 2,\\ \frac{1\cdot 3\cdot 5\cdots(p-2)}{2\cdot 4\cdot 6\cdots(p-1)},&\text{for}~{}p~{}\text{odd}~{}\text{and}~{}p>2,\\ \frac{2}{\pi}\frac{2\cdot 4\cdot 6\cdots(p-2)}{1\cdot 3\cdot 5\cdots(p-1)},&\text{for}~{}p~{}\text{even}~{}\text{and}~{}p>2.\end{cases}

As in [18], if 𝒅{\boldsymbol{d}} is selected uniformly and randomly from the unit pp-sphere Bp1B_{p-1} (denoted as 𝒅𝒰(Bp1){\boldsymbol{d}}\in\mathcal{U}(B_{p-1})), then the expected value 𝔼(|(ψ(𝒖))𝒅|/ωp){\mathbb{E}}(|(\nabla\psi({\boldsymbol{u}}))^{\top}{\boldsymbol{d}}|/\omega_{p}) is ψ(𝒖)2\|\nabla\psi({\boldsymbol{u}})\|_{2}. In practice, the Wallis factor can be approximated accurately by [18]

ωp2π(p12).\omega_{p}\approx\sqrt{\frac{2}{\pi(p-\frac{1}{2})}}. (4.1)

Therefore, the following quantity

ν=|(ψ(𝒖))𝒅|ωp\nu=\frac{\left|(\nabla\psi({\boldsymbol{u}}))^{\top}{\boldsymbol{d}}\right|}{\omega_{p}}

can be used as a condition estimator for ψ(𝒖)2\|\nabla\psi({\boldsymbol{u}})\|_{2} with high probability for the function ψ\psi at 𝒖{\boldsymbol{u}}. For example, let γ>1\gamma>1, which indicates the accuracy of the estimator, it is shown that

(ψ(𝒖)2γνγψ(𝒖)2)12πγ+𝒪(1γ2).{\mathbb{P}}\left(\frac{\|\nabla\psi({\boldsymbol{u}})\|_{2}}{\gamma}\leq\nu\leq\gamma\|\nabla\psi({\boldsymbol{u}})\|_{2}\right)\geq 1-\frac{2}{\pi\gamma}+{\mathcal{O}}\left(\frac{1}{\gamma^{2}}\right).

In general, we are interested in finding an estimate that is accurate to a factor of 10 (γ=10\gamma=10). The accuracy of the condition estimator can enhanced by using multiple samples of 𝒅{\boldsymbol{d}}, denoted 𝒅j{\boldsymbol{d}}_{j}. The \ell-sample condition estimation is given by

ν()=ωωpj=1|ψ(𝒖)𝒅j|2,\nu(\ell)=\frac{\omega_{\ell}}{\omega_{p}}\sqrt{\sum_{j=1}^{\ell}\left|\nabla\psi({\boldsymbol{u}})^{\top}{\boldsymbol{d}}_{j}\right|^{2}},

where the matrix [𝒅1,,𝒅][{\boldsymbol{d}}_{1},\ldots,{\boldsymbol{d}}_{\ell}] is orthonormalized after 𝒅1,,𝒅{\boldsymbol{d}}_{1},\ldots,{\boldsymbol{d}}_{\ell} are selected uniformly and randomly from 𝒰(Bp1)\mathcal{U}(B_{p-1}). Usually, at most two or three samples are sufficient for high accuracy. For example, the accuracies of ν(2)\nu(2) and ν(3)\nu(3) are given by [18]

(ψ(𝒖)2γν(2)γψ(𝒖)2)\displaystyle{\mathbb{P}}\left(\frac{\|\nabla\psi({\boldsymbol{u}})\|_{2}}{\gamma}\leq\nu(2)\leq\gamma\|\nabla\psi({\boldsymbol{u}})\|_{2}\right) 1π4γ2,γ>1,\displaystyle\approx 1-\frac{\pi}{4\gamma^{2}},\quad\gamma>1,
(ψ(𝒖)2γν(3)γψ(𝒖)2)\displaystyle{\mathbb{P}}\left(\frac{\|\nabla\psi({\boldsymbol{u}})\|_{2}}{\gamma}\leq\nu(3)\leq\gamma\|\nabla\psi({\boldsymbol{u}})\|_{2}\right) 1323π2γ3,γ>1.\displaystyle\approx 1-\frac{32}{3\pi^{2}\gamma^{3}},\quad\gamma>1.

As an illustration, for =3\ell=3 and γ=10\gamma=10, the estimator ν(3)\nu(3) has probability 0.99890.9989, which is within a relative factor 1010 of the true condition number ψ(𝒖)2\|\nabla\psi({\boldsymbol{u}})\|_{2}.

The above results can be easily extended to vector-valued or matrix-valued functions.

4.1. Normwise perturbation analysis

In this subsection, we propose an algorithm for the normwise condition estimation of the TTLS problem (1.3) based on SCE. The input data of Algorithm 1 includes the matrix Am×nA\in{\mathbb{R}}^{m\times n}, the vector 𝒃m{\boldsymbol{b}}\in{\mathbb{R}}^{m}, the SVD of [A𝒃][A~{}{\boldsymbol{b}}], and the computed solution 𝒙kn{\boldsymbol{x}}_{k}\in{\mathbb{R}}^{n}. The output includes the condition vector 𝒦absTTLS,()\mathscr{K}_{\rm abs}^{\mathrm{TTLS},(\ell)} and the estimated relative condition number κSCETTLS,()\kappa_{\rm SCE}^{{\rm TTLS},(\ell)}.

Algorithm 1 Small sample condition estimation for the TTLS problem under normwise perturbation analysis
  • 1.

    Generate matrices [ΔA^1Δ𝒃^1],,[ΔA^Δ𝒃^]m×(n+1)[\Delta\hat{A}_{1}~{}\Delta\hat{{\boldsymbol{b}}}_{1}],\ldots,[\Delta\hat{A}_{\ell}~{}\Delta\hat{{\boldsymbol{b}}}_{\ell}]\in{\mathbb{R}}^{m\times(n+1)} with entries being in 𝒩(0,1){\mathcal{N}}(0,1), the standard Gaussian distribution. Orthonormalize the following matrix

    [𝗏𝖾𝖼(ΔA^1)𝗏𝖾𝖼(ΔA^2)𝗏𝖾𝖼(ΔA^)Δ𝒃^1Δ𝒃^2Δ𝒃^]\left[\begin{matrix}{\sf vec}(\Delta\hat{A}_{1})&{\sf vec}(\Delta\hat{A}_{2})&\cdots&{\sf vec}(\Delta\hat{A}_{\ell})\cr\Delta\hat{{\boldsymbol{b}}}_{1}&\Delta\hat{{\boldsymbol{b}}}_{2}&\cdots&\Delta\hat{{\boldsymbol{b}}}_{\ell}\end{matrix}\right]

    to obtain [𝒒1𝒒2,,𝒒][{\boldsymbol{q}}_{1}~{}{\boldsymbol{q}}_{2},\ldots,{\boldsymbol{q}}_{\ell}] via the modified Gram-Schmidt orthogonalization process. Set

    ΔHi:=[ΔAiΔ𝒃i]=𝗎𝗇𝗏𝖾𝖼(𝒒i),i=1,,.\Delta H_{i}:=[\Delta A_{i}~{}\Delta{\boldsymbol{b}}_{i}]={\sf unvec}({\boldsymbol{q}}_{i}),\quad i=1,\ldots,\ell.
  • 2.

    Let p=m(n+1)p=m(n+1). Approximate ωp\omega_{p} and ω\omega_{\ell} by (4.1).

  • 3.

    For i=1,2,,i=1,2,\ldots,\ell, compute

    𝒈i=1V2222(V11(Z1+Z2)V22+V12(Z1+Z2)V21+𝒙kj=14cj),{\boldsymbol{g}}_{i}=\frac{1}{\|V_{22}\|_{2}^{2}}\left(V_{11}\,(Z_{1}^{\top}+Z_{2})V_{22}^{\top}+V_{12}\,(Z_{1}+Z_{2}^{\top})V_{21}^{\top}+{\boldsymbol{x}}_{k}\sum_{j=1}^{4}c_{j}\right), (4.2)

    where ZiZ_{i} and cic_{i} are defined in (2.7) with ΔH=ΔHi\Delta H=\Delta H_{i}. Estimate the absolute condition vector

    𝒦absTTLS,()\displaystyle\mathscr{K}_{\rm abs}^{\mathrm{TTLS},(\ell)} =\displaystyle= ωωpj=1|𝒈j|2.\displaystyle\frac{\omega_{\ell}}{\omega_{p}}\sqrt{\sum_{j=1}^{\ell}|{\boldsymbol{g}}_{j}|^{2}}.

    Here, for any vector 𝒈=[g1,,gn]n{\boldsymbol{g}}=[g_{1},\ldots,g_{n}]^{\top}\in{\mathbb{R}}^{n}, |𝒈|2=[|g1|2,,|gn|2]{|{\boldsymbol{g}}|^{2}}=[|g_{1}|^{2},\ldots,|g_{n}|^{2}]^{\top} and |𝒈|=[|g1|,,|gn|]\sqrt{|{\boldsymbol{g}}|}=[\sqrt{|g_{1}|},\ldots,\sqrt{|g_{n}|}]^{\top}.

  • 4.

    Compute the normwise condition number as follows,

    κSCETTLS,()=NSCETTLS,()[A𝒃]F𝒙k2,\kappa_{\rm SCE}^{{\rm TTLS},(\ell)}=\frac{N_{\rm SCE}^{{\rm TTLS},(\ell)}\big{\|}[A~{}{\boldsymbol{b}}]\big{\|}_{F}}{||{\boldsymbol{x}}_{k}||_{2}},

    where NSCETTLS,():=ωωpj=1𝒈j22=𝒦absTTLS,()F.N_{\rm SCE}^{{\rm TTLS},(\ell)}:=\frac{\omega_{\ell}}{\omega_{p}}\sqrt{\sum_{j=1}^{\ell}||{\boldsymbol{g}}_{j}||_{2}^{2}}=\|\mathscr{K}_{\rm abs}^{\mathrm{TTLS},(\ell)}\|_{F}.

Next, we give some remarks on the computational cost of Algorithm 1. In Step 1, the modified Gram-Schmidt orthogonalization process [12] is adopted to form an orthonormal matrix [𝒒1𝒒2,,𝒒][{\boldsymbol{q}}_{1}~{}{\boldsymbol{q}}_{2},\ldots,{\boldsymbol{q}}_{\ell}] and the total flop count is about 𝒪(mn2){\mathcal{O}}(mn\ell^{2}). The cost associated with step 3 is about 𝒪(k(mn+m(m)+(5kn1)(n+1k))){\mathcal{O}}(\ell k(mn+m(m-\ell)+(5k-n-1)(n+1-k))) flops that is mainly from computing the directional derivative in Lemma 2.2. The last step needs 𝒪(mn+n){\mathcal{O}}(mn+n\ell) flops. We note that =3\ell=3 generates a good condition estimation. In this case, the total cost of Algorithm 1 is 𝒪(mnk+m2k+k(5kn)(n+1k)){\mathcal{O}}(mnk+m^{2}k+k(5k-n)(n+1-k)), which does not exceed the cost of computing the SVD of [A𝒃][A~{}{\boldsymbol{b}}] and 𝒙k{\boldsymbol{x}}_{k}. Furthermore, the directional derivative (4.2) only be computed once in one loop of Algorithm 1. On the contrary, Gratton et. al [14] proposed the normwise condition number estimation algorithm through using the power method [15], which needs to evaluate the matrix-vector products Mk𝒇M_{k}{\boldsymbol{f}} and Mk𝒈M_{k}^{\top}{\boldsymbol{g}} in one loop for some suitable dimensional vectors 𝒇{\boldsymbol{f}} and 𝒈{\boldsymbol{g}}. Therefore, Algorithm 1 is more efficient compared with the normwise condition number estimation method in [14].

4.2. Componentwise perturbation analysis

If the perturbation in the input data is measured componentwise rather than by norm, it may help us to measure the sensitivity of a function more accurately [26]. The SCE method can also be used to measure the sensitivity of componentwise perturbations [18], which may give a more realistic indication of the accuracy of a computed solution than that from the normwise condition number. In componentwise perturbation analysis, for a perturbation ΔA=(Δaij)\Delta A=(\Delta a_{ij}) of A=(aij)m×nA=(a_{ij})\in{\mathbb{R}}^{m\times n}, we assume that |ΔA|ϵ|A||\Delta A|\leq\epsilon|A|. Therefore, the perturbation ΔA\Delta A can be rewritten as ΔA=δ(𝒜A)\Delta A=\delta\,({\mathcal{A}}\boxdot A) with |δ|ϵ|\delta|\leq\epsilon and each entry of 𝒜\mathcal{A} being in the interval [1,1][-1,1]. Based on the above observations, we can obtain a componentwise sensitivity estimate of the solution xkx_{k} of the TTLS problem (1.3) as follows. The detailed descriptions are given in Algorithm 2, which is a modification of Algorithm 1 directly.

Algorithm 2 Small sample condition estimation for the TTLS problem under componentwise perturbation analysis
  • 1.

    Generate matrices [ΔA^1Δ𝒃^1][ΔA^2Δ𝒃^2],,[ΔA^Δ𝒃^]m×(n+1)[\Delta\hat{A}_{1}~{}\Delta\hat{{\boldsymbol{b}}}_{1}]~{}[\Delta\hat{A}_{2}~{}\Delta\hat{{\boldsymbol{b}}}_{2}],\ldots,[\Delta\hat{A}_{\ell}~{}\Delta\hat{{\boldsymbol{b}}}_{\ell}]\in{\mathbb{R}}^{m\times(n+1)} with entries being in 𝒩(0,1){\mathcal{N}}(0,1). Orthonormalize the following matrix

    [𝗏𝖾𝖼(ΔA^1)𝗏𝖾𝖼(ΔA^2)𝗏𝖾𝖼(ΔA^)Δ𝒃^1Δ𝒃^2Δ𝒃^]\left[\begin{matrix}{\sf vec}(\Delta\hat{A}_{1})&{\sf vec}(\Delta\hat{A}_{2})&\cdots&{\sf vec}(\Delta\hat{A}_{\ell})\cr\Delta\hat{{\boldsymbol{b}}}_{1}&\Delta\hat{{\boldsymbol{b}}}_{2}&\cdots&\Delta\hat{{\boldsymbol{b}}}_{\ell}\end{matrix}\right]

    to obtain [𝒒1𝒒2,,𝒒][{\boldsymbol{q}}_{1}~{}{\boldsymbol{q}}_{2},\ldots,{\boldsymbol{q}}_{\ell}] via the modified Gram-Schmidt orthogonalization process. Set

    [ΔA~iΔ𝒃~i]=𝗎𝗇𝗏𝖾𝖼(𝒒i)i=1,,.[\Delta\widetilde{A}_{i}~{}\Delta\tilde{\boldsymbol{b}}_{i}]={\sf unvec}({\boldsymbol{q}}_{i})\quad i=1,\ldots,\ell.

    Set

    ΔHi:=[ΔAiΔ𝒃i]=[Ai^𝒃i^][ΔA~iΔ𝒃~i]i=1,,.\Delta H_{i}:=[\Delta A_{i}~{}\Delta{\boldsymbol{b}}_{i}]=[\hat{A_{i}}~{}\hat{{\boldsymbol{b}}_{i}}]\boxdot[\Delta\widetilde{A}_{i}~{}\Delta\tilde{\boldsymbol{b}}_{i}]\quad i=1,\ldots,\ell.
  • 2.

    Let p=m(n+1)p=m(n+1). Approximate ωp\omega_{p} and ω\omega_{\ell} by (4.1).

  • 3.

    For j=1,2,,,j=1,2,\ldots,\ell, calculate 𝒈j{\boldsymbol{g}}_{j} by (4.2). Estimate the absolute condition vector

    CabsTTLS,()=ωωpj=1|𝒈j|2.\displaystyle C_{\rm abs}^{\mathrm{TTLS},(\ell)}=\frac{\omega_{\ell}}{\omega_{p}}\sqrt{\sum_{j=1}^{\ell}|{\boldsymbol{g}}_{j}|^{2}}.
  • 4.

    Set the relative condition vector CrelTTLS,()=CabsTTLS,()/𝒙kC_{\rm rel}^{\mathrm{TTLS},(\ell)}=C_{\rm abs}^{\mathrm{TTLS},(\ell)}/{\boldsymbol{x}}_{k}. Compute the mixed and componentwise condition estimations mSCETTLS,()m_{\rm SCE}^{\mathrm{TTLS},(\ell)} and cSCETTLS,()c_{\rm SCE}^{\mathrm{TTLS},(\ell)} as follows,

    mSCETTLS,()=CabsTTLS,()𝒙k,cSCETTLS,()=CrelTTLS,()=CabsTTLS,()𝒙k.m_{\rm SCE}^{\mathrm{TTLS},(\ell)}=\frac{\left\|C_{\rm abs}^{\mathrm{TTLS},(\ell)}\right\|_{\infty}}{\|{\boldsymbol{x}}_{k}\|_{\infty}},\quad c_{\rm SCE}^{\mathrm{TTLS},(\ell)}=\left\|C_{\rm rel}^{\mathrm{TTLS},(\ell)}\right\|_{\infty}=\left\|\frac{C_{\rm abs}^{\mathrm{TTLS},(\ell)}}{{\boldsymbol{x}}_{k}}\right\|_{\infty}.

To estimate the mixed and componentwise condition numbers via Algorithm 2, we only need additional computational cost m(n+1)\ell m(n+1) flops comparing with Algorithm 1.

4.3. Structured perturbation analysis

For the structured TLS problem, it is reasonable to consider the case that the perturbation ΔA\Delta A has the same structure as AA. Suppose the matrix AA takes the form of (2.17), i.e., A=i=1taiSiA=\sum_{i=1}^{t}a_{i}S_{i}. Here, S1,,StS_{1},\ldots,S_{t} are linearly independent matrices of a linear subspace 𝒮m×n{\mathcal{S}}\subset{\mathbb{R}}^{m\times n}. It is easy to see that

𝗏𝖾𝖼(A)=Φst𝒂,{\sf vec}(A)=\Phi^{st}{\boldsymbol{a}},

where 𝒂=[a1,a2,,at]t{\boldsymbol{a}}=[a_{1},a_{2},\ldots,a_{t}]^{\top}\in{\mathbb{R}}^{t} and Φst=[𝗏𝖾𝖼(S1),𝗏𝖾𝖼(S2),,𝗏𝖾𝖼(St)]\Phi^{st}=[{\sf vec}(S_{1}),{\sf vec}(S_{2}),\ldots,{\sf vec}(S_{t})]. For

𝒮={all m×n real Toeplitz matrices},\mathcal{S}=\{\mbox{all $m\times n$ real Toeplitz matrices}\},

we have t=m+n1t=m+n-1,

S1\displaystyle S_{1} =𝚝𝚘𝚎𝚙𝚕𝚒𝚝𝚣(𝒆1,0),,Sm=𝚝𝚘𝚎𝚙𝚕𝚒𝚝𝚣(𝒆m,0),\displaystyle={\tt toeplitz}({\boldsymbol{e}}_{1},0),\,\ldots,\,S_{m}={\tt toeplitz}({\boldsymbol{e}}_{m},0),
Sm+1\displaystyle S_{m+1} =𝚝𝚘𝚎𝚙𝚕𝚒𝚝𝚣(0,𝒆2),Sm+n1=𝚝𝚘𝚎𝚙𝚕𝚒𝚝𝚣(0,𝒆n),\displaystyle={\tt toeplitz}(0,{\boldsymbol{e}}_{2})\ldots,\,S_{m+n-1}={\tt toeplitz}(0,{\boldsymbol{e}}_{n}),

where the Matlab-routine notation A=𝚝𝚘𝚎𝚙𝚕𝚒𝚝𝚣(Tc,Tr)𝒮A={\tt toeplitz}(T_{c},T_{r})\in{\mathcal{S}} denotes a Toeplitz matrix having TcmT_{c}\in{\mathbb{R}}^{m} as its first column and TrnT_{r}\in{\mathbb{R}}^{n} as its first row, and A=i=1taiSiA=\sum_{i=1}^{t}a_{i}S_{i}, where 𝒂=[Tc,Tr(2:end)]m+n1{\boldsymbol{a}}=[T_{c}^{\top},T_{r}(2:end)]^{\top}\in{\mathbb{R}}^{m+n-1}. This means that a Toeplitz matrix AA can be obtained by taking 𝒮=m+n1{\mathcal{S}}={\mathbb{R}}^{m+n-1} and letting τ\tau be the map

τ(𝒂)=[a1am+1am+n2am+n1a2a1am+n3am+n2am1am2an1anamam1an2an1]=Am×n,𝒂=[a1a2am+n1]m+n1.\tau({\boldsymbol{a}})=\begin{bmatrix}a_{1}&a_{m+1}&\cdots&a_{m+n-2}&a_{m+n-1}\cr a_{2}&a_{1}&\cdots&a_{m+n-3}&a_{m+n-2}\cr\vdots&\vdots&\ddots&\vdots&\vdots\cr a_{m-1}&a_{m-2}&\cdots&a_{n-1}&a_{n}\cr a_{m}&a_{m-1}&\cdots&a_{n-2}&a_{n-1}\end{bmatrix}=A\in{\mathbb{R}}^{m\times n},\quad\forall{\boldsymbol{a}}=\begin{bmatrix}a_{1}\cr a_{2}\cr\vdots\cr a_{m+n-1}\end{bmatrix}\in{\mathbb{R}}^{m+n-1}.

The SCE method maintains the desired matrix structure by working with the perturbations of AA and 𝒃{\boldsymbol{b}} in the linear space of 𝒮×m{\mathcal{S}}\times{\mathbb{R}}^{m}. This produces only slight changes in the SCE algorithm. By simply generating Δai\Delta a_{i} and Δ𝒃\Delta{\boldsymbol{b}} randomly instead of ΔA=\Delta A= and Δ𝒃\Delta{\boldsymbol{b}} as in Algorithm 1, we obtain an algorithm to estimate the condition of a composite map fτf\circ\tau. We summarize the structured normwise condition estimation in Algorithm 3, which also includes the structured componentwise condition estimation. The computational cost of Algorithm 3 is reported in Table 1.

Algorithm 3 Small sample condition estimation for the STTLS problem under structured perturbation analysis
  • 1.

    Generate matrices [Δ𝒂^1,Δ𝒂^2,,Δ𝒂^],[Δ𝒃^1,Δ𝒃^2,,Δ𝒃^][\Delta\hat{{\boldsymbol{a}}}_{1},\Delta\hat{{\boldsymbol{a}}}_{2},\ldots,\Delta\hat{{\boldsymbol{a}}}_{\ell}],~{}[\Delta\hat{{\boldsymbol{b}}}_{1},\Delta\hat{{\boldsymbol{b}}}_{2},\ldots,\Delta\hat{{\boldsymbol{b}}}_{\ell}] with entries in 𝒩(0,1),{\mathcal{N}}(0,1), where Δ𝒂^it\Delta\hat{{\boldsymbol{a}}}_{i}\in{\mathbb{R}}^{t} and Δ𝒃^im\Delta\hat{{\boldsymbol{b}}}_{i}\in{\mathbb{R}}^{m}. Orthonormalize the following matrix

    [Δ𝒂^1Δ𝒂^2Δ𝒂^Δ𝒃^1Δ𝒃^2Δ𝒃^],\left[\begin{matrix}\Delta\hat{{\boldsymbol{a}}}_{1}&\Delta\hat{{\boldsymbol{a}}}_{2}&\cdots&\Delta\hat{{\boldsymbol{a}}}_{\ell}\cr\Delta\hat{{\boldsymbol{b}}}_{1}&\Delta\hat{{\boldsymbol{b}}}_{2}&\cdots&\Delta\hat{{\boldsymbol{b}}}_{\ell}\end{matrix}\right],

    to obtain an orthonormal matrix [ξ1ξ2,,ξ][\xi_{1}~{}\xi_{2},\ldots,\xi_{\ell}] by using the modified Gram-Schmidt orthogonalization process. Set

    [Δ𝒂~iΔ𝒃~i]=ξi,i=1,,.[\Delta\tilde{\boldsymbol{a}}_{i}^{\top}~{}\Delta\tilde{\boldsymbol{b}}_{i}^{\top}]^{\top}=\xi_{i},\quad i=1,\ldots,\ell.

    Set

    [Δ𝒂iΔ𝒃i]=[𝒂^i𝒃^i][Δ𝒂~iΔ𝒃~i],i=1,,[\Delta{\boldsymbol{a}}_{i}^{\top}~{}\Delta{\boldsymbol{b}}_{i}^{\top}]^{\top}=[\hat{{\boldsymbol{a}}}_{i}^{\top}~{}\hat{{\boldsymbol{b}}}_{i}^{\top}]^{\top}\boxdot[\Delta\tilde{\boldsymbol{a}}_{i}^{\top}~{}\Delta\tilde{\boldsymbol{b}}_{i}^{\top}]^{\top},\quad i=1,\ldots,\ell

    and

    ΔHi:=[ΔAiΔ𝒃i],ΔAi=j=1tΔajSj,i=1,,.\Delta H_{i}:=[\Delta A_{i}\quad\Delta{\boldsymbol{b}}_{i}],\quad\Delta A_{i}=\sum_{j=1}^{t}\Delta a_{j}S_{j},\quad i=1,\ldots,\ell.
  • 2.

    Let p=t+mp=t+m. Approximate ωp\omega_{p} and ω\omega_{\ell} by (4.1).

  • 3.

    For j=1,2,,,j=1,2,\ldots,\ell, calculate 𝒈j{\boldsymbol{g}}_{j} by (4.2). Compute the absolute condition vector

    K¯abs=ωωtj=1|𝒈j|2.\bar{K}_{abs}=\frac{\omega_{\ell}}{\omega_{t}}\sqrt{\sum_{j=1}^{\ell}|{\boldsymbol{g}}_{j}|^{2}}.
  • 4.

    Compute the normwise condition numbers as follows:

    κSCESTTLS,()=K¯abs2[𝒂𝒃]2𝒙k2,\kappa_{\rm SCE}^{\mathrm{STTLS},(\ell)}=\frac{\left\|\bar{K}_{abs}\right\|_{2}\left\|~{}[{\boldsymbol{a}}^{\top}~{}{\boldsymbol{b}}^{\top}]^{\top}\right\|_{2}}{\|{\boldsymbol{x}}_{k}\|_{2}},

    Compute the mixed and componentwise condition estimations mSCESTTLS,()m_{\rm SCE}^{\mathrm{STTLS},(\ell)} and cSCESTTLS,()c_{\rm SCE}^{\mathrm{STTLS},(\ell)} as follows,

    mSCESTTLS,()=K¯absS𝒙k,cSCETTLS,()=K¯absS𝒙k.m_{\rm SCE}^{\mathrm{STTLS},(\ell)}=\frac{\left\|\bar{K}^{S}_{abs}\right\|_{\infty}}{\|{\boldsymbol{x}}_{k}\|_{\infty}},\quad c_{\rm SCE}^{\mathrm{TTLS},(\ell)}=\left\|\frac{\bar{K}^{S}_{abs}}{{\boldsymbol{x}}_{k}}\right\|_{\infty}.
Table 1. Computational complexity for Algorithm 3.
Step 1 2 3
Algorithm 3 𝒪(2(m+r)+(m+r)){\mathcal{O}}(\ell^{2}(m+r)+\ell(m+r)) 𝒪(mn){\mathcal{O}}(mn) 𝒪(mn+n2){\mathcal{O}}(mn\ell+n^{2}\ell)

5. Numerical examples

In this section, we present some numerical examples to illustrate the reliability of the SCE for the TTLS problem (1.3). For a given TTLS problem, the TTLS solution 𝒙k{\boldsymbol{x}}_{k} with truncation level kk can be computed by utilizing the SVD of [A𝒃][A~{}{\boldsymbol{b}}] and (3.8). The corresponding exact condition numbers are computed by their explicit expressions associated with the given data [A𝒃][A~{}{\boldsymbol{b}}]. All the sample number \ell in Algorithms 1 to 3 are set to be =3\ell=3. All the numerical experiments are carried out on Matlab R2019b with the machine epsilon μ2.2×1016\mu\approx 2.2\times 10^{-16} under Microsoft Windows 10.

Example 5.1.

Let

A=[2003010s],𝒃=[10s01],A=\left[\begin{matrix}2&0\\ 0&3\\ 0&10^{-s}\\ \end{matrix}\right],\quad{\boldsymbol{b}}=\left[\begin{matrix}10^{-s}\\ 0\\ 1\end{matrix}\right],

where s+s\in\mathbb{R}_{+}.

In Table 2, we compare our SCE-based estimations κSCETTLS,()\kappa_{\rm SCE}^{{\rm TTLS},(\ell)}, mSCETTLS,()m_{\rm SCE}^{\mathrm{TTLS},(\ell)} and cSCETTLS,()c_{\rm SCE}^{\mathrm{TTLS},(\ell)} from Algorithms 1 and 2 with the corresponding exact condition numbers for Example 5.1. The symbol ``×"``\times" in Table 2 means the condition numbers κ1rel(A,𝒃)\kappa^{rel}_{1}(A,{\boldsymbol{b}}), m1(A,𝒃)m_{1}(A,{\boldsymbol{b}}) and c1(A,𝒃)c_{1}(A,{\boldsymbol{b}}) are not defined for the truncation level k=1k=1. From the numerical results listed in Table 2, it is easy to find that the normwise condition number κrel(A,𝒃)\kappa^{rel}(A,{\boldsymbol{b}}) defined by (2.15) are much greater than results of mixed and componentwise condition numbers m(A,𝒃)m(A,{\boldsymbol{b}}) and c(A,𝒃)c(A,{\boldsymbol{b}}) given by Theorem 2.1. The (untruncated) TTL problem (1.3) is well conditioned under componentwise perturbations regardless of the choice of ss and kk. Compared with the normwise condition number, the mixed and componentwise condition numbers may capture the true conditioning of this TTLS problem. We also observe that the SCE-based condition estimations can provide reliable estimations. Moreover, numerical results of the untruncated mixed and componentwise condition numbers m(A,𝒃)m(A,{\boldsymbol{b}}) and c(A,𝒃)c(A,{\boldsymbol{b}}) given in Theorem 2.1 are equal to corresponding values of the untruncated ones m1(A,𝒃)m_{1}(A,{\boldsymbol{b}}) and c1(A,𝒃)c_{1}(A,{\boldsymbol{b}}) given by (3.12)–(3.13), respectively. The similar conclusion can be drawn for comparing κrel(A,𝒃)\kappa^{rel}(A,{\boldsymbol{b}}) and κ1rel(A,𝒃)\kappa^{rel}_{1}(A,{\boldsymbol{b}}) for k=2k=2.

Table 2. Comparisons of exact normwise, mixed and componentwise condition numbers with SCE-based condition estimations for Example 5.1 via Algorithms 1 and 2 with =3\ell=3.
ss kk κrel(A,𝒃)\kappa^{rel}(A,{\boldsymbol{b}}) κ1rel(A,𝒃)\kappa^{rel}_{1}(A,{\boldsymbol{b}}) κSCETTLS,()\kappa_{\rm SCE}^{{\rm TTLS},(\ell)} m(A,𝒃)m(A,{\boldsymbol{b}}) m1(A,𝒃)m_{1}(A,{\boldsymbol{b}}) mSCETTLS,()m_{\rm SCE}^{\mathrm{TTLS},(\ell)} c(A,𝒃)c(A,{\boldsymbol{b}}) c1(A,𝒃)c_{1}(A,{\boldsymbol{b}}) cSCETTLS,()c_{\rm SCE}^{\mathrm{TTLS},(\ell)}
33 11 1.181041.18\cdot 10^{4} ×\times 1.151041.15\cdot 10^{4} 4.504.50 ×\times 2.702.70 16.2016.20 ×\times 11.6511.65
3 22 4.111034.11\cdot 10^{3} 4.111034.11\cdot 10^{3} 5.461035.46\cdot 10^{3} 3.333.33 3.333.33 1.401.40 4.504.50 4.504.50 2.192.19
66 11 1.181071.18\cdot 10^{7} ×\times 1.511071.51\cdot 10^{7} 4.504.50 ×\times 3.023.02 16.0516.05 ×\times 10.1610.16
66 22 4.111064.11\cdot 10^{6} 4.111064.11\cdot 10^{6} 5.671065.67\cdot 10^{6} 3.333.33 3.333.33 2.352.35 4.504.50 4.504.50 2.352.35
99 11 1.1810101.18\cdot 10^{10} ×\times 1.4210101.42\cdot 10^{10} 4.504.50 ×\times 2.312.31 4.504.50 ×\times 2.312.31
99 22 4.111094.11\cdot 10^{9} 4.111094.11\cdot 10^{9} 2.981092.98\cdot 10^{9} 3.333.33 3.333.33 2.432.43 4.504.50 4.504.50 3.693.69
1212 11 1.1810131.18\cdot 10^{13} ×\times 1.6110131.61\cdot 10^{13} 4.504.50 ×\times 3.373.37 4.504.50 ×\times 3.373.37
1212 22 4.1110124.11\cdot 10^{12} 4.1110124.11\cdot 10^{12} 3.8310123.83\cdot 10^{12} 3.333.33 3.333.33 1.201.20 4.504.50 4.504.50 3.173.17
Example 5.2.

[27] Let the data matrix AA and the observation vector 𝒃{\boldsymbol{b}} be given by

A=[m1111m1111m1111111]m×(m2),𝒃=[11m11]m.A=\left[\begin{matrix}m-1&-1&\cdots&-1\\ -1&m-1&\cdots&-1\\ \vdots&\vdots&\ddots&\vdots\\ -1&-1&\cdots&m-1\\ -1&-1&\cdots&-1\\ -1&-1&\cdots&-1\end{matrix}\right]\in{\mathbb{R}}^{m\times(m-2)},\quad{\boldsymbol{b}}=\left[\begin{matrix}-1\\ -1\\ \vdots\\ m-1\\ -1\end{matrix}\right]\in{\mathbb{R}}^{m}.

Since the first mm-22 singular values of the augmented matrix [A𝒃][A~{}{\boldsymbol{b}}] are equal and larger than the (m1)(m-1)-th singular value σm1\sigma_{m-1}, the truncated level kk can only be m2m-2. It is clear that AA is a Toeplitz matrix.

A condition estimation is said to be reliable if the estimations fall within one tenth to ten times of the corresponding exact condition numbers (cf. [15, Chapter 15]). Table 3 displays the numerical results for Example 5.2 by choosing from m=100m=100 to m=500m=500. From Table 3, we can conclude that Algorithms 3 can provide reliable mixed and componentwise condition estimations for this specific Toeplitz matrix AA and the observation vector 𝒃{\boldsymbol{b}}, while the normwise condition estimation may seriously overestimate the true relative normwise condition number. We also see that the unstructured mixed and componentwise condition numbers m(A,𝒃)m(A,{\boldsymbol{b}}), c(A,𝒃)c(A,{\boldsymbol{b}}) given by Theorem 2.1 are not smaller than the corresponding structured ones ms(A,𝒃)m_{s}(A,{\boldsymbol{b}}), cs(A,𝒃)c_{s}(A,{\boldsymbol{b}}) shown in Theorem 2.2, which is consistent with Proposition 2.1. Numerical values of the structured normwise, mixed and componentwise condition number are smaller than the corresponding counterparts.

Table 3. Comparisons of true structured normwise, mixed and componentwise condition numbers with SCE-based condition estimations for Example 5.2 via Algorithms 3 with the truncated level k=m2k=m-2 and =3\ell=3.
mm κrel(A,𝒃)\kappa^{rel}(A,{\boldsymbol{b}}) κsrel(A,𝒃)\kappa_{s}^{rel}(A,{\boldsymbol{b}}) κSCESTTLS,()\kappa_{\rm SCE}^{\mathrm{STTLS},(\ell)} m(A,𝒃)m(A,{\boldsymbol{b}}) ms(A,𝒃)m_{s}(A,{\boldsymbol{b}}) mSCESTTLS,()m_{\rm SCE}^{\mathrm{STTLS},(\ell)} c(A,𝒃)c(A,{\boldsymbol{b}}) cs(A,𝒃)c_{s}(A,{\boldsymbol{b}}) cSCESTTLS,()c_{\rm SCE}^{\mathrm{STTLS},(\ell)}
100100 8.981028.98\cdot 10^{2} 9.241019.24\cdot 10^{1} 7.371037.37\cdot 10^{3} 2.492.49 2.492.49 2.842.84 2.492.49 2.492.49 2.842.84
200200 1.801031.80\cdot 10^{3} 1.311021.31\cdot 10^{2} 1.931041.93\cdot 10^{4} 2.502.50 2.502.50 2.272.27 2.502.50 2.502.50 2.272.27
300300 2.711032.71\cdot 10^{3} 1.601021.60\cdot 10^{2} 3.801043.80\cdot 10^{4} 2.502.50 2.502.50 2.212.21 2.502.50 2.502.50 2.212.21
400400 3.611033.61\cdot 10^{3} 1.851021.85\cdot 10^{2} 5.821045.82\cdot 10^{4} 2.502.50 2.502.50 2.192.19 2.502.50 2.502.50 2.192.19
500500 4.521034.52\cdot 10^{3} 2.061022.06\cdot 10^{2} 8.051048.05\cdot 10^{4} 2.502.50 2.502.50 2.142.14 2.502.50 2.502.50 2.142.14
Example 5.3.

This test problem comes from [14, §3.2]. The augmented matrix [A𝒃]m×(n+1)[A~{}{\boldsymbol{b}}]\in{\mathbb{R}}^{m\times(n+1)} are builded by using the SVD [A𝒃]=USV[A~{}{\boldsymbol{b}}]=USV^{\top}. Here, UU is an arbitrary orthogonal matrix with the size of m×mm\times m, and Σ\Sigma be a diagonal matrix with equally spaced singular values in [102,1][10^{-2},1]. The matrix VV is generated as following: compute the QR decomposition of the matrix

[1β2𝒄Xβ𝒅Y]\left[\begin{matrix}\sqrt{1-\beta^{2}}{\boldsymbol{c}}&X\\ \beta{\boldsymbol{d}}&Y\end{matrix}\right]

with the Q-factor QQ, where XX and YY are random matrices, 𝒄k{\boldsymbol{c}}\in{\mathbb{R}}^{k} and 𝒅n+1k{\boldsymbol{d}}\in{\mathbb{R}}^{n+1-k} are normalized random vectors. Here, kmin{m,n}k\leq\min\{m,n\} is truncation level. Then we set VV to be an orthogonal matrix that commutes the first and last rows of QQ^{\top}. It is easy to verify that V22=β𝒅V_{22}=\beta{\boldsymbol{d}}^{\top} and V22=β\|V_{22}\|=\beta. In this test, we take m=400m=400, n=120n=120, k=80k=80, and β=103\beta=10^{-3}. The perturbation matrices ΔA\Delta A and Δ𝒃\Delta{\boldsymbol{b}} of AA and 𝒃{\boldsymbol{b}} are generated as follows:

ΔA=ϵ(EA),Δ𝒃=ϵ(𝒇𝒃),\displaystyle\Delta A=\epsilon\,(E\boxdot A),\,\,\Delta{\boldsymbol{b}}=\epsilon\,({\boldsymbol{f}}\boxdot{\boldsymbol{b}}), (5.1)

where EE and 𝒇{\boldsymbol{f}} are random matrices whose entries are uniformly distributed in the open interval (1,1)(-1,1), ϵ=108\epsilon=10^{-8} represents the magnitude of the perturbation.

We note that both ΔA\Delta A and Δ𝒃\Delta{\boldsymbol{b}} are componentwise perturbations on AA and 𝒃{\boldsymbol{b}} respectively. In order to illustrate the validity of these estimators κSCETTLS,(),mSCETTLS,()\kappa_{\rm SCE}^{\mathrm{TTLS},(\ell)},\,m_{\rm SCE}^{\mathrm{TTLS},(\ell)} and cSCETTLS,()c_{\rm SCE}^{\mathrm{TTLS},(\ell)} via Algorithms 1 and 2, we define the normwise, mixed and componentwise over-estimation ratios as follows

rκ=κSCETTLS,()ϵ𝒙~k𝒙k2/𝒙k2,rm=mSCETTLS,()ϵ𝒙~k𝒙k/𝒙k,rc=cSCETTLS,()ϵ𝒙~k𝒙k𝒙k,r_{\kappa}=\frac{\kappa_{\rm SCE}^{\mathrm{TTLS},(\ell)}\epsilon}{\|\tilde{\boldsymbol{x}}_{k}-{\boldsymbol{x}}_{k}\|_{2}/\|{\boldsymbol{x}}_{k}\|_{2}},\quad r_{m}=\frac{m_{\rm SCE}^{\mathrm{TTLS},(\ell)}\epsilon}{\|\tilde{\boldsymbol{x}}_{k}-{\boldsymbol{x}}_{k}\|_{\infty}/\|{\boldsymbol{x}}_{k}\|_{\infty}},\quad r_{c}=\frac{c_{\rm SCE}^{\mathrm{TTLS},(\ell)}\epsilon}{\|\frac{\tilde{\boldsymbol{x}}_{k}-{\boldsymbol{x}}_{k}}{{\boldsymbol{x}}_{k}}\|_{\infty}},

Typically the ratios in (0.1,10)(0.1,~{}10) are acceptable [15, Chap. 15].

Refer to caption
(a) Normwise over-estimation ratios
Refer to caption
(b) Mixed over-estimation ratios
Refer to caption
(c) Componentwise over-estimation ratios
Figure 1. SCE for the TLS problem under unstructured componentwise perturbations with 1000 samples for Example 5.3.

Figure 1 displays the numerical results for Example 5.3, where we generate 1000 random samples [A𝒃][A~{}{\boldsymbol{b}}]. From Figure 1, we see that κSCETTLS,()\kappa_{\rm SCE}^{\mathrm{TTLS},(\ell)} may seriously overestimate the true relative normwise error. All the normwise over-estimation ratios of 1000 samples is more than 1010, and the mean value of these estimations is 453.8841453.8841. All elements of the mixed over-estimation ratios of 1000 samples are within (0.1,10)(0.1,~{}10) and the mean value of these estimations is 1.35631.3563. There are only 6 entries of the componentwise over-estimation ratios are greater than 10 and the mean value of these estimations is 2.47162.4716. The maximal values of the normwise, mixed and componentwise over-estimation vector are 712.5759, 2.7908712.5759,\,2.7908 and 11.650811.6508, while their corresponding minimum are 263.3766, 0.6078,0.4261263.3766,\,0.6078,0.4261 accordingly. Therefore the mixed and componentwise condition estimations mSCETTLS,()m_{\rm SCE}^{\rm{TTLS},(\ell)} and cSCETTLS,()c_{\rm SCE}^{\rm{TTLS},(\ell)} are reliable.

Example 5.4.

Let the Toeplitz matrix AA and the vector 𝒃{\boldsymbol{b}} are defined in Example 5.2, where m=500m=500. We generate 1000 structured componentwise perturbations ΔA1=ϵ(E1A)\Delta A_{1}=\epsilon\,(E_{1}\boxdot A) and 1000 unstructured componentwise perturbations ΔA2=ϵ(E2A)\Delta A_{2}=\epsilon\,(E_{2}\boxdot A), where E1E_{1} is a random Toeplitz matrix and E2E_{2} is a random matrix whose entries are uniformly distributed in the open internal (1,1)(-1,1). And Δ𝒃=ϵ(f𝒃)\Delta{\boldsymbol{b}}=\epsilon\,(f\boxdot{\boldsymbol{b}}), where 𝒇{\boldsymbol{f}} is a random matrix with components uniformly distributing in the open interval (1,1)(-1,1).

For Example 5.4, we use rκSr_{\kappa}^{S}, rmSr_{m}^{S} and rcSr_{c}^{S} to denote the structured normwise, mixed and componentwise over-estimation ratios corresponding to structured componentwise perturbations of ΔA1\Delta A_{1} and Δ𝒃\Delta{\boldsymbol{b}}, and rκr_{\kappa}, rmr_{m} and rcr_{c} are unstructured normwise, mixed and componentwise over-estimation ratios corresponding to unstructured componentwise perturbations of ΔA2\Delta A_{2} and Δ𝒃\Delta{\boldsymbol{b}}, where

rκS=κSCESTTLS,()ϵ𝒙~k𝒙k2/𝒙k2,rmS=mSCESTTLS,()ϵ𝒙~k𝒙k/𝒙k,rcS=cSCESTTLS,()ϵ𝒙~k𝒙k𝒙k,r_{\kappa}^{\rm S}=\frac{\kappa_{\rm SCE}^{\mathrm{STTLS},(\ell)}\,\epsilon}{\|\tilde{\boldsymbol{x}}_{k}-{\boldsymbol{x}}_{k}\|_{2}/\|{\boldsymbol{x}}_{k}\|_{2}},\quad r_{m}^{\rm S}=\frac{m_{\rm SCE}^{\mathrm{STTLS},(\ell)}\,\epsilon}{\|\tilde{\boldsymbol{x}}_{k}-{\boldsymbol{x}}_{k}\|_{\infty}/\|{\boldsymbol{x}}_{k}\|_{\infty}},\quad r_{c}^{\rm S}=\frac{c_{\rm SCE}^{\mathrm{STTLS},(\ell)}\,\epsilon}{\|\frac{\tilde{\boldsymbol{x}}_{k}-{\boldsymbol{x}}_{k}}{{\boldsymbol{x}}_{k}}\|_{\infty}},
rκ=κSCETTLS,()ϵ𝒙~k𝒙k2/𝒙k2,rm=mSCETTLS,()ϵ𝒙~k𝒙k/𝒙k,rc=cSCETTLS,()ϵ𝒙~k𝒙k𝒙k.r_{\kappa}=\frac{\kappa_{\rm SCE}^{\mathrm{TTLS},(\ell)}\epsilon}{\|\tilde{\boldsymbol{x}}_{k}-{\boldsymbol{x}}_{k}\|_{2}/\|{\boldsymbol{x}}_{k}\|_{2}},\quad r_{m}=\frac{m_{\rm SCE}^{\mathrm{TTLS},(\ell)}\epsilon}{\|\tilde{\boldsymbol{x}}_{k}-{\boldsymbol{x}}_{k}\|_{\infty}/\|{\boldsymbol{x}}_{k}\|_{\infty}},\quad r_{c}=\frac{c_{\rm SCE}^{\mathrm{TTLS},(\ell)}\epsilon}{\|\frac{\tilde{\boldsymbol{x}}_{k}-{\boldsymbol{x}}_{k}}{{\boldsymbol{x}}_{k}}\|_{\infty}}.

Figure 2 displays the numerical results for Example 5.4 with =3\ell=3 and ϵ=108\epsilon=10^{-8}. Here, in Figure 2(A)–Figure 2(C), the symbol “+” in the blue color denote the numerical values of rκSr_{\kappa}^{\rm S}, rmSr_{m}^{\rm S} and rcSr_{c}^{\rm S} corresponding to 1000 structured perturbations while the symbol “*” in the red color denote the numerical values of rκr_{\kappa}, rmr_{m} and rcr_{c} corresponding to 1000 unstructured perturbations.

From Figure 2, we observe that the mixed and componentwise condition estimations mSCESTTLS,()m_{\rm SCE}^{\mathrm{STTLS},(\ell)}, cSCESTTLS,()c_{\rm SCE}^{\mathrm{STTLS},(\ell)}, mSCETTLS,()m_{\rm SCE}^{\mathrm{TTLS},(\ell)} and cSCETTLS,()c_{\rm SCE}^{\mathrm{TTLS},(\ell)} are reliable, while the structured normwise condition estimation κSCESTTLS,()\kappa_{\rm SCE}^{\mathrm{STTLS},(\ell)} may seriously over-estimate the true relative normwise error. Furthermore, we can also conclude that the over-estimation ratios associated with the structured mixed and component condition numbers are smaller than the unstructured counterparts in most cases, which are consistent with the conclusion in Proposition 2.1. The mean values of rmSr_{m}^{\rm S}, rmr_{m}, rcSr_{c}^{\rm S}, and rcr_{c} of 10001000 samples are 1.71401.7140, 2.46422.4642, 1.71401.7140, and 2.46422.4642 respectively. Moreover, all these unstructured and structured mixed and componentwise condition over-estimation ratios are with (0.1,10)(0.1,10), which indicate mixed and componentwise condition estimations are reliable.

Refer to caption
(a) Unstructured and structured normwise over-estimation ratios
Refer to caption
(b) Unstructured and structured mixed over-estimation ratios
Refer to caption
(c) Unstructured and structured componentwise over-estimation ratios
Figure 2. SCE for the TLS problem under unstructured and structured componentwise perturbations with 1000 unstructured and structured perturbations for Example 5.4.
Example 5.5.

This example comes from the model that restructures the image named Shepp-Logan “head phantom” (Shepp and Logan 1974) by the TTLS technique, which is widely used in inverse scattering studies. In fact, the TTLS method has been used to study the ultrasound inverse scattering imaging [24]. Here, we utilize the MATLAB file “paralleltomo.m” from the testprobs suite111Netlib: http://www.netlib.org/numeralgo/ or GitHub: https://github.com/jakobsj/AIRToolsII. to create parallel-beam CT test problem and obtain the exact phantom. The input parameters of “paralleltomo.m” are set to be N=40N=40, θ=0:5:175\theta=0:5:175, and p=55p=55, hence we can obtain a 18341834-by-16001600 matrix AA and 18341834-by-11 right-hand vector 𝒃{\boldsymbol{b}}. The 500 perturbations ΔA\Delta A and Δ𝒃\Delta{\boldsymbol{b}} of AA and 𝒃{\boldsymbol{b}} are generated as in (5.1).

Table 4 lists the condition estimations κSCESTTLS,()\kappa_{\rm SCE}^{\mathrm{STTLS},(\ell)}, cSCETTLS,()c_{\rm SCE}^{\mathrm{TTLS},(\ell)} and mSCETTLS,()m_{\rm SCE}^{\mathrm{TTLS},(\ell)} with truncation level k=1536k=1536 and the corresponding relative errors with respect to different magnitude ϵ\epsilon of the perturbation for Example 5.5. From Table 4, we can see that the relative errors are bounded by the product of ϵ\epsilon and corresponding condition estimations, which means that the proposed condition estimations can give reliable error bounds. The componentwise condition estimation cSCETTLS,()c_{\rm SCE}^{\mathrm{TTLS},(\ell)} is much larger than κSCESTTLS,()\kappa_{\rm SCE}^{\mathrm{STTLS},(\ell)} and mSCETTLS,()m_{\rm SCE}^{\mathrm{TTLS},(\ell)} since the minimum absolute component of the TTLS solution 𝒙k{\boldsymbol{x}}_{k} is too small, which is order of 10610^{-6}. Hence the component of the TTLS solution 𝒙k{\boldsymbol{x}}_{k} in the sense of the tiny magnitude is very sensitive to small perturbations on the underlying component of 𝒙k{\boldsymbol{x}}_{k}.

Table 4. Comparison of the relative errors and SCE-based estimations by Algorithms 1 and 2 under 500 perturbations with different perturbation magnitudes for Example 5.5.
ϵ\epsilon 10110^{-1} 10210^{-2} 10310^{-3} 10410^{-4} 10510^{-5} 10610^{-6} 10710^{-7} 10810^{-8}
𝒙~k𝒙k2𝒙k2\frac{\left\|\tilde{\boldsymbol{x}}_{k}-{\boldsymbol{x}}_{k}\right\|_{2}}{\|{\boldsymbol{x}}_{k}\|_{2}} 3.251023.25\cdot 10^{-2} 3.251033.25\cdot 10^{-3} 3.261043.26\cdot 10^{-4} 3.261053.26\cdot 10^{-5} 3.261063.26\cdot 10^{-6} 3.261073.26\cdot 10^{-7} 3.261083.26\cdot 10^{-8} 3.261093.26\cdot 10^{-9}
κSCETTLS,()\kappa_{\rm SCE}^{\mathrm{TTLS},(\ell)} 3.331043.33\cdot 10^{4} 3.331043.33\cdot 10^{4} 3.331043.33\cdot 10^{4} 3.331043.33\cdot 10^{4} 3.331043.33\cdot 10^{4} 3.3331043.333\cdot 10^{4} 3.331043.33\cdot 10^{4} 3.331043.33\cdot 10^{4}
𝒙~k𝒙kxk\frac{\left\|\tilde{\boldsymbol{x}}_{k}-{\boldsymbol{x}}_{k}\right\|_{\infty}}{\|x_{k}\|_{\infty}} 2.621022.62\cdot 10^{-2} 2.821032.82\cdot 10^{-3} 2.891042.89\cdot 10^{-4} 2.901052.90\cdot 10^{-5} 2.901062.90\cdot 10^{-6} 2.901072.90\cdot 10^{-7} 2.901082.90\cdot 10^{-8} 2.901092.90\cdot 10^{-9}
mSCETTLS,()m_{\rm SCE}^{\mathrm{TTLS},(\ell)} 3.741013.74\cdot 10^{1} 3.741013.74\cdot 10^{1} 3.741013.74\cdot 10^{1} 3.741013.74\cdot 10^{1} 3.741013.74\cdot 10^{1} 3.741013.74\cdot 10^{1} 3.741013.74\cdot 10^{1} 3.741013.74\cdot 10^{1}
𝒙~k𝒙k𝒙k\left\|\frac{\tilde{\boldsymbol{x}}_{k}-{\boldsymbol{x}}_{k}}{{\boldsymbol{x}}_{k}}\right\|_{\infty} 3.651023.65\cdot 10^{2} 7.791017.79\cdot 10^{1} 8.381008.38\cdot 10^{0} 8.441018.44\cdot 10^{-1} 8.441028.44\cdot 10^{-2} 8.441038.44\cdot 10^{-3} 8.441048.44\cdot 10^{-4} 8.441058.44\cdot 10^{-5}
cSCETTLS,()c_{\rm SCE}^{\mathrm{TTLS},(\ell)} 1.851061.85\cdot 10^{6} 1.851061.85\cdot 10^{6} 1.851061.85\cdot 10^{6} 1.851061.85\cdot 10^{6} 1.851061.85\cdot 10^{6} 1.851061.85\cdot 10^{6} 1.851061.85\cdot 10^{6} 1.851061.85\cdot 10^{6}

6. Concluding remarks

In this paper, we study the mixed and componentwise condition numbers of the TTLS problem under the genericity condition. We also consider the structured condition estimation for the STTLS problem and investigate the relationship between the unstructured condition numbers and the corresponding structured counterparts. When the TTLS problem degenerates the untrunctated TLS problem, we prove that condition numbers for the TTLS problem can recover the previous condition numbers for the TLS problem from their explicit expressions. Based on SCE, normwise, mixed and componentwise condition estimations algorithms are proposed for the TTLS problem, which can be integrated into the SVD-based direct solver for the TTLS problem. Numerical examples indicate that, in practice, it is better to adopt the componentwise perturbation analysis for the TTLS problem and the proposed algorithms are reliable, which provide posterior error estimations of high accuracy. The results in this paper can be extended to the truncated singular value solution of a linear ill-posed problem [2]. We will report our progresses on the above topic elsewhere in the future.

References

  • [1] M. Baboulin and S. Gratton, A contribution to the conditioning of the total least-squares problem, SIAM J. Matrix Anal. Appl., 32(3) (2011), pp. 685–699.
  • [2] E. H. Bergou, S. Gratton, and J. Tshimanga, The exact condition number of the truncated singular value solution of a linear ill-posed problem, SIAM J. Matrix Anal. Appl., 35(3) (2014), pp. 1073–1085.
  • [3] A. Beck and A. Ben-Tal, A global solution for the structured total least squares problem with block circulant matrices, SIAM J. Matrix Anal. Appl., 27(1) (2005), pp 238–255.
  • [4] H. Diao and Y. Sun, Mixed and componentwise condition numbers for a linear function of the solution of the total least squares problem, Linear Algebra Appl., 544 (2018), pp. 1–29.
  • [5] H. Diao, Y. Wei, and P. Xie, Small sample statistical condition estimation for the total least squares problem, Numer. Algorithms, 75(2) (2017), pp. 435–455.
  • [6] R. D. Fierro and J. R. Bunch, Collinearity and total least squares, SIAM J. Matrix Anal. Appl., 15(4) (1994), pp. 1167–1181.
  • [7] R. D. Fierro and J. R. Bunch, Perturbation theory for orthogonal projection methods with applications to least squares and total least squares, Linear Algebra Appl., 234(2) (1996), pp. 71–96.
  • [8] R. D. Fierro, G. H. Golub, P. C. Hansen, and D. P. O’Leary, Regularization by truncated total least squares, SIAM J. Sci. Comput., 18(4) (1997), pp. 1223–1241.
  • [9] I. Gohberg and I. Koltracht, Mixed, componentwise, and structured condition numbers, SIAM J. Matrix Anal. Appl., 14(3) (1993), pp. 688–704.
  • [10] G. Golub and W. Kahan, Calculating the singular values and pseudo-inverse of a matrix, J. SIAM Ser. B Numer. Anal., 2(2) (1965), pp. 205–224.
  • [11] G. H. Golub and C. F. Van Loan, An analysis of the total least squares problem, SIAM J. Numer. Anal., 17(6) (1980), pp. 883–893.
  • [12] G. H. Golub and C. F. Van Loan, Matrix Computations, 4th ed., Johns Hopkins University Press, Baltimore, MD, 2012.
  • [13] A. Graham, Kronecker Products and Matrix Calculus: with Applications, Ellis Horwood, London, 1981.
  • [14] S. Gratton, D. Titley-Peloquin, and J. T. Ilunga, Sensitivity and conditioning of the truncated total least squares solution, SIAM J. Matrix Anal. Appl., 34(3) (2013), pp. 1257–1276.
  • [15] N. J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd ed., SIAM, Philadelphia, PA, 2002.
  • [16] Z. Jia and B. Li, On the condition number of the total least squares problem, Numer. Math., 125(1) (2013), pp. 61–87.
  • [17] J. Kamm and J. G. Nagy, A total least squares method for Toeplitz systems of equations, BIT, 38(3) (1998), pp. 560–582.
  • [18] C. S. Kenney and A. J. Laub, Small-sample statistical condition estimates for general matrix functions, SIAM J. Sci. Comput., 15(1) (1994), pp. 36–61.
  • [19] C. S. Kenney, A. J. Laub, and M. S. Reese, Statistical condition estimation for linear least squares, SIAM J. Matrix Anal. Appl., 19(4) (1998), pp. 906–923.
  • [20] C. S. Kenney, A. J. Laub, and M. S. Reese, Statistical condition estimation for linear systems, SIAM J. Sci. Comput., 19(2) (1998), pp. 566–583.
  • [21] A. J. Laub and J. Xia, Applications of statistical condition estimation to the solution of linear systems, Numer. Linear Algebra Appl., 15 (2008), pp. 489–513.
  • [22] P. Lemmerling and S. Van Huffel, Analysis of the structured total least squares problem for Hankel/Toeplitz matrices, Numer. Algorithms, 27(1) (2001), pp. 89–114.
  • [23] B. Li and Z. Jia, Some results on condition numbers of the scaled total least squares problem, Linear Algebra Appl, 435(3) (2011), pp. 674–686.
  • [24] C. Liu, Y. Wang and P. Heng, A comparison of truncated total least squares with Tikhonov regularization in imaging by ultrasound inverse scattering, Phy. in Med. Biol., 48(15) (2003), pp. 2437–2451.
  • [25] I. Markovsky and S. Van Huffel, Overview of total least-squares methods, Signal Processing, 87(10) (2007), pp. 2283–2302.
  • [26] R. D. Skeel, Scaling for numerical stability in Gaussian elimination, J. ACM, 26(3) (1979), pp. 494–526.
  • [27] S. Van Huffel and J. Vandewalle, The Total Least Squares Problem: Computational Aspects and Analysis, SIAM, Philadelphia, PA, 1991.
  • [28] M. Wei, The analysis for the total least squares problem with more than one solution, SIAM J. Matrix Anal. Appl., 13(3) (1992), pp. 746–763.
  • [29] B. Zheng, L. Meng, and Y. Wei, Condition numbers of the multidimensional total least squares problem, SIAM J. Matrix Anal. Appl., 38 (2017), pp. 924–948.
  • [30] B. Zheng and Z. Yang, Perturbation analysis for mixed least squares-total least squares problems, Numer Linear Algebra Appl., 26(4) (2019), e2239.
  • [31] L. Zhou, L. Lin, Y. Wei, and S. Qiao, Perturbation analysis and condition numbers of scaled total least squares problems, Numer. Algorithms, 51(3) (2009), pp. 381–399.