Morse index and determinant of block Jacobi matrices via optimal control
Abstract
We describe the relation between block Jacobi matrices and minimization problems for discrete time optimal control problems. Using techniques developed for the continuous case, we provide new algorithms to compute spectral invariants of block Jacobi matrices. Some examples and applications are presented.
1 Introduction
1.1 Motivation
The goal of this paper is to explore an interesting connection between block Jacobi matrices and a class of discrete optimal control problems. This gives effective algorithms for computing the determinant, the negative inertia index and more generally the number of eigenvalues smaller than some threshold of large block Jacobi matrices.
Recall that a block Jacobi matrix is a matrix of the form
(1) |
where , are symmetric matrices of order . Usually it is also assumed that are positive symmetric matrices of the same order. Here, we only require them to be non-degenerate, i.e. .
Jacobi matrices find applications in numerical analysis [11], statistical physics [12], knot theory [8] and many other areas. Moreover, any symmetric matrix can be put in a tridiagonal form using the Householder method [7]. Therefore, understanding their spectral properties is an extremely important topic.
In this article we establish a correspondence between Jacobi matrices and discrete linear quadratic regulator (LQR) problem. Then we show how the general machinery of optimal control theory allows us to compute the index and the determinant of such matrices. Similar formulas for determinants also appeared in [15], see also [16, 17] for a geometric approach to the problem.
The LQR approach nevertheless has its benefits. In particular, as a small application of our method we give a novel proof of the generalized Euler identity
(2) |
using only methods of linear algebra and basic analysis.
In order to state the main results, let us introduce the necessary notations. Consider the following discrete differential equation:
(3) |
Where and are matrix, are invertible and . Variables are called state variables while are the controls. We denote by and the vectors having and respectively as -th component. Note that we will often consider them as elements of the space of functions on a finite set of points with values in , clearly isomorphic to and correspondingly. On the space of solutions of eq. 3 with fixed initial condition we consider the following quadratic functional given by:
(4) |
where are some symmetric matrices.
The problem of minimizing over all possible discrete trajectories of (3) is the classical linear quadratic regulator problem, which is one of the central problems in optimal control theory. Since we assume that are invertible, we can resolve the constraints (3) with respect to and plug them into (4). This yields
If we assume additionally that and write down the matrix associated to the quadratic form above, we find that it is given by a block Jacobi matrix (1). Indeed, set
which is a symmetric non degenerate matrix. Then the functional is the represented by the symmetric matrix (1) with:
(5) | ||||
We now see that to each LQR problem (3)-(4) we can associate a Jacobi matrix. The converse is also true. However, there are several LQR problems which correspond to the same Jacobi matrix. We can get rid of this ambiguity by choosing fixed invertible , and an arbitrary matrix . Thus to every LQR problem (3)-(4) one can associate a Jacobi matrix and vice versa. Fixing , and beforehand makes this correspondence an affine bijection.
1.2 Main results
Through out the paper we will make the following assumption. It is equivalent to requiring in (5).
Assumption 1.
The matrices and in (3) are invertible for every .
Similarly to the case of continuous optimal control problems, one can characterize the extremal points of the functional (4) under the constraints (3) as solutions to a system of difference equations. To be more precise define the following matrices for
and for we take:
Next, we define matrices:
and
Matrices correspond to the flow of the extremal equations that we will derive in Section 2 using the Lagrange multiplier rule. Now we can formulate the two main results of this paper.
Theorem 1.
Notice that we have reduced the problem of computing the negative inertia index of a matrix to computing the inertia of matrices of dimension , which greatly reduces the dimensionality of the problem. In the same spirit, we also prove a formula for the determinant.
Theorem 2.
Let us rephrase the two statements in terms of Jacobi matrices of the form (1). We can choose and for all and obtain the following relations from eq. 5 for :
Corollary 1.
Remark 1.
The techniques used in our proofs are deeply connected to symplectic geometry. One can extend them to much more general settings [3, 4, 6]. We have chosen to keep the exposition as simple as possible and minimize the use of symplectic language in this work. In this way, both formulas (6) and (7) can be applied directly.
The article has the following structure. In Section 2 we derive the Hamiltonian system for the extremal curves using the end-point map and Lagrange multiplier rule. In Section 3 we prove Theorem 1 and in Section 4 Theorem 2. Finally, in Section 5, we consider some examples and prove the generalized Euler identity (2).
2 Eigenvalues of Jacobi matrices and the Lagrange multiplier rule
In order to compute the index and the determinant of a Jacobi matrix in (1) we will use the flow of a certain system of difference equations. First, though, we need some notions from optimal control theory.
The first object we introduce is the endpoint map:
This map takes a control and gives the solution of the differential equation (3) with at step . We can write this map explicitly by iterating (3).
(8) |
In particular, from this formula, it is clear that the value of depends only on the first components of the control . It makes sense, thus, to introduce the following filtration (i.e. flag of subspaces) in the space of controls. Set the space of controls and define:
(9) |
Denote by the orthogonal projection on with respect to the standard Euclidean scalar product. We have:
Under 1, we can resolve the constraints quite easily. This will play an important role in all of the proofs. Indeed, consider the map
defined as
(10) |
It is straightforward to check, using eq. 8, that:
(11) |
We can identify with a row block-matrix of size .
As we have seen in the previous section, a Jacobi matrix can be interpreted as the functional in eq. 4 restricted to . We would like to find the eigenvalues of this restriction, i.e. of the corresponding Jacobi matrix .
By definition is an eigenvalue of if and only if the bilinear form is degenerate on . Since the form is quadratic and is a linear space, this is equivalent to finding critical points of on . This is a typical task in constrained optimization and can be done via Lagrange multiplier rule. Notice that is not an eigenvalue of if and only if all , are non degenerate.
Let us multiply by , operation that clearly preserves the kernel. After we introduce a new parameter we obtain a family of functionals
The following result is a discrete version of the Pontryagin maximum principle for LQR problems, which characterizes critical points of the constrained variational problem as solutions to a Hamiltonian system.
Lemma 1.
The quadratic form is degenerate on if and only if the boundary value problem
(12) |
with , has a non-trivial solution. Moreover the dimension of the space of solutions of eq. 12 and of coincide.
The former quadratic form is degenerate if and only if is an eigenvalue of the restriction to of:
Proof.
If is critical point of restricted to , there exists such that
(13) |
Since is linear, we have . We will often use the latter notation in the formulas that follow.
In order to obtain a discrete analogue of a Hamiltonian system of PMP, we need to introduce for a family of functionals
and their differentials:
(14) |
Functionals differ from only by the range of summation and clearly . We look for , which satisfy:
(15) |
Using the explicit formula (8) for the end-point map we find the recurrence relation
(16) |
for all . From here, we can already derive the expression for the component of the control. Indeed, assume that for all . In this case
Hence, if we substite such a control in (15), then using (14) and (16) we obtain the control law
(17) |
which we can plug in (3).
The next step is to obtain a discrete equation for . To do this we compare the multipliers and by subtracting (15) from the same formula with replaced by . Using as a test variation and formulas (14) and (16) we find that:
Since by our assumption are invertible for all , the end-point map is surjective, as can be seen from the explicit expression (8). Thus the term in the bracket must vanish.
Collecting everything gives eq. 12. The boundary conditions for this system come form the fact that we have to add the equation and that we are assuming .
Now, notice that the initial covector of the lift uniquely determines the whole trajectory and the control. This means that the correspondence is a bijection between the space of solutions of the boundary value problem and the kernel of . ∎
Under 1, we can rewrite system (12) as a forward equation. To do so, recall that , multiply the second equation by and plug in the new expression for into the first equation. This gives
(18) |
We thus have
where
Both matrices in the product are symplectic, which makes symplectic as well.
Define the flow up to the point as
When , we just write . Also in accordance with the notations given in the introduction, we write for :
3 A recursive formula for the index
In this section we prove Theorem 1. The tools we will employ are essentially taken from linear algebra. The same kind of ideas has been applied in [4] to reformulate problems of intersection theory for paths in the Lagrange Grassmannian in terms of finite dimensional linear algebra. The classical approach to second order minimality conditions rephrases the problem of counting the number of negative eigenvalues of the Second Variation (the counter part of our quadratic form ) as counting the intersection number of a curve in a finite-dimensional manifolds, known as the Lagrangian Grassmannian, with an appropriate submanifold. For further references on the continuum approach see for instance [9, 14, 1] and reference therein.
In contrast to the continuous case, here we work only with an ordered sets of points constructed from the solutions of eq. 12. Still, all the information about the index can be recovered from them.
We will use the following lemma:
Lemma 2.
Suppose that a quadratic form is defined on finite dimensional vector space and a subspace is given. Denote by the orthogonal subspace to i.e.:
Denote by the number of negative eigenvalues of , then:
We will apply iteratively this formula to the subspaces . The subspaces were defined in (9) and provide a filtration of our space of variations. One has a natural filtration as well. Indeed, if , then . Hence, if we extend the control to by taking , we get from (8)
Notice also that for any , the subspace has codimension in .
Thus, iteration of this formula gives:
What is left to do is to express every term using the discrete Hamiltonian system given in eq. 12.
First of all, let us describe the subspaces and .
Lemma 3.
Let be defined as above and let be the vertical subspace
Then we have the following identifications:
Proof.
A direct consequence of the definitions of and is that . Lemma 1 identifies the elements of the kernel with solutions of the equation (12) with . In terms of the corresponding flow we get:
Similarly we have . Applying gives
Combining this with the previous formula proves the second isomorphism in the statement. ∎
Notice that the previous lemma holds for general LQR. In this particular case, is transversal to the fibre. Indeed, from the explicit expression of we have that if and only if , which is never the case under 1. Thus
The next step is to compute . To do so consider the standard symplectic form on :
A plane is called Lagrangian if and , i.e., it is a maximally isotropic space.
The Maslov form of a triple of Lagrangian spaces is a symmetric quadratic form on defined as follows. Let where . Then
Lemma 4.
The negative index of restricted to coincides with the negative index of .
Proof.
An element belongs to if and only if it is of the form:
where is defined in (10). Hence we can identify with .
It follows that an element belongs to if and only if for all :
where the -scalar product is defined on functions on a finite set with values in . Here we used that since .
We have thus shown that if then
for all functions .
Let us now simplify the form using this expression. Take an element . Then
Where we used the characterization of given in the previous lemma and substituted with . We now rewrite each term in symplectic form. In particular recall the following relations, which are consequences of (17) and (3)
This implies that:
Similarly, and can be expressed in terms of .
We thus get the following expression for after substituting the previous relations and grouping separately terms with and :
On the other hand let us consider the Maslov form and write down its explicit expression. We have on .
(19) |
Thus the value of is determined by:
(20) |
Here, in the second equality we substituted from (3) and in the third equality from the same equation. It follows that negative inertia index of is the same as the negative inertia index of . ∎
To arrive at formula (6) we need to write down explicitly the kernel of the Maslov form in terms of the initial covector . We have
Moreover, is directly determined by , see section 3, and thus by as:
After inserting inside the explicit expression of the Maslov form given in (20) we find that the following quadratic form is the one to consider:
Now each term in the formula for is identified and collecting all the pieces together finishes the proof of Theorem 1.
4 Proof of the determinant formula
In this section we prove Theorem 2. Consider the quadratic form . As discussed in the introduction, we can pass from control variables to state variables . This gives the Jacobi matrix once we impose the boundary conditions . This matrix decomposes as the sum of two terms. A diagonal matrix and a tridiagonal one given by:
Notice, moreover, that the matrix is positive definite, since it is a multiple of the quadratic form . For this reason we can define its square root which is again positive definite and symmetric. We will use this observation to relate the determinant of to the characteristic polynomial of a suitable matrix.
Lemma 5.
The function is a polynomial of degree and in particular it is a multiple of .
Proof.
The first thing we have to notice is that the entries of are polynomials of degree at most since is the product of affine in matrices. This implies that has at most degree .
This is exactly the dimension of the space of controls. We know by Lemma 1 that this polynomial has at least roots. This is because our perturbation sees every eigenvalue of but , whose eigenspace corresponds to the kernel of . In particular when the are all invertible, has degree , vanishes on the same set as and thus they are multiples. In general we have:
Order the non zero eigenvalues of as . Since has roots different from zero and a root of order at we can rewrite the formula as:
Thus we have that and are both polynomials of degree vanishing on the same set and thus they must be multiples. ∎
By Lemma 5 there exists a constant such that
The constant can be computed by evaluating both sides at . Using induction we can prove that
(21) |
Hence
Now it only remains to find . Recall that corresponds to the quadratic form .
Given a trajectory of (3) with we can always reconstruct the control. This gives a bijection between and which can be written using a block-triangular matrix
Now we wish to use (see eq. 10) to pull back to the quadratic form:
seen as a quadratic form on . To do so we simply need to resolve for using (11) and plug it inside the form above. This gives the form on .
We can now change coordinates and work with the variable. One can see that can be written as a symmetric matrix on of the form:
So if we want to compute its determinant, we need to find determinant of and of . We clearly have
So it only remains to find .
Notice that, since is a surjection, it has a large kernel of dimension . If , then is an eigenvector of with eigenvalue . So the determinant can be computed multiplying the remaining eigenvalues. We claim that they are all equal to the eigenvalues of . Indeed, if is an eigenvalue of , then
Applying to both sides gives
Hence we obtain that:
5 Some examples and applications
5.1 Tridiagonal Toeplitz Matrices
Let us compute the determinant of a symmetric tridiagonal Toeplitz matrix
(23) |
where and .
Recall that eigenvalues of are explicitly known [13] and are of the form
So we already a have a way of computing the determinant of the matrix as the product of all eigenvalues:
(24) |
Let us compute the same determinant using the control theoretic approach.
Using formulas (5) we reformulate the problem of computing as a LQR problem. We choose for :
In particular, this immediately implies that
After comparing with (7), we see that only needs to be computed. In this particular case
Denote
The eigenvalues of this matrix are given by
(25) |
Assume that . Then we can diagonalize using the matrix
as can be verified by a direct computation. Thus
After all of the matrices are multiplied, one can take the element which lies under the diagonal and use formula (7). This way, after some simplification, we obtain an expression which can be compactly written in terms of the two eigenvalues
(26) |
If , then using the continuity of the determinant this formula reduces to
5.2 Generalized Euler’s identity
The formula for the determinant of a tridiagonal matrix has been known for a long time. However, we would like to underline that control theoretic approach can be conceptually advantageous. To illustrate this claim, we now prove the generalized Euler’s identity (2).
Notice that we get the classical Euler’s identity if :
The identity (2) appeared in [2]. Here we give a proof which uses only elementary tools from linear algebra and analysis. The idea is to take a certain sequence of tridiagonal Toeplitz matrices of increasing rank and to compute the asymptotic expansion of the determinants in two different ways: using formula (24) and (26).
We start with the following family of one-dimensional LQR problems
and . Here . Morally we want to compute the determinant of the quadratic functional restricted to the curves which satisfy the constraints. The difficulty here is that the functional in question is a quadratic form on an infinite-dimensional space. We will compute the determinant using finite-dimensional approximations coming from discretizations of this continuous LQR problem with time step equal to . We consider discrete LQR problems of the form
In order to apply our formula we introduce a new control function , which gives an equivalent optimal control problem:
Hence we get as parameters of the LQR problem
We can now use formulas (5) to show that we are actually computing the determinants of a tridiagonal Toeplitz matrices from the previous subsection with parameters:
Let us denote those matrices as .
As discussed previously we want to compute in two different ways and then take the limit . Foreshadowing a bit the computation, the sequence does not converge. So instead we compute the limit
There are two motivations for using exactly this ratio. First of all we can notice from formula (26) that if two systems have the same matrices and , then the ratio of the corresponding determinants depends only on the associated discrete flows . Since we have started with an approximation to a continuous system, it is natural to expect that the flows will converge and this way we obtain a finite quantity. Another reason is the celebrated Gelfand-Yaglom formula in quantum mechanics [10], which computes the ratio of determinants of a 1D quantum particle with a potential and a 1D free quantum particle as a ratio of determinants of two finite-rank matrices.
Let us first use formula (26) to compute the ratio. The two eigenvalues of the flow matrix now depend on . To keep formulas clean we omit the dependence on in our notations and indicate explicit dependence on through a sub-index or super-index. We also assume that , . Other sign choices will be a direct consequence of (2) itself.
First note that is independent of . If we write , then the formula (25) for eigenvalues reads as
Formula (26) in this notations reads as
Plugging formulas for and we find that
Hence
Similarly we find that
Hence
and
Thus after collecting everything together, we find
which is exactly the left side of (2).
Now we use formula (24) to compute the same ratio. We get
Expanding and simplifying the denominator gives us
Hence
Changing the summation index gives
Since , we have the following inequality:
Thus for any there exists such that :
(27) |
Define the following partial product:
References
- [1] A. A. Agrachëv. Quadratic mappings in geometric control theory. In Problems in geometry, Vol. 20 (Russian), Itogi Nauki i Tekhniki, pages 111–205. Akad. Nauk SSSR, Vsesoyuz. Inst. Nauchn. i Tekhn. Inform., Moscow, 1988. Translated in J. Soviet Math. 51 (1990), no. 6, 2667–2734.
- [2] A. A. Agrachev. Spectrum of the second variation. Tr. Mat. Inst. Steklova, 304(Optimal noe Upravlenie i Differentsial nye Uravneniya):32–48, 2019.
- [3] Andrei Agrachev and Ivan Beschastnyi. Jacobi fields in optimal control: Morse and Maslov indices. Nonlinear Anal., 214:Paper No. 112608, 47, 2022.
- [4] Baranzini S. Agrachev A. and I. Beschastnyi. Index theorems for graph-parametrized optimal control problems. Nonlinearity, Submitted.
- [5] Kerstin Ammann and Gerald Teschl. Relative oscillation theory for jacobi matrices. arXiv:0810.5648, 2008.
- [6] Stefano Baranzini. Functional determinants for the second variation. submitted to Journal of Differential Equations, 2022.
- [7] Richard L. Burden, J. Douglas Faires, and Albert C. Reynolds. Numerical analysis. Prindle, Weber & Schmidt, Boston, Mass., 1978.
- [8] Stephan D. Burton. The determinant and volume of 2-bridge links and alternating 3-braids. New York J. Math., 24:293–316, 2018.
- [9] J. J. Duistermaat. On the Morse index in variational calculus. Advances in Math., 21(2):173–195, 1976.
- [10] Gerald Dunne. Functional determinants in quantum field theory. Journal of Physics A: Mathematical and Theoretical, 41, 12 2007.
- [11] Sondre Tesdal Galtung and Katrin Grunert. A numerical study of variational discretizations of the Camassa-Holm equation. BIT, 61(4):1271–1309, 2021.
- [12] Alfred Hucht. The square lattice Ising model on the rectangle III: Hankel and Toeplitz determinants. J. Phys. A, 54(37):Paper No. 375201, 29, 2021.
- [13] Said Kouachi. Eigenvalues and eigenvectors of tridiagonal matrices. Electron. J. Linear Algebra, 15:115–133, 2006.
- [14] Yiming Long. Index Theory for Symplectic Paths with Applications. 01 2002.
- [15] Luca Guido Molinari. Determinants of block tridiagonal matrices. Linear Algebra Appl., 429(8-9):2221–2226, 2008.
- [16] Hermann Schulz-Baldes. Sturm intersection theory for periodic jacobi matrices and linear hamiltonian systems. Linear Algebra and its Applications, 436(3):498–515, 2012.
- [17] Hermann Schulz-Baldes and Liam Urban. Space versus energy oscillations of prüfer phases for matrix sturm-liouville and jacobi operators. Electronic Journal of Differential Equations, 2020(76):1–23, 2020.