Distributed Optimal Control of Graph Symmetric Systems via Graph Filters
Abstract
Designing distributed optimal controllers subject to communication constraints is a difficult problem unless structural assumptions are imposed on the underlying dynamics and information exchange structure, e.g., sparsity, delay, or spatial invariance. In this paper, we borrow ideas from graph signal processing and define and analyze a class of Graph Symmetric Systems (GSSs), which are systems that are symmetric with respect to an underlying graph topology. We show that for linear quadratic problems subject to dynamics defined by a GSS, the optimal centralized controller is given by a novel class of graph filters with transfer function valued filter taps and can be implemented via distributed message passing. We then propose several methods for approximating the optimal centralized graph filter by a distributed controller only requiring communication with a small subset of neighboring subsystems. We further provide stability and suboptimality guarantees for the resulting distributed controllers. Finally, we empirically demonstrate that our approach allows for a principled tradeoff between communication cost and performance while guaranteeing stability. Our results can be viewed as a first step towards bridging the fields of distributed optimal control and graph signal processing.
I Introduction
Computing a distributed optimal controller in which subcontrollers have access to subsets of global system information is in general a computationally intractable problem. Indeed, even when restricted to quadratic costs, Gaussian noise, and linear dynamics, the resulting optimal controller can be nonlinear and difficult to compute \citepwitsenhausen1968counterexample. Nevertheless, significant progress has been made in distributed optimal controller synthesis over the past two decades by identifying structural assumptions on the underlying dynamics and information exchange structure such that the resulting distributed controller synthesis problem is convex.
One such structural assumption that has been shown to lead to tractable distributed optimal control problems is spatial invariance \citepbamieh2002distributed (and other closely related notions of symmetry \citepmassioni2009distributed). Such systems are invariant under subsystem permutations, and have been shown to have optimal centralized controllers that are approximately distributed. In particular, this allows for distributed controllers that enjoy stability and near-optimality guarantees to be computed by appropriately truncating the centralized controller.
Contributions: In this paper, inspired by results from graph signal processing, we introduce the notion of Graph Symmetric Systems (GSSs), which are systems that are symmetric with respect to an underlying graph topology (formalized in §II-A). We show that for such systems, the resulting Linear Quadratic (LQ) centralized optimal controller admits an efficient message passing implementation in the form of a novel class of graph filters defined by transfer function filter taps. We subsequently propose and analyze two complementary approaches to computing near-optimal distributed controllers by truncating the centralized optimal controller subject to stability constraints. By leveraging tools from robust System Level Synthesis (SLS) \citepmatni2017scalable, anderson_system_2019, we show that these truncation algorithms can be solved via convex optimization, and that the resulting distributed controllers enjoy sub-optimality guarantees relative to the centralized optimal controller. These results constitute an important first step towards bridging the complementary, but traditionally disparate, fields of distributed optimal control and graph signal processing.
Related work: An alternative structural assumption for tractable distributed optimal control of linear systems can be specified in terms of the sparsity and delay patterns of the control system. In particular, it is possible to characterize conditions on the sparsity and delay patterns of the information exchanged between subcontrollers relative to the propagation of signals through sparse and delayed distributed plants such that distributed optimal control is tractable. The seminal paper [rotkowitz2005characterization] introduced the notion of quadratic invariance,111We note that spatially invariant systems, as defined in [bamieh2002distributed], also satisfy quadratic invariance. We show in Appendix B that graph symmetric systems and controllers lead to optimal control problems satisfying quadratic invariance. which built upon and generalized funnel causality \citepbamieh2005convex, showed that so long as subcontrollers could communicate as quickly as control signals propagated through the plant, then the resulting distributed optimal control problem could be solved via convex optimization. This convex parameterization of sparse and delayed controllers has since been further generalized in the System Level Synthesis (SLS) \citepanderson_system_2019 and Input-Output Parameterization (IOP) \citepfurieri2019input frameworks, which allow for even richer classes of sparsity and delay patterns to be imposed on distributed controllers.
A related class of distributed controllers are those based on Graph Neural Networks (GNNs). GNNs can be viewed as graph filters followed by pointwise nonlinear activation functions \citepRuiz2021-GNN, and among other favorable properties, enjoy stability to graph perturbations \citepGama2020-Stability. While recent use of GNNs for distributed control has shown promise \citepGama2022-DistributedLQR, Gama2022-ControlGNN, yang2021communication, such results currently lack strong guarantees of stability. We believe the results in this paper are a first step towards addressing this gap in the literature, by explicitly connecting graph filters and distributed optimal controllers. The direct relationship between graph filters and GNNs suggests that understanding the former will give insight in the effects of the latter.
Notation: We use upper- and lower-case letters such as and to denote matrices and vectors respectively, although lower-case letters might also be used for scalars or functions (the distinction will be apparent from the context). For both upper- and lower-case letters, we use boldface such as and to denote transfer matrices or vector/scalar transfer functions.
II The Linear Quadratic Regulator Problem for Graph Symmetric Systems
Consider a discrete-time linear time-invariant (LTI) system composed of interconnected scalar subsystems, each with state , control input and which evolves under the dynamics
(1) |
for suitable matrices describing the interaction between subsystems. Here is an i.i.d. zero-mean noise. We can compactly express the dynamics of the full system in terms of the joint states and joint control actions as
(2) |
where are defined such that the global dynamics (2) are consistent with the subsystem dynamics (1).
Our goal is to find a (potentially time-varying) state-feedback controller that minimizes the cost
(3) |
where and are known symmetric matrices. The Linear Quadratic Regulator (LQR) problem is then given by
(4) |
In the centralized setting where each subsystem has access to the global state, it is well-known that the controller that solves (4) is a linear static controller where and is the unique solution to the discrete-time algebraic Ricatti equation:
(5) |
In this work, we consider a distributed variant of (4) where each subsystem can only exchange information with a small subset of subsystems. Specifically, this communication constraint is encoded as a graph , where is the set of components (nodes) and where is the set of the corresponding interconnections (edges). It is assumed that the graph is undirected, i.e. if and only if . As described in the introduction, general information exchange constraints can lead to non-convex optimal control problems \citepwitsenhausen1968counterexample. However, as we show later, under suitable graph symmetry assumptions on the dynamics matrices and the cost matrices , the optimal centralized controller admits a distributed message passing implementation allowing for a principled tradeoff between communication complexity and controller performance.
In the rest of this section, we borrow ideas from graph signal processing \citepSandryhaila2013-DSPG and introduce the notion of graph symmetric systems. First, we introduce a convenient way to define operations that respect the underlying communication graph structure via the graph matrix description (GMD) . The matrix is such that the entry is zero whenever there is no connection between components and , i.e. if and . Note that, since the graph is undirected, the matrix is symmetric. Therefore, it has an eigedecomposition in terms of an orthonormal basis of eigenvectors where is a diagonal matrix with elements such that for being the column of . We now introduce the notion of a graph symmetric system.
Definition 1 (Graph Symmetric System).
Given a GMD for a graph , a linear system (2) is graph symmetric with respect to if the dynamics matrices are simultaneously diagonalized by , i.e.,
(6) |
where are diagonal.
Note that Definition 1 does not require the dynamics to be sparse. In fact, matrices and of the form in Definition 1 can be arbitrarily dense, i.e., the evolution of a subsystem state can depend on subsystems that cannot directly communicate with \citepGama2019-LinearControl. This is distinct from sparsity/delay constraints used in [rotkowitz2005characterization, anderson_system_2019, furieri2019input], and encodes a different notion of symmetry than that exploited in the distributed control of spatially invariant systems \citepbamieh2002distributed.
By well-known results in graph signal processing \citepSandryhaila2013-DSPG, simultaneous diagonalizability of the system matrices and the GMD implies222Under the assumption that corresponds to a finite graph and has all distinct eigenvalues. On a high level, the result follows directly from the Cayley-Hamilton theorem. that they can be written as matrix polynomials of of degrees at most ,
(7) |
Matrices that can be expressed in this matrix polynomial forms are called graph filters \citepSegarra2017-GraphFilterDesign and the coefficients are referred to as the filter weights or filter taps.
We now give a message-passing interpretation of graph symmetric systems. First, it can be seen from the sparsity pattern of the GMD that the output of can be computed entirely as a linear combination of the states in nodes hop away in . To see this, consider the operation whose entry yields
(8) |
More generally, when considering polynomials, it is observed that is equivalent to exchanging times information with one-hop neighbors. Therefore, if the system matrix and the control matrix are polynomials of , then the evolution of the system can be computed entirely by means of exchanges with neighboring nodes. Hence, the system dynamics can be viewed as implementing distributed message passing \citepRuiz2021-GNN. Examples of such linear, distributed systems, include both discrete-time and continuous-time diffusions, solutions to the heat equation, among many others, see [Gama2019-LinearControl] and references therein.
For the rest of the paper, we also assume that the cost matrices for the LQR problem (4) can also be simultaneously diagonalized with the dynamics matrices. Formally, we make the following assumption.
III Optimal Distributed Linear Controller via System Level Synthesis
SLS provides a convex parameterization of achievable closed-loop system responses \citepwang_separable_2018,anderson_system_2019, which can be leveraged to show that the optimal controller for graph symmetric systems under Assumption 1 can be written as a novel class of graph filters defined by transfer function valued filter taps.
III-A Background: System-Level Synthesis
As noted in §4 of [anderson_system_2019], we can compactly write the system dynamics (2) in the frequency domain as
where is the signal in the -domain, and idem for and . For a (dynamic) linear state-feedback controller , it follows immediately that
(9) | ||||
where and are system responses that map the disturbance to state and control input , respectively. The following SLS theorem states that all achievable responses lie in an affine subspace of strictly proper stable rational transfer functions .
Theorem 1.
[anderson_system_2019, Thm. 4.1] For the LTI system evolving under the dynamics (2) and control policy , the following statements are true:
-
1.
The affine subspace defined by
(10) parameterizes all system responses from to as defined in (9), achievable by an internally stabilizing state feedback controller .
- 2.
III-B SLS for Graph Symmetric Systems
We now proceed to show that under Assumption 1, the optimal system response for a graph symmetric system that solves the LQR problem (11) can be written as a graph filter.
Theorem 2.
Given a GMD , consider an instance of the LQR problem (11) where the underlying system and cost satisfy Assumption 1. Then, there exists a global optimum where both and are diagonalizable by , i.e.,
where and are diagonal transfer matrices. Hence, the optimal controller can also be diagonalized by .
Proof.
See Appendix. ∎
Remark 1.
Note that the elements defined by the diagonal responses are transfer functions . Thus the resulting graph filter taps are transfer functions as well, i.e., a transfer function that is simultaneously diagonalizable with the matrix can be written as:
(12) |
The main implication of Theorem 2 is that the optimal linear state-feedback controller for graph symmetric systems under Assumption 1 is a graph filter and can thus be implemented via distributed message passing. We note, however, that the above result implies that the resulting optimal system response could be dense, as is dense if defines a connected graph. This can be undesirable in practice, as it requires communication exchanges with one-hop neighbors, potentially causing significant delays if the size of the graph is large. In the next section, we leverage a robust variant of the SLS parameterization given in Theorem 1 to restrict the optimal system responses to only the first filter taps while guaranteeing stability and near optimal performance.
We end this section by noting that in the graph signal processing literature \citepSandryhaila2013-DSPG, a controller of the form of (12) is known as a linear, shift-invariant (LSI) graph filter, and is analogous to an LTI filter. Note that , hence the name. In particular, Equation (12) is a spatially finite impulse response (FIR) graph filter \citepSegarra2017-GraphFilterDesign that is completely characterized by a finite set of filter taps that can be conveniently described by a collection of transfer functions . We emphasize that the transfer functions themselves are not restricted to be temporally FIR. Spatially FIR graph filters are also known as convolutional graph filters \citepRuiz2021-GNN due to their sum-and-shift nature, understanding that the effect of the operation is to shift the signal around the graph (thus, oftentimes, the GMD is referred to as the graph shift operator). Furthermore, spatially FIR graph filters satisfy the convolution theorem that indicates that a convolution in the vertex domain can be computed by means of an elementwise multiplication in the spectrum domain \citepSandryhaila2013-DSPG. Finally, in the context of finite graphs, it is observed that the space of FIR graph filters of the form (12), characterized by filter taps, is equivalent to the space of spatially infinite impulse response (IIR) graph filters as well as autoregressive, moving average (ARMA) graph filters \citepIsufi2017-ARMA.
IV Localized Approximations to the Optimal Distributed Linear Controller
In this section, we discuss several methods to approximate the optimal dense system response in the form of (12) with one that is localized and uses filter taps. We start with a projection method based on graph filter design. We then present a robust SLS formulation of the approximation problem that guarantees the stability of the resulting localized controller. Finally, we show how these two can be combined into a robust projection method that also ensures stability. In the following, we define , and recall that can be written as a graph filter (12) defined by transfer function filter taps , . We further recall that each transfer function filter tap admits the following expansion in terms of its Markov parameters: .
IV-A Naive Projection
We propose an approach inspired by the graph signal processing literature, wherein we exploit the graph filter structure of the optimal system responses \citepSegarra2017-GraphFilterDesign. More specifically, we project the optimal system responses onto graph filters of order in the norm by solving the following optimization problem
(13) |
Here collects the transfer function filter taps defining . If we further restrict each transfer function filter tap to be FIR of order , i.e., if we write , then this reduces to solving the following unconstrained quadratic program
(14) |
Proposition 3 (Approximating filter taps).
If the eigenvalues of the GMD are all distinct, then the filter taps that solve (13) are given by
(15) |
where is the error vector computed as
(16) |
with is the collection of the first powers of , is the collection of the remaining powers and collects the tail optimal filter taps.
Proof.
It follows from using the convexity of (14), matrix calculus and properties of Vandermonde matrices. ∎
Prop. 3 determines in closed-form how to compute the filter of order that best approximates the optimal linear distributed controller in the norm. It also shows that each filter tap transfer function is obtained as the optimal tap with an added corrective term that accounts for the taps that could not be included.
This approach is easy to implement computationally, as it only requires solving a least squares problem to minimize the projection cost. However, the resulting controller cannot be guaranteed to be stabilizing. As we show later via numerical simulation, approximations with a small number of filter taps are often unstable. This motivates an approach that takes into account the stability of the resulting controller.
IV-B Localized Approximations via Robust SLS
Robust SLS \citepmatni2017scalable, anderson_system_2019 offers a systematic way to reason about approximate system responses, i.e., system responses that do not exactly satisfy the achievability constraint (10). In particular, as shown in the following result, robust SLS allows for an explicit characterization of the effects of using approximate system responses for controller design.
Theorem 4 (Corollary 4.4 of [anderson_system_2019]).
Let be a solution to
(17) |
Then if the controller stabilizes the system (2), and the actual system response that is achieved is given by
We leverage this result to provide an upper-bound on the amount of truncation that can be applied to without destabilizing the system.
Corollary 5.
Proof.
First, we note that by Theorem 4, the system responses achieved by are given by
Thus, the cost achieved by the controller is bounded by
where we used Cauchy-Schwarz in the first inequality and the small gain theorem and the fact that in the last step. ∎
Corollary 5 offers a way to synthesize stable truncated system responses. Specifically, to synthesize a system response that uses only filter taps while guaranteeing both stability and performance, we propose the following optimization problem:
(18a) | ||||
(18b) |
By Corollary 5, the solution to (18) defines a controller that is stabilizing. We further show in the next result that it enjoys guaranteed suboptimality bounds relative to the optimal controller defined by . We first introduce the following notation: for a system response of the form , we define the -truncation and the -tail as
Theorem 6.
Proof.
By the constraints (18b) and Theorem 4, we immediately have that is stabilizing. To show the given suboptimality bound, we first note that there exist some such that is a feasible solution to the robust optimization problem (18). To see this, observe that
where the last step follows from the achievability of the optimal response and the definition of . Thus, is a feasible solution with our assumption that . Denote the robust SLS objective (18) as . By the optimality of the solution , we have that
where we applied Corollary 5 in the second inequality. The desired result follows then from the fact that
and an application of the triangle inequality. ∎
This optimization problem is jointly quasi-convex and can be solved efficiently using bisection. Further, feasibility provides a stability certificate in the form of .
IV-C Robust Projection
Lastly, we can combine the robustness constraints used in robust SLS with the signal-processing-based projection method. Specifically, we solve the following optimization problem
(20) | ||||
s.t. |
We note that solving this problem does not give an upper bound on the cost of the resulting controller, but the robustness constraint ensures that the resulting controller is stabilizing.
IV-D Implementation
We end this section by detailing practical implementation details for optimization problems (18) and (20). First, we note that computationally, one cannot directly optimize for the IIR system responses as is written in (18), (20). In practice, we use an FIR approximation of the strictly proper transfer functions and , i.e., is parameterized as
for some given FIR order . As shown in [anderson_system_2019], the suboptimality incurred by such an FIR approximation decays exponentially in the horizon .
The -norm constraints on the (also FIR) transfer matrix can then be enforced via semidefinite programming (see Theorem 5.8 in [dumitrescu2007positive]), potentially introducing a nontrivial computational burden. However, we note that one can replace the -norm constraint in optimization problems (18) and (20) with any induced norm constraint. A particularly appealing option is the induced norm (which defines the -norm of the transpose system), as this norm decomposes columnwise. As shown in [anderson_system_2019, wang_separable_2018], the resulting robustness constraints are linear and embarrassingly parallelizable. We defer this extension to future work.
V Numerical Experiments
We show that our approach offers a principled way to trade off performance and communication complexity through numerical experiments. We also demonstrate the importance of the robustness constraints in synthesizing stable distributed controllers and compare the performance of robust SLS and projection-based methods on synthesizing localized controllers. All code needed to reproduce the examples found in this section is available at https://github.com/unstable-zeros/graph-symmetric-systems.
V-A Setup
In the following experiments, we consider the distributed linear quadratic regulator (LQR) problem (4) over scalar subsystems. We generate the GMD and dynamic matrices and using a process similar to that in [Gama2022-DistributedLQR]. To generate a problem instance, we start by creating the communication network by randomly sampling numbers , and creating a bi-directional link between and each of its nearest points as defined by the topology on the interval under the metric . We then take to be a symmetric matrix that shares the sparsity pattern of the Laplacian of , with its entry values sampled independently from . The GMD is then normalized to have a spectral radius of . We generate the dynamics matrices and to share the same eigenvectors as , and sample their eigenvalues i.i.d. from the standard normal distribution – hence both and are symmetric matrices. We take the cost matrices . For both of the following experiments, we randomly generate problem instances using this process. We end by noting that generated this way have, on average, a diameter of hops.
For the implementation of the optimization problems, we approximate the transfer functions with an FIR horizon of . Further, for the robust SLS problem (18), instead of using bisection to determine the best value of , we fix , as empirically the value of does not significantly affect the cost achieved by the controllers.
V-B Importance of Stability Constraints
In this experiment, we demonstrate the importance of the robust SLS-based stability constraints in synthesizing stable distributed controllers. We vary the number of allowed filter taps and apply both the naive projection (13) and robust SLS (20) methods to randomly generated problem instances. For the naive projection method, we report the percentage of resulting controllers that are stable. For robust SLS, we report the percentage of optimization problems (18) being feasible, as feasible solutions optimization problem (18) are guaranteed to be stabilizing. The results are shown in Figure 1.
First, we observe that as expected, a higher number of filter taps result in a higher probability of synthesizing stable responses for both methods. However, the naive projection method has nonzero probability of resulting in an unstable controllers even with a large number of filter taps i.e., even when the projection error between the stable optimal system responses and the projected responses is small. On the other hand, the percentage of stable solutions resulting from the robust SLS problem increases monotonically with the number of filter taps, which is expected for a principled way of synthesizing stable controllers. Robust SLS also generally achieves a higher percentage of certifiably stable responses than that of naive projection, except for in the extremely sparse case of . This suggests that the robust constraint might be too restrictive for synthesizing extremely sparse responses. Combined with the low computation cost of naive projection, this suggests a potential benefit of applying both methods in the sparse regime.
V-C Truncation Performance
In this experiment, we compare the performance of robust SLS and robust projection for different filter tap numbers on the same randomly generated problem instances. We show the median (solid lines), -th and -th percentile (shaded regions) of the costs achieved by both methods in Figure 2.
First, we note that the median costs decreases monotonically for both methods as the number of hops increase. This shows that that the optimization problems can leverage the increase in expressivity of the graph filters to achieve better performance, which matches our intuition. Second, we note that the robust SLS-based method achieves a lower cost than robust Projection over for all numbers of filter taps considered. We also note that to the left of hops, the upper boundary of the shaded region, which represents the -th percentile of the cost, is infinite, indicating that at least of the robust synthesis problems are infeasible. This again suggests a need to develop more flexible methods in the sparse regime.
VI Conclusion
In this works, we introduced the notion of graph symmetric systems and showed that for linear quadratic problems, the optimal system response for graph symmetric systems can be written as (potentially dense) graph filters. We then proposed three methods to approximate the optimal responses with localized responses and validated their performance in numerical simulation. Directions of future work include relaxing the GSS constraints, applying the results on norm to enable distributed computation, and understanding how this can better inform GNN-based controllers with nonlinear activation functions.
-A Proof for Theorem 2
We show that for any optimal system response , of the optimization problem (11) that is not diagonalizable with , i.e.,
are not diagonal, we can construct a simultaneously diagonalizable system response , that is equally optimal. In particular, we construct such a system response as follows:
(21) | |||
(22) |
We first show that , are feasible solutions of (11). From the achievability condition on , we have that
(23) |
Using the simultaneous diagonalizability of and , we have
(24) |
Since the matrices , and are diagonal, we have
(25) |
where projects a matrix onto its diagonal elements. Therefore, , is also feasible.
Now, we show that , gives a cost at least as good as that of . By the simultaneous diagonalizability of the matrices and , and the fact that the -norm is invariant under unitary transformations, we have that
(26) | ||||
Denoting the -th eigenvalue of and with , respectively, we have the inequality
(27) | ||||
which follows from the definition of and in equation (22). Reversing the steps in (26), we see that achieves a cost at least as good as that of . We can thus conclude that there always exists an optimal simultaneously diagonalizable system response to the LQR problem (11). The controller is thus optimal and simultaneously diagonalizable by .
-B GSS and Controllers Satisfy Quadratic Invariance
Here we show that optimal control problems over graph symmetric systems and controllers satisfy quadratic invariance \citeprotkowitz2005characterization. Before proceeding, we remark that the analysis of LQR optimal control problem over GSSs does not require quadratic invariance. In particular, in Theorem 2 we analyze the unconstrained optimal control problem and show that the resulting unconstrained optimal controller satisfies a corresponding notion of graph symmetry. However, in the interest of completeness, we show that if such a constraint were imposed on the controller during synthesis, the resulting problem satisfies quadratic invariance.
To that end, the corresponding constrained controller synthesis problem can be stated as
subject to | |||
where Denoting the plant input-output transfer function as
we have the following proposition.
Proposition 7.
The set of graph symmetric controllers is quadratically invariant under if system (2) is graph symmetric.
Proof.
The proof follows directly from the definition of quadratic invariance [rotkowitz2005characterization, Def. 2] by straightforward calculation. First, we note that
is diagonalizable by . For any controller , it then follows immediately that
as the product of simultaneously diagonalizable matrices is also simultaneously diagonalizable, proving the claim. ∎