This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Time-invariant prefix-free source coding for MIMO LQG control

Travis C. Cuvelier, Takashi Tanaka, and Robert W. Heath, Jr This work was supported in part by the National Science Foundation under Grant No. ECCS-1711702 and and CAREER Award #1944318. T. Cuvelier is with the Department of Electrical and Computer Engineering, The University of Texas at Austin, TX, 78712 USA e-mail: [email protected]. Tanaka is with the Department of Aerospace Engineering and Engineering Mechanics, The University of Texas at Austin, TX, 78712 USA e-mail: [email protected]. Heath is with the Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC, 27606 USA e-mail: [email protected].
Abstract

In this work we consider discrete-time multiple-input multiple-output (MIMO) linear-quadratic-Gaussian (LQG) control where the feedback consists of variable length binary codewords. To simplify the decoder architecture, we enforce a strict prefix constraint on the codewords. We develop a data compression architecture that provably achieves a near minimum time-average expected bitrate for a fixed constraint on the LQG performance. The architecture conforms to the strict prefix constraint and does not require time-varying lossless source coding, in contrast to the prior art.

I Introduction

In this work, we consider discrete-time MIMO LQG control where feedback occurs via variable-length packets (codewords) of bits that we assume are conveyed reliably (without error) from a sensor/encoder to a decoder/controller. For some constraint on LQG control performance, we seek to design a sensor/encoder and decoder/controller that minimize the time-averaged expected bitrate of the binary channel.

The motivation for this formulation is communication-efficient remote control over wireless communications. In particular, we imagine a scenario where a remote sensor platform measures some dynamical system and conveys the measurements to the controller over a noiseless binary channel. At the physical and medium access control layers, reliable control of autonomous agents over wireless has motivated the development of ultra low latency reliable communication (ULLRC). While progress has been made on the development ULLRC, the fundamental scarcity of physical layer resources motivates approaches that minimize the use of these resources. In contrast to the work on ULLRC, we aim higher on the protocol stack; namely we propose to develop control and data compression (source coding) strategies to achieve satisfactory control performance with minimal communication bitrate. The minimum average length, in bits, of the binary packets the sensor encodes are an effective surrogate for the minimum amount of physical layer resources (time/bandwidth/power) required to achieve satisfactory performance for a fixed reliability (cf. e.g. [1, Section 2.6]). We impose a prefix constraint on the codewords that ensures that other users sharing the same communication medium can identify the end of each transmitted message and thus identify the channel as free-to-use. In contrast to the prior art, this work imposes a time-invariant (TI) prefix constraint on the codewords, which simplifies the decoding architecture architecture; the end of codewords can be detected via comparison with a TI list [2].

In the prior literature, achievability results for the problem of interest follow from asymptotically bounding the output entropy of a quantizer [3][4][5]. While these bounds can nearly be achieved with zero-delay lossless source coding, they generally require that the lossless code be adapted, at every timestep, to the probability mass function of the quantizer output (cf. [6, Section IV.A]). This adds a great deal of computational complexity to the encoder and decoder.

In this work, we propose a data compression architecture for MIMO plants based on [4] and [6] that attains the desired LQG performance, satisfies a TI prefix constraint, and does not require a time-varying lossless source codec. Starting from the architecture of [4], we prove that the quantizer output has a limiting distribution. We propose to losslessly encode the quantizations with a fixed prefix-free code adapted the limiting distribution. We prove that under this modification, the codewords satisfy a TI prefix constraint without an increase in codeword length and without the complexity of time-varying lossless source coding. The architecture nearly achieves a known lower bound on the minimum expected bitrate. In particular, we use results from systems theory to extend the scalar TI achievability approach in [6] to the MIMO setting.

I-A Related Work

Our work follows from the architecture for LQG control with minimum bitrate prefix-free variable-length coding in [3]. For scalar systems, [3] derived a lower bound on the bitrate of a prefix-free source codec inserted into an LQG feedback channel in terms of Massey’s directed information (DI) [7]. This motivated a rate-distortion problem for the tradeoff between DI and LQG control performance. Using entropy-coded dithered quantization (ECDQ), [3] proved that the DI lower bound was nearly achievable. In ECDQ, uses a shared sequence of IID uniform random variables to whiten the error induced by quantization process. The quantization are then encoded via an entropy source code (e.g. a Shannon-Fano-Elias code). The rate-distortion formulation was generalized to MIMO plants in [8]. Likewise, the achievability approach was generalized in [4]. New analytical lower bounds (for MIMO systems) for the bitrate/control performance tradeoff were derived in [5]. It was further demonstrated that such bounds were nearly achievable without the use of a dither signal in the high communication rate/strict control cost regime [5]. The achievablilty approaches in [3], [4], and [5] implicitly assume that quantizations are encoded using a time-varying lossless source code [6]. More recently, [9] and [2] refined the lower bound analysis in [3], clarifying that the bound still applies even when the encoder and decoder share a dither signal.

The use of time-varying source codes complicates compression architectures; generally speaking, the codebooks used to encode quantizations must be updated at every timestep. Furthermore, in network settings, a source code’s prefix properties may be used by receivers to detect the end of transmissions. If a TI prefix code is used, the set of prefixes against which received codewords are compared is constant over time [2]. In the time-varying case, there are no such guarantees. In [6] it was shown that when a dither signal is available, there exists a TI quantization and coding scheme for scalar plants that nearly achieves the DI lower bound. In this work, we generalize this to the MIMO setting; namely we prove the existence of a quantizer and TI source codec for MIMO plants that nearly achieves the directed information lower bound [8]. We believe this to be the first such “time-invariant” achievability result for MIMO plants in the literature.

Work on fixed bitrate and event-driven communication strategies for control are also relevant to this work. In [10], numerical experiments demonstrated that Lloyd-Max style quantization could be used to attain a control performance competitive with known variable-rate upper bounds. However, the quantizer in [10] is time-varying, and many experimental realizations had time-average control costs that greatly exceeded the variable-length bounds [10]. More recently, [11] have proposed the the mean time to quantizer saturation as a metric for the analysis of fixed-length coding in LQG feedback control. Periodic coding strategies were devised, and their performances and escaped times were analyzed theoretically and with an experiment [11]. More recently [12] considered tracking a scalar Markov source under an event-based communication paradigm. Under this formulation, an encoder monitoring a plant can send binary codewords (without prefix constraints) at arbitrary (continuous-time) instants to a synchronized decoder [12]. For a constraint on estimator error, the minimum bitrate communication policy was deduced. In this work, as in [4], [5], [10], we consider a sampled discrete-time control and communication model. While event-driven schemes like [12] may be able to achieve smaller bitrates than these works, the models for control and communication in [4], [5], [10] are more amenable to real-world sampled data systems with finite sensing and communication bandwidths.

I-B Notation

We denote constant scalars and vectors in lowercase xx, scalar and vector random variables in boldface 𝒙\boldsymbol{{x}}, and matrices by capital letters XX. We let [x]i[{x}]_{i} denote the ithi^{\mathrm{th}} element of x{x}, ImI_{m} denote the m×m\mathbb{R}^{m\times m} identity matrix, 0m×m0_{m\times m} the respective zero matrix. Let X2\lVert X\rVert_{2} denote the largest singular value of XX. We write P(S)D for “symmetric positive (semi)definite”, and let 𝕊+m\mathbb{S}^{m}_{+} denote the set of m×mm\times m PSD matrices. We let \succ, \succeq denote the standard partial order on the PSD cone, e.g. if A,BmA,B\in\mathbb{R}^{m}, we write ABA\succ B if ABA-B is PD, likewise ABA\succeq B if ABA-B is PSD. For time domain sequences, let {𝒙t}\{\boldsymbol{{x}}_{t}\} denote (𝒙0,𝒙1,)(\boldsymbol{{x}}_{0},\boldsymbol{{x}}_{1},\dots), 𝒙ab\boldsymbol{{x}}_{a}^{b} denote (𝒙a,,𝒙b)(\boldsymbol{{x}}_{a},\dots,\boldsymbol{{x}}_{b}) if bab\geq a, 𝒙ab=\boldsymbol{{x}}_{a}^{b}=\emptyset otherwise. Let 𝒙b=𝒙0b\boldsymbol{{x}}^{b}=\boldsymbol{{x}}_{0}^{b}. Denote the set of finite-length binary strings by {0,1}\{0,1\}^{*}. For a{0,1}a\in\{0,1\}^{*}, let (a)\ell(a) be the length of aa in bits. Let Δm\Delta\mathbb{Z}^{m} denote the set of mm-tuples whose elements are integer multiples of Δ\Delta. Define the set m(Δ)={xm:xΔ2}{\mathcal{B}}^{m}(\Delta)=\{{x}\in\mathbb{R}^{m}:\lVert{x}\rVert_{\infty}\leq\frac{\Delta}{2}\}. For a set 𝒮\mathcal{S}, define the indicator function of x𝒮{x}\in\mathcal{S} as 1x𝒮1_{{x}\in\mathcal{S}}. For a topological space 𝕋\mathbb{T}, let 𝔹(𝕋)\mathbb{B}(\mathbb{T}) denote the standard Borel σ\sigma-algebra of 𝕋\mathbb{T}. For Euclidean spaces, let λ\lambda denote the Lebesgue measure. We use PMF for “probability mass function”, PDF for “probability density function” (with respect to λ\lambda), 𝒂𝒃\boldsymbol{{a}}\perp\boldsymbol{{b}} for “𝒂\boldsymbol{{a}} and 𝒃\boldsymbol{{b}} are independent”, and HH for entropy.

II System Model and Problem Formulation

We consider the system model depicted in Fig. 1. The system to be controlled is a TI, multidimensional, linear dynamical system (i.e., a MIMO plant) plant controlled via a feedback model where communication occurs over an ideal (delay and error free) binary channel. The plant is fully observable to an encoder/sensor block, which conveys a variable-length binary codeword 𝒂t{0,1}\boldsymbol{{a}}_{t}\in\{0,1\}^{*} over the channel to a combined decoder/controller. Upon receipt of the codeword, the decoder/controller designs the control input. Denote the state vector 𝒙tm\boldsymbol{{x}}_{t}\in\mathbb{R}^{m}, the control input 𝒖tu\boldsymbol{{u}}_{t}\in\mathbb{R}^{u}, and let 𝒘t𝒩(0,W)\boldsymbol{{w}}_{t}\sim\mathcal{N}({0},{W}) denote processes noise assumed to be IID over time. We assume W0m×m{W}\succ{0}_{m\times m}, i.e., the process noise covariance is full rank. We assume assume that 𝒙0𝒩(0,X0)\boldsymbol{{x}}_{0}\sim\mathcal{N}({0},{X}_{0}) for some X00X_{0}\succeq 0. For Am×m{A}\in\mathbb{R}^{m\times m} the system matrix and Bm×u{B}\in\mathbb{R}^{m\times u} the feedback gain matrix, for t0t\geq 0 the plant dynamics are given by 𝒙t+1=A𝒙t+B𝒖t+𝒘t\boldsymbol{{x}}_{t+1}={A}\boldsymbol{{x}}_{t}+{B}\boldsymbol{{u}}_{t}+\boldsymbol{{w}}_{t}. To ensure finite control cost is attainable, we assume (A,B)({A},{B}) are stabilizable. We assume that the encoder and decoder share access to a random uniform dither signal {𝒅t}\{\boldsymbol{{d}}_{t}\}. We assume that, for some Δ>0\Delta>0 to be specified, the random vectors 𝒅tm\boldsymbol{{d}}_{t}\in\mathbb{R}^{m} have components that are independently uniformly distributed on [Δ/2,Δ/2][-\Delta/2,\Delta/2], and that the sequence {𝒅t}\{\boldsymbol{{d}}_{t}\} is IID over time. We assume that {𝒘t}\{\boldsymbol{{w}}_{t}\}, {𝒅t}\{\boldsymbol{{d}}_{t}\}, and 𝒙0\boldsymbol{{x}}_{0} are mutually independent. In real-world systems, this shared randomness can be effectively accomplished using synchronized pseudorandom number generators. We assume that the encoder/sensor and the decoder/controller may be randomized given their inputs. The encoder/sensor policy in Fig. 1 is a sequence of causally conditioned Borel measurable kernels denoted E[𝒂0||𝒅0,𝒙0]={E[𝒂t|𝒂0t1,𝒅0t,𝒙0t]}t\mathbb{P}_{\mathrm{E}}[\boldsymbol{{a}}_{0}^{\infty}||\boldsymbol{{d}}_{0}^{\infty},\boldsymbol{{x}}_{0}^{\infty}]=\left\{\mathbb{P}_{\mathrm{E}}[\boldsymbol{{a}}_{t}|\boldsymbol{{a}}_{0}^{t-1},\boldsymbol{{d}}_{0}^{t},\boldsymbol{{x}}_{0}^{t}]\right\}_{t}, and that corresponding decoder/controller policy is given by C[𝒖||𝒂,𝒅]={C[𝒖t|𝒂t,𝒅t,𝒖t1]}t\mathbb{P}_{\mathrm{C}}[\boldsymbol{{u}}^{\infty}||\boldsymbol{{a}}^{\infty},\boldsymbol{{d}}^{\infty}]=\left\{\mathbb{P}_{\mathrm{C}}[\boldsymbol{{u}}_{t}|\boldsymbol{{a}}^{t},\boldsymbol{{d}}^{t},\boldsymbol{{u}}^{t-1}]\right\}_{t}. Note that under the assumed dynamics, 𝒙t\boldsymbol{{x}}^{t} is a deterministic function of 𝒙\boldsymbol{{x}}, 𝒂t1\boldsymbol{{a}}^{t-1}, 𝒖t1\boldsymbol{{u}}^{t-1}, and 𝒘t1\boldsymbol{{w}}^{t-1}. We encode conditional independence assumptions in the system model by a factorization of the one-step transition kernels for 𝒂t\boldsymbol{{a}}_{t}, 𝒅t\boldsymbol{{d}}_{t}, 𝒖t\boldsymbol{{u}}_{t}, and 𝒘t\boldsymbol{{w}}_{t}. For t0t\geq 0, we assume the kernels factorize via

[𝒂t+1,𝒖t+1|𝒂t,𝒅t+1,𝒖t,𝒘t,𝒙0]=E[𝒂t+1|𝒂t,𝒅t+1,𝒙t+1]C[𝒖t+1|𝒂t+1,𝒅t+1,𝒖t],\mathbb{P}[\boldsymbol{{a}}_{t+1},\boldsymbol{{u}}_{t+1}|\boldsymbol{{a}}^{t},\boldsymbol{{d}}^{t+1},\boldsymbol{{u}}^{t},\boldsymbol{{w}}^{t},\boldsymbol{{x}}_{0}]=\\ \mathbb{P}_{\mathrm{E}}[\boldsymbol{{a}}_{t+1}|\boldsymbol{{a}}^{t},\boldsymbol{{d}}^{t+1},\boldsymbol{{x}}^{t+1}]\mathbb{P}_{\mathrm{C}}[\boldsymbol{{u}}_{t+1}|\boldsymbol{{a}}^{t+1},\boldsymbol{{d}}^{t+1},\boldsymbol{{u}}^{t}], (1a)
[𝒂t+1,𝒅t+1,𝒖t+1,𝒘t+1|𝒂t,𝒅t,𝒖t,𝒘t,𝒙0]=[𝒂t+1,𝒖t+1|𝒂t,𝒅t,𝒖t,𝒘t,𝒙0][𝒅t+1][𝒘t+1],\mathbb{P}[\boldsymbol{{a}}_{t+1},\boldsymbol{{d}}_{t+1},\boldsymbol{{u}}_{t+1},\boldsymbol{{w}}_{t+1}|\boldsymbol{{a}}^{t},\boldsymbol{{d}}^{t},\boldsymbol{{u}}^{t},\boldsymbol{{w}}^{t},\boldsymbol{{x}}_{0}]=\\ \mathbb{P}[\boldsymbol{{a}}_{t+1},\boldsymbol{{u}}_{t+1}|\boldsymbol{{a}}^{t},\boldsymbol{{d}}^{t},\boldsymbol{{u}}^{t},\boldsymbol{{w}}^{t},\boldsymbol{{x}}_{0}]\mathbb{P}[\boldsymbol{{d}}_{t+1}]\mathbb{P}[\boldsymbol{{w}}_{t+1}], (1b)

and that we have initially [𝒂0,𝒅0,𝒖,𝒘0|𝒙0]\mathbb{P}[\boldsymbol{{a}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{u}},\boldsymbol{{w}}_{0}|\boldsymbol{{x}}_{0}] == [𝒘0][𝒅0]E[𝒂0|𝒙0,𝒅0]\mathbb{P}[\boldsymbol{{w}}_{0}]\mathbb{P}[\boldsymbol{{d}}_{0}]\mathbb{P}_{\mathrm{E}}[\boldsymbol{{a}}_{0}|\boldsymbol{{x}}_{0},\boldsymbol{{d}}_{0}] C[𝒖0|𝒂0,𝒅0]\mathbb{P}_{\mathrm{C}}[\boldsymbol{{u}}_{0}|\boldsymbol{{a}}_{0},\boldsymbol{{d}}_{0}]. Implications of these factorization are discussed in the caption of Fig. 1.

Refer to caption
Figure 1: The encoder can select the codeword 𝒂t\boldsymbol{{a}}_{t} randomly given “the information it knows at time tt”. When 𝒂t\boldsymbol{{a}}_{t} arrives at the decoder, the decoder can randomly generate the control 𝒖t\boldsymbol{{u}}_{t} given 𝒂t\boldsymbol{{a}}_{t} as well as its own prior knowledge. The encoder and decoder share access to 𝒅t\boldsymbol{{d}}_{t}, which is IID and generated “independently” of all past system variables.

We require the codewords to conform to a TI prefix constraint, namely that for all i,ji,j and distinct a1,a2{0,1}a_{1},a_{2}\in\{0,1\}^{*} with both [𝒂i=a1]>0\mathbb{P}[\boldsymbol{{a}}_{i}=a_{1}]>0 and [𝒂j=a2]>0\mathbb{P}[\boldsymbol{{a}}_{j}=a_{2}]>0, a1a_{1} is not a prefix of a2a_{2} and vice-versa. Thus, not only must the support of the random variable 𝒂i\boldsymbol{{a}}_{i} be a set of prefix-free binary codewords, we also require the union of the supports of the {𝒂i}\{\boldsymbol{{a}}_{i}\} be prefix-free. This is a stronger prefix constraint than was considered in [4]. The TI constraint may enable for more computationally efficient communication resource sharing; the end of a codeword can be unambiguously identified via comparing received messages against the TI union of supports. We are interested in the following optimization, over policies E,C\mathbb{P}_{\mathrm{E}},\mathbb{P}_{\mathrm{C}} that conform to the prefix constraint:

(γ)={infE,C 1Tt=0T1𝔼[(𝒂t)]s.t. 1Tt=0T1𝔼[𝒙t+1Q2+𝒖tΦ2]γ,\mathcal{L}(\gamma)=\left\{\begin{aligned} &\underset{\mathbb{P}_{\mathrm{E}},\mathbb{P}_{\mathrm{C}}}{\inf}\text{ }\frac{1}{T}\sum\nolimits_{t=0}^{T-1}\mathbb{E}[\ell(\boldsymbol{{a}}_{t})]\\ &\text{s.t. }\frac{1}{T}\sum\nolimits_{t=0}^{T-1}\mathbb{E}[\lVert\boldsymbol{{x}}_{t+1}\rVert_{{Q}}^{2}+\lVert\boldsymbol{{u}}_{t}\rVert_{{\Phi}}^{2}]\leq\gamma,\end{aligned}\right. (2)

where Q0{Q}\succeq{0}, Φ0{\Phi}\succ{0}, and γ\gamma is the maximum tolerable LQG cost. Let SS be a stabilizing solution to the discrete algebraic Riccati equation (DARE) ATSASATSB(BTSB+Φ)1BTSA+Q=0A^{\mathrm{T}}SA-S-A^{\mathrm{T}}SB(B^{\mathrm{T}}SB+\Phi)^{-1}B^{\mathrm{T}}SA+Q=0, K=(BTSB+Φ)1BTSAK=-(B^{\mathrm{T}}SB+\Phi)^{-1}B^{\mathrm{T}}SA, and Θ=KT(BTSB+Φ)K\Theta=K^{\mathrm{T}}(B^{\mathrm{T}}SB+\Phi)K. Consider the optimization

(γ)\displaystyle\mathcal{R}(\gamma) =\displaystyle= {infP,Π,m×mP,Π012(log2detΠ+log2detW) s.t. Tr(ΘP)+Tr(WS)γ PAPAT+W   [PΠPATAPAPAT+W]0\displaystyle\left\{\begin{aligned} &\underset{\begin{subarray}{c}P,\Pi,\in\mathbb{R}^{m\times m}\\ P,\Pi\succeq 0\end{subarray}}{\inf}\frac{1}{2}(-\log_{2}{\det{\Pi}}+\log_{2}{\det{W}})\\ &\text{ }\text{s.t. }\mathrm{Tr}(\Theta P)+\mathrm{Tr}(WS)\leq\gamma\text{, }\\ &\text{ }P\preceq APA^{\mathrm{T}}+W\text{, }\\ &\text{ }\text{ }\text{ }\begin{bmatrix}P-\Pi&PA^{\mathrm{T}}\\ AP&APA^{\mathrm{T}}+W\end{bmatrix}\succeq 0\end{aligned}\right. (3)

A consequence of [2] (cf. [6] and [8]) is the lower bound (γ)(γ)\mathcal{R}(\gamma)\leq\mathcal{L}(\gamma). In the sequel, we will analyze the bitrate of the proposed achievability scheme in terms of (γ)\mathcal{R}(\gamma).

III Achievability Architecture

Variable Description/Key Equations
P^𝕊+m\hat{P}\in\mathbb{S}_{+}^{m} Minimizing PP of (3), asymptotic KF estimator error
P^+𝕊+m\hat{P}_{+}\in\mathbb{S}_{+}^{m} P^+=AP^AT+W\hat{P}_{+}=A\hat{P}A^{\mathrm{T}}+W, asymptotic KF prediction error
Cm×mC\in\mathbb{R}^{m\times m} Measurement matrix, P^1P^+1=CTC12Δ2\hat{P}^{-1}-\hat{P}_{+}^{-1}=C^{\mathrm{T}}C\frac{12}{\Delta^{2}}
Δ\Delta\in\mathbb{R} Quantizer/dither sensitivity, P^1P^+1=CTC12Δ2\hat{P}^{-1}-\hat{P}_{+}^{-1}=C^{\mathrm{T}}C\frac{12}{\Delta^{2}}
Ku×mK\in\mathbb{R}^{u\times m} Certainty equivalent feedback control gain
𝒙¯t|t1m\overline{\boldsymbol{{x}}}_{t|t-1}\in\mathbb{R}^{m} TI KF prediction of 𝐱t\mathbf{x}_{t}
𝒙¯t|tm\overline{\boldsymbol{{x}}}_{t|t}\in\mathbb{R}^{m} TI KF estimate of 𝐱t\mathbf{x}_{t}. 𝐮t=K𝒙¯t|t\mathbf{u}_{t}=K\overline{\boldsymbol{{x}}}_{t|t}
𝒆¯tm\overline{\boldsymbol{{e}}}_{t}\in\mathbb{R}^{m} KF predict error 𝒆¯t=𝒙¯t|t1𝐱t\overline{\boldsymbol{{e}}}_{t}=\overline{\boldsymbol{{x}}}_{t|t-1}-\mathbf{x}_{t}, 𝔼[𝒆¯t𝒆¯tT]P^+\mathbb{E}[\overline{\boldsymbol{{e}}}_{t}\overline{\boldsymbol{{e}}}_{t}^{\mathrm{T}}]\rightarrow\hat{P}_{+}
𝒅tm\boldsymbol{{d}}_{t}\in\mathbb{R}^{m} Dither sequence, IID elements Uniform(Δ)\sim\text{Uniform}(\Delta)
𝒒tΔm\boldsymbol{{q}}_{t}\in\Delta\mathbb{Z}^{m} Quantization 𝒒t=QΔ(C𝒆t+𝒅t)\boldsymbol{{q}}_{t}=Q_{\Delta}(C\boldsymbol{{e}}_{t}+\boldsymbol{{d}}_{t}), encoded into 𝒂t\boldsymbol{{a}}_{t}
𝒒~tm\boldsymbol{{\tilde{q}}}_{t}\in\mathbb{R}^{m} Reconstruction 𝒒~t=𝒒t𝒅t\boldsymbol{{\tilde{q}}}_{t}=\boldsymbol{{q}}_{t}-\boldsymbol{{d}}_{t}, error 𝒗t=𝒒~tC𝐞t\boldsymbol{{v}}_{t}=\boldsymbol{{\tilde{q}}}_{t}-C\mathbf{e}_{t}
𝒗tm\boldsymbol{{v}}_{t}\in\mathbb{R}^{m} 𝒗t𝐞t\boldsymbol{{v}}_{t}\perp\mathbf{e}_{t}, 𝒗t𝐱t\boldsymbol{{v}}_{t}\perp\mathbf{x}_{t}, 𝔼[𝒗t𝒗t+kT]=ImΔ212𝟙k=0\mathbb{E}[\boldsymbol{{v}}_{t}\boldsymbol{{v}}_{t+k}^{\mathrm{T}}]=I_{m}\frac{\Delta^{2}}{12}\mathbb{1}_{k=0}
𝒚tm\boldsymbol{{y}}_{t}\in\mathbb{R}^{m} Effective measurement at decoder 𝒚t=C𝐱t+𝐯t\boldsymbol{{y}}_{t}=C\mathbf{x}_{t}+\mathbf{v}_{t}
TABLE I:

The achievability approach we proposed is demonstrated in Fig. 2. It is almost identically to the time-varying MIMO achievability approach in [6, Section IV.B] with the exception that the lossless Shannon-Fano-Elias source codec used to encode the quantizations is TI. We first describe the signals in Fig. 2 before summarizing relevant results from [6] that simplify the analysis. Let P^\hat{P} denote the minimizing PP from (3), let P^+=AP^AT+W\hat{P}_{+}=A\hat{P}A^{\mathrm{T}}+W, and let Δ\Delta\in\mathbb{R}, Δ>0\Delta>0, and Cm×mC\in\mathbb{R}^{m\times m} be such that P^1P^+1=CTC12Δ2\hat{P}^{-1}-\hat{P}_{+}^{-1}=C^{\mathrm{T}}C\frac{12}{\Delta^{2}}. In this way, P^\hat{P} (P^+\hat{P}_{+} ) is interpreted as the asymptotic (prediction) error covariance attained by a Kalman filter tracking the source process {𝒙t}\{\boldsymbol{{x}}_{t}\} under a measurement model 𝒚t=C𝒙t+𝒗t\boldsymbol{{y}}_{t}=C\boldsymbol{{x}}_{t}+\boldsymbol{{v}}_{t}, where 𝔼[𝒗t𝒗t+kT]=ImΔ212𝟙k=0\mathbb{E}[\boldsymbol{{v}}_{t}\boldsymbol{{v}}_{t+k}^{\mathrm{T}}]=I_{m}\frac{\Delta^{2}}{12}\mathbb{1}_{k=0}, 𝔼[𝒗t𝒙0T]=𝔼[𝒗t𝒘t+kT]=0m×m\mathbb{E}[\boldsymbol{{v}}_{t}\boldsymbol{{x}}_{0}^{\mathrm{T}}]=\mathbb{E}[\boldsymbol{{v}}_{t}\boldsymbol{{w}}_{t+k}^{\mathrm{T}}]=0_{m\times m}. We will show that this measurement model is attained by the decoder in Fig. 2 using dithered quantization.

Refer to caption
Figure 2: The achievability architecture. Descriptions of the variables may be found in Table I.

Define a uniform elementwise quantizer via QΔ:mΔmQ_{\Delta}\colon\mathbb{R}^{m}\rightarrow\Delta\mathbb{Z}^{m} via [QΔ(x)]i=kΔ, if [x]i[kΔΔ/2,kΔ+Δ/2),[Q_{\Delta}({x})]_{i}=k\Delta,\text{ if }[{x}]_{i}\in[k\Delta-\Delta/2,k\Delta+\Delta/2), e.g. each element of the vector x{x} is rounded to the nearest multiple of Δ\Delta. Both the encoder and decoder in Fig. 2 operate identical time invariant Kalman filters that compute recursive estimates of 𝒙t\boldsymbol{{x}}_{t} given measurements received by the decoder. Let 𝒙¯t|t1\overline{\boldsymbol{{x}}}_{t|t-1} denote the prior estimate at time tt and 𝒙¯t|t\boldsymbol{{\overline{x}}}_{t|t} denote the posterior. Set 𝒙¯0|1=0\boldsymbol{{\overline{x}}}_{0|-1}=0. At every time tt the encoder and decoder produces the Kalman filter prediction error

𝒆t=𝒙t𝒙¯t|t1,\displaystyle\boldsymbol{{e}}_{t}=\boldsymbol{{x}}_{t}-\overline{\boldsymbol{{x}}}_{t|t-1}, (4)

and the linear measurement innovation 𝒆t\boldsymbol{{e}}_{t}. It then produces the discrete, dithered quantization qt=QΔ(C𝒆t+𝒅t)q_{t}=Q_{\Delta}(C\boldsymbol{{e}}_{t}+\boldsymbol{{d}}_{t}). It encodes the discrete quantization losslessly into the codeword 𝒂t\boldsymbol{{a}}_{t} which it transmits to the decoder. Upon receiving 𝒂t\boldsymbol{{a}}_{t}, the decoder can recover 𝒒t\boldsymbol{{q}}_{t} exactly. It then produces the reconstruction 𝒒~t=𝒒t𝒅t\boldsymbol{{\tilde{q}}}_{t}=\boldsymbol{{q}}_{t}-\boldsymbol{{d}}_{t}. Let 𝒗t=𝒒~tC𝒆t\boldsymbol{{v}}_{t}=\boldsymbol{{\tilde{q}}}_{t}-C\boldsymbol{{e}}_{t} denote the reconstruction error. It then produces the centered measurement 𝒚t=𝒒~t+C𝒙t|t1\boldsymbol{{y}}_{t}=\boldsymbol{{\tilde{q}}}_{t}+C\boldsymbol{{x}}_{t|t-1} (equivalently 𝒚t=C𝒙t+𝒗t\boldsymbol{{y}}_{t}=C\boldsymbol{{x}}_{t}+\boldsymbol{{v}}_{t}). By [4, Lemma 1], the elements [𝒗t]i[\boldsymbol{{v}}_{t}]_{i} are mutually independent and uniform on [Δ/2,Δ/2][-\Delta/2,\Delta/2] and furthermore 𝒗t𝒙t\boldsymbol{{v}}_{t}\perp\boldsymbol{{x}}_{t}. Denote the TI Kalman filter gain J=P^+CT(CP^+CT+Im×mΔ212)1J=\hat{P}_{+}{C}^{\mathrm{T}}({C}\hat{P}_{+}{C}^{\mathrm{T}}+I_{m\times m}\frac{\Delta^{2}}{12})^{-1}. The decoder updates its estimate via

𝒙¯t|t=𝒙¯t|t1+J(𝒚tC𝒙¯t|t1),\displaystyle\boldsymbol{{\overline{x}}}_{t|t}=\boldsymbol{{\overline{x}}}_{t|t-1}+J(\boldsymbol{{y}}_{t}-C\boldsymbol{{\overline{x}}}_{t|t-1}), (5)

and applies the certainty equivalent control input 𝒖t=K𝒙¯t|t\boldsymbol{{u}}_{t}=K\boldsymbol{{\overline{x}}}_{t|t}. It then computes the predict estimate 𝒙¯t|t1=A𝒙¯t1|t1+B𝒖t1\boldsymbol{{\overline{x}}}_{t|t-1}={A}\boldsymbol{{\overline{x}}}_{t-1|t-1}+{B}\boldsymbol{{u}}_{t-1}. Since the encoder knows the quantizations {𝒒t}\{\boldsymbol{{q}}_{t}\} and the dither {𝒅t}\{\boldsymbol{{d}}_{t}\}, it can recover the sequence of measurements {𝒚t}\{\boldsymbol{{y}}_{t}\}. The encoder can thus compute the sequence of TI Kalman filter estimates {𝒙¯t|t1,𝒙¯t|t}\{\boldsymbol{{\overline{x}}}_{t|t-1},\boldsymbol{{\overline{x}}}_{t|t}\}. While we have yet to define how 𝒒t\boldsymbol{{q}}_{t} is encoded into 𝒂t\boldsymbol{{a}}_{t}, we have the following.

Lemma 1

So long as the decoder recovers 𝐪t\boldsymbol{{q}}_{t} exactly at every tt, we have that 1) 𝐯t𝐞t,𝐯t1𝐰t,𝐱0\boldsymbol{{v}}_{t}\perp\boldsymbol{{e}}^{t},\boldsymbol{{v}}^{t-1}\boldsymbol{{w}}^{t},\boldsymbol{{x}}_{0}, 2) limtH(𝐪)t(γ)+m(1+12log2(2πe12))\lim_{t\rightarrow\infty}H(\boldsymbol{{q}})_{t}\leq\mathcal{R}(\gamma)+m\left(1+\frac{1}{2}\log_{2}\left(\frac{2\pi e}{12}\right)\right), 3) limTi=0T𝔼[𝐱t22+𝐮t22]γ\lim_{T\rightarrow\infty}\sum_{i=0}^{T}\mathbb{E}[\lVert\boldsymbol{{x}}_{t}\rVert_{2}^{2}+\lVert\boldsymbol{{u}}_{t}\rVert_{2}^{2}]\leq\gamma.

The first statement follows from [4, Lemma 1] and the system model and the subsequent are via [6, Theorem IV.5].

The TI prefix constraint can be satisfied by encoding each 𝒒t\boldsymbol{{q}}_{t} with the same lossless prefix code at all time. Let 𝒒\boldsymbol{{q}} be a discrete random variable on Δm\Delta\mathbb{Z}^{m} (the same alphabet as 𝒒t\boldsymbol{{q}}_{t}). Assume the support of 𝒒\boldsymbol{{q}} is such that [𝒒=k]=0\mathbb{P}[\boldsymbol{{q}}=k]=0 only if [𝒒t=k]=0\mathbb{P}[\boldsymbol{{q}}_{t}=k]=0. Define F¯𝒒(q)=[𝒒<q]+[𝒒=q]2\overline{F}_{\boldsymbol{{q}}}(q)=\mathbb{P}[\boldsymbol{{q}}<q]+\frac{\mathbb{P}[\boldsymbol{{q}}=q]}{2}, and define the encoding function C𝒒F(q):Δm{0,1}C^{\mathrm{F}}_{\boldsymbol{{q}}}(q):\Delta\mathbb{Z}^{m}\rightarrow\{0,1\}^{*} such that C𝒒F(q)C^{\mathrm{F}}_{\boldsymbol{{q}}}(q) is the binary expansion of F¯𝒒(q)\overline{F}_{\boldsymbol{{q}}}(q) truncated to log2(𝒒[q])+1\lceil-\log_{2}(\mathbb{P}_{\boldsymbol{{q}}}[q])\rceil+1 bits. This is a standard, lossless, prefix-free Shannon-Fano-Elias code for the random variable 𝒒\boldsymbol{{q}} [13]. If we use C𝒒FC^{\mathrm{F}}_{\boldsymbol{{q}}} to encode 𝒒t\boldsymbol{{q}}_{t} (e.g. 𝒂t=C𝒒F(𝒒t)\boldsymbol{{a}}_{t}=C^{\mathrm{F}}_{\boldsymbol{{q}}}(\boldsymbol{{q}}_{t})) the codeword length satisfies

𝔼[(𝒂t)]2+H(𝒒t)+DKL(𝒒t||𝒒).\displaystyle\mathbb{E}[\ell(\boldsymbol{{a}}_{t})]\leq 2+H(\boldsymbol{{q}}_{t})+D_{\mathrm{KL}}(\boldsymbol{{q}}_{t}||\boldsymbol{{q}}). (6)

Note that if for some kk, [𝒒=k]=0\mathbb{P}[\boldsymbol{{q}}=k]=0 while [𝒒t=k]>0\mathbb{P}[\boldsymbol{{q}}_{t}=k]>0, then DKL(𝒒t||𝒒)=D_{\mathrm{KL}}(\boldsymbol{{q}}_{t}||\boldsymbol{{q}})=\infty. The main result of this paper is the following lemma, proved in Section IV.

Lemma 2

There exists a random variable 𝐪\boldsymbol{{q}} such that DKL(𝐪t||𝐪)D_{\mathrm{KL}}(\boldsymbol{{q}}_{t}||\boldsymbol{{q}}) is finite for all tt and limtDKL(𝐪t||𝐪)=0\lim_{t\rightarrow\infty}D_{\mathrm{KL}}(\boldsymbol{{q}}_{t}||\boldsymbol{{q}})=0.

This lemma allows us to bound the bitrate for when a TI prefix-free code is used in Fig. 2.

Theorem 3

Let η=(1+12log2(2πe12))\eta=\left(1+\frac{1}{2}\log_{2}\left(\frac{2\pi e}{12}\right)\right). Setting 𝐚t=C𝐪F(𝐪t)\boldsymbol{{a}}_{t}=C^{\mathrm{F}}_{\boldsymbol{{q}}}(\boldsymbol{{q}}_{t}) for all tt ensures that the system in Fig. 2 attains a control cost less than γ\gamma, satisfies the prefix constraint in Section II, and attains a communication cost that satisfies

limsupT1Ti=0T1𝔼[(𝒂t)]\displaystyle\lim\sup_{T\rightarrow\infty}\frac{1}{T}\sum\limits_{i=0}^{T-1}\mathbb{E}[\ell(\boldsymbol{{a}}_{t})] \displaystyle\leq (γ)+2+mη\displaystyle\mathcal{R}(\gamma)+2+m\eta (7)
Proof:

C𝒒FC^{\mathrm{F}}_{\boldsymbol{{q}}} is a prefix free code fixed over all tt. Furthermore, since DKL(𝒒t||𝒒)D_{\mathrm{KL}}(\boldsymbol{{q}}_{t}||\boldsymbol{{q}}) is finite for all tt, it is lossless on the support of {𝒒t}\boldsymbol{{q}}_{t}\}. The bound on codeword length (7) follows via applying (6) to each term on the left-hand side and then using the Cesáro mean (cf. Lemma 1). ∎

In Section IV, we use ergodic theory to prove Lemma 2.

IV Proof of Lemma 2

We proceed as in [6] by analyzing the long-term behavior of the Kalman innovation and dither signals, e.g. (𝒆t,𝒅t)(\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t}). At every tt, the quantization error 𝒗t\boldsymbol{{v}}_{t} is a measurable function of (𝒆t,𝒅t)(\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t}) given by 𝒗t=𝒒tC𝒆t\boldsymbol{{v}}_{t}=\boldsymbol{{q}}_{t}-{C}\boldsymbol{{e}}_{t}, or, equivalently, 𝒗t=QΔ(C𝒆t+𝒅t)𝒅tC𝒆t\boldsymbol{{v}}_{t}=Q_{\Delta}({C}\boldsymbol{{e}}_{t}+\boldsymbol{{d}}_{t})-\boldsymbol{{d}}_{t}-{C}\boldsymbol{{e}}_{t}. Let L=AJL=AJ and R=ALC{R}={A}-{L}{C}. The following recursion for the 𝒆t\boldsymbol{{e}}_{t} (cf. (4)) can be derived

𝒆t+1\displaystyle\boldsymbol{{e}}_{t+1} =\displaystyle= R𝒆tL𝒗t+𝒘t\displaystyle R\boldsymbol{{e}}_{t}-L\boldsymbol{{v}}_{t}+\boldsymbol{{w}}_{t} (8)
=\displaystyle= M(𝒆t,𝒅t)+𝒘t.\displaystyle M(\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t})+\boldsymbol{{w}}_{t}. (9)

By construction of LL, (ALC)(A-LC) is stable, e.g. its eigenvalues lie strictly within the complex unit circle [14]. Define the state space 𝔻m=m×m(Δ)\mathbb{D}^{m}=\mathbb{R}^{m}\times\mathcal{B}^{m}(\Delta). For r,μm{r},{\mu}\in\mathbb{R}^{m}, Σ𝕊+m×m\Sigma\in\mathbb{S}_{+}^{m\times m}, denote the PDF of a multivariate Gaussian random variable with mean μ{\mu} and covariance Σ\Sigma evaluated at r{r} by N(r;μ,Σ)N({r};{\mu},\Sigma), i.e. define the function N(r;μ,Σ):m×m×𝕊+mN({r};{\mu},\Sigma):\mathbb{R}^{m}\times\mathbb{R}^{m}\times\mathbb{S}_{+}^{m}\rightarrow\mathbb{R} via N(r;μ,Σ)=1(2π)mdetΣe12(rμ)TΣ1(rμ)N({r};{\mu},\Sigma)=\frac{1}{\sqrt{(2\pi)^{m}\det{\Sigma}}}e^{-\frac{1}{2}({r}-{\mu})^{\mathrm{T}}\Sigma^{-1}({r}-{\mu})}. Define M:(x,y)𝔻mmM\colon(x,y)\in\mathbb{D}^{m}\rightarrow\mathbb{R}^{m} via M(x,y)=RxL(QΔ(Cx+y)yCx)M({x},{y})=R{x}-L(Q_{\Delta}(C{x}+{y})-{y}-C{x}). By definition, the marginal PDF of the dither is f𝒅t+1(d)=1Δm1dm(Δ)f_{\boldsymbol{{d}}_{t+1}}({d})=\frac{1}{\Delta^{m}}1_{{d}\in\mathcal{B}^{m}(\Delta)} for all tt. Since 𝒘t(𝒆t,𝒅t)\boldsymbol{{w}}_{t}\perp(\boldsymbol{{e}}^{t},\boldsymbol{{d}}^{t}), and 𝒅t+1(𝒆t+1,𝒘t\boldsymbol{{d}}_{t+1}\perp(\boldsymbol{{e}}^{t+1},\boldsymbol{{w}}^{t}), via (8) we have that {𝒆t,𝒅t}\{\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t}\} is a time-homogeneous first order Markov chain.

The PDFs of {𝒆t,𝒅t}\{\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t}\} factorize via f𝒆t+1,𝒅t+1|𝒆t,𝒅t=f𝒅t+1f𝒆t+1|𝒆t,𝒅tf_{\boldsymbol{{e}}_{t+1},\boldsymbol{{d}}_{t+1}|\boldsymbol{{e}}^{t},\boldsymbol{{d}}^{t}}=f_{\boldsymbol{{d}}_{t+1}}f_{\boldsymbol{{e}}_{t+1}|\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t}}. Via (8),

f𝒆t+1|𝒆t,𝒅t(e|ep,dp)\displaystyle f_{\boldsymbol{{e}}_{t+1}|\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t}}({e}|{e}_{p},{d}_{p}) =\displaystyle= N(e;M(ep,dp),W)\displaystyle N({e};M({e}_{p},{d}_{p}),W) (10)
f𝒆t+1|𝒆t,𝒗t(e|ep,vp)\displaystyle f_{\boldsymbol{{e}}_{t+1}|\boldsymbol{{e}}_{t},\boldsymbol{{v}}_{t}}({e}|{e}_{p},{v}_{p}) =\displaystyle= N(e;RepLvp,W).\displaystyle N({e};R{e}_{p}-L{v}_{p},W). (11)

Let ft+1|t=f𝒆t+1,𝒅t+1|𝒆t,𝒅tf_{t+1|t}=f_{\boldsymbol{{e}}_{t+1},\boldsymbol{{d}}_{t+1}|\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t}}, and likewise ft+n|t=f𝒆t+n,𝒅t+n|𝒆t,𝒅tf_{t+n|t}=f_{\boldsymbol{{e}}_{t+n},\boldsymbol{{d}}_{t+n}|\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t}}. From the foregoing, we have

ft+1|t(e,d|ep,dp)=1dm(Δ)N(e;M(ep,dp),W)Δm\displaystyle f_{t+1|t}({e},{d}|{e}_{p},{d}_{p})=\frac{1_{{d}\in\mathcal{B}^{m}(\Delta)}N({e};M({e}_{p},{d}_{p}),W)}{\Delta^{m}} (12)

Applying the Chapman-Komogorov equations to (12), it can be seen that the nn-step transition kernels satisfy

ft+n|t(e,d|ep,dp)=f𝒆t+n|𝒆t,𝒅t(e|ep,dp)1dm(Δ)Δm.\displaystyle f_{t+n|t}({e},{d}|{e}_{p},{d}_{p})=f_{\boldsymbol{{e}}_{t+n}|\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t}}({e}|{e}_{p},{d}_{p})\frac{1_{{d}\in\mathcal{B}^{m}(\Delta)}}{\Delta^{m}}. (13)

We now state the generalization of [6, Lemma A.3].

Lemma 4

Let Σn=i=0n1RiW(RT)i\Sigma_{n}=\sum\limits_{i=0}^{n-1}R^{i}W(R^{\mathrm{T}})^{i} and

μn(e0,d0,v1n1)=Rn1M(e0,d0)i=0n2RiLvn1i.\displaystyle{\mu}_{n}({e}_{0},{d}_{0},{v}_{1}^{n-1})=R^{n-1}M({e}_{0},{d}_{0})-\sum\limits_{i=0}^{n-2}R^{i}L{v}_{n-1-i}.

For all n1n\geq 1 we have

f𝒆n|𝒆0,𝒅0,𝒗1n1(en|e0,d0,v1n1)=N(en;μn(e0,d0,v1n1),Σn).f_{\boldsymbol{{e}}_{n}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{n-1}}({e}_{n}|{e}_{0},{d}_{0},{v}_{1}^{n-1})=\\ N({e}_{n};{\mu}_{n}({e}_{0},{d}_{0},{v}_{1}^{n-1}),\Sigma_{n}). (14)
Proof:

The proof follows by induction on nn. By (10), (14) holds for n=1n=1. Assume that the relation (14) holds for some n=k1n=k-1. To avoid clutter, when obvious we suppress the arguments to PMFs, writing, e.g. f𝒆k|𝒆0,𝒅0,𝒗1k1f_{\boldsymbol{{e}}_{k}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-1}} in place of f𝒆k|𝒆0,𝒅0,𝒗1k1(ek|e0,d0,v1k1)f_{\boldsymbol{{e}}_{k}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-1}}({e}_{k}|{e}_{0},{d}_{0},{v}_{1}^{k-1}). Via Bayes’ Theorem, f𝒆k|𝒆0,𝒅0,𝒗1k1=f𝒆k,𝒗k1|𝒆0,𝒅0,𝒗1k2f𝒗k1|𝒆0,𝒅0,𝒗1k2f_{\boldsymbol{{e}}_{k}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-1}}=\frac{f_{\boldsymbol{{e}}_{k},\boldsymbol{{v}}_{k-1}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-2}}}{f_{\boldsymbol{{v}}_{k-1}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-2}}}. Since 𝒗k1\boldsymbol{{v}}_{k-1} is a measurable function of 𝒆k1\boldsymbol{{e}}_{k-1} and 𝒅k1\boldsymbol{{d}}_{k-1}, 𝒗k1\boldsymbol{{v}}_{k-1} is conditionally independent of (𝒗1k2,𝒆0,𝒅0)(\boldsymbol{{v}}_{1}^{k-2},\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0}) given 𝒆k1\boldsymbol{{e}}_{k-1}. By Lemma 1, 𝒗k1\boldsymbol{{v}}_{k-1} is (pairwise) independent of 𝒆k1\boldsymbol{{e}}_{k-1}. This further implies that 𝒗k1\boldsymbol{{v}}_{k-1} is independent of (𝒗1k2,𝒆0,𝒅0)(\boldsymbol{{v}}_{1}^{k-2},\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0}). Thus we have

f𝒗k1\displaystyle f_{\boldsymbol{{v}}_{k-1}} =\displaystyle= f𝒗k1|𝒆k1,𝒆0,𝒅0,𝒗1k2\displaystyle f_{\boldsymbol{{v}}_{k-1}|\boldsymbol{{e}}_{k-1},\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-2}} (15)
=\displaystyle= f𝒗k1|𝒆0,𝒅0,𝒗1k2\displaystyle f_{\boldsymbol{{v}}_{k-1}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-2}} (16)

Using (15), it can be seen that f𝒆k,𝒗k1|𝒆0,𝒅0,𝒗1k2=f𝒗k1f𝒆k|𝒆k1,𝒆0,𝒅0,𝒗1k1f𝒆k1|𝒆0,𝒅0,𝒗1k2𝑑ek1f_{\boldsymbol{{e}}_{k},\boldsymbol{{v}}_{k-1}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-2}}=f_{\boldsymbol{{v}}_{k-1}}\int_{\mathbb{R}}f_{\boldsymbol{{e}}_{k}|\boldsymbol{{e}}_{k-1},\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-1}}f_{\boldsymbol{{e}}_{k-1}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-2}}d{e}_{k-1}. Thus,

f𝒆k|𝒆0,𝒅0,𝒗1k1=f𝒆k|𝒆k1,𝒆0,𝒅0,𝒗1k1f𝒆k1|𝒆0,𝒅0,𝒗1k2𝑑ek1.f_{\boldsymbol{{e}}_{k}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-1}}=\\ \int_{\mathbb{R}}f_{\boldsymbol{{e}}_{k}|\boldsymbol{{e}}_{k-1},\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-1}}f_{\boldsymbol{{e}}_{k-1}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-2}}d{e}_{k-1}. (17)

As 𝒘k1(𝒅0k1,𝒆0k1,𝒘0t1,𝒙0)\boldsymbol{{w}}_{k-1}\perp(\boldsymbol{{d}}_{0}^{k-1},\boldsymbol{{e}}_{0}^{k-1},\boldsymbol{{w}}_{0}^{t-1},\boldsymbol{{x}}_{0}), we have, via (8) that 𝒆k\boldsymbol{{e}}_{k} is conditionally independent of (𝒆0,𝒅0,𝒗0k2)(\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{0}^{k-2}) given (𝒆k1,𝒗k1)(\boldsymbol{{e}}_{k-1},\boldsymbol{{v}}_{k-1}). Thus f𝒆k|𝒆k1,𝒆0,𝒅0,𝒗1k1=f𝒆k|𝒆k1,𝒗k1f_{\boldsymbol{{e}}_{k}|\boldsymbol{{e}}_{k-1},\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-1}}=f_{\boldsymbol{{e}}_{k}|\boldsymbol{{e}}_{k-1},\boldsymbol{{v}}_{k-1}}. Using (11) and assuming (14) holds for n=k1n=k-1, the integration in (17) is given by f𝒆k|𝒆0,𝒅0,𝒗1k1=N(ek;Rek1Lvk1,W)N(et1;μk1,Σk1)𝑑ek1f_{\boldsymbol{{e}}_{k}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-1}}=\int_{\mathbb{R}}N({e}_{k};R{e}_{k-1}-L{v}_{k-1},W)N({e}_{t-1};{\mu}_{k-1},\Sigma_{k-1})d{e}_{k-1}. This is the convolution of two multivariate Gaussian PDFs.

Computing the convolution, we have f𝒆k|𝒆0,𝒅0,𝒗1k1=N(ek;Rμk1Lvk1,RΣk1RT+W)f_{\boldsymbol{{e}}_{k}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-1}}=N({e}_{k};R\mu_{k-1}-L{v}_{k-1},R\Sigma_{k-1}R^{\mathrm{T}}+W), e.g. given (𝒆0,𝒅0,𝒗1k1)=(e0,d0,v1k1)(\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{k-1})=({e}_{0},{d}_{0},{v}_{1}^{k-1}), 𝒆k\boldsymbol{{e}}_{k} is Gaussian with mean Rμk1Lvk1R{\mu}_{k-1}-L{v}_{k-1} and covariance RΣk1RT+WR\Sigma_{k-1}R^{\mathrm{T}}+W. As μk=Rμk1Lvk1\mu_{k}=R\mu_{k-1}-L{v}_{k-1} and Σk=RΣk1RT+W\Sigma_{k}=R\Sigma_{k-1}R^{\mathrm{T}}+W we have the proof. ∎

The next lemmas describe properties of {Σn}\{\Sigma_{n}\} and the sequence of functions μn(e0,d0,v1n2):m×(m(Δ))n1m\mu_{n}(e_{0},d_{0},v_{1}^{n-2}):\mathbb{R}^{m}\times(\mathcal{B}^{m}(\Delta))^{n-1}\rightarrow\mathbb{R}^{m}. We recall a result from system theory.

Lemma 5 (Gelfand’s Theorem and a Corollary [15])

Let Qm×mQ\in\mathbb{R}^{m\times m} and denote the spectral radius of QQ (e.g. the largest of the absolute values of QQ’s eigenvalues) by ρmax(Q)\rho_{\max}(Q). Gelfand’s Theorem states that limn(Qn2)1n=ρmax(Q)\lim_{n\rightarrow\infty}\left(\lVert Q^{n}\rVert_{2}\right)^{\frac{1}{n}}=\rho_{\max}(Q). Let γ=(ρmax(Q)+1)/2\gamma=(\rho_{\max}(Q)+1)/2. If ρmax(Q)<1\rho_{\max}(Q)<1 then γ<1\gamma<1 and by Gelfand’s Theorem \exists ii such that \forall jij\geq i Qj2γj\lVert Q^{j}\rVert_{2}\leq\gamma^{j}.

Lemma 6

We have ΣnW0m×m\Sigma_{n}\succeq W\succ 0_{m\times m}. Furthermore, there exists a constant cc such that for all nn, Σn2c\lVert\Sigma_{n}\rVert_{2}\leq c.

Proof:

The first statement is immediate from the assumption that W0m×mW\succ 0_{m\times m} and the formula for Σn\Sigma_{n} in Lemma 4. Note that since Σn1Σn\Sigma_{n-1}\preceq\Sigma_{n}, we have that {Σn2}\{\lVert\Sigma_{n}\rVert_{2}\} is monotonically increasing. Recall that ρmax(R)<1\rho_{\max}(R)<1 [14]. Using Lemma 5, it is seen that for some c<c<\infty, limri=0rRi22W2=c\lim_{r\rightarrow\infty}\sum\limits_{i=0}^{r}\lVert R^{i}\rVert_{2}^{2}\lVert W\rVert_{2}=c. The triangle inequality and the submultiplicity of the matrix 2-norm gives Σn2c\lVert\Sigma_{n}\rVert_{2}\leq c. ∎

Lemma 7

There exist constants α\alpha and β\beta such that for any nn and choice of v1n2(m(Δ))n2{v}_{1}^{n-2}\in(\mathcal{B}^{m}(\Delta))^{n-2} we have μn(e0,d0,v1n1)2αM(e0,d0)2+β\lVert\mu_{n}\left({e}_{0},{d}_{0},{v}_{1}^{n-1}\right)\rVert_{2}\leq\alpha\lVert M(e_{0},d_{0})\rVert_{2}+\beta.

Proof:

The proof is analogous to that of Lemma 6. Lemma 5 is used to bound Rn12\lVert R^{n-1}\rVert_{2} and i=0n2Ri2\sum_{i=0}^{n-2}\lVert R^{i}\rVert_{2}. We also use the fact that for vjm(Δ){v}_{j}\in\mathcal{B}^{m}(\Delta), vj2Δm2\lVert{v}_{j}\rVert_{2}\leq\frac{\Delta\sqrt{m}}{2}. ∎

For α\alpha, β\beta, and cc as defined in Lemmas 6 and 7, define a subset of m×m×m\mathbb{R}^{m}\times\mathbb{R}^{m\times m} via (e0,d0,m,L,R,Δ)={(μ,Σ)m×m×m:μ2αM(e0,d0)+β,ΣW,Σ2c}\mathcal{M}(e_{0},d_{0},m,L,R,\Delta)=\{({\mu},\Sigma)\in\mathbb{R}^{m}\times\mathbb{R}^{m\times m}:\lVert{\mu}\rVert_{2}\leq\alpha\lVert M(e_{0},d_{0})\rVert+\beta,\Sigma\succeq W,\lVert\Sigma\rVert_{2}\leq c\}. We have that (e0,d0,m,L,R,Δ)\mathcal{M}(e_{0},d_{0},m,L,R,\Delta) is compact (closed and bounded). This helps prove the main result of this section.

Lemma 8

There exists an invariant PDF gg_{\infty} such that

𝔻ft+1|t(e,d|x,y)g(x,y)𝑑x𝑑y=g(e,d),\displaystyle\int_{\mathbb{D}}f_{t+1|t}({e},{d}|{x},{y})g_{\infty}({x},{y})d{x}d{y}=g_{\infty}({e},{d}), (18)

and g(x,y)>0g_{\infty}({x},{y})>0 for all (x,y)𝔻({x},{y})\in\mathbb{D}.

Proof:

A set F𝔹(𝔻)F\in\mathbb{B}(\mathbb{D}) is called weakly transient with respect to the Markov kernel (12) if there exists a sequence of positive integers n1<n2<n_{1}<n_{2}<\dots such that

i=1𝒆ni,𝒅ni|𝒆0,𝒅0[F|𝒆0=e0,𝒅0=d0]<\displaystyle\sum_{i=1}^{\infty}\mathbb{P}_{\boldsymbol{{e}}_{n_{i}},\boldsymbol{{d}}_{n_{i}}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0}}[F|\boldsymbol{{e}}_{0}={e}_{0},\boldsymbol{{d}}_{0}={d}_{0}]<\infty (19)

holds for λ\lambda almost-every (e0,d0)𝔻({e}_{0},{d}_{0})\in\mathbb{D}. A consequence of [16, Theorem 5] is that there exists an invariant PDF gg_{\infty} satisfying (18) and g(a,b)>0g_{\infty}(a,b)>0 for all (a,b)𝔻(a,b)\in\mathbb{D} if and only if every weakly transient set FF has λ(F)=0\lambda(F)=0. To prove the lemma, we proceed via contradiction. We prove that any set F𝔹(𝔻)F\in\mathbb{B}(\mathbb{D}) with λ(F)>0\lambda(F)>0 is not weakly transient. Let F𝔻F\subset\mathbb{D} have λ(F)>0\lambda(F)>0. Since λ(F)>0\lambda(F)>0, FF must contain an open ball, which in turn must contain a closed rectangle of positive Lebesgue measure, e.g. there exists a δ>0\delta>0 and point (eF,dF)F({e}_{F},{d}_{F})\in F such that the set H={(x,y)m×m:xeFδ2,ydFδ2}H=\{(x,y)\in\mathbb{R}^{m}\times\mathbb{R}^{m}:\lVert x-{e}_{F}\rVert_{\infty}\leq\frac{\delta}{2},\lVert y-d_{F}\rVert_{\infty}\leq\frac{\delta}{2}\} has HFH\subset F. Note λ(H)=δ2m\lambda(H)=\delta^{2m}. Define the “e-section” of HH as the subset He={xm:xeFδ2}H_{e}=\{x\in\mathbb{R}^{m}:\lVert x-{e}_{F}\rVert_{\infty}\leq\frac{\delta}{2}\}. As HFH\subset F, the conditional probability satisfies

𝒆ni,𝒅ni|𝒆0,𝒅0[F|e0,d0]\displaystyle\mathbb{P}_{\boldsymbol{{e}}_{n_{i}},\boldsymbol{{d}}_{n_{i}}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0}}[F|{e}_{0},{d}_{0}] \displaystyle\geq Hft+ni|t(x,y|e0,d0)𝑑x𝑑y\displaystyle\int_{H}f_{t+n_{i}|t}(x,y|{e}_{0},{d}_{0})dxdy
\displaystyle\geq δ2minf(x,y)Hft+ni|t(x,y|e0,d0),\displaystyle\delta^{2m}\inf_{({x},{y})\in H}f_{t+n_{i}|t}({x},{y}|{e}_{0},{d}_{0}),
\displaystyle\geq δ2mΔminfxHef𝒆ni|𝒆0,𝒅0(x|e0,d0).\displaystyle\frac{\delta^{2m}}{\Delta^{m}}\inf_{{x}\in H_{e}}f_{\boldsymbol{{e}}_{n_{i}}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0}}({x}|{e}_{0},{d}_{0}).

Since the “minimum is less than the average”, we have

infxHef𝒆ni|𝒆0,𝒅0(x|e0,d0)infxHev1ni1(m(Δ))n1f𝒆ni|𝒆0,𝒅0,𝒗1ni1(x|e0,d0,v1ni1).\inf_{{x}\in H_{e}}f_{\boldsymbol{{e}}_{n_{i}}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0}}({x}|{e}_{0},{d}_{0})\geq\\ \inf_{\begin{subarray}{c}{x}\in H_{e}\\ {v}_{1}^{n_{i}-1}\in(\mathcal{B}^{m}(\Delta))^{n-1}\end{subarray}}f_{\boldsymbol{{e}}_{n_{i}}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{n_{i}-1}}({x}|{e}_{0},{d}_{0},{v}_{1}^{n_{i}-1}). (21)

Using the explicit formula for f𝒆ni|𝒆0,𝒅0,𝒗1ni1f_{\boldsymbol{{e}}_{n_{i}}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{n_{i}-1}} given by (14) in Lemma 4, the right side of (21) can be written in terms of Σni\Sigma_{{n}_{i}} and μni(e0,d0,v1ni)\mu_{n_{i}}(e_{0},d_{0},v_{1}^{{n}_{i}}). However, for a fixed (e0,d0)(e_{0},d_{0}), Lemma 7 proves a compact set that contains μni(e0,d0,v1ni)\mu_{n_{i}}(e_{0},d_{0},v_{1}^{n_{i}}) for any choice of nin_{i} and v1niv_{1}^{n_{i}}. This set is given by 𝒞μ={μm:μ2αM(e0,d0)+β}\mathcal{C}_{{\mu}}=\{{\mu}\in\mathbb{R}^{m}:\lVert\mu\rVert_{2}\leq\alpha\lVert M(e_{0},d_{0})\rVert+\beta\}. Likewise, for any nin_{i}, Σni\Sigma_{n_{i}} falls within the compact set 𝒞Σ={Σm×m:ΣW,Σ2c}\mathcal{C}_{\Sigma}=\{{\Sigma}\in\mathbb{R}^{m\times m}:\Sigma\succeq W,\lVert\Sigma\rVert_{2}\leq c\} Thus, we have, for any nin_{i}

infxHev1ni1(m(Δ))n1f𝒆ni|𝒆0,𝒅0,𝒗1ni1(x|e0,d0,v1ni1)infxHeμ𝒞μΣ𝒞ΣN(x;μ,Σ).\inf_{\begin{subarray}{c}{x}\in H_{e}\\ {v}_{1}^{n_{i}-1}\in(\mathcal{B}^{m}(\Delta))^{n-1}\end{subarray}}f_{\boldsymbol{{e}}_{n_{i}}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0},\boldsymbol{{v}}_{1}^{n_{i}-1}}({x}|{e}_{0},{d}_{0},{v}_{1}^{n_{i}-1})\geq\\ \inf_{\begin{subarray}{c}{x}\in H_{e}\text{, }{\mu}\in\mathcal{C}_{{\mu}}\text{, }\Sigma\in\mathcal{C}_{\Sigma}\end{subarray}}N(x;{\mu},\Sigma). (22)

Note that the optimization on the right of (22) does not depend on nin_{i}. At every point (x,μ,Σ)(x,\mu,\Sigma) where Σ0m×m\Sigma\succ 0_{m\times m}, the positive function N(x;μ,Σ):m×m×m×mN(x;\mu,\Sigma):\mathbb{R}^{m}\times\mathbb{R}^{m}\times\mathbb{R}^{m\times m}\rightarrow\mathbb{R} is continuous. Note that the subset He×𝒞μ×𝒞Σm×m×m×mH_{e}\times\mathcal{C}_{\mu}\times\mathcal{C}_{\Sigma}\subset\mathbb{R}^{m}\times\mathbb{R}^{m}\times\mathbb{R}^{m\times m} is closed and bounded, thus compact, and that Σ𝒞ΣΣ0m×m\Sigma\in\mathcal{C}_{\Sigma}\rightarrow\Sigma\succ 0_{m\times m}. Thus we have that there exists ϵ>0\epsilon>0 such that infxHe,μ𝒞μ, Σ𝒞ΣN(x;μ,Σ)ϵ\inf_{\begin{subarray}{c}{x}\in H_{e},{\mu}\in\mathcal{C}_{{\mu}},\text{ }\Sigma\in\mathcal{C}_{\Sigma}\end{subarray}}N(x;{\mu},\Sigma)\geq\epsilon. Thus, following the inequalities from (IV) through (22) gives, for any nin_{i} that 𝒆ni,𝒅ni|𝒆0,𝒅0[F|e0,d0]>ϵ\mathbb{P}_{\boldsymbol{{e}}_{n_{i}},\boldsymbol{{d}}_{n_{i}}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0}}[F|{e}_{0},{d}_{0}]>\epsilon. Thus for any FF with λ(F)>0\lambda(F)>0 the terms in the summation (19) are bounded from below by ϵ\epsilon. For any subsequence {ni}\{n_{i}\} of \mathbb{N} the summation in (19) diverges, which implies FF is not weakly transient. Thus, if FF is weakly transient, it must have λ(F)=0\lambda(F)=0, which via [16, Theorem 5] proves the lemma. ∎

The next ensures that the sequence {𝒆t,𝒅t}\{\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t}\} converges to gg_{\infty}.

Lemma 9

Let (𝐞,𝐝)g(\boldsymbol{{e}}_{\infty},\boldsymbol{{d}}_{\infty})\sim g_{\infty}, and (𝐞0,𝐝0)(\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0}) be drawn from a continuous distribution on 𝔻\mathbb{D}. The sequence of random variables {(𝐞t,𝐝t)}\{(\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t})\} converges weakly to (𝐞,𝐝)(\boldsymbol{{e}}_{\infty},\boldsymbol{{d}}_{\infty}).

Proof:

We can prove this result using [17, Theorem 4] adapted to the special case when a first-order time-homogeneous Markov chain admits an invariant PDF (18) (in other words. when the chain admits an invariant measure that is absolutely continuous with respect to Lebesgue measure). A Markov chain {𝒈t}\{\boldsymbol{{g}}_{t}\} on a state-space 𝔾\mathbb{G} is said to be ϕ\phi-irreducible if there exists a nonzero σ\sigma-finite measure ϕ\phi such that for all measurable 𝒢𝔾\mathcal{G}\subset\mathbb{G} with ϕ(𝒢)>0\phi(\mathcal{G})>0 and all initial conditions 𝒛0=z0\boldsymbol{{z}}_{0}={z}_{0} with z0𝔻z_{0}\in\mathbb{D} we can find an integer nn such that 𝒛n|𝒛0[𝒛n𝒜|𝒛0=z0]>0\mathbb{P}_{\boldsymbol{{z}}_{n}|\boldsymbol{{z}}_{0}}[\boldsymbol{{z}}_{n}\in\mathcal{A}|\boldsymbol{{z}}_{0}={z}_{0}]>0. A Markov chain {𝒈t}\{\boldsymbol{{g}}_{t}\} on 𝔾\mathbb{G} is aperiodic if there does not exist d1d\geq 1 and disjoint nonempty measurable subsets 𝒢0,𝒢1,𝒢d1\mathcal{G}_{0},\mathcal{G}_{1},\dots\mathcal{G}_{d-1} such that when gn1𝒢i{g}_{n-1}\in\mathcal{G}_{i}, then 𝒈n|𝒈n1[𝒈n𝒢i+1 mod d|𝒛n1=zn1]=1\mathbb{P}_{\boldsymbol{{g}}_{n}|\boldsymbol{{g}}_{n-1}}[\boldsymbol{{g}}_{n}\in\mathcal{G}_{i+1\text{ mod }d}|\boldsymbol{{z}}_{n-1}={z}_{n-1}]=1. A consequence of [17, Theorem 4] is that if the Markov chain {𝒈t}\{\boldsymbol{{g}}_{t}\} admits an invariant PDF gg_{\infty}, is ϕ\phi-irreducible, aperiodic, and 𝒈0\boldsymbol{{g}}_{0} is a continuous random variable then the 𝒈i\boldsymbol{{g}}_{i} converge weakly to the measure induced by the invariant PDF. Consider the Markov chain {𝒆t,𝒅t}\{\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t}\} on 𝔻\mathbb{D}, and recall that the Lebesgue measure λ\lambda on 𝔻\mathbb{D} is countably generated. Take 𝒜𝔻\mathcal{A}\subset\mathbb{D} with λ(𝒜)>0\lambda(\mathcal{A})>0, and let (e0,d0)𝔻({e}_{0},{d}_{0})\in\mathbb{D} be arbitrary. We have

𝒆1,𝒅1|𝒆0,𝒅0[(𝒆1,𝒅1)𝒜|(𝒆0,𝒅0)=(e0,d0)]=𝒜ft+1|t(e,d|e0,d0)𝑑d𝑑e.\mathbb{P}_{\boldsymbol{{e}}_{1},\boldsymbol{{d}}_{1}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0}}[(\boldsymbol{{e}}_{1},\boldsymbol{{d}}_{1})\in\mathcal{A}|(\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0})=({e}_{0},{d}_{0})]=\\ \int_{\mathcal{A}}f_{t+1|t}({e},{d}|{e}_{0},{d}_{0})ddde. (23)

It is immediate from the defition of ft+1|tf_{t+1|t} in (12) and the fact that 𝒜𝔻\mathcal{A}\subset\mathbb{D} that 𝒆1,𝒅1|𝒆0,𝒅0[(𝒆1,𝒅1)𝒜|(𝒆0,𝒅0)(e0,d0)]0\mathbb{P}_{\boldsymbol{{e}}_{1},\boldsymbol{{d}}_{1}|\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0}}[(\boldsymbol{{e}}_{1},\boldsymbol{{d}}_{1})\in\mathcal{A}|(\boldsymbol{{e}}_{0},\boldsymbol{{d}}_{0})({e}_{0},{d}_{0})]\geq 0. Thus the Markov chain {𝒆t,𝒅t}\{\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t}\} is λ\lambda-irreducible.

Via the same logic, it can be shown by contradiction that the chain is aperiodic. Assume that the chain is periodic, e.g. that for d2d\geq 2, there exist disjoint nonempty measurable subsets 𝒜0,𝒜1,,𝒜d1𝔻\mathcal{A}_{0},\mathcal{A}_{1},\dots,\mathcal{A}_{d-1}\subset\mathbb{D} satisfying such that if (en1,dn1)𝒜i({e}_{n-1},{d}_{n-1})\in\mathcal{A}_{i} then 𝒆n,𝒅n|𝒆n1,𝒅n1[(𝒆n,𝒅n)𝒜i+1 mod d|𝒛n1=zn1]=1\mathbb{P}_{\boldsymbol{{e}}_{n},\boldsymbol{{d}}_{n}|\boldsymbol{{e}}_{n-1},\boldsymbol{{d}}_{n-1}}[(\boldsymbol{{e}}_{n},\boldsymbol{{d}}_{n})\in\mathcal{A}_{i+1\text{ mod }d}|\boldsymbol{{z}}_{n-1}={z}_{n-1}]=1. Assume that 𝒜i\mathcal{A}_{i} has λ(𝒜i)>0\lambda(\mathcal{A}_{i})>0 (since the sets {𝒜k}\{\mathcal{A}_{k}\} partition 𝔻\mathbb{D}, at least one of the sets must have positive Lebesgue measure). Take (et,dt)𝒜i(e_{t},d_{t})\in\mathcal{A}_{i}, and let j=i+1 mod dj=i+1\text{ mod }d. By the assumption of periodicity

t+1|t[(𝒆t+1,𝒅t+1)𝒜j|(𝒆t,𝒅t)=(et,dt)]=1.\mathbb{P}_{t+1|t}[(\boldsymbol{{e}}_{t+1},\boldsymbol{{d}}_{t+1})\in\mathcal{A}_{j}|(\boldsymbol{{e}}_{t},\boldsymbol{{d}}_{t})=({e}_{t},{d}_{t})]=1. (24)

If 𝒜i\mathcal{A}_{i} has λ(𝒜i)>0\lambda(\mathcal{A}_{i})>0, for any (et,dt)(e_{t},d_{t}) our proof of irreducibility guarantees that \exists ϵ>0\epsilon>0 such that [(𝒆t+1,𝒅t+1)𝒜i|𝒆t=et,𝒅t=dt]ϵ\mathbb{P}[(\boldsymbol{{e}}_{t+1},\boldsymbol{{d}}_{t+1})\in\mathcal{A}_{i}|\boldsymbol{{e}}_{t}={e}_{t},\boldsymbol{{d}}_{t}={d}_{t}]\geq\epsilon. As we assumed 𝒜i𝒜j=\mathcal{A}_{i}\cap\mathcal{A}_{j}=\emptyset, this contradicts (24). Thus the Markov chain {𝒆i,𝒅i}\{\boldsymbol{{e}}_{i},\boldsymbol{{d}}_{i}\} is aperiodic, and by [17, Theorem 4] we have the proof. ∎

The next lemma characterizes the invariant PDF gg_{\infty}.

Corollary 10

Assume (𝐞,𝐝)g(\boldsymbol{{e}},\boldsymbol{{d}})\sim g_{\infty}. Denote the marginal PDF of 𝐞\boldsymbol{{e}} via g𝐞(e)=Δmg(e,δ)𝑑δg_{\boldsymbol{{e}}_{\infty}}(e)=\int_{\mathcal{B}^{m}_{\Delta}}g_{\infty}(e,\delta)d\delta. We have that 𝐞𝐝\boldsymbol{{e}}\perp\boldsymbol{{d}} and 𝐝Uniform[Δ/2,Δ/2]\boldsymbol{{d}}\sim\text{Uniform}[-\Delta/2,\Delta/2]. This implies that, g:𝔻g_{\infty}\colon\mathbb{D}\rightarrow\mathbb{R} factorizes via g(e,d)=g𝐞(e)/Δmg_{\infty}(e,d)=g_{\boldsymbol{{e}}_{\infty}}(e)/\Delta^{m} for (e,d)𝔻(e,d)\in\mathbb{D}.

Proof:

Let 𝒜1\mathcal{A}_{1} be an open ball in m\mathbb{R}^{m} and 𝒜2\mathcal{A}_{2} be an open ball inside m(Δ)\mathcal{B}^{m}(\Delta). Open intervals like 𝒜1×𝒜2\mathcal{A}_{1}\times\mathcal{A}_{2} form a π\pi system that generates the σ\sigma-algebra 𝔹(𝔻)\mathbb{B}(\mathbb{D}). Using the definition of the transition kernel (12) and the invariant PDF (18), it can be shown that if 𝒫=𝒜1×𝒜2\mathcal{P}=\mathcal{A}_{1}\times\mathcal{A}_{2} then, [(𝒆,𝒅)𝒫]=𝒜1g𝒆(e)𝑑eλ(𝒜2)Δm\mathbb{P}[(\boldsymbol{{e}},\boldsymbol{{d}})\in\mathcal{P}]=\int_{\mathcal{A_{1}}}g_{\boldsymbol{{e}}_{\infty}}(e)de\frac{\lambda(\mathcal{A}_{2})}{\Delta^{m}}. By Dynkin’s πλ\pi-\lambda theorem, this proves that 𝒆𝒅\boldsymbol{{e}}\perp\boldsymbol{{d}} (see e.g., [18, Prop. 2.13]). ∎

Let (𝒆,𝒅)g(\boldsymbol{{e}},\boldsymbol{{d}})\sim g_{\infty}, and define 𝒒=QΔ(C𝒆+𝒅)\boldsymbol{{q}}=Q_{\Delta}(C\boldsymbol{{e}}+\boldsymbol{{d}}). We now prove that 𝒒t\boldsymbol{{q}}_{t} converges to 𝒒\boldsymbol{{q}} in the KL sense. This is not obvious a priori as 𝒒t\boldsymbol{{q}}_{t} and 𝒒\boldsymbol{{q}} have countably infinite support.

Lemma 11

DKL(𝒒t||𝒒)<D_{\mathrm{KL}}(\boldsymbol{{q}}_{t}||\boldsymbol{{q}})<\infty and limtDKL(𝐪t||𝐪)=0\underset{t\rightarrow\infty}{\lim}D_{\mathrm{KL}}(\boldsymbol{{q}}_{t}||\boldsymbol{{q}})=0.

Proof:

Using the properties of the invariant measure and a data processing inequality for f-divergences [19, Theorem 2.2 (6)] one can prove that the sequence of relative entropies is monotonically decreasing, i.e., that DKL(𝒒t+1||𝒒)DKL(𝒒t||𝒒)D_{\mathrm{KL}}(\boldsymbol{{q}}_{t+1}||\boldsymbol{{q}})\leq D_{\mathrm{KL}}(\boldsymbol{{q}}_{t}||\boldsymbol{{q}}) for all tt. Since DKL(𝒒t||𝒒)0D_{\mathrm{KL}}(\boldsymbol{{q}}_{t}||\boldsymbol{{q}})\geq 0, the limit as tt\rightarrow\infty exists. From applying the data processing inequality for KL divergences several times (cf. [19, Theorem 2.2 (6) ]), and the fact that both 𝒅t𝒆t\boldsymbol{{d}}_{t}\perp\boldsymbol{{e}}_{t} and by Corollary 10 that both 𝒅𝒆\boldsymbol{{d}}\perp\boldsymbol{{e}} and 𝒅t\boldsymbol{{d}}_{t} and 𝒅\boldsymbol{{d}} are identically distributed, we have that DKL(𝒒t||𝒒)DKL(𝒆t||𝒆)D_{\mathrm{KL}}(\boldsymbol{{q}}_{t}||\boldsymbol{{q}})\leq D_{\mathrm{KL}}(\boldsymbol{{e}}_{t}||\boldsymbol{{e}}). Details can be found in the proof of [6, Lemma IV.10]. The rest of the proof follows from bounding DKL(𝒆t||𝒆)D_{\mathrm{KL}}(\boldsymbol{{e}}_{t}||\boldsymbol{{e}}).

Let {𝝂t}\{\boldsymbol{{\nu}}_{t}\} denote an IID sequence of random variables uniformly distributed on m(Δ)\mathcal{B}^{m}(\Delta). Likewise, let {𝝎t}\{\boldsymbol{{\omega}}_{t}\} be IID with 𝝎t𝒩(0,W)\boldsymbol{{\omega}}_{t}\sim\mathcal{N}(0,W), and let 𝝀𝒩(0,X0)\boldsymbol{{\lambda}}\sim\mathcal{N}(0,X_{0}). Assume {𝝎t}\{\boldsymbol{{\omega}}_{t}\}, {𝝂t}\{\boldsymbol{{\nu}}_{t}\}, and 𝝀\boldsymbol{{\lambda}} are mutually independent. Let “𝒂=𝐷𝒃\boldsymbol{{a}}\overset{D}{=}\boldsymbol{{b}}” denote “𝒂\boldsymbol{{a}} and 𝒃\boldsymbol{{b}} are identically distributed”. By unwrapping the recursive definition of {𝒆t}\{\boldsymbol{{e}}_{t}\} in (8), we have 𝒆t=𝐷Rt𝝀+i=0t1Ri(𝝎iL𝝂i)\boldsymbol{{e}}_{t}\overset{D}{=}R^{t}\boldsymbol{{\lambda}}+\sum_{i=0}^{t-1}R^{i}(\boldsymbol{{\omega}}_{i}-L\boldsymbol{{\nu}}_{i}). Likewise, by definition of 𝒆\boldsymbol{{e}}, we have that both 𝒆=𝐷limtRt𝝀+i=0t1Ri(𝝎iL𝝂i)\boldsymbol{{e}}\overset{D}{=}\lim_{t\rightarrow\infty}R^{t}\boldsymbol{{\lambda}}+\sum_{i=0}^{t-1}R^{i}(\boldsymbol{{\omega}}_{i}-L\boldsymbol{{\nu}}_{i}) and 𝒆=𝐷limti=0t1Ri(𝝎iL𝝂i)\boldsymbol{{e}}\overset{D}{=}\lim_{t\rightarrow\infty}\sum_{i=0}^{t-1}R^{i}(\boldsymbol{{\omega}}_{i}-L\boldsymbol{{\nu}}_{i}) following from the weak convergence guaranteed by Lemma 9. Define the random variables 𝒈t=i=0t1Ri𝝎i\boldsymbol{{g}}_{\leq t}=\sum_{i=0}^{t-1}R^{i}\boldsymbol{{\omega}}_{i}, 𝒖t=i=0t1RiL𝝂i\boldsymbol{{u}}_{\leq t}=-\sum_{i=0}^{t-1}R^{i}L\boldsymbol{{\nu}}_{i}, and 𝒔>t=limri=trRi(𝝎iL𝝂i)\boldsymbol{{s}}_{>t}=\lim_{r\rightarrow\infty}\sum_{i=t}^{r}R^{i}(\boldsymbol{{\omega}}_{i}-L\boldsymbol{{\nu}}_{i}). Note that the limit in the definition of 𝒔>t\boldsymbol{{s}}_{>t} is well defined via Lemma 5 in concert with Kolmogorov’s two-series theorem. By definition, 𝒆t=𝐷Rt𝝀+𝒈t+𝒖t\boldsymbol{{e}}_{t}\overset{D}{=}R^{t}\boldsymbol{{\lambda}}+\boldsymbol{{g}}_{\leq t}+\boldsymbol{{u}}_{\leq t} and 𝒆=𝐷𝒈t+𝒖t+𝒔>t\boldsymbol{{e}}\overset{D}{=}\boldsymbol{{g}}_{\leq t}+\boldsymbol{{u}}_{\leq t}+\boldsymbol{{s}}_{>t}. Note that 𝒈t𝒩(0,i=0t1RWRT)\boldsymbol{{g}}_{\leq t}\sim\mathcal{N}(0,\sum\limits_{i=0}^{t-1}RWR^{\mathrm{T}}). We have

DKL(𝒆t||𝒆)\displaystyle D_{\mathrm{KL}}(\boldsymbol{{e}}_{t}||\boldsymbol{{e}}) \displaystyle\leq DKL(Rt𝝀+𝒈t||𝒈t+𝒔>t)\displaystyle D_{\mathrm{KL}}(R^{t}\boldsymbol{{\lambda}}+\boldsymbol{{g}}_{\leq t}||\boldsymbol{{g}}_{\leq t}+\boldsymbol{{s}}_{>t}) (25)
\displaystyle\leq DKL(Rt𝝀+𝒈t||𝒈t+𝒔>t|𝒔>t),\displaystyle D_{\mathrm{KL}}(R^{t}\boldsymbol{{\lambda}}+\boldsymbol{{g}}_{\leq t}||\boldsymbol{{g}}_{\leq t}+\boldsymbol{{s}}_{>t}|\boldsymbol{{s}}_{>t}), (26)

where (25) follows from the data processing inequality for f-divergences and (26) follows since conditioning increases KL divergence (see [19, Theorem 2.2 (5)]). Given 𝒔>t=s\boldsymbol{{s}}_{>t}=s, (26) simplifies to a KL divergence between multivariate Gaussians. Let St=i=0t1RWRTS_{t}=\sum_{i=0}^{t-1}RWR^{\mathrm{T}} and S¯t=St+RtX0(RT)t\overline{S}_{t}=S_{t}+R^{t}X_{0}(R^{\mathrm{T}})^{t}. Since 𝝀𝒈t\boldsymbol{{\lambda}}\perp\boldsymbol{{g}}_{\leq t} by construction, Rt𝝀+𝒈t𝒩(0,S¯t)R^{t}\boldsymbol{{\lambda}}+\boldsymbol{{g}}_{\leq t}\sim\mathcal{N}(0,\overline{S}_{t}). Recall also that 𝒈t𝒔>t\boldsymbol{{g}}_{\leq t}\perp\boldsymbol{{s}}_{>t} by construction. We have, then that DKL(Rt𝝀+𝒈t||𝒈t+𝒔>t|𝒔>t=s)=DKL(𝒩(0,S¯t)||𝒩(s,St))D_{\mathrm{KL}}(R^{t}\boldsymbol{{\lambda}}+\boldsymbol{{g}}_{\leq t}||\boldsymbol{{g}}_{\leq t}+\boldsymbol{{s}}_{>t}|\boldsymbol{{s}}_{>t}=s)=D_{\mathrm{KL}}(\mathcal{N}(0,\overline{S}_{t})||\mathcal{N}(s,S_{t})). Let ft:mf_{t}:\mathbb{R}^{m}\rightarrow\mathbb{R} be defined via

ft(s)=Tr(St1S¯t)+sTSt1s+ln(detSt/detS¯t).\displaystyle f_{t}(s)=\text{Tr}(S_{t}^{-1}\overline{S}_{t})+{s}^{\mathrm{T}}S_{t}^{-1}{s}+\ln{\left(\det{S_{t}}/\det{\overline{S}_{t}}\right)}. (27)

We have that DKL(𝒩(0,S¯t)||𝒩(s,St))=12(ft(s)m)D_{\mathrm{KL}}(\mathcal{N}(0,\overline{S}_{t})||\mathcal{N}(s,S_{t}))=\frac{1}{2}(f_{t}(s)-m), where the divergence is expressed in nats. Thus, via (26),

DKL(𝒒t||𝒒)\displaystyle D_{\mathrm{KL}}(\boldsymbol{{q}}_{t}||\boldsymbol{{q}}) \displaystyle\leq 𝔼[(ft(𝒔>t)m)]2.\displaystyle\frac{\mathbb{E}\left[(f_{t}(\boldsymbol{{s}}_{>t})-m)\right]}{2}. (28)

Since 𝒔>t2\boldsymbol{{s}}_{>t}\in\mathcal{L}^{2} and St,S¯tWS_{t},\overline{S}_{t}\succ W, it is clear that the right-hand side of (28) is finite for all tt. We now demonstrate that limt𝔼[(ft(𝒔>t)m)]=0\lim_{t\rightarrow\infty}\mathbb{E}\left[(f_{t}(\boldsymbol{{s}}_{>t})-m)\right]=0, attacking the terms in (27) one-at-a-time. As S¯tStW0m×m\overline{S}_{t}\succeq{S}_{t}\succeq W\succ 0_{m\times m}, the determinants detSt\det{{S}_{t}} and detS¯t\det{\overline{S}_{t}} are bounded strictly away from 0. Since ρmax(R)<1\rho_{\max}(R)<1, we have that Q=limtStQ=\lim_{t\rightarrow\infty}S_{t} is well defined, and that limtS¯t=Q\lim_{t\rightarrow\infty}\overline{S}_{t}=Q. Thus

limtln(detSt/detS¯t)=0 and limtTr(St1S¯t)=m.\displaystyle\lim_{t\rightarrow\infty}\ln\left(\det{S_{t}}/\det{\overline{S}_{t}}\right)=0\text{ and }\lim_{t\rightarrow\infty}\text{Tr}(S_{t}^{-1}\overline{S}_{t})=m.

We now prove limt𝔼[𝒔TSt1𝒔]=0\lim_{t\rightarrow\infty}\mathbb{E}[\boldsymbol{{s}}^{\mathrm{T}}S_{t}^{-1}\boldsymbol{{s}}]=0. Fix tt and let 𝒑t;r=i=trRi(𝝎iL𝝂i)\boldsymbol{{p}}_{t;r}=\sum_{i=t}^{r}R^{i}(\boldsymbol{{\omega}}_{i}-L\boldsymbol{{\nu}}_{i}). For any tt, limr𝒑t;r𝒑t;rT=𝒔>t𝒔>tT\lim_{r\rightarrow\infty}\boldsymbol{{p}}_{t;r}\boldsymbol{{p}}_{t;r}^{\mathrm{T}}=\boldsymbol{{s}}_{>t}\boldsymbol{{s}}_{>t}^{\mathrm{T}}. These limits are well defined by Kolmogorov’s two-series theorm. Let U=W+LΔ212LTU=W+L\frac{\Delta^{2}}{12}L^{\mathrm{T}}. We have

𝔼[𝒔>tTS¯t1𝒔>t]\displaystyle\mathbb{E}[\boldsymbol{{s}}^{\mathrm{T}}_{>t}\overline{S}^{-1}_{t}\boldsymbol{{s}}_{>t}] =\displaystyle= 𝔼[Tr(S¯t1𝒔>t𝒔>tT)]\displaystyle\mathbb{E}[\text{Tr}\left(\overline{S}^{-1}_{t}\boldsymbol{{s}}_{>t}\boldsymbol{{s}}^{\mathrm{T}}_{>t}\right)] (29)
\displaystyle\leq Tr(S¯t1liminfr 𝔼[𝒑t;r𝒑t;rT]).\displaystyle\text{Tr}\left(\overline{S}^{-1}_{t}\underset{r\rightarrow\infty}{\lim\inf}\text{ }\mathbb{E}\left[\boldsymbol{{p}}_{t;r}\boldsymbol{{p}}_{t;r}^{\mathrm{T}}\right]\right). (30)

where (30) follows from Fatou’s lemma, and linearity. We have 𝔼[𝒑t;r𝒑t;rT]=i=trRiU(RT)i\mathbb{E}\left[\boldsymbol{{p}}_{t;r}\boldsymbol{{p}}_{t;r}^{\mathrm{T}}\right]=\sum_{i=t}^{r}R^{i}U(R^{\mathrm{T}})^{i}, thus

limr𝔼[𝒑t;r𝒑t;rT]=Rt(limri=0rRiU(RT)i)(RT)t,\lim_{r\rightarrow\infty}\mathbb{E}\left[\boldsymbol{{p}}_{t;r}\boldsymbol{{p}}_{t;r}^{\mathrm{T}}\right]=R^{t}\left(\lim_{r\rightarrow\infty}\sum_{i=0}^{r}R^{i}U(R^{\mathrm{T}})^{i}\right)(R^{\mathrm{T}})^{t}, (31)

where the limit exists via Lemma 5. Let N=(limri=0rRi(W+LΔ212LT)(RT)i)N=\left(\lim_{r\rightarrow\infty}\sum_{i=0}^{r}R^{i}\left(W+L\frac{\Delta^{2}}{12}L^{\mathrm{T}}\right)(R^{\mathrm{T}})^{i}\right). We have established that 𝔼[𝒔>tTS¯t1𝒔>t]Tr(S¯t1RtN(RT)t)\mathbb{E}[\boldsymbol{{s}}^{\mathrm{T}}_{>t}\overline{S}^{-1}_{t}\boldsymbol{{s}}_{>t}]\leq\text{Tr}(\overline{S}^{-1}_{t}R^{t}N(R^{\mathrm{T}})^{t}). Thus since ρmax(R)<1\rho_{\max}(R)<1 taking the limit as tt\rightarrow\infty proves limt𝔼[𝒔>tTS¯t1𝒔>t]=0\underset{t\rightarrow\infty}{\lim}\mathbb{E}[\boldsymbol{{s}}^{\mathrm{T}}_{>t}\overline{S}^{-1}_{t}\boldsymbol{{s}}_{>t}]=0. Thus, in conclusion limtD(𝒒t||𝒒)limt𝔼[(ft(𝒔>t)m)]20\lim_{t\rightarrow\infty}D(\boldsymbol{{q}}_{t}||\boldsymbol{{q}})\leq\lim_{t\rightarrow\infty}\frac{\mathbb{E}\left[(f_{t}(\boldsymbol{{s}}_{>t})-m)\right]}{2}\leq 0. ∎

V Conclusion

For mm dimensional plants, we have proven the existence of a time-invariant data compression architecture and controller that can achieves any feasible LQG control performance given a feedback bitrate within approximately 2+1.26m2+1.26m bits of a known lower bound. Using the “Shannon-type” source codes described in [6, Section IV.A], it can be shown that this overhead can be reduced to 1+1.26m1+1.26m bits. The difference between the lower bound and the rate achieved by the dither free, but time-varying, achievability architecture in [5] is logarithmic in the plant dimension (i.e. 𝒪(log(m))\mathcal{O}(\log(m))). This follows from the use of more sophisticated lattice quantizers [5]. An opportunity for future work is to demonstrate TI achievability without the use of dithering and with more sophisticated quantization.

While we have proved that the minimum bitrate can nearly be achieved with a time-invariant source codec, there still remains the problem of developing a practical implementation of this source coding scheme. Numerical experiments would lend credence to these theoretical results. While the time-varying achievability approaches in [4] [5] require the precise construction of codebooks in accordance with the PMF of the quantizer output, the long-term analyses of Section IV could be useful in proving bounds on the performance of adaptive compression schemes based on, for example, [20].

References

  • [1] H. Jung, A. Pedram, T. Cuvelier, and T. Tanaka, “Optimized data rate allocation for dynamic sensor fusion over resource constrained communication networks,” Int. J. Robust and Nonlinear Control, 2021.
  • [2] T. C. Cuvelier, T. Tanaka, and R. W. Heath, “A lower-bound for variable-length source coding in linear-quadratic-gaussian control with shared randomness,” IEEE Contr. Syst. Lett., pp. 1–1, 2022.
  • [3] E. I. Silva, M. S. Derpich, and J. Ostergaard, “A framework for control system design subject to average data-rate constraints,” IEEE Trans. Automat. Cont., vol. 56, no. 8, pp. 1886–1899, 2011.
  • [4] T. Tanaka, K. H. Johansson, T. Oechtering, H. Sandberg, and M. Skoglund, “Rate of prefix-free codes in LQG control systems,” in Proc. IEEE ISIT, 2016, pp. 2399–2403.
  • [5] V. Kostina and B. Hassibi, “Rate-cost tradeoffs in control,” IEEE Trans. Automat. Cont., vol. 64, no. 11, pp. 4525–4540, 2019.
  • [6] T. Cuvelier, T. Tanaka, and R. Heath, “Prefix-free coding for LQG control,” arXiv:2204.00588, 2022.
  • [7] J. Massey, “Causality, feedback and directed information,” in Proc. IEEE ISIT, 1990, pp. 303–305.
  • [8] T. Tanaka, P. M. Esfahani, and S. K. Mitter, “LQG control with minimum directed information: Semidefinite programming approach,” IEEE Trans. Automat. Contr., vol. 63, no. 1, pp. 37–52, 2018.
  • [9] M. S. Derpich and J. Østergaard, “Comments on ‘A Framework for Control System Design Subject to Average Data-Rate Constraints’,” arXiv:2103.12897, 2021.
  • [10] A. Khina, Y. Nakahira, Y. Su, and B. Hassibi, “Algorithms for optimal control with fixed-rate feedback,” in Proc. IEEE CDC, 2017, pp. 6015–6020.
  • [11] B. Amini and R. R. Bitmead, “LQG control performance with low-bitrate periodic coding,” IEEE Control of Netw. Syst., vol. 9, no. 1, pp. 320–333, 2022.
  • [12] N. Guo and V. Kostina, “Optimal causal rate-constrained sampling for a class of continuous Markov processes,” IEEE Transactions on Information Theory, vol. 67, no. 12, pp. 7876–7890, 2021.
  • [13] T. M. Cover and J. A. Thomas, Elements of Information Theory.   Wiley-Interscience, 1991.
  • [14] S. Chan, G. Goodwin, and K. Sin, “Convergence properties of the Riccati difference equation in optimal filtering of nonstabilizable systems,” IEEE Trans. Automat. Contr., vol. 29, no. 2, pp. 110–118, 1984.
  • [15] G. Dullerud and F. Paganini, A Course in Robust Control Theory.   Springer, 2000.
  • [16] Y. Ito, “Invariant measures for markov processes,” Transactions of the American Mathematical Society, vol. 110, no. 1, pp. 152–184, 1964.
  • [17] G. O. Roberts and J. S. Rosenthal, “General state space Markov chains and MCMC algorithms,” Probability Surveys, vol. 1, no. none, pp. 20 – 71, 2004.
  • [18] G. Žitković, “Theory of probability (lecture notes),” Fall 2013. [Online]. Available: https://web.ma.utexas.edu/users/gordanz/lecture_notes_page.html
  • [19] Y. Polyanskiy and Y. Wu, “Lecture notes on information theory,” 2019. [Online]. Available: {http://people.lids.mit.edu/yp/homepage/data/itlectures_v5.pdf}
  • [20] D. Bontemps, “Universal coding on infinite alphabets: Exponentially decreasing envelopes,” IEEE Trans. Inf. Theory, vol. 57, no. 3, pp. 1466–1478, 2011.