BREAKING NEWS
Matrix difference equation

## Summary

A matrix difference equation is a difference equation in which the value of a vector (or sometimes, a matrix) of variables at one point in time is related to its own value at one or more previous points in time, using matrices.[1][2] The order of the equation is the maximum time gap between any two indicated values of the variable vector. For example,

${\displaystyle \mathbf {x} _{t}=\mathbf {A} \mathbf {x} _{t-1}+\mathbf {B} \mathbf {x} _{t-2}}$

is an example of a second-order matrix difference equation, in which x is an n × 1 vector of variables and A and B are n × n matrices. This equation is homogeneous because there is no vector constant term added to the end of the equation. The same equation might also be written as

${\displaystyle \mathbf {x} _{t+2}=\mathbf {A} \mathbf {x} _{t+1}+\mathbf {B} \mathbf {x} _{t}}$

or as

${\displaystyle \mathbf {x} _{n}=\mathbf {A} \mathbf {x} _{n-1}+\mathbf {B} \mathbf {x} _{n-2}}$

The most commonly encountered matrix difference equations are first-order.

## Nonhomogeneous first-order case and the steady state

An example of a nonhomogeneous first-order matrix difference equation is

${\displaystyle \mathbf {x} _{t}=\mathbf {A} \mathbf {x} _{t-1}+\mathbf {b} }$

with additive constant vector b. The steady state of this system is a value x* of the vector x which, if reached, would not be deviated from subsequently. x* is found by setting xt = xt−1 = x* in the difference equation and solving for x* to obtain

${\displaystyle \mathbf {x} ^{*}=[\mathbf {I} -\mathbf {A} ]^{-1}\mathbf {b} }$

where I is the n×n identity matrix, and where it is assumed that [IA] is invertible. Then the nonhomogeneous equation can be rewritten in homogeneous form in terms of deviations from the steady state:

${\displaystyle \left[\mathbf {x} _{t}-\mathbf {x} ^{*}\right]=\mathbf {A} \left[\mathbf {x} _{t-1}-\mathbf {x} ^{*}\right]}$

## Stability of the first-order case

The first-order matrix difference equation [xtx*] = A[xt−1x*] is stable—that is, xt converges asymptotically to the steady state x*—if and only if all eigenvalues of the transition matrix A (whether real or complex) have an absolute value which is less than 1.

## Solution of the first-order case

Assume that the equation has been put in the homogeneous form yt = Ayt−1. Then we can iterate and substitute repeatedly from the initial condition y0, which is the initial value of the vector y and which must be known in order to find the solution:

{\displaystyle {\begin{aligned}\mathbf {y} _{1}&=\mathbf {A} \mathbf {y} _{0}\\\mathbf {y} _{2}&=\mathbf {A} \mathbf {y} _{1}=\mathbf {A} ^{2}\mathbf {y} _{0}\\\mathbf {y} _{3}&=\mathbf {A} \mathbf {y} _{2}=\mathbf {A} ^{3}\mathbf {y} _{0}\end{aligned}}}

and so forth, so that by mathematical induction the solution in terms of t is

${\displaystyle \mathbf {y} _{t}=\mathbf {A} ^{t}\mathbf {y} _{0}}$

Further, if A is diagonalizable, we can rewrite A in terms of its eigenvalues and eigenvectors, giving the solution as

${\displaystyle \mathbf {y} _{t}=\mathbf {P} \mathbf {D} ^{t}\mathbf {P} ^{-1}\mathbf {y} _{0},}$

where P is an n × n matrix whose columns are the eigenvectors of A (assuming the eigenvalues are all distinct) and D is an n × n diagonal matrix whose diagonal elements are the eigenvalues of A. This solution motivates the above stability result: At shrinks to the zero matrix over time if and only if the eigenvalues of A are all less than unity in absolute value.

## Extracting the dynamics of a single scalar variable from a first-order matrix system

Starting from the n-dimensional system yt = Ayt−1, we can extract the dynamics of one of the state variables, say y1. The above solution equation for yt shows that the solution for y1,t is in terms of the n eigenvalues of A. Therefore the equation describing the evolution of y1 by itself must have a solution involving those same eigenvalues. This description intuitively motivates the equation of evolution of y1, which is

${\displaystyle y_{1,t}=a_{1}y_{1,t-1}+a_{2}y_{1,t-2}+\dots +a_{n}y_{1,t-n}}$

where the parameters ai are from the characteristic equation of the matrix A:

${\displaystyle \lambda ^{n}-a_{1}\lambda ^{n-1}-a_{2}\lambda ^{n-2}-\dots -a_{n}\lambda ^{0}=0.}$

Thus each individual scalar variable of an n-dimensional first-order linear system evolves according to a univariate nth-degree difference equation, which has the same stability property (stable or unstable) as does the matrix difference equation.

## Solution and stability of higher-order cases

Matrix difference equations of higher order—that is, with a time lag longer than one period—can be solved, and their stability analyzed, by converting them into first-order form using a block matrix (matrix of matrices). For example, suppose we have the second-order equation

${\displaystyle \mathbf {x} _{t}=\mathbf {A} \mathbf {x} _{t-1}+\mathbf {B} \mathbf {x} _{t-2}}$

with the variable vector x being n × 1 and A and B being n × n. This can be stacked in the form

${\displaystyle {\begin{bmatrix}\mathbf {x} _{t}\\\mathbf {x} _{t-1}\\\end{bmatrix}}={\begin{bmatrix}\mathbf {A} &\mathbf {B} \\\mathbf {I} &\mathbf {0} \\\end{bmatrix}}{\begin{bmatrix}\mathbf {x} _{t-1}\\\mathbf {x} _{t-2}\end{bmatrix}}}$

where I is the n × n identity matrix and 0 is the n × n zero matrix. Then denoting the 2n × 1 stacked vector of current and once-lagged variables as zt and the 2n × 2n block matrix as L, we have as before the solution

${\displaystyle \mathbf {z} _{t}=\mathbf {L} ^{t}\mathbf {z} _{0}}$

Also as before, this stacked equation, and thus the original second-order equation, are stable if and only if all eigenvalues of the matrix L are smaller than unity in absolute value.

## Nonlinear matrix difference equations: Riccati equations

In linear-quadratic-Gaussian control, there arises a nonlinear matrix equation for the reverse evolution of a current-and-future-cost matrix, denoted below as H. This equation is called a discrete dynamic Riccati equation, and it arises when a variable vector evolving according to a linear matrix difference equation is controlled by manipulating an exogenous vector in order to optimize a quadratic cost function. This Riccati equation assumes the following, or a similar, form:

${\displaystyle \mathbf {H} _{t-1}=\mathbf {K} +\mathbf {A} '\mathbf {H} _{t}\mathbf {A} -\mathbf {A} '\mathbf {H} _{t}\mathbf {C} \left[\mathbf {C} '\mathbf {H} _{t}\mathbf {C} +\mathbf {R} \right]^{-1}\mathbf {C} '\mathbf {H} _{t}\mathbf {A} }$

where H, K, and A are n × n, C is n × k, R is k × k, n is the number of elements in the vector to be controlled, and k is the number of elements in the control vector. The parameter matrices A and C are from the linear equation, and the parameter matrices K and R are from the quadratic cost function. See here for details.

In general this equation cannot be solved analytically for Ht in terms of t; rather, the sequence of values for Ht is found by iterating the Riccati equation. However, it has been shown[3] that this Riccati equation can be solved analytically if R = 0 and n = k + 1, by reducing it to a scalar rational difference equation; moreover, for any k and n if the transition matrix A is nonsingular then the Riccati equation can be solved analytically in terms of the eigenvalues of a matrix, although these may need to be found numerically.[4]

In most contexts the evolution of H backwards through time is stable, meaning that H converges to a particular fixed matrix H* which may be irrational even if all the other matrices are rational. See also Stochastic control § Discrete time.

A related Riccati equation[5] is

${\displaystyle \mathbf {X} _{t+1}=-\left[\mathbf {E} +\mathbf {B} \mathbf {X} _{t}\right]\left[\mathbf {C} +\mathbf {A} \mathbf {X} _{t}\right]^{-1}}$

in which the matrices X, A, B, C, and E are all n × n. This equation can be solved explicitly. Suppose Xt = NtD−1
t
, which certainly holds for t = 0 with N0 = X0 and with D0 = I. Then using this in the difference equation yields

{\displaystyle {\begin{aligned}\mathbf {X} _{t+1}&=-\left[\mathbf {E} +\mathbf {B} \mathbf {N} _{t}\mathbf {D} _{t}^{-1}\right]\mathbf {D} _{t}\mathbf {D} _{t}^{-1}\left[\mathbf {C} +\mathbf {A} \mathbf {N} _{t}\mathbf {D} _{t}^{-1}\right]^{-1}\\&=-\left[\mathbf {E} \mathbf {D} _{t}+\mathbf {B} \mathbf {N} _{t}\right]\left[\left[\mathbf {C} +\mathbf {A} \mathbf {N} _{t}\mathbf {D} _{t}^{-1}\right]\mathbf {D} _{t}\right]^{-1}\\&=-\left[\mathbf {E} \mathbf {D} _{t}+\mathbf {B} \mathbf {N} _{t}\right]\left[\mathbf {C} \mathbf {D} _{t}+\mathbf {A} \mathbf {N} _{t}\right]^{-1}\\&=\mathbf {N} _{t+1}\mathbf {D} _{t+1}^{-1}\end{aligned}}}

so by induction the form Xt = NtD−1
t
holds for all t. Then the evolution of N and D can be written as

${\displaystyle {\begin{bmatrix}\mathbf {N} _{t+1}\\\mathbf {D} _{t+1}\end{bmatrix}}={\begin{bmatrix}-\mathbf {B} &-\mathbf {E} \\\mathbf {A} &\mathbf {C} \end{bmatrix}}{\begin{bmatrix}\mathbf {N} _{t}\\\mathbf {D} _{t}\end{bmatrix}}\equiv \mathbf {J} {\begin{bmatrix}\mathbf {N} _{t}\\\mathbf {D} _{t}\end{bmatrix}}}$

Thus by induction

${\displaystyle {\begin{bmatrix}\mathbf {N} _{t}\\\mathbf {D} _{t}\end{bmatrix}}=\mathbf {J} ^{t}{\begin{bmatrix}\mathbf {N} _{0}\\\mathbf {D} _{0}\end{bmatrix}}}$