BREAKING NEWS
Schur complement

## Summary

In linear algebra and the theory of matrices, the Schur complement of a block matrix is defined as follows.

Suppose p, q are nonnegative integers, and suppose A, B, C, D are respectively p × p, p × q, q × p, and q × q matrices of complex numbers. Let

${\displaystyle M=\left[{\begin{matrix}A&B\\C&D\end{matrix}}\right]}$
so that M is a (p + q) × (p + q) matrix.

If D is invertible, then the Schur complement of the block D of the matrix M is the p × p matrix defined by

${\displaystyle M/D:=A-BD^{-1}C.}$
If A is invertible, the Schur complement of the block A of the matrix M is the q × q matrix defined by
${\displaystyle M/A:=D-CA^{-1}B.}$
In the case that A or D is singular, substituting a generalized inverse for the inverses on M/A and M/D yields the generalized Schur complement.

The Schur complement is named after Issai Schur who used it to prove Schur's lemma, although it had been used previously.[1] Emilie Virginia Haynsworth was the first to call it the Schur complement.[2] The Schur complement is a key tool in the fields of numerical analysis, statistics, and matrix analysis.

## Background

The Schur complement arises when performing a block Gaussian elimination on the matrix M. In order to eliminate the elements below the block diagonal, one multiplies the matrix M by a block lower triangular matrix on the right as follows:

{\displaystyle {\begin{aligned}&M={\begin{bmatrix}A&B\\C&D\end{bmatrix}}\quad \to \quad {\begin{bmatrix}A&B\\C&D\end{bmatrix}}{\begin{bmatrix}I_{p}&0\\-D^{-1}C&I_{q}\end{bmatrix}}={\begin{bmatrix}A-BD^{-1}C&B\\0&D\end{bmatrix}},\end{aligned}}}

where Ip denotes a p×p identity matrix. As a result, the Schur complement ${\displaystyle M/D=A-BD^{-1}C}$  appears in the upper-left p×p block.

Continuing the elimination process beyond this point (i.e., performing a block Gauss–Jordan elimination),

{\displaystyle {\begin{aligned}&{\begin{bmatrix}A-BD^{-1}C&B\\0&D\end{bmatrix}}\quad \to \quad {\begin{bmatrix}I_{p}&-BD^{-1}\\0&I_{q}\end{bmatrix}}{\begin{bmatrix}A-BD^{-1}C&B\\0&D\end{bmatrix}}={\begin{bmatrix}A-BD^{-1}C&0\\0&D\end{bmatrix}},\end{aligned}}}

{\displaystyle {\begin{aligned}M&={\begin{bmatrix}A&B\\C&D\end{bmatrix}}={\begin{bmatrix}I_{p}&BD^{-1}\\0&I_{q}\end{bmatrix}}{\begin{bmatrix}A-BD^{-1}C&0\\0&D\end{bmatrix}}{\begin{bmatrix}I_{p}&0\\D^{-1}C&I_{q}\end{bmatrix}}.\end{aligned}}}

Thus, the inverse of M may be expressed involving D−1 and the inverse of Schur's complement, assuming it exists, as
{\displaystyle {\begin{aligned}M^{-1}={\begin{bmatrix}A&B\\C&D\end{bmatrix}}^{-1}={}&\left({\begin{bmatrix}I_{p}&BD^{-1}\\0&I_{q}\end{bmatrix}}{\begin{bmatrix}A-BD^{-1}C&0\\0&D\end{bmatrix}}{\begin{bmatrix}I_{p}&0\\D^{-1}C&I_{q}\end{bmatrix}}\right)^{-1}\\={}&{\begin{bmatrix}I_{p}&0\\-D^{-1}C&I_{q}\end{bmatrix}}{\begin{bmatrix}\left(A-BD^{-1}C\right)^{-1}&0\\0&D^{-1}\end{bmatrix}}{\begin{bmatrix}I_{p}&-BD^{-1}\\0&I_{q}\end{bmatrix}}\\[4pt]={}&{\begin{bmatrix}\left(A-BD^{-1}C\right)^{-1}&-\left(A-BD^{-1}C\right)^{-1}BD^{-1}\\-D^{-1}C\left(A-BD^{-1}C\right)^{-1}&D^{-1}+D^{-1}C\left(A-BD^{-1}C\right)^{-1}BD^{-1}\end{bmatrix}}\\[4pt]={}&{\begin{bmatrix}\left(M/D\right)^{-1}&-\left(M/D\right)^{-1}BD^{-1}\\-D^{-1}C\left(M/D\right)^{-1}&D^{-1}+D^{-1}C\left(M/D\right)^{-1}BD^{-1}\end{bmatrix}}.\end{aligned}}}

The above relationship comes from the elimination operations that involve D−1 and M/D. An equivalent derivation can be done with the roles of A and D interchanged. By equating the expressions for M−1 obtained in these two different ways, one can establish the matrix inversion lemma, which relates the two Schur complements of M: M/D and M/A (see "Derivation from LDU decomposition" in Woodbury matrix identity § Alternative proofs).

## Properties

• If p and q are both 1 (i.e., A, B, C and D are all scalars), we get the familiar formula for the inverse of a 2-by-2 matrix:
${\displaystyle M^{-1}={\frac {1}{AD-BC}}\left[{\begin{matrix}D&-B\\-C&A\end{matrix}}\right]}$
provided that AD − BC is non-zero.
• In general, if A is invertible, then
{\displaystyle {\begin{aligned}M&={\begin{bmatrix}A&B\\C&D\end{bmatrix}}={\begin{bmatrix}I_{p}&0\\CA^{-1}&I_{q}\end{bmatrix}}{\begin{bmatrix}A&0\\0&D-CA^{-1}B\end{bmatrix}}{\begin{bmatrix}I_{p}&A^{-1}B\\0&I_{q}\end{bmatrix}},\\[4pt]M^{-1}&={\begin{bmatrix}A^{-1}+A^{-1}B(M/A)^{-1}CA^{-1}&-A^{-1}B(M/A)^{-1}\\-(M/A)^{-1}CA^{-1}&(M/A)^{-1}\end{bmatrix}}\end{aligned}}}
whenever this inverse exists.
• (Schur's formula) When A, respectively D, is invertible, the determinant of M is also clearly seen to be given by
${\displaystyle \det(M)=\det(A)\det \left(D-CA^{-1}B\right)}$ , respectively
${\displaystyle \det(M)=\det(D)\det \left(A-BD^{-1}C\right)}$ ,
which generalizes the determinant formula for 2 × 2 matrices.
• (Guttman rank additivity formula) If D is invertible, then the rank of M is given by
${\displaystyle \operatorname {rank} (M)=\operatorname {rank} (D)+\operatorname {rank} \left(A-BD^{-1}C\right)}$
• (Haynsworth inertia additivity formula) If A is invertible, then the inertia of the block matrix M is equal to the inertia of A plus the inertia of M/A.
• (Quotient identity) ${\displaystyle A/B=((A/C)/(B/C))}$ .[3]
• The Schur complement of a Laplacian matrix is also a Laplacian matrix.[4]

## Application to solving linear equations

The Schur complement arises naturally in solving a system of linear equations such as[5]

${\displaystyle {\begin{bmatrix}A&B\\C&D\end{bmatrix}}{\begin{bmatrix}x\\y\end{bmatrix}}={\begin{bmatrix}u\\v\end{bmatrix}}}$ .

Assuming that the submatrix ${\displaystyle A}$  is invertible, we can eliminate ${\displaystyle x}$  from the equations, as follows.

${\displaystyle x=A^{-1}(u-By)}$ .

Substituting this expression into the second equation yields

${\displaystyle \left(D-CA^{-1}B\right)y=v-CA^{-1}u}$ .

We refer to this as the reduced equation obtained by eliminating ${\displaystyle x}$  from the original equation. The matrix appearing in the reduced equation is called the Schur complement of the first block ${\displaystyle A}$  in ${\displaystyle M}$ :

${\displaystyle S\ {\overset {\underset {\mathrm {def} }{}}{=}}\ D-CA^{-1}B}$ .

Solving the reduced equation, we obtain

${\displaystyle y=S^{-1}\left(v-CA^{-1}u\right)}$ .

Substituting this into the first equation yields

${\displaystyle x=\left(A^{-1}+A^{-1}BS^{-1}CA^{-1}\right)u-A^{-1}BS^{-1}v}$ .

We can express the above two equation as:

${\displaystyle {\begin{bmatrix}x\\y\end{bmatrix}}={\begin{bmatrix}A^{-1}+A^{-1}BS^{-1}CA^{-1}&-A^{-1}BS^{-1}\\-S^{-1}CA^{-1}&S^{-1}\end{bmatrix}}{\begin{bmatrix}u\\v\end{bmatrix}}}$ .

Therefore, a formulation for the inverse of a block matrix is:

${\displaystyle {\begin{bmatrix}A&B\\C&D\end{bmatrix}}^{-1}={\begin{bmatrix}A^{-1}+A^{-1}BS^{-1}CA^{-1}&-A^{-1}BS^{-1}\\-S^{-1}CA^{-1}&S^{-1}\end{bmatrix}}={\begin{bmatrix}I_{p}&-A^{-1}B\\&I_{q}\end{bmatrix}}{\begin{bmatrix}A^{-1}&\\&S^{-1}\end{bmatrix}}{\begin{bmatrix}I_{p}&\\-CA^{-1}&I_{q}\end{bmatrix}}}$ .

In particular, we see that the Schur complement is the inverse of the ${\displaystyle 2,2}$  block entry of the inverse of ${\displaystyle M}$ .

In practice, one needs ${\displaystyle A}$  to be well-conditioned in order for this algorithm to be numerically accurate.

In electrical engineering this is often referred to as node elimination or Kron reduction.

## Applications to probability theory and statistics

Suppose the random column vectors X, Y live in Rn and Rm respectively, and the vector (X, Y) in Rn + m has a multivariate normal distribution whose covariance is the symmetric positive-definite matrix

${\displaystyle \Sigma =\left[{\begin{matrix}A&B\\B^{\mathsf {T}}&C\end{matrix}}\right],}$

where ${\textstyle A\in \mathbb {R} ^{n\times n}}$  is the covariance matrix of X, ${\textstyle C\in \mathbb {R} ^{m\times m}}$  is the covariance matrix of Y and ${\textstyle B\in \mathbb {R} ^{n\times m}}$  is the covariance matrix between X and Y.

Then the conditional covariance of X given Y is the Schur complement of C in ${\textstyle \Sigma }$ :[6]

{\displaystyle {\begin{aligned}\operatorname {Cov} (X\mid Y)&=A-BC^{-1}B^{\mathsf {T}}\\\operatorname {E} (X\mid Y)&=\operatorname {E} (X)+BC^{-1}(Y-\operatorname {E} (Y))\end{aligned}}}

If we take the matrix ${\displaystyle \Sigma }$  above to be, not a covariance of a random vector, but a sample covariance, then it may have a Wishart distribution. In that case, the Schur complement of C in ${\displaystyle \Sigma }$  also has a Wishart distribution.[citation needed]

## Conditions for positive definiteness and semi-definiteness

Let X be a symmetric matrix of real numbers given by

${\displaystyle X=\left[{\begin{matrix}A&B\\B^{\mathsf {T}}&C\end{matrix}}\right].}$

Then
• If A is invertible, then X is positive definite if and only if A and its complement X/A are both positive definite:
• ${\displaystyle X\succ 0\Leftrightarrow A\succ 0,X/A=C-B^{\mathsf {T}}A^{-1}B\succ 0.}$ [7]
• If C is invertible, then X is positive definite if and only if C and its complement X/C are both positive definite:
• ${\displaystyle X\succ 0\Leftrightarrow C\succ 0,X/C=A-BC^{-1}B^{\mathsf {T}}\succ 0.}$
• If A is positive definite, then X is positive semi-definite if and only if the complement X/A is positive semi-definite:
• ${\displaystyle {\text{If }}A\succ 0,{\text{ then }}X\succeq 0\Leftrightarrow X/A=C-B^{\mathsf {T}}A^{-1}B\succeq 0.}$ [7]
• If C is positive definite, then X is positive semi-definite if and only if the complement X/C is positive semi-definite:
• ${\displaystyle {\text{If }}C\succ 0,{\text{ then }}X\succeq 0\Leftrightarrow X/C=A-BC^{-1}B^{\mathsf {T}}\succeq 0.}$

The first and third statements can be derived[5] by considering the minimizer of the quantity

${\displaystyle u^{\mathsf {T}}Au+2v^{\mathsf {T}}B^{\mathsf {T}}u+v^{\mathsf {T}}Cv,\,}$

as a function of v (for fixed u).

Furthermore, since

${\displaystyle \left[{\begin{matrix}A&B\\B^{\mathsf {T}}&C\end{matrix}}\right]\succ 0\Longleftrightarrow \left[{\begin{matrix}C&B^{\mathsf {T}}\\B&A\end{matrix}}\right]\succ 0}$

and similarly for positive semi-definite matrices, the second (respectively fourth) statement is immediate from the first (resp. third) statement.

There is also a sufficient and necessary condition for the positive semi-definiteness of X in terms of a generalized Schur complement.[1] Precisely,

• ${\displaystyle X\succeq 0\Leftrightarrow A\succeq 0,C-B^{\mathsf {T}}A^{g}B\succeq 0,\left(I-AA^{g}\right)B=0\,}$  and
• ${\displaystyle X\succeq 0\Leftrightarrow C\succeq 0,A-BC^{g}B^{\mathsf {T}}\succeq 0,\left(I-CC^{g}\right)B^{\mathsf {T}}=0,}$

where ${\displaystyle A^{g}}$  denotes a generalized inverse of ${\displaystyle A}$ .