BREAKING NEWS

## Summary

In linear algebra, the trace of a square matrix A, denoted tr(A), is defined to be the sum of elements on the main diagonal (from the upper left to the lower right) of A. The trace is only defined for a square matrix (n × n).

It can be proved that the trace of a matrix is the sum of its (complex) eigenvalues (counted with multiplicities). It can also be proved that tr(A) = tr(C−1AC), and as a consequence that one can define the trace of a linear operator mapping a vector space into itself since this implies the trace is invariant for similar matrices and not dependent on a choice of basis.

The trace is related to the derivative of the determinant (see Jacobi's formula).

## Definition

The trace of an n × n square matrix A is defined as: 34

$\operatorname {tr} (\mathbf {A} )=\sum _{i=1}^{n}a_{ii}=a_{11}+a_{22}+\dots +a_{nn}$

where aii denotes the entry on the ith row and ith column of A. The entries of A can be real numbers or (more generally) complex numbers. The trace is not defined for non-square matrices.

Expressions like tr(exp(A)), where A is a square matrix, occur so often in some fields (e.g. multivariate statistical theory), that a shorthand notation has become common:

$\operatorname {tre} (A):=\operatorname {tr} (\exp(A)).$

tre is sometimes referred to as the exponential trace function; it is used in the Golden–Thompson inequality.

## Example

Let A be a matrix, with

$\mathbf {A} ={\begin{pmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{pmatrix}}={\begin{pmatrix}1&0&3\\11&5&2\\6&12&-5\end{pmatrix}}$

Then

$\operatorname {tr} (\mathbf {A} )=\sum _{i=1}^{3}a_{ii}=a_{11}+a_{22}+a_{33}=1+5+(-5)=1$

## Properties

### Basic properties

The trace is a linear mapping. That is,

{\begin{aligned}\operatorname {tr} (\mathbf {A} +\mathbf {B} )&=\operatorname {tr} (\mathbf {A} )+\operatorname {tr} (\mathbf {B} )\\\operatorname {tr} (c\mathbf {A} )&=c\operatorname {tr} (\mathbf {A} )\end{aligned}}

for all square matrices A and B, and all scalars c.: 34

A matrix and its transpose have the same trace:: 34

$\operatorname {tr} (\mathbf {A} )=\operatorname {tr} \left(\mathbf {A} ^{\mathsf {T}}\right).$

This follows immediately from the fact that transposing a square matrix does not affect elements along the main diagonal.

### Trace of a product

The trace of a square matrix which is the product of two real matrices can be rewritten as the sum of entry-wise products of their elements, i.e. as the sum of all elements of their Hadamard product. Phrased directly, if A and B are two m × n real matrices, then:

$\operatorname {tr} \left(\mathbf {A} ^{\mathsf {T}}\mathbf {B} \right)=\operatorname {tr} \left(\mathbf {A} \mathbf {B} ^{\mathsf {T}}\right)=\operatorname {tr} \left(\mathbf {B} ^{\mathsf {T}}\mathbf {A} \right)=\operatorname {tr} \left(\mathbf {B} \mathbf {A} ^{\mathsf {T}}\right)=\sum _{i=1}^{m}\sum _{j=1}^{n}a_{ij}b_{ij}\;.$

If one views any m × n real matrix as a vector of length mn (an operation called vectorization) then the above operation on A and B coincides with the standard dot product. According to the above expression, tr(AA) is a sum of squares and hence is nonnegative, equal to zero if and only if A is zero.: 7  Furthermore, as noted in the above formula, tr(AB) = tr(BA). These demonstrate the positive-definiteness and symmetry required of an inner product; it is common to call tr(AB) the Frobenius inner product of A and B. This is a natural inner product on the vector space of all real matrices of fixed dimensions. The norm derived from this inner product is called the Frobenius norm, and it satisfies a submultiplicative property, as can be proven with the Cauchy–Schwarz inequality:

$0\leq \left[\operatorname {tr} (\mathbf {A} \mathbf {B} )\right]^{2}\leq \operatorname {tr} \left(\mathbf {A} ^{2}\right)\operatorname {tr} \left(\mathbf {B} ^{2}\right)\leq \left[\operatorname {tr} (\mathbf {A} )\right]^{2}\left[\operatorname {tr} (\mathbf {B} )\right]^{2}\ .$

if A and B are real positive semi-definite matrices of the same size. The Frobenius inner product and norm arise frequently in matrix calculus and statistics.

The Frobenius inner product may be extended to a hermitian inner product on the complex vector space of all complex matrices of a fixed size, by replacing B by its complex conjugate.

The symmetry of the Frobenius inner product may be phrased more directly as follows: the matrices in the trace of a product can be switched without changing the result. If A and B are m × n and n × m real or complex matrices, respectively, then: 34 [note 1]

$\operatorname {tr} (\mathbf {A} \mathbf {B} )=\operatorname {tr} (\mathbf {B} \mathbf {A} )$

This is notable both for the fact that AB does not usually equal BA, and also since the trace of either does not usually equal tr(A)tr(B).[note 2] The similarity-invariance of the trace, meaning that tr(A) = tr(P−1AP) for any square matrix A and any invertible matrix P of the same dimensions, is a fundamental consequence. This is proved by

$\operatorname {tr} \left(\mathbf {P} ^{-1}(\mathbf {A} \mathbf {P} )\right)=\operatorname {tr} \left((\mathbf {A} \mathbf {P} )\mathbf {P} ^{-1}\right)=\operatorname {tr} (\mathbf {A} ).$

Similarity invariance is the crucial property of the trace in order to discuss traces of linear transformations as below.

Additionally, for real column vectors $\mathbf {a} \in \mathbb {R} ^{n}$  and $\mathbf {b} \in \mathbb {R} ^{n}$ , the trace of the outer product is equivalent to the inner product:

$\operatorname {tr} \left(\mathbf {b} \mathbf {a} ^{\textsf {T}}\right)=\mathbf {a} ^{\textsf {T}}\mathbf {b}$

### Cyclic property

More generally, the trace is invariant under cyclic permutations, that is,

$\operatorname {tr} (\mathbf {A} \mathbf {B} \mathbf {C} \mathbf {D} )=\operatorname {tr} (\mathbf {B} \mathbf {C} \mathbf {D} \mathbf {A} )=\operatorname {tr} (\mathbf {C} \mathbf {D} \mathbf {A} \mathbf {B} )=\operatorname {tr} (\mathbf {D} \mathbf {A} \mathbf {B} \mathbf {C} ).$

This is known as the cyclic property.

Arbitrary permutations are not allowed: in general,

$\operatorname {tr} (\mathbf {A} \mathbf {B} \mathbf {C} )\neq \operatorname {tr} (\mathbf {A} \mathbf {C} \mathbf {B} ).$

However, if products of three symmetric matrices are considered, any permutation is allowed, since:

$\operatorname {tr} (\mathbf {A} \mathbf {B} \mathbf {C} )=\operatorname {tr} \left(\left(\mathbf {A} \mathbf {B} \mathbf {C} \right)^{\mathsf {T}}\right)=\operatorname {tr} (\mathbf {C} \mathbf {B} \mathbf {A} )=\operatorname {tr} (\mathbf {A} \mathbf {C} \mathbf {B} ),$

where the first equality is because the traces of a matrix and its transpose are equal. Note that this is not true in general for more than three factors.

### Trace of a Kronecker product

The trace of the Kronecker product of two matrices is the product of their traces:

$\operatorname {tr} (\mathbf {A} \otimes \mathbf {B} )=\operatorname {tr} (\mathbf {A} )\operatorname {tr} (\mathbf {B} ).$

### Characterization of the trace

The following three properties:

{\begin{aligned}\operatorname {tr} (\mathbf {A} +\mathbf {B} )&=\operatorname {tr} (\mathbf {A} )+\operatorname {tr} (\mathbf {B} ),\\\operatorname {tr} (c\mathbf {A} )&=c\operatorname {tr} (\mathbf {A} ),\\\operatorname {tr} (\mathbf {A} \mathbf {B} )&=\operatorname {tr} (\mathbf {B} \mathbf {A} ),\end{aligned}}

characterize the trace up to a scalar multiple in the sense that follows: If $f$  is a linear functional on the space of square matrices that satisfies $f(xy)=f(yx),$  then $f$  and $\operatorname {tr}$  are proportional.[note 3]

### Trace of product of symmetric and skew-symmetric matrix

If A is symmetric and B is skew-symmetric, then

$\operatorname {tr} (\mathbf {A} \mathbf {B} )=0.$

### Trace as the sum of eigenvalues

Given any n × n real or complex matrix A, there is

$\operatorname {tr} (\mathbf {A} )=\sum _{i=1}^{n}\lambda _{i}$

where λ1, ..., λn are the eigenvalues of A counted with multiplicity. This holds true even if A is a real matrix and some (or all) of the eigenvalues are complex numbers. This may be regarded as a consequence of the existence of the Jordan canonical form, together with the similarity-invariance of the trace discussed above.

### Trace of commutator

When both A and B are n × n matrices, the trace of the (ring-theoretic) commutator of A and B vanishes: tr([A, B]) = 0, because tr(AB) = tr(BA) and tr is linear. One can state this as "the trace is a map of Lie algebras glnk from operators to scalars", as the commutator of scalars is trivial (it is an Abelian Lie algebra). In particular, using similarity invariance, it follows that the identity matrix is never similar to the commutator of any pair of matrices.

Conversely, any square matrix with zero trace is a linear combinations of the commutators of pairs of matrices.[note 4] Moreover, any square matrix with zero trace is unitarily equivalent to a square matrix with diagonal consisting of all zeros.

### Traces of special kinds of matrices

• The trace of the n × n identity matrix is the dimension of the space, namely n.
$\operatorname {tr} \left(\mathbf {I} _{n}\right)=n$

This leads to generalizations of dimension using trace.
• The trace of a Hermitian matrix is real, because the elements on the diagonal are real.
• The trace of a permutation matrix is the number of fixed points off the corresponding permutation, because the diagonal term aii is 1 if the ith point is fixed and 0 otherwise.
• The trace of a projection matrix is the dimension of the target space.
{\begin{aligned}\mathbf {P} _{\mathbf {X} }&=\mathbf {X} \left(\mathbf {X} ^{\mathsf {T}}\mathbf {X} \right)^{-1}\mathbf {X} ^{\mathsf {T}}\\[3pt]\Longrightarrow \operatorname {tr} \left(\mathbf {P} _{\mathbf {X} }\right)&=\operatorname {rank} (\mathbf {X} ).\end{aligned}}

The matrix PX is idempotent.
When the characteristic of the base field is zero, the converse also holds: if tr(Ak) = 0 for all k, then A is nilpotent.
When the characteristic n > 0 is positive, the identity in n dimensions is a counterexample, as $\operatorname {tr} \left(\mathbf {I} _{n}^{k}\right)=\operatorname {tr} \left(\mathbf {I} _{n}\right)=n\equiv 0$ , but the identity is not nilpotent.

## Trace of a linear operator

In general, given some linear map f : VV (where V is a finite-dimensional vector space), we can define the trace of this map by considering the trace of a matrix representation of f, that is, choosing a basis for V and describing f as a matrix relative to this basis, and taking the trace of this square matrix. The result will not depend on the basis chosen, since different bases will give rise to similar matrices, allowing for the possibility of a basis-independent definition for the trace of a linear map.

Such a definition can be given using the canonical isomorphism between the space End(V) of linear maps on V and VV*, where V* is the dual space of V. Let v be in V and let f be in V*. Then the trace of the indecomposable element vf is defined to be f (v); the trace of a general element is defined by linearity. Using an explicit basis for V and the corresponding dual basis for V*, one can show that this gives the same definition of the trace as given above.

### Eigenvalue relationships

If A is a linear operator represented by a square matrix with real or complex entries and if λ1, …, λn are the eigenvalues of A (listed according to their algebraic multiplicities), then

$\operatorname {tr} (\mathbf {A} )=\sum _{i}\lambda _{i}$

This follows from the fact that A is always similar to its Jordan form, an upper triangular matrix having λ1, …, λn on the main diagonal. In contrast, the determinant of A is the product of its eigenvalues; that is,

$\det(\mathbf {A} )=\prod _{i}\lambda _{i}.$

More generally,

$\operatorname {tr} \left(\mathbf {A} ^{k}\right)=\sum _{i}\lambda _{i}^{k}.$

### Derivatives

The trace corresponds to the derivative of the determinant: it is the Lie algebra analog of the (Lie group) map of the determinant. This is made precise in Jacobi's formula for the derivative of the determinant.

As a particular case, at the identity, the derivative of the determinant actually amounts to the trace: tr = det′I. From this (or from the connection between the trace and the eigenvalues), one can derive a connection between the trace function, the exponential map between a Lie algebra and its Lie group (or concretely, the matrix exponential function), and the determinant:

$\det(\exp(\mathbf {A} ))=\exp(\operatorname {tr} (\mathbf {A} )).$

For example, consider the one-parameter family of linear transformations given by rotation through angle θ,

$\mathbf {R} _{\theta }={\begin{pmatrix}\cos \theta &-\sin \theta \\\sin \theta &\cos \theta \end{pmatrix}}.$

These transformations all have determinant 1, so they preserve area. The derivative of this family at θ = 0, the identity rotation, is the antisymmetric matrix

$A={\begin{pmatrix}0&-1\\1&0\end{pmatrix}}$

which clearly has trace zero, indicating that this matrix represents an infinitesimal transformation which preserves area.

A related characterization of the trace applies to linear vector fields. Given a matrix A, define a vector field F on Rn by F(x) = Ax. The components of this vector field are linear functions (given by the rows of A). Its divergence div F is a constant function, whose value is equal to tr(A).

By the divergence theorem, one can interpret this in terms of flows: if F(x) represents the velocity of a fluid at location x and U is a region in Rn, the net flow of the fluid out of U is given by tr(A) · vol(U), where vol(U) is the volume of U.

The trace is a linear operator, hence it commutes with the derivative:

$d\operatorname {tr} (\mathbf {X} )=\operatorname {tr} (d\mathbf {X} ).$

## Applications

The trace of a 2 × 2 complex matrix is used to classify Möbius transformations. First, the matrix is normalized to make its determinant equal to one. Then, if the square of the trace is 4, the corresponding transformation is parabolic. If the square is in the interval [0,4), it is elliptic. Finally, if the square is greater than 4, the transformation is loxodromic. See classification of Möbius transformations.

The trace is used to define characters of group representations. Two representations A, B : GGL(V) of a group G are equivalent (up to change of basis on V) if tr(A(g)) = tr(B(g)) for all gG.

The trace also plays a central role in the distribution of quadratic forms.

## Lie algebra

The trace is a map of Lie algebras $\operatorname {tr} :{\mathfrak {gl}}_{n}\to K$  from the Lie algebra ${\mathfrak {gl}}_{n}$  of linear operators on an n-dimensional space (n × n matrices with entries in $K$ ) to the Lie algebra K of scalars; as K is Abelian (the Lie bracket vanishes), the fact that this is a map of Lie algebras is exactly the statement that the trace of a bracket vanishes:

$\operatorname {tr} ([\mathbf {A} ,\mathbf {B} ])=0{\text{ for each }}\mathbf {A} ,\mathbf {B} \in {\mathfrak {gl}}_{n}.$

The kernel of this map, a matrix whose trace is zero, is often said to be traceless or trace free, and these matrices form the simple Lie algebra ${\mathfrak {sl}}_{n}$ , which is the Lie algebra of the special linear group of matrices with determinant 1. The special linear group consists of the matrices which do not change volume, while the special linear Lie algebra is the matrices which do not alter volume of infinitesimal sets.

In fact, there is an internal direct sum decomposition ${\mathfrak {gl}}_{n}={\mathfrak {sl}}_{n}\oplus K$  of operators/matrices into traceless operators/matrices and scalars operators/matrices. The projection map onto scalar operators can be expressed in terms of the trace, concretely as:

$\mathbf {A} \mapsto {\frac {1}{n}}\operatorname {tr} (\mathbf {A} )\mathbf {I} .$

Formally, one can compose the trace (the counit map) with the unit map $K\to {\mathfrak {gl}}_{n}$  of "inclusion of scalars" to obtain a map ${\mathfrak {gl}}_{n}\to {\mathfrak {gl}}_{n}$  mapping onto scalars, and multiplying by n. Dividing by n makes this a projection, yielding the formula above.

In terms of short exact sequences, one has

$0\to {\mathfrak {sl}}_{n}\to {\mathfrak {gl}}_{n}{\overset {\operatorname {tr} }{\to }}K\to 0$

which is analogous to
$1\to \operatorname {SL} _{n}\to \operatorname {GL} _{n}{\overset {\det }{\to }}K^{*}\to 1$

(where $K^{*}=K\setminus \{0\}$ ) for Lie groups. However, the trace splits naturally (via $1/n$  times scalars) so ${\mathfrak {gl}}_{n}={\mathfrak {sl}}_{n}\oplus K$ , but the splitting of the determinant would be as the nth root times scalars, and this does not in general define a function, so the determinant does not split and the general linear group does not decompose:
$\operatorname {GL} _{n}\neq \operatorname {SL} _{n}\times K^{*}.$

### Bilinear forms

The bilinear form (where X, Y are square matrices)

$B(\mathbf {X} ,\mathbf {Y} )=\operatorname {tr} (\operatorname {ad} (\mathbf {X} )\operatorname {ad} (\mathbf {Y} ))\quad {\text{where }}\operatorname {ad} (\mathbf {X} )\mathbf {Y} =[\mathbf {X} ,\mathbf {Y} ]=\mathbf {X} \mathbf {Y} -\mathbf {Y} \mathbf {X}$

is called the Killing form, which is used for the classification of Lie algebras.

The trace defines a bilinear form:

$(\mathbf {X} ,\mathbf {Y} )\mapsto \operatorname {tr} (\mathbf {X} \mathbf {Y} ).$

The form is symmetric, non-degenerate[note 5] and associative in the sense that:

$\operatorname {tr} (\mathbf {X} [\mathbf {Y} ,\mathbf {Z} ])=\operatorname {tr} ([\mathbf {X} ,\mathbf {Y} ]\mathbf {Z} ).$

For a complex simple Lie algebra (such as ${\mathfrak {sl}}$ n), every such bilinear form is proportional to each other; in particular, to the Killing form.

Two matrices X and Y are said to be trace orthogonal if

$\operatorname {tr} (\mathbf {X} \mathbf {Y} )=0.$

## Generalizations

The concept of trace of a matrix is generalized to the trace class of compact operators on Hilbert spaces, and the analog of the Frobenius norm is called the Hilbert–Schmidt norm.

If K is trace-class, then for any orthonormal basis $(\varphi _{n})_{n}$ , the trace is given by

$\operatorname {tr} (K)=\sum _{n}\left\langle \varphi _{n},K\varphi _{n}\right\rangle ,$

and is finite and independent of the orthonormal basis.

The partial trace is another generalization of the trace that is operator-valued. The trace of a linear operator Z which lives on a product space AB is equal to the partial traces over A and B:

$\operatorname {tr} (Z)=\operatorname {tr} _{A}\left(\operatorname {tr} _{B}(Z)\right)=\operatorname {tr} _{B}\left(\operatorname {tr} _{A}(Z)\right).$

For more properties and a generalization of the partial trace, see traced monoidal categories.

If A is a general associative algebra over a field k, then a trace on A is often defined to be any map tr : Ak which vanishes on commutators[clarification needed]: tr([a,b]) for all a, bA. Such a trace is not uniquely defined; it can always at least be modified by multiplication by a nonzero scalar.

A supertrace is the generalization of a trace to the setting of superalgebras.

The operation of tensor contraction generalizes the trace to arbitrary tensors.

## Traces in the language of tensor products

Given a vector space V, there is a natural bilinear map V × VF given by sending (v, φ) to the scalar φ(v). The universal property of the tensor product VV automatically implies that this bilinear map is induced by a linear functional on VV.

Similarly, there is a natural bilinear map V × V → Hom(V, V) given by sending (v, φ) to the linear map w ↦ φ(w)v. The universal property of the tensor product, just as used previously, says that this bilinear map is induced by a linear map VV → Hom(V, V). If V is finite-dimensional, then this linear map is a linear isomorphism. This fundamental fact is a straightforward consequence of the existence of a (finite) basis of V, and can also be phrased as saying that any linear map VV can be written as the sum of (finitely many) rank-one linear maps. Composing the inverse of the isomorphism with the linear functional obtained above results in a linear functional on Hom(V, V). This linear functional is exactly the same as the trace.

Using the definition of trace as the sum of diagonal elements, the matrix formula tr(AB) = tr(BA) is straightforward to prove, and was given above. In the present perspective, one is considering linear maps S and T, and viewing them as sums of rank-one maps, so that there are linear functionals φi and ψj and nonzero vectors vi and wj such that S(u) = ∑φi(u)vi and T(u) = ∑ψj(u)wj for any u in V. Then

$(S\circ T)(u)=\sum _{i}\varphi _{i}\left(\sum _{j}\psi _{j}(u)w_{j}\right)v_{i}=\sum _{i}\sum _{j}\psi _{j}(u)\varphi _{i}(w_{j})v_{i}$

for any u in V. The rank-one linear map uψj(u)φi(wj)vi has trace ψj(vi)φi(wj) and so

$\operatorname {tr} (S\circ T)=\sum _{i}\sum _{j}\psi _{j}(v_{i})\varphi _{i}(w_{j})=\sum _{j}\sum _{i}\varphi _{i}(w_{j})\psi _{j}(v_{i}).$

Following the same procedure with S and T reversed, one finds exactly the same formula, proving that tr(ST) equals tr(TS).

The above proof can be regarded as being based upon tensor products, given that the fundamental identity of End(V) with VV is equivalent to the expressibility of any linear map as the sum of rank-one linear maps. As such, the proof may be written in the notation of tensor products. Then one may consider the multilinear map V × V × V × VVV given by sending (v, φ, w, ψ) to φ(w)vψ. Further composition with the trace map then results in φ(w)ψ(v), and this is unchanged if one were to have started with (w, ψ, v, φ) instead. One may also consider the bilinear map End(V) × End(V) → End(V) given by sending (f, g) to the composition fg, which is then induced by a linear map End(V) ⊗ End(V) → End(V). It can be seen that this coincides with the linear map VVVVVV. The established symmetry upon composition with the trace map then establishes the equality of the two traces.

For any finite dimensional vector space V, there is a natural linear map FVV'; in the language of linear maps, it assigns to a scalar c the linear map c⋅idV. Sometimes this is called coevaluation map, and the trace VV'F is called evaluation map. These structures can be axiomatized to define categorical traces in the abstract setting of category theory.