Let X be an n×nreal or complexmatrix. The exponential of X, denoted by e^{X} or exp(X), is the n×n matrix given by the power series
$e^{X}=\sum _{k=0}^{\infty }{1 \over k!}X^{k}$
where $X^{0}$ is defined to be the identity matrix $I$ with the same dimensions as $X$.^{[1]}
The above series always converges, so the exponential of X is well-defined. If X is a 1×1 matrix the matrix exponential of X is a 1×1 matrix whose single element is the ordinary exponential of the single element of X.
Properties
Elementary properties
Let X and Y be n×n complex matrices and let a and b be arbitrary complex numbers. We denote the n×nidentity matrix by I and the zero matrix by 0. The matrix exponential satisfies the following properties.^{[2]}
We begin with the properties that are immediate consequences of the definition as a power series:
e^{0} = I
exp(X^{T}) = (exp X)^{T}, where X^{T} denotes the transpose of X.
The proof of this identity is the same as the standard power-series argument for the corresponding identity for the exponential of real numbers. That is to say, as long as $X$ and $Y$ commute, it makes no difference to the argument whether $X$ and $Y$ are numbers or matrices. It is important to note that this identity typically does not hold if $X$ and $Y$ do not commute (see Golden-Thompson inequality below).
Consequences of the preceding identity are the following:
e^{aX}e^{bX} = e^{(a + b)X}
e^{X}e^{−X} = I
Using the above results, we can easily verify the following claims. If X is symmetric then e^{X} is also symmetric, and if X is skew-symmetric then e^{X} is orthogonal. If X is Hermitian then e^{X} is also Hermitian, and if X is skew-Hermitian then e^{X} is unitary.
One of the reasons for the importance of the matrix exponential is that it can be used to solve systems of linear ordinary differential equations. The solution of
${\frac {d}{dt}}y(t)=Ay(t),\quad y(0)=y_{0},$
where A is a constant matrix, is given by
$y(t)=e^{At}y_{0}.\,$
The matrix exponential can also be used to solve the inhomogeneous equation
In addition to providing a computational tool, this formula demonstrates that a matrix exponential is always an invertible matrix. This follows from the fact that the right hand side of the above equation is always non-zero, and so det(e^{A}) ≠ 0, which implies that e^{A} must be invertible.
In the real-valued case, the formula also exhibits the map
to not be surjective, in contrast to the complex case mentioned earlier. This follows from the fact that, for real-valued matrices, the right-hand side of the formula is always positive, while there exist invertible matrices with a negative determinant.
Real symmetric matrices
The matrix exponential of a real symmetric matrix is positive definite. Let $S$ be an n×n real symmetric matrix and $x\in \mathbb {R} ^{n}$ a column vector. Using the elementary properties of the matrix exponential and of symmetric matrices, we have:
Since $e^{S/2}$ is invertible, the equality only holds for $x=0$, and we have $x^{T}e^{S}x>0$ for all non-zero $x$. Hence $e^{S}$ is positive definite.
The exponential of sums
For any real numbers (scalars) x and y we know that the exponential function satisfies e^{x+y} = e^{x}e^{y}. The same is true for commuting matrices. If matrices X and Y commute (meaning that XY = YX), then,
$e^{X+Y}=e^{X}e^{Y}.$
However, for matrices that do not commute the above equality does not necessarily hold.
The Lie product formula
Even if X and Y do not commute, the exponential e^{X + Y} can be computed by the Lie product formula^{[4]}
where the remaining terms are all iterated commutators involving X and Y. If X and Y commute, then all the commutators are zero and we have simply Z = X + Y.
Inequalities for exponentials of Hermitian matrices
For Hermitian matrices there is a notable theorem related to the trace of matrix exponentials.
There is no requirement of commutativity. There are counterexamples to show that the Golden–Thompson inequality cannot be extended to three matrices – and, in any event, tr(exp(A)exp(B)exp(C)) is not guaranteed to be real for Hermitian A, B, C. However, Lieb proved^{[7]}^{[8]} that it can be generalized to three matrices if we modify the expression as follows
The exponential of a matrix is always an invertible matrix. The inverse matrix of e^{X} is given by e^{−X}. This is analogous to the fact that the exponential of a complex number is always nonzero. The matrix exponential then gives us a map
from the space of all n×n matrices to the general linear group of degree n, i.e. the group of all n×n invertible matrices. In fact, this map is surjective which means that every invertible matrix can be written as the exponential of some other matrix^{[9]} (for this, it is essential to consider the field C of complex numbers and not R).
Taking the above expression e^{X(t)} outside the integral sign and expanding the integrand with the help of the Hadamard lemma one can obtain the following useful expression for the derivative of the matrix exponent,^{[11]}
The coefficients in the expression above are different from what appears in the exponential. For a closed form, see derivative of the exponential map.
Computing the matrix exponential
Finding reliable and accurate methods to compute the matrix exponential is difficult, and this is still a topic of considerable current research in mathematics and numerical analysis. Matlab, GNU Octave, and SciPy all use the Padé approximant.^{[12]}^{[13]}^{[14]} In this section, we discuss methods that are applicable in principle to any matrix, and which can be carried out explicitly for small matrices.^{[15]} Subsequent sections describe methods suitable for numerical evaluation on large matrices.
Application of Sylvester's formula yields the same result. (To see this, note that addition and multiplication, hence also exponentiation, of diagonal matrices is equivalent to element-wise addition and multiplication, and hence exponentiation; in particular, the "one-dimensional" exponentiation is felt element-wise for the diagonal case.)
A matrix N is nilpotent if N^{q} = 0 for some integer q. In this case, the matrix exponential e^{N} can be computed directly from the series expansion, as the series terminates after a finite number of terms:
This means that we can compute the exponential of X by reducing to the previous two cases:
$e^{X}=e^{A+N}=e^{A}e^{N}.\,$
Note that we need the commutativity of A and N for the last step to work.
Using the Jordan canonical form
A closely related method is, if the field is algebraically closed, to work with the Jordan form of X. Suppose that X = PJP^{ −1} where J is the Jordan form of X. Then
For a simple rotation in which the perpendicular unit vectors a and b specify a plane,^{[16]} the rotation matrixR can be expressed in terms of a similar exponential function involving a generatorG and angle θ.^{[17]}^{[18]}
The formula for the exponential results from reducing the powers of G in the series expansion and identifying the respective series coefficients of G^{2} and G with −cos(θ) and sin(θ) respectively. The second expression here for e^{Gθ} is the same as the expression for R(θ) in the article containing the derivation of the generator, R(θ) = e^{Gθ}.
In two dimensions, if $a=\left[{\begin{smallmatrix}1\\0\end{smallmatrix}}\right]$ and $b=\left[{\begin{smallmatrix}0\\1\end{smallmatrix}}\right]$, then $G=\left[{\begin{smallmatrix}0&-1\\1&0\end{smallmatrix}}\right]$, $G^{2}=\left[{\begin{smallmatrix}-1&0\\0&-1\end{smallmatrix}}\right]$, and
reduces to the standard matrix for a plane rotation.
The matrix P = −G^{2}projects a vector onto the ab-plane and the rotation only affects this part of the vector. An example illustrating this is a rotation of 30° = π/6 in the plane spanned by a and b,
To prove this, multiply the first of the two above equalities by P(z) and replace z by A.
Such a polynomial Q_{t}(z) can be found as follows−see Sylvester's formula. Letting a be a root of P, Q_{a,t}(z) is solved from the product of P by the principal part of the Laurent series of f at a: It is proportional to the relevant Frobenius covariant. Then the sum S_{t} of the Q_{a,t}, where a runs over all the roots of P, can be taken as a particular Q_{t}. All the other Q_{t} will be obtained by adding a multiple of P to S_{t}(z). In particular, S_{t}(z), the Lagrange-Sylvester polynomial, is the only Q_{t} whose degree is less than that of P.
Example: Consider the case of an arbitrary 2×2 matrix,
$A:={\begin{bmatrix}a&b\\c&d\end{bmatrix}}.$
The exponential matrix e^{tA}, by virtue of the Cayley–Hamilton theorem, must be of the form
$e^{tA}=s_{0}(t)\,I+s_{1}(t)\,A$.
(For any complex number z and any C-algebra B, we denote again by z the product of z by the unit of B.)
Thus, as indicated above, the matrix A having decomposed into the sum of two mutually commuting pieces, the traceful piece and the traceless piece,
$A=sI+(A-sI)~,$
the matrix exponential reduces to a plain product of the exponentials of the two respective pieces. This is a formula often used in physics, as it amounts to the analog of Euler's formula for Pauli spin matrices, that is rotations of the doublet representation of the group SU(2).
The polynomial S_{t} can also be given the following "interpolation" characterization. Define e_{t}(z) ≡ e^{tz}, and n ≡ deg P. Then S_{t}(z) is the unique degree < n polynomial which satisfies S_{t}^{(k)}(a) = e_{t}^{(k)}(a) whenever k is less than the multiplicity of a as a root of P. We assume, as we obviously can, that P is the minimal polynomial of A. We further assume that A is a diagonalizable matrix. In particular, the roots of P are simple, and the "interpolation" characterization indicates that S_{t} is given by the Lagrange interpolation formula, so it is the Lagrange−Sylvester polynomial .
Evaluation by implementation of Sylvester's formula
A practical, expedited computation of the above reduces to the following rapid steps. Recall from above that an n×n matrix exp(tA) amounts to a linear combination of the first n−1 powers of A by the Cayley–Hamilton theorem. For diagonalizable matrices, as illustrated above, e.g. in the 2×2 case, Sylvester's formula yields exp(tA) = B_{α} exp(tα) + B_{β} exp(tβ), where the Bs are the Frobenius covariants of A.
It is easiest, however, to simply solve for these Bs directly, by evaluating this expression and its first derivative at t = 0, in terms of A and I, to find the same answer as above.
But this simple procedure also works for defective matrices, in a generalization due to Buchheim.^{[19]} This is illustrated here for a 4×4 example of a matrix which is not diagonalizable, and the Bs are not projection matrices.
with eigenvalues λ_{1} = 3/4 and λ_{2} = 1, each with a multiplicity of two.
Consider the exponential of each eigenvalue multiplied by t, exp(λ_{i}t). Multiply each exponentiated eigenvalue by the corresponding undetermined coefficient matrix B_{i}. If the eigenvalues have an algebraic multiplicity greater than 1, then repeat the process, but now multiplying by an extra factor of t for each repetition, to ensure linear independence.
(If one eigenvalue had a multiplicity of three, then there would be the three terms: $B_{i_{1}}e^{\lambda _{i}t},~B_{i_{2}}te^{\lambda _{i}t},~B_{i_{3}}t^{2}e^{\lambda _{i}t}$. By contrast, when all eigenvalues are distinct, the Bs are just the Frobenius covariants, and solving for them as below just amounts to the inversion of the Vandermonde matrix of these 4 eigenvalues.)
To solve for all of the unknown matrices B in terms of the first three powers of A and the identity, one needs four equations, the above one providing one such at t = 0. Further, differentiate it with respect to t,
The exponential of a 1×1 matrix is just the exponential of the one entry of the matrix, so exp(J_{1}(4)) = [e^{4}]. The exponential of J_{2}(16) can be calculated by the formula e^{(λI + N)} = e^{λ}e^{N} mentioned above; this yields^{[20]}
The second step is possible due to the fact that, if AB = BA, then e^{At}B = Be^{At}. So, calculating e^{At} leads to the solution to the system, by simply integrating the third step with respect to t.
From before, we already have the general solution to the homogeneous equation. Since the sum of the homogeneous and particular solutions give the general solution to the inhomogeneous problem, we now only need find the particular solution.
which could be further simplified to get the requisite particular solution determined through variation of parameters.
Note c = y_{p}(0). For more rigor, see the following generalization.
Inhomogeneous case generalization: variation of parameters
For the inhomogeneous case, we can use integrating factors (a method akin to variation of parameters). We seek a particular solution of the form y_{p}(t) = exp(tA) z(t),
$P\in \mathbb {C} [X]$ is a monic polynomial of degree n > 0,
f is a continuous complex valued function defined on some open interval I,
$t_{0}$ is a point of I,
$y_{k}$ is a complex number, and
s_{k}(t) is the coefficient of $X^{k}$ in the polynomial denoted by $S_{t}\in \mathbb {C} [X]$ in Subsection Evaluation by Laurent series above.
To justify this claim, we transform our order n scalar equation into an order one vector equation by the usual reduction to a first order system. Our vector equation takes the form
The matrix exponential of another matrix (matrix-matrix exponential),^{[21]} is defined as
$X^{Y}=e^{\log(X)\cdot Y}$
$^{Y}X=e^{Y\cdot \log(X)}$
for any normal and non-singularn×n matrix X, and any complex n×n matrix Y.
For matrix-matrix exponentials, there is a distinction between the left exponential ^{Y}X and the right exponential X^{Y}, because the multiplication operator for matrix-to-matrix is not commutative. Moreover,
If X is normal and non-singular, then X^{Y} and ^{Y}X have the same set of eigenvalues.
If X is normal and non-singular, Y is normal, and XY = YX, then X^{Y} = ^{Y}X.
If X is normal and non-singular, and X, Y, Z commute with each other, then X^{Y+Z} = X^{Y}·X^{Z} and ^{Y+Z}X = ^{Y}X·^{Z}X.
^Bhatia, R. (1997). Matrix Analysis. Graduate Texts in Mathematics. 169. Springer. ISBN 978-0-387-94846-1.
^E. H. Lieb (1973). "Convex trace functions and the Wigner–Yanase–Dyson conjecture". Advances in Mathematics. 11 (3): 267–288. doi:10.1016/0001-8708(73)90011-X.
^H. Epstein (1973). "Remarks on two theorems of E. Lieb". Communications in Mathematical Physics. 31 (4): 317–325. Bibcode:1973CMaPh..31..317E. doi:10.1007/BF01646492. S2CID 120096681.
^Weyl, Hermann (1952). Space Time Matter. Dover. p. 142. ISBN 978-0-486-60267-7.
^Bjorken, James D.; Drell, Sidney D. (1964). Relativistic Quantum Mechanics. McGraw-Hill. p. 22.
^Rinehart, R. F. (1955). "The equivalence of definitions of a matric function". The American Mathematical Monthly, 62 (6), 395-414.
^This can be generalized; in general, the exponential of J_{n}(a) is an upper triangular matrix with e^{a}/0! on the main diagonal, e^{a}/1! on the one above, e^{a}/2! on the next one, and so on.
^Ignacio Barradas and Joel E. Cohen (1994). "Iterated Exponentiation, Matrix-Matrix Exponentiation, and Entropy" (PDF). Academic Press, Inc. Archived from the original (PDF) on 2009-06-26.
Hall, Brian C. (2015), Lie groups, Lie algebras, and representations: An elementary introduction, Graduate Texts in Mathematics, 222 (2nd ed.), Springer, ISBN 978-3-319-13466-6
Horn, Roger A.; Johnson, Charles R. (1991). Topics in Matrix Analysis. Cambridge University Press. ISBN 978-0-521-46713-1..
Suzuki, Masuo (1985). "Decomposition formulas of exponential operators and Lie exponentials with some applications to quantum mechanics and statistical physics". Journal of Mathematical Physics. 26 (4): 601–612. Bibcode:1985JMP....26..601S. doi:10.1063/1.526596.
Curtright, T L; Fairlie, D B; Zachos, C K (2014). "A compact formula for rotations as spin matrix polynomials". Symmetry, Integrability and Geometry: Methods and Applications. 10: 084. arXiv:1402.3541. Bibcode:2014SIGMA..10..084C. doi:10.3842/SIGMA.2014.084. S2CID 18776942.
Householder, Alston S. (2006). The Theory of Matrices in Numerical Analysis. Dover Books on Mathematics. ISBN 978-0-486-44972-2.
Van Kortryk, T. S. (2016). "Matrix exponentials, SU(N) group elements, and real polynomial roots". Journal of Mathematical Physics. 57 (2): 021701. arXiv:1508.05859. Bibcode:2016JMP....57b1701V. doi:10.1063/1.4938418. S2CID 119647937.