Misplaced Pages

Faddeev–LeVerrier algorithm

Article snapshot taken from[REDACTED] with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Urbain Le Verrier (1811–1877)
The discoverer of Neptune.

In mathematics (linear algebra), the Faddeev–LeVerrier algorithm is a recursive method to calculate the coefficients of the characteristic polynomial p A ( λ ) = det ( λ I n A ) {\displaystyle p_{A}(\lambda )=\det(\lambda I_{n}-A)} of a square matrix, A, named after Dmitry Konstantinovich Faddeev and Urbain Le Verrier. Calculation of this polynomial yields the eigenvalues of A as its roots; as a matrix polynomial in the matrix A itself, it vanishes by the Cayley–Hamilton theorem. Computing the characteristic polynomial directly from the definition of the determinant is computationally cumbersome insofar as it introduces a new symbolic quantity λ {\displaystyle \lambda } ; by contrast, the Faddeev-Le Verrier algorithm works directly with coefficients of matrix A {\displaystyle A} .

The algorithm has been independently rediscovered several times in different forms. It was first published in 1840 by Urbain Le Verrier, subsequently redeveloped by P. Horst, Jean-Marie Souriau, in its present form here by Faddeev and Sominsky, and further by J. S. Frame, and others. (For historical points, see Householder. An elegant shortcut to the proof, bypassing Newton polynomials, was introduced by Hou. The bulk of the presentation here follows Gantmacher, p. 88.)

The Algorithm

The objective is to calculate the coefficients ck of the characteristic polynomial of the n×n matrix A,

p A ( λ ) det ( λ I n A ) = k = 0 n c k λ k   , {\displaystyle p_{A}(\lambda )\equiv \det(\lambda I_{n}-A)=\sum _{k=0}^{n}c_{k}\lambda ^{k}~,}

where, evidently, cn = 1 and c0 = (−1) det A.

The coefficients cn-i are determined by induction on i, using an auxiliary sequence of matrices

M 0 0 c n = 1 ( k = 0 ) M k A M k 1 + c n k + 1 I c n k = 1 k t r ( A M k ) k = 1 , , n   . {\displaystyle {\begin{aligned}M_{0}&\equiv 0&c_{n}&=1\qquad &(k=0)\\M_{k}&\equiv AM_{k-1}+c_{n-k+1}I\qquad \qquad &c_{n-k}&=-{\frac {1}{k}}\mathrm {tr} (AM_{k})\qquad &k=1,\ldots ,n~.\end{aligned}}}

Thus,

M 1 = I   , c n 1 = t r A = c n t r A ; {\displaystyle M_{1}=I~,\quad c_{n-1}=-\mathrm {tr} A=-c_{n}\mathrm {tr} A;}
M 2 = A I t r A , c n 2 = 1 2 ( t r A 2 ( t r A ) 2 ) = 1 2 ( c n t r A 2 + c n 1 t r A ) ; {\displaystyle M_{2}=A-I\mathrm {tr} A,\quad c_{n-2}=-{\frac {1}{2}}{\Bigl (}\mathrm {tr} A^{2}-(\mathrm {tr} A)^{2}{\Bigr )}=-{\frac {1}{2}}(c_{n}\mathrm {tr} A^{2}+c_{n-1}\mathrm {tr} A);}
M 3 = A 2 A t r A 1 2 ( t r A 2 ( t r A ) 2 ) I , {\displaystyle M_{3}=A^{2}-A\mathrm {tr} A-{\frac {1}{2}}{\Bigl (}\mathrm {tr} A^{2}-(\mathrm {tr} A)^{2}{\Bigr )}I,}
c n 3 = 1 6 ( ( tr A ) 3 3 tr ( A 2 ) ( tr A ) + 2 tr ( A 3 ) ) = 1 3 ( c n t r A 3 + c n 1 t r A 2 + c n 2 t r A ) ; {\displaystyle c_{n-3}=-{\tfrac {1}{6}}{\Bigl (}(\operatorname {tr} A)^{3}-3\operatorname {tr} (A^{2})(\operatorname {tr} A)+2\operatorname {tr} (A^{3}){\Bigr )}=-{\frac {1}{3}}(c_{n}\mathrm {tr} A^{3}+c_{n-1}\mathrm {tr} A^{2}+c_{n-2}\mathrm {tr} A);}

etc.,   ...;

M m = k = 1 m c n m + k A k 1   , {\displaystyle M_{m}=\sum _{k=1}^{m}c_{n-m+k}A^{k-1}~,}
c n m = 1 m ( c n t r A m + c n 1 t r A m 1 + . . . + c n m + 1 t r A ) = 1 m k = 1 m c n m + k t r A k   ; . . . {\displaystyle c_{n-m}=-{\frac {1}{m}}(c_{n}\mathrm {tr} A^{m}+c_{n-1}\mathrm {tr} A^{m-1}+...+c_{n-m+1}\mathrm {tr} A)=-{\frac {1}{m}}\sum _{k=1}^{m}c_{n-m+k}\mathrm {tr} A^{k}~;...}

Observe A = − Mn /c0 = (−1)Mn/detA terminates the recursion at λ. This could be used to obtain the inverse or the determinant of A.

Derivation

The proof relies on the modes of the adjugate matrix, Bk ≡ Mn−k, the auxiliary matrices encountered.   This matrix is defined by

( λ I A ) B = I   p A ( λ ) {\displaystyle (\lambda I-A)B=I~p_{A}(\lambda )}

and is thus proportional to the resolvent

B = ( λ I A ) 1 I   p A ( λ )   . {\displaystyle B=(\lambda I-A)^{-1}I~p_{A}(\lambda )~.}

It is evidently a matrix polynomial in λ of degree n−1. Thus,

B k = 0 n 1 λ k   B k = k = 0 n λ k   M n k , {\displaystyle B\equiv \sum _{k=0}^{n-1}\lambda ^{k}~B_{k}=\sum _{k=0}^{n}\lambda ^{k}~M_{n-k},}

where one may define the harmless M0≡0.

Inserting the explicit polynomial forms into the defining equation for the adjugate, above,

k = 0 n λ k + 1 M n k λ k ( A M n k + c k I ) = 0   . {\displaystyle \sum _{k=0}^{n}\lambda ^{k+1}M_{n-k}-\lambda ^{k}(AM_{n-k}+c_{k}I)=0~.}

Now, at the highest order, the first term vanishes by M0=0; whereas at the bottom order (constant in λ, from the defining equation of the adjugate, above),

M n A = B 0 A = c 0   , {\displaystyle M_{n}A=B_{0}A=c_{0}~,}

so that shifting the dummy indices of the first term yields

k = 1 n λ k ( M 1 + n k A M n k + c k I ) = 0   , {\displaystyle \sum _{k=1}^{n}\lambda ^{k}{\Big (}M_{1+n-k}-AM_{n-k}+c_{k}I{\Big )}=0~,}

which thus dictates the recursion

M m = A M m 1 + c n m + 1 I   , {\displaystyle \therefore \qquad M_{m}=AM_{m-1}+c_{n-m+1}I~,}

for m=1,...,n. Note that ascending index amounts to descending in powers of λ, but the polynomial coefficients c are yet to be determined in terms of the Ms and A.

This can be easiest achieved through the following auxiliary equation (Hou, 1998),

λ p A ( λ ) λ n p = tr A B   . {\displaystyle \lambda {\frac {\partial p_{A}(\lambda )}{\partial \lambda }}-np=\operatorname {tr} AB~.}

This is but the trace of the defining equation for B by dint of Jacobi's formula,

p A ( λ ) λ = p A ( λ ) m = 0 λ ( m + 1 ) tr A m = p A ( λ )   tr I λ I A tr B   . {\displaystyle {\frac {\partial p_{A}(\lambda )}{\partial \lambda }}=p_{A}(\lambda )\sum _{m=0}^{\infty }\lambda ^{-(m+1)}\operatorname {tr} A^{m}=p_{A}(\lambda )~\operatorname {tr} {\frac {I}{\lambda I-A}}\equiv \operatorname {tr} B~.}

Inserting the polynomial mode forms in this auxiliary equation yields

k = 1 n λ k ( k c k n c k tr A M n k ) = 0   , {\displaystyle \sum _{k=1}^{n}\lambda ^{k}{\Big (}kc_{k}-nc_{k}-\operatorname {tr} AM_{n-k}{\Big )}=0~,}

so that

m = 1 n 1 λ n m ( m c n m + tr A M m ) = 0   , {\displaystyle \sum _{m=1}^{n-1}\lambda ^{n-m}{\Big (}mc_{n-m}+\operatorname {tr} AM_{m}{\Big )}=0~,}

and finally

c n m = 1 m tr A M m   . {\displaystyle \therefore \qquad c_{n-m}=-{\frac {1}{m}}\operatorname {tr} AM_{m}~.}

This completes the recursion of the previous section, unfolding in descending powers of λ.

Further note in the algorithm that, more directly,

M m = A M m 1 1 m 1 ( tr A M m 1 ) I   , {\displaystyle M_{m}=AM_{m-1}-{\frac {1}{m-1}}(\operatorname {tr} AM_{m-1})I~,}

and, in comportance with the Cayley–Hamilton theorem,

adj ( A ) = ( 1 ) n 1 M n = ( 1 ) n 1 ( A n 1 + c n 1 A n 2 + . . . + c 2 A + c 1 I ) = ( 1 ) n 1 k = 1 n c k A k 1   . {\displaystyle \operatorname {adj} (A)=(-1)^{n-1}M_{n}=(-1)^{n-1}(A^{n-1}+c_{n-1}A^{n-2}+...+c_{2}A+c_{1}I)=(-1)^{n-1}\sum _{k=1}^{n}c_{k}A^{k-1}~.}

The final solution might be more conveniently expressed in terms of complete exponential Bell polynomials as

c n k = ( 1 ) n k k ! B k ( tr A , 1 !   tr A 2 , 2 !   tr A 3 , , ( 1 ) k 1 ( k 1 ) !   tr A k ) . {\displaystyle c_{n-k}={\frac {(-1)^{n-k}}{k!}}{\mathcal {B}}_{k}{\Bigl (}\operatorname {tr} A,-1!~\operatorname {tr} A^{2},2!~\operatorname {tr} A^{3},\ldots ,(-1)^{k-1}(k-1)!~\operatorname {tr} A^{k}{\Bigr )}.}

Example

A = [ 3 1 5 3 3 1 4 6 4 ] {\displaystyle {\displaystyle A=\left}}

M 0 = [ 0 0 0 0 0 0 0 0 0 ] c 3 = 1 M 1 = [ 1 0 0 0 1 0 0 0 1 ] A   M 1 = [ 3 1 5 3 3 1 4 6 4 ] c 2 = 1 1 10 = 10 M 2 = [ 7 1 5 3 7 1 4 6 6 ] A   M 2 = [ 2 26 14 8 12 12 6 14 2 ] c 1 = 1 2 ( 8 ) = 4 M 3 = [ 6 26 14 8 8 12 6 14 6 ] A   M 3 = [ 40 0 0 0 40 0 0 0 40 ] c 0 = 1 3 120 = 40 {\displaystyle {\displaystyle {\begin{aligned}M_{0}&=\left\quad &&&c_{3}&&&&&=&1\\M_{\mathbf {\color {blue}1} }&=\left&A~M_{1}&=\left&c_{2}&&&=-{\frac {1}{\mathbf {\color {blue}1} }}\mathbf {\color {red}10} &&=&-10\\M_{\mathbf {\color {blue}2} }&=\left\qquad &A~M_{2}&=\left\qquad &c_{1}&&&=-{\frac {1}{\mathbf {\color {blue}2} }}\mathbf {\color {red}(-8)} &&=&4\\M_{\mathbf {\color {blue}3} }&=\left\qquad &A~M_{3}&=\left\qquad &c_{0}&&&=-{\frac {1}{\mathbf {\color {blue}3} }}\mathbf {\color {red}120} &&=&-40\end{aligned}}}}

Furthermore, M 4 = A   M 3 + c 0   I = 0 {\displaystyle {\displaystyle M_{4}=A~M_{3}+c_{0}~I=0}} , which confirms the above calculations.

The characteristic polynomial of matrix A is thus p A ( λ ) = λ 3 10 λ 2 + 4 λ 40 {\displaystyle {\displaystyle p_{A}(\lambda )=\lambda ^{3}-10\lambda ^{2}+4\lambda -40}} ; the determinant of A is det ( A ) = ( 1 ) 3 c 0 = 40 {\displaystyle {\displaystyle \det(A)=(-1)^{3}c_{0}=40}} ; the trace is 10=−c2; and the inverse of A is

A 1 = 1 c 0   M 3 = 1 40 [ 6 26 14 8 8 12 6 14 6 ] = [ 0 . 15 0 . 65 0 . 35 0 . 20 0 . 20 0 . 30 0 . 15 0 . 35 0 . 15 ] {\displaystyle {\displaystyle A^{-1}=-{\frac {1}{c_{0}}}~M_{3}={\frac {1}{40}}\left=\left}} .

An equivalent but distinct expression

A compact determinant of an m×m-matrix solution for the above Jacobi's formula may alternatively determine the coefficients c,

c n m = ( 1 ) m m ! | tr A m 1 0 0 tr A 2 tr A m 2 0 tr A m 1 tr A m 2 1 tr A m tr A m 1 tr A |   . {\displaystyle c_{n-m}={\frac {(-1)^{m}}{m!}}{\begin{vmatrix}\operatorname {tr} A&m-1&0&\cdots &0\\\operatorname {tr} A^{2}&\operatorname {tr} A&m-2&\cdots &0\\\vdots &\vdots &&&\vdots \\\operatorname {tr} A^{m-1}&\operatorname {tr} A^{m-2}&\cdots &\cdots &1\\\operatorname {tr} A^{m}&\operatorname {tr} A^{m-1}&\cdots &\cdots &\operatorname {tr} A\end{vmatrix}}~.}

See also

References

  1. Urbain Le Verrier: Sur les variations séculaires des éléments des orbites pour les sept planètes principales, J. de Math. (1) 5, 230 (1840), Online
  2. Paul Horst: A method of determining the coefficients of a characteristic equation. Ann. Math. Stat. 6 83-84 (1935), doi:10.1214/aoms/1177732612
  3. Jean-Marie Souriau, Une méthode pour la décomposition spectrale et l'inversion des matrices, Comptes Rend. 227, 1010-1011 (1948).
  4. D. K. Faddeev, and I. S. Sominsky, Sbornik zadatch po vyshej algebra (Problems in higher algebra, Mir publishers, 1972), Moscow-Leningrad (1949). Problem 979.
  5. J. S. Frame: A simple recursion formula for inverting a matrix (abstract), Bull. Am. Math. Soc. 55 1045 (1949), doi:10.1090/S0002-9904-1949-09310-2
  6. Householder, Alston S. (2006). The Theory of Matrices in Numerical Analysis. Dover Books on Mathematics. ISBN 0486449726.
  7. Hou, S. H. (1998). "Classroom Note: A Simple Proof of the Leverrier--Faddeev Characteristic Polynomial Algorithm" SIAM review 40(3) 706-709, doi:10.1137/S003614459732076X .
  8. Gantmacher, F.R. (1960). The Theory of Matrices. NY: Chelsea Publishing. ISBN 0-8218-1376-5.
  9. Zadeh, Lotfi A. and Desoer, Charles A. (1963, 2008). Linear System Theory: The State Space Approach (Mc Graw-Hill; Dover Civil and Mechanical Engineering) ISBN 9780486466637, pp 303–305;
  10. Abdeljaoued, Jounaidi and Lombardi, Henri (2004). Méthodes matricielles - Introduction à la complexité algébrique, (Mathématiques et Applications, 42) Springer, ISBN 3540202471 .
  11. Brown, Lowell S. (1994). Quantum Field Theory, Cambridge University Press. ISBN 978-0-521-46946-3, p. 54; Also see, Curtright, T. L., Fairlie, D. B. and Alshal, H. (2012). "A Galileon Primer", arXiv:1212.6972, section 3.
  12. Reed, M.; Simon, B. (1978). Methods of Modern Mathematical Physics. Vol. 4 Analysis of Operators. USA: ACADEMIC PRESS, INC. pp. 323–333, 340, 343. ISBN 0-12-585004-2.

Barbaresco F. (2019) Souriau Exponential Map Algorithm for Machine Learning on Matrix Lie Groups. In: Nielsen F., Barbaresco F. (eds) Geometric Science of Information. GSI 2019. Lecture Notes in Computer Science, vol 11712. Springer, Cham. https://doi.org/10.1007/978-3-030-26980-7_10

Categories:
Faddeev–LeVerrier algorithm Add topic