“If you stand for nothing, Burr, what will you fall for?”
―Hamilton
Remark15.24.
Given a square matrix \(A\) and polynomial \(g(x) = \sum_i a_i x^i\text{,}\) recall that \(g(A)\) refers to the square matrix \(\sum_i a_i A^i\text{.}\)
Proposition15.25.Ideals and \(g(x)\).
Given a \(n \times n\) matrix \(A\) with entries in a field \(F\text{,}\) the set \(I := \{g(x) \mid g(A) = 0 \}\) forms a non-zero ideal of \(F[x]\text{.}\)
Proof.
\(I\) is an ideal since the result of evaluating the sum of two polynomials \(g(x) + f(x)\) at \(x = A\) is \(g(A) f(A)\text{.}\) the result of evaluating the product \(g(x) f(x)\) at \(x = A\) is \(g(A) f(A)\text{.}\)
To show it is non-zero, consider the matrices \(I_n, A, A^2, \dots, A^{n^2}\text{.}\) This is a collection of \(n^2 + 1\) matrices in the \(n^2\) dimensional \(F\)-vector space \(\Mat_{n \times n}(F)\text{,}\) and hence the must be linearly dependent: there are \(a_0, \dots, a_{n^2+1} \in F\text{,}\) not all of which are \(0\text{,}\) such that \(\sum_i a_i A^i = 0\text{.}\) This proves \(\sum_i a_i x^i \in I\text{.}\)
Definition15.26.Minimum Polynomial of a Matrix.
Let \(F\) be a field and let \(A \in \Mat_{n \times n}(F)\text{.}\) The minimum polynomial of \(A\text{,}\) denoted \(\mp_A(x)\text{,}\) is the unique monic generator of the ideal \(\{g(x) \in F[x] \, \mid \, g(A) = 0\}\text{.}\) Equivalently, \(\mp_A(x)\) is the monic polynomial of least degree such that \(\mp_A(A) = 0\text{.}\)
Given an \(n \times n\) matrix \(A\) and polynomial \(g(x)\in F[x]\text{,}\) we have \(g(A) = 0\) if and only if \(g(x)\) annihilates the \(F[x]\)-module \(F^n_A\text{.}\)
In particular, \(\mp_A(x)\) is the unique monic generator of the annihilator ideal
\begin{equation*}
n_{F[x]}(F^n_A) := \{ g(x) \in F[x] \mid g(x) \cdot v = 0 \text{ for all $v \in F^n_A$}\}.
\end{equation*}
Proof.
If \(g(A) = 0\text{,}\) then for each \(v \in F^n\text{,}\) by definition of the action of \(F[x]\) on \(F^n_A\) we have
\begin{equation*}
g(x) \cdot v = g(A) v = 0
\end{equation*}
and so \(g(x)\) annihilates \(F^n_A\text{.}\) Conversely, if \(g(x)\) annihilates \(F^n_A\text{,}\) then \(g(A) \cdot v = 0\) for all \(v \in F^n\text{.}\) Taking \(v = e_i\) for each \(i\text{,}\) this says that each column of \(g(A)\) is \(0\) and hence \(g(A)\) is the zero matrix.
Definition15.28.Minimum Polynomial.
More generally, let \(V\) be an \(F\)-vector space of dimension \(n\text{,}\) and let \(g: V \to V\) be a linear transformation. The minimum polynomial of \(g\text{,}\) denoted \(\mp_g(x)\text{,}\) is the unique monic polynomial generating the ideal \(\{f(x) \in F[x] \mid f(g) = 0\}\) or, equivalently, the annihilator ideal \({\ann}_{F[x]}(V_g)\text{.}\)
Theorem15.29.Cayley-Hamilton Theorem.
Let \(F\) be a field, \(V\) a finite dimensional \(F\)-vector space, and \(g: V \to V\) an \(F\)-linear operator. Let \(f_1(x), \dots, f_s(x)\) be the invariant factors of \(g\text{.}\)
The product of the invariant factors of \(g\) equals the characteristic polynomial of \(g\text{:}\)
(The Cayley-Hamilton Theorem) The minimum polynomial of \(g\) divides its characteristic polynomial. In particular, \(g\) satisfies its characteristic polynomial:
\begin{equation*}
\cp_g(g) = 0.
\end{equation*}
Proof.
The first assertion is a consequence of Corollary CITEX, since the product of the diagonal elements of the Smith Normal Form of \(xI_n - A\) is equal to the determinant of \(xI_n - A\text{.}\) (Technically, we can only conclude at first that they are only associates, but since each is monic, they must be equal.)
For the second, we use the isomorphism of \(F[x]\)-modules
Note that a polynomial \(p(x)\) annihilates \(F[x]/(f_i(x))\) if and only if \(f_i(x)\) divides \(p(x)\text{.}\) Since \(f_1 \mid \cdots \mid f_s\text{,}\) the annihilator of the \(F[x]\)-module \(F[x]/(f_1(x)) \oplus \cdots \oplus F[x]/(f_s(x))\) is generated by \(f_s(x)\text{.}\) Thus the annihilator of \(V_t\) is also generated by \(f_s(x)\text{,}\) and by the Proposition \(f_s(x)\) is the minimum polynomial of \(t\text{.}\)
The third assertion is an immediate consequence of the first two.
We apply the Cayley-Hamilton Theorem: \(\mp_A(x) \mid \cp_A(x)\text{.}\) The polynomial \(\cp_A(x)\) is easy to compute since this matrix is upper-triangular:
\begin{equation*}
\cp_A(x) =\det(xI_4 - A) = (x-1)^4.
\end{equation*}
So \(\mp_A(x) = (x-1)^j\) for some \(j \leq 4\text{.}\) By brute-force, we verify that \((A-I_4)^3 \ne 0\) and thus it must be the case that \(\mp_A(x) = \cp_A(x) = (x-1)^4\text{.}\)
As in the previous example, \(\cp_A(x) = (x-1)^4\) and so by the Cayley-Hamilton Theorem\(\mp_A(x) = (x-1)^j\) for some \(j \leq 4\text{.}\) This time we notice that \((A-I_4)^2 = 0\) and so, since \((A-I_4) \ne 0\text{,}\)\(\mp_A(x) = \cp_A(x) = (x-1)^2\text{.}\)