Jordan Canonical Form

Section 15.3 Jordan Canonical Form

“Success doesn’t stop when you get there.”
―Michael Jordan

We now turn to the Jordan canonical form. To motivate it, let us do an example.

Example 15.32. Companion Matrix and Jordan Blocks.

Let us consider the companion matrix of \((x-2)^3 = x^3 -6x^2 + 12x -8\text{:}\)

\begin{equation*} A = \begin{bmatrix} 0 & 0 & 8 \\ 1 & 0 & -12 \\ 0 & 1 & 6 \end{bmatrix}=C((x-2)^3)\in \M_{3 \times 3}(\Q). \end{equation*}

We can interpret this matrix as arising from the linear transformation \(g\) on

\begin{equation*} V := \Q[x]/(x-2)^3 \end{equation*}

defined as multiplication by \(x\text{.}\) Recall that the ordered basis of \(V\) that gives the matrix \(A\) as:

\begin{equation*} B=\{\ov{1}, \ov{x}, \ov{x^2}\}. \end{equation*}

But notice that

\begin{equation*} B'=\{\ov{1}, \ov{x-2}, \ov{(x-2)^2}\} \end{equation*}

is also a basis of \(V\text{.}\) Let us calculate what the operator \(g\) does to this alternative basis. We could work this out by brute force, but a cleaner way is to first compute what the operator \(g -2 \cdot \id_V\) does. Since \(g - 2 \cdot \id_V\) is multiplication by \(x-2\text{,}\) it sends each basis element to the next one, except for the last one, which is sent to \(0\text{.}\) It follows that the matrix of this operator relative to the ordered basis \(B'\) is

\begin{equation*} [g - 2 \cdot \id_V]_{B'}^{B'} = \begin{bmatrix} 0 & 0 & 0 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \\ \end{bmatrix} \end{equation*}

and hence the matrix for \(g\) itself for this basis is

\begin{equation*} [g]_{B'}^{B'} = \begin{bmatrix} 2 & 0 & 0 \\ 1 & 2 & 0 \\ 0 & 1 & 2 \\\end{bmatrix} =: J_3(2). \end{equation*}

This is what’s known as a Jordan Block.

Remark 15.33.

In this example, if we used the basis \(\{ \ov{(x-2)^2}, \ov{x-2}, \ov{1}\}\) instead we would have gotten the transpose of \(J_3(2)\text{.}\)

Definition 15.34. Jordan Block.

Given a field \(F\text{,}\) and integer \(n \geq 1\text{,}\) and an element \(r\in F\text{,}\) the Jordan block \(J_n(r)\) is the \(n \times n\) with entries in \(F\) such that its diagonal entries are all \(r\text{,}\) each entry just below the diagonal is a \(1\text{,}\) and all other entries are \(0\text{:}\)

\begin{equation*} J_n(r) = \begin{bmatrix} r & 0 & 0 & \cdots & 0 \\ 1 & r & 0 & \cdots & 0 \\ 0 & \ddots & \ddots && \vdots \\ \vdots && \ddots & \ddots & 0\\ 0 & \cdots & 0 & 1 & r \\ \end{bmatrix}. \end{equation*}

(More precisely, \(a_{i,i} = r\) for all \(1 \leq i \leq n\text{,}\) \(a_{i,i-1} = 1\) for all \(2 \leq i \leq n\text{,}\) and \(a_{i,j} = 0\) for all other \(i,j\text{.}\))

Remark 15.35.

Some people (including I think the authors or our text) defined a Jordan block to be the transpose of what I have defined it to be.

Theorem 15.36. Jordan Canonical Form.

Let \(F\) be a field, let \(V\) be a finite dimensional \(F\)-vector space, and let \(g: V \to V\) be a linear transformation satisfying the property that the characteristic polynomial \(\cp_g(x)\) of \(g\) factors completely in \(F[x]\) into linear factors. Then there is an ordered basis \(B\) for \(V\) such that

\begin{equation*} [g]_B^B = J_{e_1}(r_1) \oplus \dots \oplus J_{e_s}(r_s)= \begin{bmatrix} J_{e_1}(r_1) & 0 & 0 & \cdots & 0 \\ 0 & J_{e_2}(r_2) & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & & \vdots \\ 0 & 0 & \cdots & 0 & J_{e_m}(r_m) \\ \end{bmatrix}, \end{equation*}

where \(m\text{,}\) the \(e_i\)’s, and the \(r_i\)’s are such that \((x - r_1)^{e_1}, \ldots, (x - r_m)^{e^m}\) are the elementary divisors of the \(F[x]\)-module \(V_g\text{.}\) Moreover, this matrix is unique up to ordering of the Jordan Blocks.

Proof.

The proof is similar to the proof the RCF theorem, using the idea of Example CITEX above, but starting with the FTFGMPIDEDF CITEX(instead of the FTFGMPIDIFF). Here are the details:

We consider the \(F[x]\)-module \(V_g\text{.}\) Since we assume \(\cp_g(x)\) factors completely, the only irreducible polynomials in its factorization are linear. Thus the invariant factors of \(V_g\) are products of polynomials of the form \((x-r)^e\) for various \(r \in F\) and integers \(e \geq 1\text{.}\) It follows that the elementary divisors have this form too. The FTFGMPIDEDF CITEX therefore gives an isomorphism of \(F[x]\)-modules

\begin{equation*} V_g\cong \frac{F[x]}{(x - r_1)^{e_1}} \oplus \frac{F[x]}{(x -r_2)^{e_2}} \oplus \cdots \oplus\frac{F[x]}{(x - r_s)^{e_s}}. \end{equation*}

Now pick ordered bases \(B_i=\{\ov{1}, \ov{x-r_i}, \ldots, \ov{(x-r_i)^{e_i-1}}\}\) for each of the summands and set \(B\) to be their “ordered union’’ just as we did for the proof of the Theorem on RCF. CITEX By the same argument as in Example CITEX applied to each summand individually, the matrix representing multiplication by \(x\) on each summand is \(J_{e_i}(r_i)\text{.}\) This gives the existence of the JCF.

The uniqueness follows from the uniqueness clause in the FTFGMPIDEDF CITEX.

Warning 15.37.

Not every operator has a Jordan Canonical Form: Theorem 15.36 only applies if \(\cp_g(x)\) factors completely, and, conversely, if an operator \(g\) is represented by any lower-triangular matrix, then its characteristic polynomial must be a product of linear polynomials. For algebraically closed fields, such as \(\C\text{,}\) every linear operator does indeed have a JCF.

Remark 15.38.

If we flip the order of each \(B_i\) in the proof, we would end up with the transpose of what I have defined the JCF to be. This is what our text does, and it defines the JCF as the transpose of what I have defined it to be.

Corollary 15.39. JCF Exists if CharPoly Factors Linearly.

If \(A\) is an \(n \times n\) matrix with entries in a field \(F\) and \(\cp_A(x)\) factors completely into linear factors, then \(A\) is similar to a matrix in Jordan Canonical Form, and this matrix is unique up to the ordering of the Jordan Blocks.

Example 15.40. Finding JCF.

Let us find the Jordan Canonical Form of

\begin{equation*} A = \begin{bmatrix} 1 & -1 & 1 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \\ \end{bmatrix} \in \Mat_{3 \times 3}(\Q). \end{equation*}

We found the Rational Canonical Form of this matrix before. [provisional cross-reference: ] In the process we showed that we have an isomorphism of \(\Q[x]\)-module.

\begin{equation*} \Q^3_A \cong \Q[x]/(x-1) \oplus \Q[x]/(x^2-1). \end{equation*}

By the Sunzi Remainder Theorem

\begin{equation*} \Q[x]/(x-1) \oplus \Q[x]/(x^2-1) \cong \Q[x]/(x-1) \oplus \Q[x]/(x-1) \oplus \Q[x]/(x+1) \end{equation*}

and thus the elementary divisors of \(\Q^3_A\) are \(x-1, x-1, x+1\text{.}\) By Theorem 15.36 this shows that the JCF of \(A\) is

\begin{equation*} J_1(1) \oplus J_1(1) \oplus J_1(-1) = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & -1 \end{bmatrix}. \end{equation*}

Definition 15.41. Diagonalizable.

Let \(V\) be a finite dimensional vector space over a field \(F\) and let \(g: V \to V\) be an \(F\)-linear operator. We say \(g\) is diagonalizable if there is a basis \(B\) for \(V\) such that the matrix \([g]_B^B\) is a diagonal matrix.

Theorem 15.42. Diagonalizable Characterization.

Let \(F\) be a field, let \(V\) be a finite dimensional vector space, and consider a linear transformation \(t: V \rightarrow V\text{.}\) The following are equivalent:

\(t\) is diagonalizable.
\(t\) has a Jordan canonical form \(A\) and \(A\) is a diagonal matrix.
\(t\) has a Jordan canonical form and all elementary divisors are of the form \(x-r\) with \(r \in F\text{.}\)
Each invariant factor of \(t\) is a product of linear polynomials with no repeated linear factors.
The minimal polynomial of \(t\) is a product of linear polynomials with no repeated linear factors.

Proof.

Note that a diagonal matrix is an example of a matrix in JCF. By the uniqueness of the JCF, (1) holds if and only if (2) holds. Moreover, the equivalence of (2) and (3) follows by definition. A matrix has a JCF if and only if its invariant factors factor completely. In this case, the elementary divisors are constructed by decomposing each invariant factor into powers of distinct linear polynomials. This gives that (3) holds if and only if (4) holds. Finally, since the minimal polynomial is one of the invariant factors and every other invariant factor divides it, we get the equivalence between (4) and (5).

Summary

Prev Top Next