11 Advanced Topics of Linear Maps
The diagonalization process we explored previously—expressing matrices as \(P^{-1}DP\) with eigenvector matrix \(P\) and diagonal eigenvalue matrix \(D\)—takes on deeper meaning when viewed through the lens of linear transformations.
This perspective reveals that the same linear transformation can look different depending on our coordinate system. By changing basis, we gain the flexibility to choose the most convenient representation for a given problem, bridging the gap between abstract properties and concrete matrix representations. This powerful viewpoint is essential in both theoretical contexts and applications ranging from computer graphics to quantum mechanics.
11.1 Vector Basis Representations
When working with a vector space \(V\) (often \(\mathbb{R}^n\)), we can represent vectors using different bases. Let’s explore how the coordinate representation of a vector changes when we switch between bases.
Consider a vector space \(V\) with two different bases:
- \(S_1 = \{\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_n\}\)
- \(S_2 = \{\mathbf{w}_1, \mathbf{w}_2, \ldots, \mathbf{w}_n\}\)
For any vector \(\mathbf{v} \in V\), we can express it as a linear combination using either basis:
Using basis \(S_1\): \[\mathbf{v} = c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \ldots + c_n\mathbf{v}_n\]
The coordinate vector with respect to \(S_1\) is: \[[\mathbf{v}]_{S_1} = \begin{bmatrix} c_1 \\ c_2 \\ \vdots \\ c_n \end{bmatrix}\]
Similarly, using basis \(S_2\): \[\mathbf{v} = d_1\mathbf{w}_1 + d_2\mathbf{w}_2 + \ldots + d_n\mathbf{w}_n\]
The coordinate vector with respect to \(S_2\) is: \[[\mathbf{v}]_{S_2} = \begin{bmatrix} d_1 \\ d_2 \\ \vdots \\ d_n \end{bmatrix}\]
The fundamental question is: What is the relationship between \([\mathbf{v}]_{S_1}\) and \([\mathbf{v}]_{S_2}\)?
Theorem 11.1 Suppose that \(V\) is an \(n\)-dimensional vector space with bases \(S_1\) and \(S_2\). Then there exists a unique invertible matrix \(P=P_{S_1 \leftarrow S_2}\) such that for every vector \(\mathbf{v} \in V\) \[[\mathbf{v}]_{S_1} = P[\mathbf{v}]_{S_2}\]
Moreover, this change of basis matrix can be constructed as: \[P_{S_1 \leftarrow S_2} = \begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow\\ [\mathbf{w}_1]_{S_1} & [\mathbf{w}_2]_{S_1} & \cdots & [\mathbf{w}_n]_{S_1} \\ \downarrow & \downarrow & \cdots&\downarrow \end{bmatrix}. \tag{11.1}\]
Proof. Let \(\mathbf{v} \in V\) with representation in basis \(S_2\):
\[\mathbf{v} = d_1\mathbf{w}_1 + d_2\mathbf{w}_2 + \ldots + d_n\mathbf{w}_n \quad \text{where} \quad [\mathbf{v}]_{S_2} = \begin{bmatrix} d_1 \\ d_2 \\ \vdots \\ d_n \end{bmatrix}\]
Now, we need to find \([\mathbf{v}]_{S_1}\). To do that we use Theorem 6.1, that states that finding coordinates is a linear map, and Equation 3.1, that allows to express a linear combination in \(\mathbb{R}^n\) as a product of a matrix and a vector: \[\begin{align} [\mathbf{v}]_{S_1} & = [d_1\mathbf{w}_1 + \ldots + d_n\mathbf{w}_n]_{S_1}\\ & = d_1[\mathbf{w}_1]_{S_1} + \ldots + d_n[\mathbf{w}_n]_{S_1}\\ & = \begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow\\ [\mathbf{w}_1]_{S_1} & [\mathbf{w}_2]_{S_1} & \cdots & [\mathbf{w}_n]_{S_1} \\ \downarrow & \downarrow & \cdots&\downarrow \end{bmatrix} \begin{bmatrix} d_1 \\ d_2 \\ \vdots \\ d_n \end{bmatrix} \end{align}\] Therefore: \[[\mathbf{v}]_{S_1} = P_{S_1 \leftarrow S_2} [\mathbf{v}]_{S_2},\] and \(P_{S_1\leftarrow S_2}\) is given by Equation 11.1
If \(P=P_{S_1\leftarrow S_2}\) is the change of basis matrix from \(S_2\) to \(S_1\), then \(P^{-1}=P_{S_2\leftarrow S_1}\) is the change of basis from \(S_1\) to \(S_2\). We see this easily: from the equation \([\mathbf{v}]_{S_1} = P[\mathbf{v}]_{S_2}\), we deduce that \[[\mathbf{v}]_{S_2} = P^{-1}[\mathbf{v}]_{S_1}\]
11.1.1 Computational Problems
Theorem 11.1 provides not just a theoretical connection but a concrete algorithm for computing the change of basis matrix \(P\). The formula \([\mathbf{v}]_{S_1}=P[\mathbf{v}]_{S_2}\) looks simple but contains all the steps needed to convert coordinates between different bases. Students should pay close attention to this formula as it shows how to solve many types of coordinate conversion problems in one compact expression.
Find the change of basis matrix
Suppose \(S_1\) and \(S_2\) are given. Find the change of basis matrix from \(S_2\) to \(S_1\), \(P_{S_1\leftarrow S_2}\).
Example: Suppose \(S_1 = \{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3\}\) and \(S_2 = \{\mathbf{w}_1, \mathbf{w}_2, \mathbf{w}_3\}\) are bases of \(\mathbb{R}^3\) and we are asked to find \(P_{S_1\leftarrow S_2}\).
From formula Equation 11.1, we know that we need to find \([\mathbf{w}_1]_{S_1}\), \([\mathbf{w}_2]_{S_1}\), and \([\mathbf{w}_3]_{S_1}\).
To find \([\mathbf{w}_i]_{S_1}\), we solve the vector equation: \(x_1 \mathbf{v}_1 + x_2 \mathbf{v}_2 + x_3 \mathbf{v}_3 = \mathbf{w}_i\). We look at the augmented system and we row reduce it: \[ \left[ \begin{array}{ccc|c} \uparrow & \uparrow & \uparrow & \uparrow \\ \mathbf{v}_1 & \mathbf{v}_2 & \mathbf{v}_3 & \mathbf{w}_i \\ \downarrow & \downarrow & \downarrow & \downarrow \end{array}\right] \xrightarrow{\text{RREF}} \left[\begin{array}{ccc|c} 1&0&0&\uparrow \\ 0&1&0 & [\mathbf{w}_i]_{S_1} \\ 0&0&1 &\downarrow \end{array}\right]. \] Since \(\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3\}\) is a basis, the first three columns of the reduced matrix is the identity and the last column gives us the coordinates. We can combine all of them:
\[ \left[ \begin{array}{ccc|ccc} \uparrow & \uparrow & \uparrow & \uparrow & \uparrow & \uparrow\\ \mathbf{v}_1 & \mathbf{v}_2 & \mathbf{v}_3 & \mathbf{w}_1 & \mathbf{w}_2 & \mathbf{w}_3 \\ \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow \end{array}\right] \xrightarrow{\text{RREF}} \left[\begin{array}{ccc|ccc} 1&0&0&\uparrow &\uparrow & \uparrow\\ 0&1&0 & [\mathbf{w}_1]_{S_1} & [\mathbf{w}_2]_{S_1} & [\mathbf{w}_3]_{S_1} \\ 0&0&1 &\downarrow & \downarrow & \downarrow \end{array}\right] \] Therefore, in general, to find \(P_{S_1\leftarrow S_2}\) we row reduce the augmented matrix \([S_1|S_2]\) to get \([I|P_{S_1\leftarrow S_2}]\):
\[[S_1|S_2]\xrightarrow{\text{RREF}} [I|P_{S_1\leftarrow S_2}]\]
Where:
- \(S_1\) is the matrix with columns being the vectors of the first basis
- \(S_2\) is the matrix with columns being the vectors of the second basis
- \(P_{S_1\leftarrow S_2}\) is the change of basis matrix from \(S_2\) to \(S_1\)
Find coordinates
Suppose \(S_1\), \(S_2\) and \(P=P_{S_1\leftarrow S_2}\) are given. If we we know the coordinates of a vector \(\mathbf{v}\in V\) with respect to one basis, find the coordinates with respect to the other basis.
These type of problems are simpler. We just need to pay attention if we multiply by \(P\) or by \(P^{-1}\). Let \(\mathbf{v}\in V\). If we know \([\mathbf{v}]_{S_2}\), then we find \([\mathbf{v}]_{S_1}\) using: \[\mathbf{v}_{S_1}=P[\mathbf{v}]_{S_2}\] On the other hand, if we know \([\mathbf{v}]_{S_1}\) then we find \([\mathbf{v}]_{S_2}\) using: \[\mathbf{v}_{S_2}=P^{-1}[\mathbf{v}]_{S_1}\]
Find elements of a basis
Suppose \(S_1\) and \(P=P_{S_1\leftarrow S_2}\) are given. Find \(S_1\).
Example: Suppose that
- \(S_1 =\left\{ \begin{bmatrix}1\\1\\1\end{bmatrix}, \begin{bmatrix}1\\-1\\0\end{bmatrix}, \begin{bmatrix}1\\1\\-2\end{bmatrix} \right\}\)
- \(S_2 =\{\mathbf{w}_1,\mathbf{w}_2, \mathbf{w}_3\}\), and are bases of \(\mathbf{R^3}\) and that \[ P=P_{S_1\leftarrow S_2}=\begin{bmatrix}1&0&0\\1&1&0\\1&1&1\end{bmatrix} \] is the change of basis matrix from \(S_1\) to \(S_2\). Our task is to find the elements of \(S_2\).
From Equation 11.1 we know that \([\mathbf{w}_1]_{S_1}=(1,1,1)\). Then \[\mathbf{w}_1=1\begin{bmatrix}1\\1\\1\end{bmatrix}+1\begin{bmatrix}1\\-1\\0\end{bmatrix}+1\begin{bmatrix}1\\1\\-2 \end{bmatrix}=\begin{bmatrix}3\\1\\-1\end{bmatrix}\] Notice that we can write this as \[\begin{align}\mathbf{w}_1&=1\begin{bmatrix}1\\1\\1\end{bmatrix}+1\begin{bmatrix}1\\-1\\0\end{bmatrix}+1\begin{bmatrix}1\\1\\-2 \end{bmatrix}\\ &=\begin{bmatrix}1&1&1\\1&-1&1\\1&0&2\end{bmatrix}\begin{bmatrix}1\\1\\1\end{bmatrix} =[S_1][\mathbf{w}_1]_{S_1} \end{align}\] where \([S_1]\) is the \(3\times 3\) matrix with columns being the vectors of \(S_1\).
We can find all the vectors at once \[[S_1]P=[S_1] \begin{bmatrix} \uparrow&\uparrow&\uparrow\\ [\mathbf{w}_1]_{S_1}&[\mathbf{w}_2]_{S_1}&[\mathbf{w}_3]_{S_1}\\ \downarrow&\downarrow&\downarrow\\ \end{bmatrix}= \begin{bmatrix} \uparrow &\uparrow &\uparrow\\ \mathbf{w}_1&\mathbf{w}_2&\mathbf{w}_3\\ \downarrow &\downarrow &\downarrow\\ \end{bmatrix}=[S_2]\] and this works in general.
11.2 Matrix Basis Representation
In Theorem 4.1 we showed that a linear map \(T:\mathbf{R}^n\to\mathbf{R}^n\) can be written as \(T\mathbf{x}=A\mathbf{x}\), where \(A\) is the matrix \[A=\begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow \\ T(\mathbf{e}_1) & T(\mathbf{e}_2) & \cdots & T(\mathbf{e}_n) \\ \downarrow & \downarrow & \cdots & \downarrow \\ \end{bmatrix}\] and \(\{\mathbf{e}_1,\dots\mathbf{e}_n\}\) is the canonical basis for \(\mathbb{R}^n\).
In this section we show that a similar result works for general linear maps on general vector spaces with a fixed basis. The idea of the proof is the same but the result is presented in terms of coordinates.
Theorem 11.2 Suppose that \(V\) is an \(n\) dimensional vector space with basis \(S=\{\mathbf{e}_1,\dots\mathbf{e}_n\}\). If \(T:V\to V\) is a linear map, there exists a unique matrix \(A\) such that for every \(\mathbf{v}\in V\), \[[T\mathbf{v}]_S=A[\mathbf{v}]_S\] Moreover, the matrix representation of \(T\) with respect to the basis \(S\) is given by \[A = \begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow\\ [T\mathbf{v}_1]_S & [T\mathbf{v}_2]_{S} & \cdots & [T\mathbf{v}_n]_{S} \\ \downarrow & \downarrow & \cdots&\downarrow \end{bmatrix}. \tag{11.2}\]
Proof. Let \(\mathbf{v} \in V\) with representation in basis \(S\): \[\mathbf{v} = x_1\mathbf{v}_1 + x_2\mathbf{v}_2 + \ldots + x_n\mathbf{v}_n \quad \text{where} \quad [\mathbf{v}]_{S} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}\] Since \(T\) is linear \[ \begin{align}T\mathbf{v} &=T(x_1\mathbf{v}_1 + x_2\mathbf{v}_2 + \ldots + x_n\mathbf{v}_n)\\ &= x_1T\mathbf{v}_1 + x_2T\mathbf{v}_2 + \ldots + x_nT\mathbf{v}_n. \end{align} \] Then we find coordinates using Theorem 6.1 (finding coordinates is a linear map) and Equation 3.1 to find the matrix representation: \[ \begin{align} [T\mathbf{v}]_S &= [x_1T\mathbf{v}_1 + x_2T\mathbf{v}_2 + \ldots + x_nT\mathbf{v}_n]_S\\ &=x_1[T\mathbf{v}_1]_S + x_2[T\mathbf{v}_2]_S + \ldots + x_n[T\mathbf{v}_n]_S\\ &= \begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow\\ [T\mathbf{v}_1]_{S} & [T\mathbf{v}_2]_{S} & \cdots & [T\mathbf{v}_n]_{S} \\ \downarrow & \downarrow & \cdots&\downarrow \end{bmatrix} \begin{bmatrix}x_1\\x_2\\\vdots\\x_n\end{bmatrix}=A[\mathbf{v}]_S. \end{align} \] This concludes the proof \(\square\)
11.2.1 Computational Problems
Like before, Theorem 11.2 provides not just a theoretical connection but a concrete algorithm for computing the matrix representation \(A\). The formula \([T\mathbf{v}]_{S}=A[\mathbf{v}]_{S}\) contains a lot of information. Students should pay close attention to it.
Find the matrix representation
Suppose that a linear map \(T:\mathbb{R}^n\to\mathbb{R}^n\) and a basis \(S=\{\mathbf{v}_1,\dots,\mathbf{v}_n\}\) of \(\mathbf{R}^n\) are given. Find the matrix representation \(A\).
Example: Suppose \(S = \{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3\}\) is a basis of \(\mathbb{R}^3\) and that \(T:\mathbb{R}^3\to\mathbb{R}^3\) is a linear map. We are asked to find the matrix representation of \(T\) with respect to \(S\), which we call \(A\).
From Theorem 11.2, we know that we need to find \([T\mathbf{v}_1]_{S}\), \([T\mathbf{v}_2]_{S}\), and \([T\mathbf{v}_3]_{S}\).
To find \([T\mathbf{v}_i]_{S}\), we solve the vector equation: \(x_1 \mathbf{v}_1 + x_2 \mathbf{v}_2 + x_3 \mathbf{v}_3 = T\mathbf{v}_i\). We look at the augmented system and we row reduce it: \[ \left[ \begin{array}{ccc|c} \uparrow & \uparrow & \uparrow & \uparrow \\ \mathbf{v}_1 & \mathbf{v}_2 & \mathbf{v}_3 & T\mathbf{v}_i \\ \downarrow & \downarrow & \downarrow & \downarrow \end{array}\right] \xrightarrow{\text{RREF}} \left[\begin{array}{ccc|c} 1&0&0&\uparrow \\ 0&1&0 & [T\mathbf{v}_i]_{S} \\ 0&0&1 &\downarrow \end{array}\right]. \] Since \(\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3\}\) is a basis, the first three columns of the reduced matrix is the identity and the last column gives us the coordinates of \(T\mathbf{v}_i\). We can combine all of them:
\[ \left[ \begin{array}{ccc|ccc} \uparrow & \uparrow & \uparrow & \uparrow & \uparrow & \uparrow\\ \mathbf{v}_1 & \mathbf{v}_2 & \mathbf{v}_3 & T\mathbf{v}_1 & T\mathbf{v}_2 & T\mathbf{v}_3 \\ \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow \end{array}\right] \xrightarrow{\text{RREF}} \left[\begin{array}{ccc|ccc} 1&0&0&\uparrow &\uparrow & \uparrow\\ 0&1&0 & [T\mathbf{v}_1]_{S} & [T\mathbf{v}_2]_{S} & [T\mathbf{v}_3]_{S} \\ 0&0&1 &\downarrow & \downarrow & \downarrow \end{array}\right] \] Therefore, in general, to find \(A\) we row reduce the augmented matrix \([S|T(S)]\) to get \([I|A]\):
\[[S|T(S)]\xrightarrow{\text{RREF}} [I|A]\]
Where:
- \(S\) is the matrix with columns being the vectors of the basis
- \(T(S)\) is the matrix with columns being \(T\) applied to the vectors of the basis
- \(A\) is the matrix representation of \(T\) with respect to \(S\)
Find \(T\mathbf{v}\)
Suppose \(S\) is a basis of \(\mathbb{R}^n\) and that \(A\) is the matrix representation of a linear map \(T:\mathbb{R}^n\to\mathbb{R}^n\). If \(\mathbf{v}\in\mathbb{R}^n\), we need to find \(T\mathbf{v}\)
Example: Suppose that \[S =\left\{ \begin{bmatrix}1\\1\\1\end{bmatrix}, \begin{bmatrix}1\\-1\\0\end{bmatrix}, \begin{bmatrix}1\\1\\-2\end{bmatrix} \right\}\] is a basis of \(\mathbf{R^3}\) and that \(T:\mathbb{R}^3\to\mathbb{R}^3\) is a linear map with matrix represenation with respect to \(S\) given by \[ A=\begin{bmatrix}1&0&0\\1&1&0\\1&1&1\end{bmatrix}.\] Let \(\mathbf{v}=(1,1,0)\). We need to find \(T(1,1,0)\).
From Theorem 11.2 we know that \([T\mathbf{v}]_S=A[\mathbf{v}]_S\). This formula outlines the steps that we need to take. We should not memorize them, we should read them from the formula. We have \(\mathbf{v}\)
- First, we need to find \([\mathbf{v}]_S\)
- Then, multiplying \([\mathbf{v}]_S\) by \(A\) we get \([T\mathbf{v}]_S\)
- Finally we find \(T\mathbf{v}\)
To find \([\mathbf{v}]_S\), we look at the augmented system and row reduce it \[\left[\begin{array}{ccc|c} 1&1&1&1\\1&-1&1&1\\1&0&-2&0 \end{array}\right] \xrightarrow{\text{RREF}} \left[\begin{array}{ccc|c} 1&0&0&2\\0&1&0&0\\0&0&1&-1 \end{array}\right] \] and we get that \([(1,1,0)]_S=(2,0,-1)\). Now we multiply the coordinates by \(A\) to find the coordinates of \(T\mathbf{v}\) \[ [T\mathbf{v}]_S=A[\mathbf{v}]_S =A\begin{bmatrix}1&0&0\\1&1&0\\1&1&1\end{bmatrix} \begin{bmatrix}2\\0\\-1\end{bmatrix}= \begin{bmatrix}2\\2\\1\end{bmatrix} \] Then \[T\mathbf{v}= 2 \begin{bmatrix}1\\1\\1\end{bmatrix}+ 2 \begin{bmatrix}1\\-1\\0\end{bmatrix}+ \begin{bmatrix}1\\1\\-2\end{bmatrix} =\begin{bmatrix}5\\1\\0\end{bmatrix} \]