3 Matrices

The following video from the Essence of Linear Algebra, from 3Blue1Brown, is exceptionally good. Watch it carefully

Linea Transformations and Matrices

3.1 Matrices: Definition and Different Perspectives

A matrix is a rectangular array of numbers arranged in rows and columns. Formally, a matrix \(A\) with \(m\) rows and \(n\) columns is written as:

\[ A = \begin{bmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} \end{bmatrix}. \] The matrix \(A\) is said to have dimension \(m\times n\). The entry \(a_{i,j}\) represents the element at the \(i\)-th row and \(j\)-th column. This entry is also sometimes denoted as \(A_{i,j}\) or \((A)_{i,j}\).

For example, consider the following matrix:

\[M=\begin{bmatrix}8&6&0&6\\-4&-8&2&-7\\-8&4&-5&3\end{bmatrix}\]

This matrix \(M\) has 3 rows and 4 columns. We can interpret and view this matrix in several ways:

As an array of numbers: We can see \(M\) as a collection of numbers arranged in a rectangular grid with 3 rows and 4 columns. Each entry in the matrix is identified by its row and column index. For example, \(M_{2,3}\) or \((M)_{2,3}\), the entry in the second row and third column is 2.
As a collection of column vectors: We can view \(M\) as having 4 column vectors, each with 3 elements. The columns of \(M\) are:

\[\left [ \begin{bmatrix}8\\-4\\-8\end{bmatrix}, \begin{bmatrix}6\\-8\\4\end{bmatrix}, \begin{bmatrix}0\\2\\-5\end{bmatrix}, \begin{bmatrix}6\\-7\\3\end{bmatrix}\right ]\]

Each column vector can be treated as a separate entity, and matrix operations can be performed on these columns.
As a collection of row vectors: We can view \(M\) as having 3 row vectors, each with 4 elements. The rows of \(M\) are:

\[\biggl [ \begin{bmatrix}8&6&0&6\end{bmatrix}, \begin{bmatrix}-4&-8&2&-7\end{bmatrix}, \begin{bmatrix}-8&4&-5&3\end{bmatrix}\biggr ]\]

Each row vector can be treated as a separate entity, and matrix operations can be performed on these rows.

3.2 Addition and Scalar Multiplication of Matrices

Given two matrices \(A\) and \(B\) of the same size \(m \times n\), the sum of \(A\) and \(B\), denoted as \(A + B\), is a new matrix \(C\) of size \(m \times n\) where each element \(c_{i,j}\) is the sum of the corresponding elements \(a_{i,j}\) and \(b_{i,j}\) from matrices \(A\) and \(B\), respectively. In other words: \[c_{i,j} = a_{i,j} + b_{i,j}\] for all \(1 \leq i \leq m\) and \(1 \leq j \leq n\). For example, if \(A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}\) and \(B = \begin{bmatrix} 5 & 6 \\ 7 & 8 \end{bmatrix}\), then: \[A + B = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} + \begin{bmatrix} 5 & 6 \\ 7 & 8 \end{bmatrix} = \begin{bmatrix} 1+5 & 2+6 \\ 3+7 & 4+8 \end{bmatrix} = \begin{bmatrix} 6 & 8 \\ 10 & 12 \end{bmatrix}\] The scalar multiplication of a matrix \(A\) by a scalar \(k\), denoted as \(kA\), is a new matrix \(B\) of the same size as \(A\), where each element \(b_{i,j}\) is the product of the scalar \(k\) and the corresponding element \(a_{i,j}\) from matrix \(A\). In other words: \[b_{i,j} = k \cdot a_{i,j}\] for all \(1 \leq i \leq m\) and \(1 \leq j \leq n\). For example, if \(A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}\) and \(k = 2\), then: \[2A = 2 \cdot \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} = \begin{bmatrix} 2 \cdot 1 & 2 \cdot 2 \\ 2 \cdot 3 & 2 \cdot 4 \end{bmatrix} = \begin{bmatrix} 2 & 4 \\ 6 & 8 \end{bmatrix}\]

3.2.1 Properties of addition and scalar product

Let \(A\), \(B\), and \(C\) be \(m\times n\) matrices and let \(c\) and \(d\) be scalars. Then we can esily check that

Commutativity of addition: \(A + B = B + A\)
Associativity of addition: \((A + B) + C = A + (B + C)\)
Existence of zero matrix: There exists a matrix \(O\) such that \(A + O = A\) for all matrices \(A\)
Existence of additive inverse: For every matrix \(A\), there exists a matrix \(-A\) such that \(A + (-A) = O\)
Distributivity of scalar multiplication over matrix addition: \(k(A + B) = kA + kB\)
Distributivity of scalar multiplication over field addition: \((k + l)A = kA + lA\)
Associativity of scalar multiplication: \((kl)A = k(lA)\)
Existence of multiplicative identity: \(1A = A\) for all matrices \(A\)

Since we have already shown that if \(A\) and \(B\) are \(m\times n\) matrices and \(k\) is a scalar, then \(A+B\) and \(kA\) are also \(m\times n\) matrices, we can conclude that the set of all \(m\times n\) matrices forms a vector space. This set is commonly denoted using various notations, including: \(M_{m\times n}\), \(\mathbb{M}_{m\times n}\), and \(\mathbb{R}^{m\times n}\).

3.3 The Transpose of a Matrix

Given a matrix \(A\) with \(m\) rows and \(n\) columns, denoted as: \[ A = \begin{bmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} \end{bmatrix}, \] the transpose of matrix \(A\), denoted as \(A^T\) or \(A'\), is obtained by interchanging the rows and columns of \(A\). In other words, the first row of \(A\) becomes the first column of \(A^T\), the second row of \(A\) becomes the second column of \(A^T\), and so on. The resulting matrix \(A^T\) has \(n\) rows and \(m\) columns:

\[ A^T = \begin{bmatrix} a_{1,1} & a_{2,1} & \cdots & a_{m,1}\\ a_{1,2} & a_{2,2} & \cdots & a_{m,2}\\ \vdots & \vdots & \ddots & \vdots\\ a_{1,n} & a_{2,n} & \cdots & a_{m,n}. \end{bmatrix} \]

When \(A\) is represented by columns or by rows, we can easily determine the form of the transpose. \[ \text{If}\quad A = \begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow \\ \mathbf{c}_1 & \mathbf{c}_2 & \cdots & \mathbf{c}_n \\ \downarrow & \downarrow & \cdots & \downarrow \end{bmatrix},\text{ then }\quad A^T=\begin{bmatrix} \leftarrow & \mathbf{c}_1^T &\rightarrow \\ \leftarrow & \mathbf{c}_2^T &\rightarrow\\ \vdots & \vdots &\vdots\\ \leftarrow & \mathbf{c}_n^T &\rightarrow \end{bmatrix}, \] and \[ \text{if}\quad A = \begin{bmatrix} \leftarrow & \mathbf{r}_1 &\rightarrow \\ \leftarrow & \mathbf{r}_2 &\rightarrow\\ \vdots & \vdots &\vdots\\ \leftarrow & \mathbf{r}_m &\rightarrow \end{bmatrix}, \text{ then }\quad A^T = \begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow \\ \mathbf{r}_1^T & \mathbf{r}_2^T & \cdots & \mathbf{r}_n^T \\ \downarrow & \downarrow & \cdots & \downarrow \end{bmatrix}. \]

Notice that if we take the transpose twice returns the original matrix. In other words, \[(A^T)^T=A.\] To check properties of the transpose we use the definition \((A^T)_{i,j}=A_{j,i}\).

Exercise: Suppose that \(A\) and \(B\) are \(m\times n\) matrices and that \(c\in\mathbb{R}\). Prove that \((A+B)^T=A^T+B^T\) and that \((cA)^T=cA^T\).

Click to see a sketch of the proof

\[((A+B)^T)_{i,j}=(A+B)_{j,i}=A_{j,i}+B_{j,i}=(A^T)_{i,j}+(B^T)_{i,j}=(A^T+B^T)_{i,j}.\] The other one is similar

3.4 Matrix-Vector Multiplication

We now cover one of the most important operations in linear algebra: multiplying a matrix with a vector. Let \(A\) be a an \(m\times n\) matrix and \(\mathbf{x}\in\mathbb{R}^n\). The product \(A\mathbf{x}\) will be a vector in \(\mathbb{R}^m\). This operation is crucial because it allows us to:

Transform vectors in space (like rotations, reflections, or scaling)
Solve systems of linear equations in a compact way
Apply linear transformations in computer graphics, data science, and physics

We look at the \(m\times n\) matrix \(A\) in three ways: \[A = \begin{bmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} \end{bmatrix} =\begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow \\ \mathbf{c}_1 & \mathbf{c}_2 & \cdots & \mathbf{c}_n \\ \downarrow & \downarrow & \cdots & \downarrow \end{bmatrix} =\begin{bmatrix} \leftarrow & \mathbf{r}_1 &\rightarrow \\ \leftarrow & \mathbf{r}_2 &\rightarrow \\ \vdots & \vdots &\vdots\\ \leftarrow & \mathbf{r}_m &\rightarrow \end{bmatrix}.\]

3.4.1 A Linear Combination of Columns

We define the product \(A\mathbf{x}\) as a linear combination of the columns of \(A\): \[ A\mathbf{x} = \begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow \\ \mathbf{c}_1 & \mathbf{c}_2 & \cdots & \mathbf{c}_n \\ \downarrow & \downarrow & \cdots & \downarrow \end{bmatrix} \begin{bmatrix}x_1\\x_2\\\vdots\\x_n \end{bmatrix} = x_1\mathbf{c}_1+x_2\mathbf{c_2}+\cdots+x_n\mathbf{c}_n. \tag{3.1}\]

Example: Let \(A=\begin{bmatrix} 1 & 2 & 0 \\ -1 & 3 & 4 \end{bmatrix}\) be a \(2\times 3\) matrix and \(\mathbf{x}=\begin{bmatrix} 2 \\ -1 \\ 3 \end{bmatrix}\) be a vector in \(\mathbb{R}^3\). Then \[ \begin{aligned} A\mathbf{x} &= \begin{bmatrix} 1 & 2 & 0 \\ -1 & 3 & 4 \end{bmatrix} \begin{bmatrix} 2 \\ -1 \\ 3 \end{bmatrix} \\ &=2\begin{bmatrix}1\\-1 \end{bmatrix} + (-1)\begin{bmatrix}2\\3 \end{bmatrix} +3\begin{bmatrix}0\\4 \end{bmatrix} =\begin{bmatrix}0\\6 \end{bmatrix} \end{aligned} \]

Equation (Equation 3.1) is one of the most useful formulas.

It allows us to write matrix multiplications as linear combinations,
It allows us write linear combinations as matrix multiplication.

3.4.2 The Component-wise Formula

From the definition given by (Equation 3.1), we can write \(A\mathbf{x}\) in terms of the \(a_{i,j}\)’s: \[\begin{aligned} A\mathbf{x} &= \begin{bmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} \end{bmatrix} \begin{bmatrix} x_1\\ x_2\\ \vdots\\ x_n \end{bmatrix} \\ &= x_1\begin{bmatrix} a_{1,1}\\ a_{2,1}\\ \vdots\\ a_{m,1} \end{bmatrix} + x_2\begin{bmatrix} a_{1,2}\\ a_{2,2}\\ \vdots\\ a_{m,2} \end{bmatrix} +\cdots + x_n\begin{bmatrix} a_{1,n}\\ a_{2,n}\\ \vdots\\ a_{m,n} \end{bmatrix} \\ % &= \begin{bmatrix} a_{1,1}x_1\\ a_{2,1}x_1\\ \vdots\\ a_{m,1}x_1 \end{bmatrix} % + \begin{bmatrix} a_{1,2}x_2\\ a_{2,2}x_2\\ \vdots\\ a_{m,2}x_2 \end{bmatrix} +\cdots % + \begin{bmatrix} a_{1,n}x_n\\ a_{2,n}x_n\\ \vdots\\ a_{m,n}x_n \end{bmatrix} \\ A\mathbf{x} &= \begin{bmatrix} a_{1,1}x_1+a_{1,2}x_2+\cdots+a_{1,n}x_n\\ a_{2,1}x_1+a_{2,2}x_2+\cdots+a_{2,n}x_n\\ \vdots\\ a_{m,1}x_1+a_{m,2}x_2+\cdots+a_{m,n}x_n\\ \end{bmatrix}. \end{aligned} \]

Therefore, the \(i\)-th component of \(A\mathbf{x}\) is: \[(A\mathbf{x})_i=a_{i,1}x_1+a_{i,2}x_2+\cdots+a_{i,n}x_n \tag{3.2}\]

3.4.3 The Row Dot Product Formula

From (Equation 3.2) we see that the \(i\)-th term of \(A\mathbf{x}\) can be written as a dot product: \[ (A\mathbf{x})_i=a_{i,1}x_1+a_{i,2}x_2+\cdots+a_{i,n}x_n =\begin{bmatrix} a_{i,1}\\ a_{i,2}\\\vdots \\ a_{i,n}\end{bmatrix}\cdot \begin{bmatrix} x_1\\ x_2\\\vdots \\ x_n\end{bmatrix} =\mathbf{r}_i^T\cdot\mathbf{x}.\] Recall that vectors in \(\mathbb{R}^n\) are represented by column vectors, and that the first vector is the transpose of the \(i\)-th row of \(A\). Putting all the compunents togther, we get: \[ A\mathbf{x}=\begin{bmatrix} \leftarrow & \mathbf{r}_1 &\rightarrow \\ \leftarrow & \mathbf{r}_2 &\rightarrow \\ \vdots & \vdots &\vdots\\ \leftarrow & \mathbf{r}_m &\rightarrow \end{bmatrix} \begin{bmatrix} x_1\\ x_2\\\vdots \\ x_n\end{bmatrix} =\begin{bmatrix} \mathbf{r}_1^T\cdot\mathbf{x} \\ \mathbf{r}_2^T\cdot\mathbf{x}\\ \vdots \\ \mathbf{r}_n^T\cdot\mathbf{x}\end{bmatrix}. \tag{3.3}\]

3.5 Matrix-Matrix Multiplication

Matrix-matrix multiplication, like matrix-vector multiplication, requires compatibility between the dimensions of the matrices involved. For the product \(AB\) to be defined, the number of columns in \(A\) must equal the number of rows in \(B\). When this condition is met, the matrices are said to be compatible for multiplication. Specifically, if \(A\) is an \(m \times n\) matrix and \(B\) is an \(n \times p\) matrix, their product \(AB\) will be an \(m \times p\) matrix.

3.5.1 The Product \(AB\): \(A\) Acts on the Columns of \(B\)

Let \(A\) be an \(m \times n\) matrix and \(B\) an \(n \times p\) matrix. Since the number of columns in \(A\) matches the number of rows in \(B\), the matrices are compatible for multiplication. To define the product \(AB\), we express \(B\) in terms of its column vectors and let \(A\) act on each column individually. Specifically,

\[ AB = A \begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow \\ \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_p \\ \downarrow & \downarrow & \cdots & \downarrow \end{bmatrix} = \begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow \\ A\mathbf{b}_1 & A\mathbf{b}_2 & \cdots & A\mathbf{b}_p \\ \downarrow & \downarrow & \cdots & \downarrow \end{bmatrix}. \tag{3.4}\]

Notice that \(AB\) consists of \(p\) columns, where each column \(A\mathbf{c}_i \in \mathbb{R}^m\). Therefore, \(AB\) is an \(m \times p\) matrix.

3.5.2 The Component Formula

We use the previous formula to compute the individual entries of \(AB\). Consider \((AB)_{i,j}\). This is the element of \(AB\) in the \(i\)-th row and \(j\)-th column. Since the \(j\)-th column of \(AB\) is \(A\mathbf{b}_j\), it follows from (Equation 3.2) that \((AB)_{i,j}=(A\mathbf{b}_j)_i= \sum_{k=1}^na_{i,k}b_{k,j}.\) Then we have \[(AB)_{i,j}=\sum_{k=1}^na_{i,k}b_{k,j}. \tag{3.5}\]

3.5.3 The Row-Column Dot Product Formula

From (Equation 3.5), it follows that \((AB)_{i,j}\) is the dot product of the \(i\)-th row of \(A\) and the \(j\)-th column of \(B\). Writing \(A\) in terms of its rows and \(B\) in terms of its columns, we have:

\[ AB = \begin{bmatrix} \leftarrow & \mathbf{r}_1 & \rightarrow \\ \leftarrow & \mathbf{r}_2 & \rightarrow \\ \vdots & \vdots & \vdots \\ \leftarrow & \mathbf{r}_m & \rightarrow \end{bmatrix} \begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow \\ \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_p \\ \downarrow & \downarrow & \cdots & \downarrow \end{bmatrix} = \begin{bmatrix} \mathbf{r}_1^T \cdot \mathbf{b}_1 & \mathbf{r}_1^T \cdot \mathbf{b}_2 & \cdots & \mathbf{r}_1^T \cdot \mathbf{b}_p \\ \mathbf{r}_2^T \cdot \mathbf{b}_1 & \mathbf{r}_2^T \cdot \mathbf{b}_2 & \cdots & \mathbf{r}_2^T \cdot \mathbf{b}_p \\ \vdots & \vdots & \ddots & \vdots \\ \mathbf{r}_m^T \cdot \mathbf{b}_1 & \mathbf{r}_m^T \cdot \mathbf{b}_2 & \cdots & \mathbf{r}_m^T \cdot \mathbf{b}_p \end{bmatrix}. \tag{3.6}\]

Here, \(\mathbf{r}_i\) represents the \(i\)-th row of \(A\), and \(\mathbf{b}_j\) represents the \(j\)-th column of \(B\). Each entry of \(AB\), denoted \((AB)_{i,j}\), is the dot product \(\mathbf{r}_i^T \cdot \mathbf{b}_j\).

3.6 The Transpose of the Product

An important property of matrix multiplication is that the transpose of a product is the product of the transposes in reverse order. This relation is fundamental in many areas of linear algebra, from proving theoretical results about linear transformations to solving practical problems in optimization and data analysis.

3.6.0.1 Theorem

Let \(A\) be an \(m \times n\) matrix and \(B\) an \(n \times p\) matrix. Then:
\[(AB)^T = B^T A^T.\]

Proof. The product \(AB\) is an \(m \times p\) matrix, so its transpose \((AB)^T\) is a \(p \times m\) matrix. Similarly, \(B^T\) is a \(p \times n\) matrix, and \(A^T\) is an \(n \times m\) matrix. Thus, the product \(B^T A^T\) also has dimensions \(p \times m\), matching those of \((AB)^T\).

To prove the equality, we verify that the entries of \((AB)^T\) and \(B^T A^T\) are identical. Consider the \((i, j)\)-th entry of \((AB)^T\):
\[( (AB)^T )_{i,j} = (AB)_{j,i}.\]
Using the definition of matrix multiplication, we expand \((AB)_{j,i}\):
\[(AB)_{j,i} = \sum_{k=1}^n A_{j,k} B_{k,i}.\]
Next, observe that \((B^T)_{i,k} = B_{k,i}\) and \((A^T)_{k,j} = A_{j,k}\). Substituting these into the sum, we get:
\[(AB)_{j,i} = \sum_{k=1}^n (B^T)_{i,k} (A^T)_{k,j}.\]
Therefore,
\[((AB)^T)_{i,j} = (B^T A^T)_{i,j}.\]

Since the \((i, j)\)-th entries of \((AB)^T\) and \(B^T A^T\) are equal for all \(i\) and \(j\), we conclude that:
\[(AB)^T = B^T A^T.\]

Click to see a proof that uses the row column dot product formula

Write \(A\) and \(B\) in terms of their rows and columns \[A=\begin{bmatrix} \leftarrow & \mathbf{r}_1 & \rightarrow \\ \leftarrow & \mathbf{r}_2 & \rightarrow \\ \vdots & \vdots & \vdots \\ \leftarrow & \mathbf{r}_m & \rightarrow \end{bmatrix} \quad\quad B=\begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow \\ \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_p \\ \downarrow & \downarrow & \cdots & \downarrow \end{bmatrix}\] Then \[B^T = \begin{bmatrix} \leftarrow &\mathbf{b}_1^T&\rightarrow \\ \leftarrow& \mathbf{b}_2^T&\rightarrow \\ \vdots &\vdots&\vdots \\\leftarrow & \mathbf{b}_p^T &\rightarrow \end{bmatrix}\quad\quad A^T =\begin{bmatrix} \uparrow & \uparrow & \cdots & \uparrow \\ \mathbf{r}_1^T & \mathbf{r}_2^T & \cdots & \mathbf{r}_m^T \\ \downarrow & \downarrow & \cdots & \downarrow \end{bmatrix}.\] Therefore it follows from the row column dot product formula that \[(B^TA^T)_{i,j} = (\mathbf{b}_i^T)^T\cdot\mathbf{r}_j^T=\mathbf{b}_i\cdot\mathbf{r}_j^T =\mathbf{r}_j^T\cdot\mathbf{b}_i=(AB)_{j,i}=((AB)^T)_{i,j}.\]

3.7 Matrix Types and the Inverse

Matrices come in various types, each with unique properties that make them fundamental to linear algebra and its applications. Among these, the following play a central role.

Square Matrices

A square matrix is a matrix that has an equal number of rows and columns. The collection of all \(n \times n\) square matrices is denoted by \(M_n\). This set is closed under several operations: if \(A, B \in M_n\), their product \(AB\) also belongs to \(M_n\). Similarly, any power of \(A\), such as \(A^k\) for a positive integer \(k\), remains in \(M_n\), as does the transpose of \(A\).

Diagonal Matrices

A diagonal matrix is a square matrix in which all off-diagonal entries are zero. Formally, a matrix \(D \in M_n\) is diagonal if \((D)_{i,j} = 0\) for all \(i \neq j\). The only potentially nonzero entries are located along the main diagonal, from the top-left to the bottom-right. Diagonal matrices are significant because they are easy to work with: addition, multiplication, and finding powers are straightforward operations when the matrices are diagonal.

Upper and Lower Triangular Matrices

An upper triangular matrix is a square matrix in which all entries below the main diagonal are zero, meaning \((U)_{i,j} = 0\) for all \(i > j\). Similarly, a lower triangular matrix has all entries above the main diagonal equal to zero, i.e., \((L)_{i,j} = 0\) for all \(i < j\). These matrices are commonly used in matrix factorizations, and solving systems of linear equations efficiently. Both types are particularly important in numerical methods, as their structure reduces computational complexity in many algorithms.

Symmetric Matrices

A square matrix is symmetric if it is equal to its transpose, meaning \(A^T=A\) or, equivalently, \(A_{i,j}=A_{j,i}\) for all \(i,j\). Symmetric matrices play a crucial role in various fields due to their significant orthogonal properties and frequent appearance in science and engineering applications.

Identity Matrices

An identity matrix is a special type of diagonal matrix where all the diagonal entries are 1, and all off-diagonal entries are 0. It is denoted as \(I_n\) for an \(n \times n\) matrix. Formally, \((I_n)_{i,j} = 1\) if \(i = j\) and \((I_n)_{i,j} = 0\) if \(i \neq j\).

The identity matrix serves as the multiplicative identity in matrix multiplication. Specifically, if \(A\) is an \(n \times p\) matrix, then \(I_n A = A\). Similarly, if \(B\) is an \(m \times n\) matrix, then \(B I_n = B\).

Inverse of a Matrix

The inverse of a matrix is a concept that applies to square matrices. A square matrix \(A\) is said to be invertible (or nonsingular) if there exists another matrix \(A^{-1}\) such that:
\[A A^{-1} = A^{-1} A = I_n,\]
where \(I_n\) is the identity matrix. The matrix \(A^{-1}\) is called the inverse of \(A\).

Not all square matrices have an inverse. In practical applications, matrix inverses are used to solve systems of linear equations, analyze transformations, and compute solutions in various scientific and engineering contexts. However, for large matrices, explicit inversion is computationally expensive, and alternative methods, such as iterative techniques, are often preferred.

3.8 Elementary Matrices: Row Operations

Elementary matrices are special square matrices that perform row operations through matrix multiplication. They play an important role in linear algebra, particularly in solving systems of linear equations, characterizing invertible matrices, and understanding and computing determinants. There are three types:

Type 1: Switching two rows
Type 2: Multiplying a row by a non-zero constant
Type 3: Adding a multiple of a row to another row

Let’s illustrate each type with 3×3 elementary matrices acting on a generic 3×5 matrix:

Let A be a 3×5 matrix: \(A = \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & a_{1,4} & a_{1,5}\\ a_{2,1} & a_{2,2} & a_{2,3} & a_{2,4} & a_{2,5}\\ a_{3,1} & a_{3,2} & a_{3,3} & a_{3,4} & a_{3,5} \end{bmatrix}\)

3.8.1 Example 1: Interchanging Rows 1 and 2

\(\begin{aligned} E_1A &= \begin{bmatrix} 0 & 1 & 0\\ 1 & 0 & 0\\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & a_{1,4} & a_{1,5}\\ a_{2,1} & a_{2,2} & a_{2,3} & a_{2,4} & a_{2,5}\\ a_{3,1} & a_{3,2} & a_{3,3} & a_{3,4} & a_{3,5} \end{bmatrix} \\ &=\begin{bmatrix} a_{2,1} & a_{2,2} & a_{2,3} & a_{2,4} & a_{2,5}\\ a_{1,1} & a_{1,2} & a_{1,3} & a_{1,4} & a_{1,5}\\ a_{3,1} & a_{3,2} & a_{3,3} & a_{3,4} & a_{3,5} \end{bmatrix} \end{aligned}\)

3.8.2 Example 2: Multiplying Row 3 by 2

\(\begin{aligned} E_2A &= \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 2 \end{bmatrix} \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & a_{1,4} & a_{1,5}\\ a_{2,1} & a_{2,2} & a_{2,3} & a_{2,4} & a_{2,5}\\ a_{3,1} & a_{3,2} & a_{3,3} & a_{3,4} & a_{3,5} \end{bmatrix} \\ &=\begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & a_{1,4} & a_{1,5}\\ a_{2,1} & a_{2,2} & a_{2,3} & a_{2,4} & a_{2,5}\\ 2a_{3,1} & 2a_{3,2} & 2a_{3,3} & 2a_{3,4} & 2a_{3,5} \end{bmatrix} \end{aligned}\)

3.8.3 Example 3: Adding 3 Times Row 1 to Row 2

\(\begin{aligned} E_3A &= \begin{bmatrix} 1 & 0 & 0\\ 3 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & a_{1,4} & a_{1,5}\\ a_{2,1} & a_{2,2} & a_{2,3} & a_{2,4} & a_{2,5}\\ a_{3,1} & a_{3,2} & a_{3,3} & a_{3,4} & a_{3,5} \end{bmatrix} \\ &=\begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} & a_{1,4} & a_{1,5}\\ 3a_{1,1}+a_{2,1} & 3a_{1,2}+a_{2,2} & 3a_{1,3}+a_{2,3} & 3a_{1,4}+a_{2,4} & 3a_{1,5}+a_{2,5}\\ a_{3,1} & a_{3,2} & a_{3,3} & a_{3,4} & a_{3,5} \end{bmatrix} \end{aligned}\)

Note that each elementary matrix is invertible, and its inverse performs the opposite operation:

For \(E_1\): its own inverse (swapping the same rows again)
For \(E_2\): multiply the third row by 1/2
For \(E_3\): subtract 3 times row 1 from row 2

We saw in (Equation 3.1) that when we multiply a matrix \(A\) by a vector \(\mathbf{x}\), the product \(A\mathbf{x}\) is a linear combination of the columns of \(A\). Similarly, when we multiply by a row vector \(\mathbf{z}\) from the left, the product \(\mathbf{z}A\) is a linear combination of the rows of \(A\). This fundamental principle helps us understand elementary matrices: when we multiply a matrix \(A\) by an elementary matrix \(E\) on the left, each row of the product \(EA\) is a linear combination of the rows of \(A\), precisely implementing our desired row operation.

3.9 Matrices in Numpy

This section covers fundamental matrix operations using NumPy’s ndarray class. We’ll explore creation, indexing, and basic mathematical operations.

3.9.1 Setup

First, let’s import NumPy:

import numpy as np

3.9.2 Creating Matrices

NumPy provides several ways to create matrices using ndarrays:

# From a list of lists
A = np.array([[1, 2, 3],
              [4, 5, 6]])
print("Matrix A:")
print(A)

# Using special functions
zeros = np.zeros((2, 3))    # 2x3 matrix of zeros
ones = np.ones((3, 3))      # 3x3 matrix of ones
eye = np.eye(3)             # 3x3 identity matrix

print("\nZeros matrix:")
print(zeros)
print("\nIdentity matrix:")
print(eye)

Matrix A:
[[1 2 3]
 [4 5 6]]

Zeros matrix:
[[0. 0. 0.]
 [0. 0. 0.]]

Identity matrix:
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

3.9.3 Matrix Properties and Shape

The shape attribute tells us the dimensions of the matrix:

A = np.array([[1, 2, 3],
              [4, 5, 6]])
print(f"Shape: {A.shape}")
print(f"Number of dimensions: {A.ndim}")
print(f"Size: {A.size}")

Shape: (2, 3)
Number of dimensions: 2
Size: 6

3.9.4 Indexing and Slicing

NumPy provides powerful ways to access matrix elements:

A = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

# Individual elements
print("First element:", A[0, 0])
print("Second row, third column:", A[1, 2])

# Extracting rows
print("\nFirst row:", A[0, :])
print("Second row:", A[1])  # : is implicit

# Extracting columns
print("\nFirst column:", A[:, 0])
print("Second column:", A[:, 1])

# Slicing
print("\nSubmatrix (first two rows, second and third columns):")
print(A[0:2, 1:3])

First element: 1
Second row, third column: 6

First row: [1 2 3]
Second row: [4 5 6]

First column: [1 4 7]
Second column: [2 5 8]

Submatrix (first two rows, second and third columns):
[[2 3]
 [5 6]]

3.9.5 Basic Operations

3.9.5.1 Addition and Subtraction

Matrix addition and subtraction work element-wise:

A = np.array([[1, 2],
              [3, 4]])
B = np.array([[5, 6],
              [7, 8]])

print("Matrix A:")
print(A)
print("\nMatrix B:")
print(B)
print("\nA + B:")
print(A + B)
print("\nA - B:")
print(A - B)

Matrix A:
[[1 2]
 [3 4]]

Matrix B:
[[5 6]
 [7 8]]

A + B:
[[ 6  8]
 [10 12]]

A - B:
[[-4 -4]
 [-4 -4]]

3.9.5.2 Scalar Operations

Multiply or divide a matrix by a scalar:

A = np.array([[1, 2],
              [3, 4]])

print("Original matrix:")
print(A)
print("\nMultiply by 2:")
print(2 * A)
print("\nDivide by 2:")
print(A / 2)

Original matrix:
[[1 2]
 [3 4]]

Multiply by 2:
[[2 4]
 [6 8]]

Divide by 2:
[[0.5 1. ]
 [1.5 2. ]]

3.9.5.3 Matrix Multiplication

NumPy provides several ways to perform matrix multiplication:

A = np.array([[1, 2],
              [3, 4]])
B = np.array([[5, 6],
              [7, 8]])

print("Matrix multiplication (A @ B):")
print(A @ B)          # Preferred method (Python 3.5+)

print("\nElement-wise multiplication (A * B):")
print(A * B)          # Hadamard product

Matrix multiplication (A @ B):
[[19 22]
 [43 50]]

Element-wise multiplication (A * B):
[[ 5 12]
 [21 32]]

3.9.6 Common Matrix Operations

Here are some frequently used matrix operations:

A = np.array([[1, 2],
              [3, 4]])

print("Original matrix:")
print(A)

print("\nTranspose:")
print(A.T)

print("\nMatrix trace:")
print(np.trace(A))

print("\nMatrix determinant:")
print(np.linalg.det(A))

print("\nMatrix inverse:")
print(np.linalg.inv(A))

Original matrix:
[[1 2]
 [3 4]]

Transpose:
[[1 3]
 [2 4]]

Matrix trace:
5

Matrix determinant:
-2.0000000000000004

Matrix inverse:
[[-2.   1. ]
 [ 1.5 -0.5]]

3.9.7 Important Notes

Always check matrix dimensions when performing operations
Use the appropriate multiplication operator:
- @ or np.matmul() for matrix multiplication
- * for element-wise multiplication
Remember that indexing starts at 0, not 1
When extracting rows or columns:
- A single row: A[i] or A[i, :]
- A single column: A[:, j]

3.10 Matrices in Sympy

This section covers fundamental matrix operations using SymPy’s Matrix class. We’ll explore creation, indexing, and both numeric and symbolic operations.

3.10.1 Setup

First, let’s import SymPy and set up symbolic variables:

from sympy import Matrix, Symbol, init_printing, pprint
import sympy as sp

# Setup pretty printing
init_printing()

# Define some symbolic variables
x = Symbol('x')
y = Symbol('y')

3.10.2 Creating Matrices

SymPy provides several ways to create matrices:

# From a list of lists
A = Matrix([[1, 2, 3],
            [4, 5, 6]])
print("Matrix A:")
pprint(A)

# Using special constructors
zeros = Matrix.zeros(2, 3)    # 2x3 matrix of zeros
ones = Matrix.ones(3, 3)      # 3x3 matrix of ones
eye = Matrix.eye(3)           # 3x3 identity matrix

print("\nZeros matrix:")
pprint(zeros)
print("\nIdentity matrix:")
pprint(eye)

# Symbolic matrix
symbolic = Matrix([[x, y],
                  [y, x]])
print("\nSymbolic matrix:")
pprint(symbolic)

Matrix A:
⎡1  2  3⎤
⎢       ⎥
⎣4  5  6⎦

Zeros matrix:
⎡0  0  0⎤
⎢       ⎥
⎣0  0  0⎦

Identity matrix:
⎡1  0  0⎤
⎢       ⎥
⎢0  1  0⎥
⎢       ⎥
⎣0  0  1⎦

Symbolic matrix:
⎡x  y⎤
⎢    ⎥
⎣y  x⎦

3.10.3 Matrix Properties and Shape

SymPy matrices have several useful properties:

A = Matrix([[1, 2, 3],
            [4, 5, 6]])
print(f"Shape: {A.shape}")
print(f"Number of rows: {A.rows}")
print(f"Number of columns: {A.cols}")

Shape: (2, 3)
Number of rows: 2
Number of columns: 3

3.10.4 Indexing and Slicing

SymPy uses different indexing methods than NumPy:

A = Matrix([[1, 2, 3],
            [4, 5, 6],
            [7, 8, 9]])

# Individual elements (zero-based indexing)
print("First element:", A[0, 0])
print("Second row, third column:", A[1, 2])

# Extracting rows
print("\nFirst row:")
pprint(A.row(0))
print("\nSecond row:")
pprint(A.row(1))

# Extracting columns
print("\nFirst column:")
pprint(A.col(0))
print("\nSecond column:")
pprint(A.col(1))

# Extracting submatrices
print("\nSubmatrix:")
pprint(A[0:2, 1:3])

First element: 1
Second row, third column: 6

First row:
[1  2  3]

Second row:
[4  5  6]

First column:
⎡1⎤
⎢ ⎥
⎢4⎥
⎢ ⎥
⎣7⎦

Second column:
⎡2⎤
⎢ ⎥
⎢5⎥
⎢ ⎥
⎣8⎦

Submatrix:
⎡2  3⎤
⎢    ⎥
⎣5  6⎦

3.10.5 Basic Operations

3.10.5.1 Addition and Subtraction

Matrix addition and subtraction work both with numeric and symbolic matrices:

A = Matrix([[1, 2],
            [3, 4]])
B = Matrix([[5, 6],
            [7, 8]])

print("Matrix A:")
pprint(A)
print("\nMatrix B:")
pprint(B)
print("\nA + B:")
pprint(A + B)
print("\nA - B:")
pprint(A - B)

# Symbolic example
C = Matrix([[x, y],
            [y, x]])
print("\nSymbolic addition A + C:")
pprint(A + C)

Matrix A:
⎡1  2⎤
⎢    ⎥
⎣3  4⎦

Matrix B:
⎡5  6⎤
⎢    ⎥
⎣7  8⎦

A + B:
⎡6   8 ⎤
⎢      ⎥
⎣10  12⎦

A - B:
⎡-4  -4⎤
⎢      ⎥
⎣-4  -4⎦

Symbolic addition A + C:
⎡x + 1  y + 2⎤
⎢            ⎥
⎣y + 3  x + 4⎦

3.10.5.2 Scalar Operations

Multiply or divide a matrix by a scalar (numeric or symbolic):

A = Matrix([[1, 2],
            [3, 4]])

print("Original matrix:")
pprint(A)
print("\nMultiply by 2:")
pprint(2 * A)
print("\nMultiply by symbolic x:")
pprint(x * A)

Original matrix:
⎡1  2⎤
⎢    ⎥
⎣3  4⎦

Multiply by 2:
⎡2  4⎤
⎢    ⎥
⎣6  8⎦

Multiply by symbolic x:
⎡ x   2⋅x⎤
⎢        ⎥
⎣3⋅x  4⋅x⎦

3.10.5.3 Matrix Multiplication

SymPy matrix multiplication works with both numeric and symbolic matrices:

A = Matrix([[1, 2],
            [3, 4]])
B = Matrix([[5, 6],
            [7, 8]])

print("Matrix multiplication (A * B):")
pprint(A * B)

Matrix multiplication (A * B):
⎡19  22⎤
⎢      ⎥
⎣43  50⎦

3.10.6 Other Matrix Operations

SymPy provides powerful symbolic matrix operations:

A = Matrix([[1, 2],
            [3, 4]])

print("Original matrix:")
pprint(A)

print("\nTranspose:")
pprint(A.transpose())

print("\nMatrix trace:")
pprint(A.trace())

print("\nDeterminant:")
pprint(A.det())

print("\nMatrix inverse:")
pprint(A.inv())

# Symbolic example
S = Matrix([[x, y],
            [y, x]])
print("\n5th power of S:")
S**5

Original matrix:
⎡1  2⎤
⎢    ⎥
⎣3  4⎦

Transpose:
⎡1  3⎤
⎢    ⎥
⎣2  4⎦

Matrix trace:
5

Determinant:
-2

Matrix inverse:
⎡-2    1  ⎤
⎢         ⎥
⎣3/2  -1/2⎦

5th power of S:

\(\displaystyle \left[\begin{matrix}x^{5} + 10 x^{3} y^{2} + 5 x y^{4} & 5 x^{4} y + 10 x^{2} y^{3} + y^{5}\\5 x^{4} y + 10 x^{2} y^{3} + y^{5} & x^{5} + 10 x^{3} y^{2} + 5 x y^{4}\end{matrix}\right]\)

3.10.7 Important Notes

SymPy matrices use * for matrix multiplication (unlike NumPy’s @)
Indexing is zero-based, similar to NumPy
SymPy matrices are immutable - operations return new matrices
Row and column extraction methods return Matrix objects
SymPy can handle:
- Symbolic computations
- Exact fractions
- Algebraic expressions