5  Systems of Linear Equations

Systems of linear equations form the cornerstone of linear algebra. They emerge naturally in mathematics, physics, engineering, and data science whenever we need to describe multiple linear relationships simultaneously.

5.1 Equivalent Representations

A key observation in linear algebra is that we can represent the same system in three different ways. Each representation offers unique advantages. Consider:

\[ \begin{aligned} 3x + 4y + z &= 9 \\ x - y + 2z &= 4 \\ 5x - 4y + z &= 3 \end{aligned} \]

As a Vector Equation

Here, we view the system as a sum of scaled vectors: \[ x\begin{bmatrix} 3 \\ 1 \\ 5 \end{bmatrix} + y\begin{bmatrix} 4 \\ -1 \\ -4 \end{bmatrix} + z\begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} = \begin{bmatrix} 9 \\ 4 \\ 3 \end{bmatrix} \]

As a Matrix Equation

Here, we package all coefficients into a single matrix: \[ \begin{bmatrix} 3 & 4 & 1 \\ 1 & -1 & 2 \\ 5 & -4 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} 9 \\ 4 \\ 3 \end{bmatrix} \]

The vector equation representation offers a geometric interpretation of linear systems. When we write the system as a vector equation, we’re asking whether a target vector can be obtained as a linear combination of given vectors. This connects systems of equations directly to fundamental concepts of linear algebra: linear combinations and spanning sets. We can visualize the solution process as asking whether the target vector lies in the span of our coefficient vectors, and if so, how we can reach it through scaling and adding these vectors.

Matrix equations package the coefficients efficiently. They are particularly useful in the square case, where the coefficient matrix is \(n\times n\). When this matrix is invertible, we can solve the system directly by multiplying both sides by the inverse matrix: if \(Ax = b\) and \(A\) is invertible, then \(x = A^{-1}b\). This not only gives us a theoretical way to express solutions but also connects to computational methods for solving systems. Moreover, whether a matrix is invertible tells us important information about the existence and uniqueness of solutions.

These three representations—systems of equations, vector equations, and matrix equations—each illuminate different aspects of linear systems. Fluency in moving between these representations and understanding their connections is important for both theoretical understanding and practical problem-solving.

5.2 Characterization of the Solution Set

A fundamental result of linear algebra is that every system of linear equations must have exactly one of these three outcomes:

  1. No solution exists (inconsistent system)
  2. Exactly one solution exists (unique solution)
  3. Infinitely many solutions exist

Let’s visualize each case in \(\mathbb{R}^2\) using vector equations:

Case 1: No Solution

\[ x\begin{bmatrix} 1 \\ 2 \end{bmatrix} + y\begin{bmatrix} 2 \\ 4 \end{bmatrix} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \]

This system has no solution because:

  • The vectors \(\begin{bmatrix} 1 \\ 2 \end{bmatrix}\) and \(\begin{bmatrix} 2 \\ 4 \end{bmatrix}\) span only a line
  • The target vector \(\begin{bmatrix} 1 \\ 1 \end{bmatrix}\) lies off this line
  • No combination of the vectors can reach the target

Case 2: Unique Solution

\[ x\begin{bmatrix} 1 \\ 2 \end{bmatrix} + y\begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \end{bmatrix} \]

This system has exactly one solution: \(\quad x=1\) and \(y=-1\).

Case 3: Infinite Solutions

\[ x\begin{bmatrix} 1 \\ 2 \end{bmatrix} + y\begin{bmatrix} 2 \\ 4 \end{bmatrix} = \begin{bmatrix} 4 \\ 8 \end{bmatrix} \]

This system has infinitely many solutions because:

  • The vectors are parallel (one is a multiple of the other)
  • The target vector lies on their shared line
  • Solutions: \(x = 4+4t, y = -2t\) for any real \(t\)

5.3 Solution Methods

In Algebra 1 (typically 8th or 9th grade), students learn three approaches:

  1. Graphing (limited to \(\mathbb{R}^2\))
  2. Substitution
  3. Elimination

Let’s examine the algebraic methods using this system: \[ \begin{aligned} 2x + 3y &= 2 \\ x + y &= 7 \end{aligned} \]

Substitution Method

  1. Choose a variable and solve for it: \[ x = 7 - y \quad \text{(from second equation)} \]

  2. Substitute into the other equation: \[ \begin{aligned} 2(7-y) + 3y &= 2 \\ 14 - 2y + 3y &= 2 \\ 14 + y &= 2 \\ y &= -12 \end{aligned} \]

  3. Back-substitute: \[ x = 7 - (-12) = 19 \]

Elimination Method

  1. Set up the aligned system: \[ \begin{aligned} 2x + 3y &= 2 \\ x + y &= 7 \end{aligned} \]

  2. Multiply second equation by -2: \[ \begin{aligned} 2x + 3y &= 2 \\ -2x - 2y &= -14 \end{aligned} \]

  3. Add equations: \[ \begin{aligned} 2x + 3y &= 2 \\ -2x - 2y &= -14 \\ \hline y &= -12 \end{aligned} \]

  4. Back-substitute: \[ \begin{aligned} 2x + 3(-12) &= 2 \\ 2x - 36 &= 2 \\ 2x &= 38 \\ x &= 19 \end{aligned} \]

While both methods work, elimination systematically transforms the system into an “upper triangular” form, making solutions easier to find. Consider this upper triangular system:

\[ \begin{aligned} x_1 + 2x_2 + x_3 - x_4 &= 4 \\ x_3 + x_4 &= 1 \\ x_4 &= -2 \end{aligned} \]

From this form, we can easily find all solutions:

  1. \(x_4 = -2\)
  2. Substituting \(x_4\) in the second equation, we get \(x_3 = 3\)
  3. \(x_2\) is free (can be any real number)
  4. Susbtituting \(x_4\) and \(x_3\) in the first equation, we get \(x_1 = -2x_2 + 3\)

The general solution is: \[ \begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix} =\begin{bmatrix}-2x_2+3\\x_2\\3\\-2\end{bmatrix} =\begin{bmatrix}3\\0\\3\\-2\end{bmatrix}+ x_2\begin{bmatrix}-2\\1\\0\\0\end{bmatrix} \]

This represents all solutions, with \(x_2\) as the free parameter.

5.4 Gauss Elimination

The elimination method discussed previously naturally extends to a systematic algorithm known as Gaussian Elimination. This algorithm formalizes and generalizes the process of using row operations to transform a system of equations into “upper triangular” form, also called echelon form. The key insight is that through a careful sequence of row operations (adding multiples of one equation to another, swapping equations, or multiplying an equation by a nonzero scalar), we can systematically create zeros below the diagonal, one column at a time.

When a system is in echelon form, the pattern of coefficients forms a “staircase” shape, making it significantly easier to find solutions through back-substitution. This systematic approach works for systems of any size, making it a fundamental tool in linear algebra. The detailed algorithm and its implementation are presented in the accompanying slide presentation.

Why does Gaussian Elimination work?

The mathematical foundation of Gaussian elimination lies in matrix multiplication: each row operation is equivalent to left multiplication by an invertible matrix. The following theorem shows why these operations preserve the solution set of the original system.

Theorem 5.1 Let \(A\) be an \(m\times n\) matrix, \(\mathbf{b}\in\mathbb{R}^m\) and \(E\) be an invertible \(m\times m\) matrix. Then the following conditions are equivalent for a vector \(\mathbf{x}_0\in\mathbb{R}^n\):

  1. \(\mathbf{x}_0\) is a solution of the system \(A\mathbf{x}=\mathbf{b}\)
  2. \(\mathbf{x}_0\) is a solution of the system \((EA)\mathbf{x}=E\mathbf{b}\)

Proof. The equivalence follows because \(E\) is invertible:

  • If \(A\mathbf{x}_0=\mathbf{b}\), then multiplying both sides by \(E\) gives \((EA)\mathbf{x}_0=E\mathbf{b}\)
  • Conversely, if \((EA)\mathbf{x}_0=E\mathbf{b}\), then multiplying both sides by \(E^{-1}\) gives \(A\mathbf{x}_0=\mathbf{b}\). \(\square\)

Each elementary row operation corresponds to multiplication by a specific type of invertible matrix:

  1. Row swap \((R_i \leftrightarrow R_j)\): Multiply by a permutation matrix that differs from the identity matrix only in rows \(i\) and \(j\), where the 1’s are swapped. The inverse is itself since swapping twice returns to the original
  2. Row scaling \((cR_i)\): Multiply by diagonal matrix with \(c\) in position \(i\), 1’s elsewhere. The inverse multiplies row \(i\) by \(\frac{1}{c}\)
  3. Row addition \((R_i + cR_j)\): Multiply by the matrix that differs from the identity only in position \((i,j)\) where there is a \(c\). This adds \(c\) times row \(j\) to row \(i\). The inverse has \(-c\) in position \((i,j)\), which subtracts \(c\) times row \(j\) from row \(i\)

Therefore, if we perform a sequence of row operations to transform \(A\mathbf{x}=\mathbf{b}\) into row echelon form, we’re effectively multiplying both sides by a product of invertible matrices. This preserves the solution set while making the system easier to solve.

This also explains why we use the augmented matrices. Notice that \([A|\mathbf{b}]\) has \(m\) rows and \(n+1\) columns, then it follows from Equation 3.4 that \[E[A|\mathbf{b}]=[EA|E\mathbf{b}].\] So row operations on the augmented matrix simultaneously transform both \(A\) and \(\mathbf{b}\) while preserving their relationship.

5.5 Homogeneous Systems

A linear system \(A\mathbf{x}=\mathbf{b}\) is homogeneous if \(\mathbf{b}=\mathbf{0}\). That is, a homogeneous system has the form: \[A\mathbf{x}=\mathbf{0}\] The zero vector \(\mathbf{x}=\mathbf{0}\) is always a solution to a homogeneous system. This leads to a fundamental property: a homogeneous system has either:

  1. Exactly one solution: \(\mathbf{x}=\mathbf{0}\)
  2. Infinitely many solutions

This dichotomy follows because if a homogeneous system has any nonzero solution \(\mathbf{v}\), then \(c\mathbf{v}\) is also a solution for any scalar \(c\), yielding infinitely many solutions.

Solving Homogeneous Systems

To solve a homogeneous system, we reduce \(A\) to row echelon form and analyze the pivots.

In row echelon form, a free column is any column that does not contain a pivot. When solving the system, variables corresponding to free columns can take any value, while the other variables are determined by these choices. Each free column thus generates a dimension of solutions.

Consider the system: \[\begin{bmatrix} 1 & 2 & 1 & 3\\ 2 & 4 & 0 & 2\\ 3 & 6 & -1 & 1 \end{bmatrix}\begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix}=\begin{bmatrix}0\\0\\0\end{bmatrix}\]

Row reduction gives: \[\begin{bmatrix} 1 & 2 & 0 & 1\\ 0 & 0 & 1 & 2\\ 0 & 0 & 0 & 0 \end{bmatrix}\begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix}=\begin{bmatrix}0\\0\\0\end{bmatrix}\]

Notice that columns 2 and 4 are free because they have no pivots. Using back substitution, we that the solution of the reduced system is:

  • \(x_4\) is free
  • \(x_3=-2x_4\),
  • \(x_2\) is free, and
  • \(x_1=-2x_2-x_4\).

Writing all values in terms of \(x_2\) and \(x_4\) we see that solutions are of the form: \[ \begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix} =\begin{bmatrix}-2x_2-x_4\\x_2\\-2x_4\\x_4\end{bmatrix} =x_2\begin{bmatrix}-2\\ 1\\ 0\\ 0\end{bmatrix} + x_4\begin{bmatrix}-1\\ 0\\ -2\\ 1\end{bmatrix}\]

Since \(x_2,x_4\) can be any real numbers, the system has infinitely many solutions.

Connection Between Solutions and Column Dependencies

Solutions to homogeneous systems reveal linear dependencies among the columns of the coefficient matrix. Consider:

\[A=\begin{bmatrix} 1 & 2 & 1 & 3\\ 2 & 4 & 0 & 2\\ 3 & 6 & -1 & 1 \end{bmatrix} =\begin{bmatrix} \uparrow & \uparrow & \uparrow & \uparrow \\ \mathbf{a}_1 & \mathbf{a}_2 & \mathbf{a}_3 & \mathbf{a}_4 \\ \downarrow & \downarrow & \downarrow & \downarrow \end{bmatrix}\]

where \(\mathbf{a}_1,\mathbf{a}_2,\mathbf{a}_3,\mathbf{a}_4\) are the columns of \(A\).

We just saw that the general solution to \(A\mathbf{x}=\mathbf{0}\) is: \[ \begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix} = x_2\begin{bmatrix}-2\\ 1\\ 0\\ 0\end{bmatrix} + x_4\begin{bmatrix}-1\\ 0\\ -2\\ 1\end{bmatrix}\]

Each basic solution reveals a column dependency (Equation 3.1):

  1. Setting \(x_2=1, x_4=0\) gives solution \((-2,1,0,0)\in\mathbb{R}^4\). This means: \[-2\mathbf{a}_1 + \mathbf{a}_2 = \mathbf{0}\quad\text{which implies}\quad \mathbf{a}_2 = 2\mathbf{a}_1.\]
  2. Setting \(x_2=0, x_4=1\) gives solution \((-1,0,-2,1)\in\mathbb{R}^4\). This means: \[-\mathbf{a}_1 - 2\mathbf{a}_3 + \mathbf{a}_4 = \mathbf{0}\quad\text{which implies}\quad \mathbf{a}_4 = \mathbf{a}_1 + 2\mathbf{a}_3.\]

These relationships, which can be easily verified by replacing the values of the columns, show that free columns (\(\mathbf{a}_2\) and \(\mathbf{a}_4\)) can be expressed as linear combinations of pivot columns (\(\mathbf{a}_1\) and \(\mathbf{a}_3\)).

Homogeneous Systems: Dependence and Independence

The previous example illustrates important principles that we expand in this section. Let \(A\) be an \(m\times n\) matrix with columns \(\mathbf{a}_1,\mathbf{a}_2,\dots,\mathbf{a}_n\in\mathbb{R}^m\). Three key ideas connect solutions and column dependencies:

  1. If \(\mathbf{x}\neq\mathbf{0}\) solves \(A\mathbf{x}=\mathbf{0}\), then \(t\mathbf{x}\) is also a solution for any \(t\in\mathbb{R}\), giving infinitely many solutions.
  2. From Equation 3.1, \(\mathbf{x}=(x_1,\dots,x_n)\in\mathbb{R}^n\) solves \(A\mathbf{x}=\mathbf{0}\) if and only if
    \[x_1\mathbf{a}_1+x_2\mathbf{a}_2+\cdots+x_n\mathbf{a}_n=\mathbf{0}\]
  3. When any \(x_i\neq 0\), we can solve for \(\mathbf{a}_i\), expressing \(\mathbf{a}_i\) as a linear combination of the other columns: \[\mathbf{a}_i = -\frac{x_1}{x_i}\mathbf{a}_1-\cdots-\frac{x_{i-1}}{x_i}\mathbf{a}_{i-1}-\frac{x_{i+1}}{x_i}\mathbf{a}_{i+1}-\cdots-\frac{x_n}{x_i}\mathbf{a}_n.\] Conversely, if we can write \(\mathbf{a}_i\) in terms of the other columns, there exist constants \(c_1,\dots,c_{i-1},c_{i+1},\dots,c_n\) such that \[\mathbf{a}_i=c_1\mathbf{a}_1+\cdots+c_{i-1}\mathbf{a}_{i-1}+c_{i+1}\mathbf{a}_{i+1}+\cdots+c_n\mathbf{a}_n.\] Then subtracting \(\mathbf{a}_i\) from both sides, we obtain \[\mathbf{0}=c_1\mathbf{a}_1+\cdots+c_{i-1}\mathbf{a}_{i-1}+1\mathbf{a}_i+c_{i+1}\mathbf{a}_{i+1}+\cdots+c_n\mathbf{a}_n,\] and we obtain a non-zero solution of the homogeneous system because the coefficient of \(\mathbf{a}_i\) is 1.

The following two theorems capture these important relations:

Theorem 5.2 Suppose that \(A\) is an \(m\times n\) matrix. The following are equivalent:

  1. The homogeneous system \(A\mathbf{x}=\mathbf{0}\) has infintely many solutions
  2. The homogeneous system \(A\mathbf{x}=\mathbf{0}\) has a non-zero solution
  3. We can write a column of \(A\) as a linear combination of the other columns
  4. The row reduced Echelon form of \(A\) has a free column

Theorem 5.3 Suppose that \(A\) is an \(m\times n\) matrix with columns \(\mathbf{a}_1,\mathbf{a}_2\dots,\mathbf{a}_n\in\mathbb{R}^m\). The following are equivalent:

  1. The homogeneous system \(A\mathbf{x}=\mathbf{0}\) has a unique solution
  2. Let \(\mathbf{x}\in\mathbb{R}^n\). If \(A\mathbf{x}=\mathbf{0}\), then \(\mathbf{x}=\mathbf{0}\)
  3. Let \(c_1,\dots,c_n\in\mathbb{R}\). If \(c_1\mathbf{a}_1+c_2\mathbf{a}_2+\cdots+c_n\mathbf{a}_n=\mathbf{0}\), then \(c_1=c_2=\cdots=c_n=0.\)
  4. We cannot write any column of \(A\) as a linear combination of the other columns
  5. The row reduced Echelon form of \(A\) doesn’t have any free columns

Condition 3 of Theorem 5.3 motivates the following definition

Definition 5.1 Suppose that \(V\) is a vector space and that \(\mathbf{v}_1,\mathbf{v}_2,\dots,\mathbf{v}_k\in V\). The set \(S=\{\mathbf{v}_1,\mathbf{v}_2,\dots,\mathbf{v}_k\}\) is linearly independent if for every \(c_1,\dots,c_k\in\mathbb{R}\), if \(c_1\mathbf{v}_1+c_2\mathbf{v}_2+\cdots+c_k\mathbf{v}_k=\mathbf{0}\), then \(c_1=c_2=\cdots=c_k=0.\)

Notice that \(\{\mathbf{v}_1,\mathbf{v}_2,\dots,\mathbf{v}_k\}\) is linearly independent if and only if we cannot write any of these vectors as a linear combination of the other ones. However, checking this requires solving \(k\) vector equations.

From a computational point of view, the statement in the definition is much more practical. We only solve one vector equation. We look at: \[ c_1\mathbf{v}_1+c_2\mathbf{v}_2+\cdots+c_k\mathbf{v}_k=\mathbf{0}\] and we solve it. If the only solution is \(c_1=c_2=\cdots=c_k=0\), the set is linearly independent.

If \(\{\mathbf{v}_1,\mathbf{v}_2,\dots,\mathbf{v}_k\}\) is not linearly dependent, then it is linearly dependent. In this case we can write a vector as a linear combination of the others.

5.6 Linear Systems and the Span of Columns

Let \(A\) be an \(m\times n\) matrix with columns \(\mathbf{a}_1,\mathbf{a}_2,\dots,\mathbf{a}_n\in\mathbb{R}^m\). In this section we address three questions:

  1. Given \(\mathbf{b}\in\mathbb{R}^m\), when does \(A\mathbf{x}=\mathbf{b}\) have at least one solution?
  2. If \(A\mathbf{x}=\mathbf{b}\) has a solution, when is that solution unique?
  3. Under what conditions does \(A\mathbf{x}=\mathbf{b}\) have solutions for every \(\mathbf{b}\in\mathbb{R}^m\)

Let’s start with existence. Suppose that \(A\mathbf{x}=\mathbf{b}\) . By Equation 3.1: \[A\mathbf{x} = x_1\mathbf{a}_1 + x_2\mathbf{a}_2 + \cdots + x_n\mathbf{a}_n = \mathbf{b}\]

Therefore, \(\mathbf{b}\) must be a linear combination of the columns of \(A\). The set of all such linear combinations is called the span of the columns (see Section 1.3). Then we have

Theorem 5.4 Let \(A\) be an \(m\times n\) matrix and \(\mathbf{b}\in\mathbb{R}^m\). Then the following conditions are equivalent:

  1. \(A\mathbf{x}=\mathbf{b}\) has a solution
  2. \(\mathbf{b}\) can be written as a linear combination of the columns of \(A\)

Let’s examine some examples. Consider: \[A=\begin{bmatrix}1&2&1&3\\2&4&0&2\\3&6&-1&1\end{bmatrix},\quad\mathbf{b}_1=\begin{bmatrix}1\\1\\1\end{bmatrix},\quad\mathbf{b}_2=\begin{bmatrix}1\\1\\0\end{bmatrix}\]

First, let’s solve \(A\mathbf{x}=\mathbf{b}_1\). The row reduced echelon form of \([A|\mathbf{b}_1]\) is: \[ \left[\begin{array}{cccc|c}1&2&1&3&1\\2&4&0&2&1\\3&6&-1&1&1\end{array}\right] \xrightarrow{\text{RREF}} \left[\begin{array}{cccc|c}1&2&0&1&\frac{1}{2}\\0&0&1&2&\frac{1}{2}\\0&0&0&0&0\end{array}\right] \]

This system has solutions. Using back substitution:

  1. \(x_4\) is free
  2. \(x_3=\frac{1}{2}-2x_4\)
  3. \(x_2\) is free
  4. \(x_1=\frac{1}{2}-2x_2-x_4\)

The general solution is: \[ \begin{bmatrix}x_1\\ x_2\\ x_3\\ x_4\end{bmatrix} = \begin{bmatrix}\frac{1}{2}\\ 0\\ \frac{1}{2}\\ 0\end{bmatrix} +x_2 \begin{bmatrix}-2\\ 1\\ 0\\ 0\end{bmatrix} +x_4 \begin{bmatrix}-1\\ 0\\ -2\\ 1\end{bmatrix} \] Since \(x_2\) and \(x_4\) can take arbitrary elements, there are infinitely many solutions. When \(x_2=x_4=0\), we get \((\frac{1}{2},0,\frac{1}{2},0)\), showing that: \[\mathbf{b}_1=\frac{1}{2}\mathbf{a}_1+\frac{1}{2}\mathbf{a}_3,\] which we can verify by replacing the values of the columns of \(A\) in the equation.

This example illustrates the following

Theorem 5.5 Let \(A\) be an \(m\times n\) matrix, \(\mathbf{b}\in\mathbb{R}^m\), and suppose that \(A\mathbf{x}=\mathbf{b}\) has at least one solution. Then the following conditions are equivalent:

  1. \(A\mathbf{x}=\mathbf{b}\) has a unique solution
  2. The matrix \(R\), the reduced row echelon form of \(A\), has a pivot position in every column.
  3. The matrix \(R\), the reduced row echelon form of \(A\), does not have a free column.

Now consider \(A\mathbf{x}=\mathbf{b}_2\). The row reduced echelon form of \([A|\mathbf{b}_2]\) is: \[ \left[\begin{array}{cccc|c}1&2&1&3&1\\2&4&0&2&1\\3&6&-1&1&0\end{array}\right] \xrightarrow{\text{RREF}} \left[\begin{array}{cccc|c}1&2&0&1&0\\0&0&1&2&0\\0&0&0&0&1\end{array}\right] \]

This system has no solution since the last equation becomes: \[0x_1+0x_2+0x_3+0x_4=1\]

For the general case, let \(\mathbf{b}_3=\begin{bmatrix}a\\b\\c\end{bmatrix}\) be arbitrary in \(\mathbb{R}^3\). We can find solutions by reducing the augmented matrix \([A|\mathbf{b}_3]\) to echelon form. We can do this in SymPy using the echelon_form() method that finds the row echelon form and not the row reduced echelon form.

from sympy import Matrix, symbols

a,b,c = symbols("a b c")
A = Matrix([[1,2,1,3],[2,4,0,2],[3,6,-1,1]])
B = Matrix.hstack(A,Matrix([a,b,c]))    # finds the augmented matrix
B.echelon_form()    # finds the echelon form

\(\displaystyle \left[\begin{matrix}1 & 2 & 1 & 3 & a\\0 & 0 & -2 & -4 & - 2 a + b\\0 & 0 & 0 & 0 & - 2 a + 4 b - 2 c\end{matrix}\right]\)

The last equation is: \[0x_1+0x_2+0x_3+0x_4=-2a+4b-2c.\] Then we conclude that \(A\mathbf{x}=\mathbf{b}_3\) has a solution if and only if \(-2a+4b-2c=0\), which explains why we have a solution for \(\mathbf{b}_1=(1,1,1)\) but not for \(\mathbf{b}_2=(1,1,0)\).

The last two examples illustrate the following

Theorem 5.6 Let \(A\) be an \(m\times n\) matrix and let \(R\) be the reduced row echelon form of \(A\). The following conditions are equivalent:

  1. \(A\mathbf{x}=\mathbf{b}\) has a solution for every \(\mathbf{b}\in\mathbb{R}^m\).
  2. \(R\) has a pivot position in every row.
  3. \(R\) does not have a zero row.
  4. Every \(\mathbf{b}\in\mathbb{R}^m\) can be written as a linear combination of the columns of \(A\)

The last condition motivates the following definition

Definition 5.2 Suppose that \(V\) is a vector space and that \(\mathbf{v}_1,\mathbf{v}_2,\dots,\mathbf{v}_k\in V\). The set \(S=\{\mathbf{v}_1,\mathbf{v}_2,\dots,\mathbf{v}_k\}\) spans \(V\) if for every \(\mathbf{v}\in V\), there exist \(c_1,\dots,c_k\in\mathbb{R}\) such that \[c_1\mathbf{v}_1+c_2\mathbf{v}_2+\cdots+c_k\mathbf{v}_k=\mathbf{v}\]

5.7 Finding Inverses

Let \(A\) be an \(n\times n\) matrix. Recall that \(A\) is invertible if there exists an \(n\times n\) matrix \(A^{-1}\) such that: \[AA^{-1}=I_n\quad\text{and}\quad A^{-1}A=I_n\] where \(I_n\) is the \(n\times n\) identity matrix. If \(A\) and \(B\) are invertible \(n\times n\) matrices. Then \(AB\) is also invertible and the inverse is the product of their inverses in reverse order: \[(AB)^{-1} = B^{-1}A^{-1} \tag{5.1}\] We can easily verify this: \[(AB)(B^{-1}A^{-1}) = A(BB^{-1})A^{-1} = AI_nA^{-1} = AA^{-1} = I_n\] And similarly for \((B^{-1}A^{-1})(AB)\).

Suppose now that \(A\) is an invertible \(n\times n\) matrix. Then for any \(\mathbf{b}\in\mathbb{R}^n\), the system \(A\mathbf{x}=\mathbf{b}\) has unique solution: \[\mathbf{x}=A^{-1}\mathbf{b}.\] From Theorem 5.6, the row reduced echelon form \(R\) of an invertible matrix \(A\) must have:

  • A pivot in every row
  • a pivot in every column, because \(R\) has an equal number of rows and columns
  • No free columns

This leads to a key result:

Theorem: If \(A\) is an invertible \(n\times n\) matrix, its row reduced echelon form \(R\) is the identity matrix \(I_n\).

This gives us an algorithm to find \(A^{-1}\):

  1. Form the augmented matrix \([A|I_n]\)
  2. Row reduce to get \([I_n|B]\) for some matrix \(B\)
  3. Then \(B = A^{-1}\)

We’ll explore this algorithm in detail during the lab.

5.7.1 Writing Invertible Matrices as Products of Elementary Matrices

Let \(A\) be an invertible matrix. Then the row reduced echelon form of \(A\) is the identity. Each step of the row reduction process corresponds to left multiplication by an elementary matrix. Recall that elementary matrices are invertible and their inverses are also elementary matrices. Therefore, we can find elementary matrices \(E_1,\ldots,E_k\) such that: \[E_k\cdots E_2E_1A = I_n\] Solving for \(A\) we get that \[A=E_k^{-1}E_{k-1}^{-1}\cdots E_2^{-1}E_1^{-1}.\]

To illustrate, consider the case where \(E_4E_3E_2E_1A = I_n\). We can solve for \(A\):

  1. Start with \(E_4E_3E_2E_1A = I_n\)
  2. Multiply from the left by \(E_4^{-1}\):
    • \(E_3E_2E_1A = E_4^{-1}\)
  3. Multiply from the left by \(E_3^{-1}\):
    • \(E_2E_1A = E_3^{-1}E_4^{-1}\)
  4. Multiply from the left by \(E_2^{-1}\):
    • \(E_1A = E_2^{-1}E_3^{-1}E_4^{-1}\)
  5. Finally multiply from the left by \(E_1^{-1}\):
    • \(A = E_1^{-1}E_2^{-1}E_3^{-1}E_4^{-1}\)