5  Systems of Linear Equations

Systems of linear equations form the cornerstone of linear algebra. They emerge naturally in mathematics, physics, engineering, and data science whenever we need to describe multiple linear relationships simultaneously.

5.1 Equivalent Representations

A key observation in linear algebra is that we can represent the same system in three different ways. Each representation offers unique advantages. Consider:

3x+4y+z=9xy+2z=45x4y+z=3

As a Vector Equation

Here, we view the system as a sum of scaled vectors: x[315]+y[414]+z[121]=[943]

As a Matrix Equation

Here, we package all coefficients into a single matrix: [341112541][xyz]=[943]

The vector equation representation offers a geometric interpretation of linear systems. When we write the system as a vector equation, we’re asking whether a target vector can be obtained as a linear combination of given vectors. This connects systems of equations directly to fundamental concepts of linear algebra: linear combinations and spanning sets. We can visualize the solution process as asking whether the target vector lies in the span of our coefficient vectors, and if so, how we can reach it through scaling and adding these vectors.

Matrix equations package the coefficients efficiently. They are particularly useful in the square case, where the coefficient matrix is n×n. When this matrix is invertible, we can solve the system directly by multiplying both sides by the inverse matrix: if Ax=b and A is invertible, then x=A1b. This not only gives us a theoretical way to express solutions but also connects to computational methods for solving systems. Moreover, whether a matrix is invertible tells us important information about the existence and uniqueness of solutions.

These three representations—systems of equations, vector equations, and matrix equations—each illuminate different aspects of linear systems. Fluency in moving between these representations and understanding their connections is important for both theoretical understanding and practical problem-solving.

5.2 Characterization of the Solution Set

A fundamental result of linear algebra is that every system of linear equations must have exactly one of these three outcomes:

  1. No solution exists (inconsistent system)
  2. Exactly one solution exists (unique solution)
  3. Infinitely many solutions exist

Let’s visualize each case in R2 using vector equations:

Case 1: No Solution

x[12]+y[24]=[11]

This system has no solution because:

  • The vectors [12] and [24] span only a line
  • The target vector [11] lies off this line
  • No combination of the vectors can reach the target

Case 2: Unique Solution

x[12]+y[11]=[01]

This system has exactly one solution: x=1 and y=1.

Case 3: Infinite Solutions

x[12]+y[24]=[48]

This system has infinitely many solutions because:

  • The vectors are parallel (one is a multiple of the other)
  • The target vector lies on their shared line
  • Solutions: x=4+4t,y=2t for any real t

5.3 Solution Methods

In Algebra 1 (typically 8th or 9th grade), students learn three approaches:

  1. Graphing (limited to R2)
  2. Substitution
  3. Elimination

Let’s examine the algebraic methods using this system: 2x+3y=2x+y=7

Substitution Method

  1. Choose a variable and solve for it: x=7y(from second equation)

  2. Substitute into the other equation: 2(7y)+3y=2142y+3y=214+y=2y=12

  3. Back-substitute: x=7(12)=19

Elimination Method

  1. Set up the aligned system: 2x+3y=2x+y=7

  2. Multiply second equation by -2: 2x+3y=22x2y=14

  3. Add equations: 2x+3y=22x2y=14y=12

  4. Back-substitute: 2x+3(12)=22x36=22x=38x=19

While both methods work, elimination systematically transforms the system into an “upper triangular” form, making solutions easier to find. Consider this upper triangular system:

x1+2x2+x3x4=4x3+x4=1x4=2

From this form, we can easily find all solutions:

  1. x4=2
  2. Substituting x4 in the second equation, we get x3=3
  3. x2 is free (can be any real number)
  4. Susbtituting x4 and x3 in the first equation, we get x1=2x2+3

The general solution is: [x1x2x3x4]=[2x2+3x232]=[3032]+x2[2100]

This represents all solutions, with x2 as the free parameter.

5.4 Gauss Elimination

The elimination method discussed previously naturally extends to a systematic algorithm known as Gaussian Elimination. This algorithm formalizes and generalizes the process of using row operations to transform a system of equations into “upper triangular” form, also called echelon form. The key insight is that through a careful sequence of row operations (adding multiples of one equation to another, swapping equations, or multiplying an equation by a nonzero scalar), we can systematically create zeros below the diagonal, one column at a time.

When a system is in echelon form, the pattern of coefficients forms a “staircase” shape, making it significantly easier to find solutions through back-substitution. This systematic approach works for systems of any size, making it a fundamental tool in linear algebra. The detailed algorithm and its implementation are presented in the accompanying slide presentation.

Why does Gaussian Elimination work?

The mathematical foundation of Gaussian elimination lies in matrix multiplication: each row operation is equivalent to left multiplication by an invertible matrix. The following theorem shows why these operations preserve the solution set of the original system.

Theorem 5.1 Let A be an m×n matrix, bRm and E be an invertible m×m matrix. Then the following conditions are equivalent for a vector x0Rn:

  1. x0 is a solution of the system Ax=b
  2. x0 is a solution of the system (EA)x=Eb

Proof. The equivalence follows because E is invertible:

  • If Ax0=b, then multiplying both sides by E gives (EA)x0=Eb
  • Conversely, if (EA)x0=Eb, then multiplying both sides by E1 gives Ax0=b.

Each elementary row operation corresponds to multiplication by a specific type of invertible matrix:

  1. Row swap (RiRj): Multiply by a permutation matrix that differs from the identity matrix only in rows i and j, where the 1’s are swapped. The inverse is itself since swapping twice returns to the original
  2. Row scaling (cRi): Multiply by diagonal matrix with c in position i, 1’s elsewhere. The inverse multiplies row i by 1c
  3. Row addition (Ri+cRj): Multiply by the matrix that differs from the identity only in position (i,j) where there is a c. This adds c times row j to row i. The inverse has c in position (i,j), which subtracts c times row j from row i

Therefore, if we perform a sequence of row operations to transform Ax=b into row echelon form, we’re effectively multiplying both sides by a product of invertible matrices. This preserves the solution set while making the system easier to solve.

This also explains why we use the augmented matrices. Notice that [A|b] has m rows and n+1 columns, then it follows from that E[A|b]=[EA|Eb]. So row operations on the augmented matrix simultaneously transform both A and b while preserving their relationship.

5.5 Homogeneous Systems

A linear system Ax=b is homogeneous if b=0. That is, a homogeneous system has the form: Ax=0 The zero vector x=0 is always a solution to a homogeneous system. This leads to a fundamental property: a homogeneous system has either:

  1. Exactly one solution: x=0
  2. Infinitely many solutions

This dichotomy follows because if a homogeneous system has any nonzero solution v, then cv is also a solution for any scalar c, yielding infinitely many solutions.

Solving Homogeneous Systems

To solve a homogeneous system, we reduce A to row echelon form and analyze the pivots.

In row echelon form, a free column is any column that does not contain a pivot. When solving the system, variables corresponding to free columns can take any value, while the other variables are determined by these choices. Each free column thus generates a dimension of solutions.

Consider the system: [121324023611][x1x2x3x4]=[000]

Row reduction gives: [120100120000][x1x2x3x4]=[000]

Notice that columns 2 and 4 are free because they have no pivots. Using back substitution, we that the solution of the reduced system is:

  • x4 is free
  • x3=2x4,
  • x2 is free, and
  • x1=2x2x4.

Writing all values in terms of x2 and x4 we see that solutions are of the form: [x1x2x3x4]=[2x2x4x22x4x4]=x2[2100]+x4[1021]

Since x2,x4 can be any real numbers, the system has infinitely many solutions.

Connection Between Solutions and Column Dependencies

Solutions to homogeneous systems reveal linear dependencies among the columns of the coefficient matrix. Consider:

A=[121324023611]=[a1a2a3a4]

where a1,a2,a3,a4 are the columns of A.

We just saw that the general solution to Ax=0 is: [x1x2x3x4]=x2[2100]+x4[1021]

Each basic solution reveals a column dependency ():

  1. Setting x2=1,x4=0 gives solution (2,1,0,0)R4. This means: 2a1+a2=0which impliesa2=2a1.
  2. Setting x2=0,x4=1 gives solution (1,0,2,1)R4. This means: a12a3+a4=0which impliesa4=a1+2a3.

These relationships, which can be easily verified by replacing the values of the columns, show that free columns (a2 and a4) can be expressed as linear combinations of pivot columns (a1 and a3).

Homogeneous Systems: Dependence and Independence

The previous example illustrates important principles that we expand in this section. Let A be an m×n matrix with columns a1,a2,,anRm. Three key ideas connect solutions and column dependencies:

  1. If x0 solves Ax=0, then tx is also a solution for any tR, giving infinitely many solutions.
  2. From , x=(x1,,xn)Rn solves Ax=0 if and only if
    x1a1+x2a2++xnan=0
  3. When any xi0, we can solve for ai, expressing ai as a linear combination of the other columns: ai=x1xia1xi1xiai1xi+1xiai+1xnxian. Conversely, if we can write ai in terms of the other columns, there exist constants c1,,ci1,ci+1,,cn such that ai=c1a1++ci1ai1+ci+1ai+1++cnan. Then subtracting ai from both sides, we obtain 0=c1a1++ci1ai1+1ai+ci+1ai+1++cnan, and we obtain a non-zero solution of the homogeneous system because the coefficient of ai is 1.

The following two theorems capture these important relations:

Theorem 5.2 Suppose that A is an m×n matrix. The following are equivalent:

  1. The homogeneous system Ax=0 has infintely many solutions
  2. The homogeneous system Ax=0 has a non-zero solution
  3. We can write a column of A as a linear combination of the other columns
  4. The row reduced Echelon form of A has a free column

Theorem 5.3 Suppose that A is an m×n matrix with columns a1,a2,anRm. The following are equivalent:

  1. The homogeneous system Ax=0 has a unique solution
  2. Let xRn. If Ax=0, then x=0
  3. Let c1,,cnR. If c1a1+c2a2++cnan=0, then c1=c2==cn=0.
  4. We cannot write any column of A as a linear combination of the other columns
  5. The row reduced Echelon form of A doesn’t have any free columns

Condition 3 of motivates the following definition

Definition 5.1 Suppose that V is a vector space and that v1,v2,,vkV. The set S={v1,v2,,vk} is linearly independent if for every c1,,ckR, if c1v1+c2v2++ckvk=0, then c1=c2==ck=0.

Notice that {v1,v2,,vk} is linearly independent if and only if we cannot write any of these vectors as a linear combination of the other ones. However, checking this requires solving k vector equations.

From a computational point of view, the statement in the definition is much more practical. We only solve one vector equation. We look at: c1v1+c2v2++ckvk=0 and we solve it. If the only solution is c1=c2==ck=0, the set is linearly independent.

If {v1,v2,,vk} is not linearly dependent, then it is linearly dependent. In this case we can write a vector as a linear combination of the others.

5.6 Linear Systems and the Span of Columns

Let A be an m×n matrix with columns a1,a2,,anRm. In this section we address three questions:

  1. Given bRm, when does Ax=b have at least one solution?
  2. If Ax=b has a solution, when is that solution unique?
  3. Under what conditions does Ax=b have solutions for every bRm

Let’s start with existence. Suppose that Ax=b . By : Ax=x1a1+x2a2++xnan=b

Therefore, b must be a linear combination of the columns of A. The set of all such linear combinations is called the span of the columns (see ). Then we have

Theorem 5.4 Let A be an m×n matrix and bRm. Then the following conditions are equivalent:

  1. Ax=b has a solution
  2. b can be written as a linear combination of the columns of A

Let’s examine some examples. Consider: A=[121324023611],b1=[111],b2=[110]

First, let’s solve Ax=b1. The row reduced echelon form of [A|b1] is: [121312402136111]RREF[12011200121200000]

This system has solutions. Using back substitution:

  1. x4 is free
  2. x3=122x4
  3. x2 is free
  4. x1=122x2x4

The general solution is: [x1x2x3x4]=[120120]+x2[2100]+x4[1021] Since x2 and x4 can take arbitrary elements, there are infinitely many solutions. When x2=x4=0, we get (12,0,12,0), showing that: b1=12a1+12a3, which we can verify by replacing the values of the columns of A in the equation.

This example illustrates the following

Theorem 5.5 Let A be an m×n matrix, bRm, and suppose that Ax=b has at least one solution. Then the following conditions are equivalent:

  1. Ax=b has a unique solution
  2. The matrix R, the reduced row echelon form of A, has a pivot position in every column.
  3. The matrix R, the reduced row echelon form of A, does not have a free column.

Now consider Ax=b2. The row reduced echelon form of [A|b2] is: [121312402136110]RREF[120100012000001]

This system has no solution since the last equation becomes: 0x1+0x2+0x3+0x4=1

For the general case, let b3=[abc] be arbitrary in R3. We can find solutions by reducing the augmented matrix [A|b3] to echelon form. We can do this in SymPy using the echelon_form() method that finds the row echelon form and not the row reduced echelon form.

from sympy import Matrix, symbols

a,b,c = symbols("a b c")
A = Matrix([[1,2,1,3],[2,4,0,2],[3,6,-1,1]])
B = Matrix.hstack(A,Matrix([a,b,c]))    # finds the augmented matrix
B.echelon_form()    # finds the echelon form

[1213a00242a+b00002a+4b2c]

The last equation is: 0x1+0x2+0x3+0x4=2a+4b2c. Then we conclude that Ax=b3 has a solution if and only if 2a+4b2c=0, which explains why we have a solution for b1=(1,1,1) but not for b2=(1,1,0).

The last two examples illustrate the following

Theorem 5.6 Let A be an m×n matrix and let R be the reduced row echelon form of A. The following conditions are equivalent:

  1. Ax=b has a solution for every bRm.
  2. R has a pivot position in every row.
  3. R does not have a zero row.
  4. Every bRm can be written as a linear combination of the columns of A

The last condition motivates the following definition

Definition 5.2 Suppose that V is a vector space and that v1,v2,,vkV. The set S={v1,v2,,vk} spans V if for every vV, there exist c1,,ckR such that c1v1+c2v2++ckvk=v

5.7 Finding Inverses

Let A be an n×n matrix. Recall that A is invertible if there exists an n×n matrix A1 such that: AA1=InandA1A=In where In is the n×n identity matrix. If A and B are invertible n×n matrices. Then AB is also invertible and the inverse is the product of their inverses in reverse order: (5.1)(AB)1=B1A1 We can easily verify this: (AB)(B1A1)=A(BB1)A1=AInA1=AA1=In And similarly for (B1A1)(AB).

Suppose now that A is an invertible n×n matrix. Then for any bRn, the system Ax=b has unique solution: x=A1b. From , the row reduced echelon form R of an invertible matrix A must have:

  • A pivot in every row
  • a pivot in every column, because R has an equal number of rows and columns
  • No free columns

This leads to a key result:

Theorem: If A is an invertible n×n matrix, its row reduced echelon form R is the identity matrix In.

This gives us an algorithm to find A1:

  1. Form the augmented matrix [A|In]
  2. Row reduce to get [In|B] for some matrix B
  3. Then B=A1

We’ll explore this algorithm in detail during the lab.

5.7.1 Writing Invertible Matrices as Products of Elementary Matrices

Let A be an invertible matrix. Then the row reduced echelon form of A is the identity. Each step of the row reduction process corresponds to left multiplication by an elementary matrix. Recall that elementary matrices are invertible and their inverses are also elementary matrices. Therefore, we can find elementary matrices E1,,Ek such that: EkE2E1A=In Solving for A we get that A=Ek1Ek11E21E11.

To illustrate, consider the case where E4E3E2E1A=In. We can solve for A:

  1. Start with E4E3E2E1A=In
  2. Multiply from the left by E41:
    • E3E2E1A=E41
  3. Multiply from the left by E31:
    • E2E1A=E31E41
  4. Multiply from the left by E21:
    • E1A=E21E31E41
  5. Finally multiply from the left by E11:
    • A=E11E21E31E41