As we have seen before, a linear equation in one variable is an equation of the form
It is possible to have several unknowns, but then we also have to have several equations. An equation in several unknowns is linear if the unknowns are multiplied with constants and those products add to a constant. A set of linear equations is called a linear system.
The most important and typical case is that the number of equations equals the number of unknowns.
Let's illustrate the concepts with the following simple system. Suppose there are two variables, and , and we have the two equations
We begin by observing that both equations are satisfied if and . This is easy to check (and I know it because that's how I set up the system), but how would we know if nobody told us the solution?
The textbooks usually present two methods of solving a linear system, substitution and elimination. Substitution works by solving one equation for one variable, and substituting its value (in terms of the other variables) in the other equations. It works well for small systems (and it can sometimes be applied successfully to nonlinear systems). Elimination, or Gaussian Elimination, is essentially equivalent, but it's more systematic, and it's the basis of virtually all computer methods of solving large linear systems. This page focuses on elimination.
Consider again the above system. Let's do something that may appear mysterious at first, but that embodies the basic idea. Let's multiply with on both sides of the first equation. This gives the new system:
The significance of this operation is that in the new system the variable has the same coefficient in both equations.
Next, let's subtract on both sides of the second equation. However, on the left side let's not call it . Let's call it . We can do this, according to the first equation. The second equation becomes
Now that we know we return to the first equation and substitute for . This gives the equation:
The first step (multiplying with ) and the second step (subtracting from the second equation) are usually combined and described, somewhat imprecisely, as subtracting twice the first equation from the second equation.
So in the above example we subtracted a suitable multiple of the first equation to get a new equation that has just one variable. We solved that equation, and then substituted its solution in the first equation which at that stage also turned into a single equation with a single unknown. We finished by solving that single equation.
Remember that at the end of the process we check our answers by substitution. Clearly, substituting and in the original equations shows that those equations are satisfied.
If we had more than two equations, say equations, we would pick a suitable equation (which may be the first, or any other equation) and a particular variable (which may be the first, or another) and subtract suitable multiples of our chosen equation from the other equations so that the chosen variable no longer shows up in the resulting equations in unknowns. At that stage, we have reduced our problem to a simpler problem of the same structure. We repeat the process until we get a single equation in a single unknown. The process so far is called Gaussian Elimination. We solve our single remaining equation, substitute its solution in an equation that has only two unknowns, solve that one, substitute the two now known variables into an equation with three unknowns, and in this manner work our way backwards until we have all the solutions. This second process is called Backward Substitution.
The method outlined here works in general. For work with paper and pencil it needs to be streamlined. Moreover, it is almost inevitable to avoid mistakes. Mistakes are harmless if they are recognized immediately. Let me describe a procedure that's efficient and that lets you discover errors immediately.
To begin with, we don't need to write down the unknowns all the time. It is sufficient to write down their coefficients. This will save time and effort. To check for errors we keep track of one more number, the row sum of each equation. It is simply the sum of the coefficients and the right hand side of each equation. For example the equation
Let's illustrate the technique with an example. Consider the linear system
The system and its processing are described in the following table:
Equations , , and form the original system, augmented by the row sum (RS). RHS stands for the awkward phrase right hand side. (You could also write the right hand side on the left, or you could arrange your equations in columns and write the right hand sides on top or at the bottom, or a separate sheet of paper, if you like.)
Equation is obtained by subtracting twice the first equation from the second equation. You should check that you get indeed the equation described in row of the above table. Fully written out it means
This is the key to many error checking techniques: compute the same thing twice and if you make a mistake in one of your computations it will show up as discrepancy in your two answers.
Row is obtained similarly, it equals row minus three times row . The whole purpose of the exercise is that rows (or equations) and contain only two variables, namely and . The new system is
These two equations have the same coefficient of (i.e., ). If we subtract the second from the first, or in terms of the overall system, row from row we obtain the new equation
The table does not describe how we find and . Knowing we find an equation, e.g., that contains and . That equation becomes
Notes