- Love Triangles Before moving to a general approach to solving a linear system let's think about a special case that remains almost as easy as the scalar case. Consider this matrix: A=(100230456). Now form a linear system with it: (100230456)(xyz)=(−140). Visually we can see an easy way to solve this problem: The first equation (first row) only depends on x so it is just a scalar system: x=−1/1=−1. Next, once we have determined x, we use it in the second row which now becomes a scalar equation in y: 2(−1)+3y=4 or y=6/3=2. Now we can move onto the third row and solve for z: 4(−1)+5(2)+6z=0 or z=(0+4−10)/6=−1. So we have extended the simple scalar case of dividing the constant by the coefficient to solve a N dimensional system as long as it is triangular. That is, the algorithm subtracts (weighted) values already solved from the current constant term then divides by the current diagonal element. When will this process fail? Under the same condition as the scalar system: when it calls for dividing by 0. The terms that are divided are the diagonal elements of the matrix. So if any of the aii elements are 0 the triangular system is singular. If they are all different from 0 the algorithm will work, the system can be inverted and the determinant of A is not zero. In fact, you might remember that if A is lower triangular its determinant is simply the product of its diagonal elements: A triangular →|A|=N∏n=1ann. Remember that we are talking about computing the solution with numbers stored as floating point reals (FPRs). If any aii=0 exactly then we have a symbolically singular triangular matrix. But if aii is not exactly zero but is a FPR close enough to 0.0 we get overflow: a result that is too big to store given the precision of the FPRs. We could test for near 0 using Ox's
- Suppose A is a N×N lower triangular matrix and we wish to solve Ax=b for x.
- This requires a sequence of forward substitutions x1=b1/a11↓x2=b2−a21x1a22↓⋮⋮↓xN=bN−SNaNN
- At row n, the term Sn=n−1∑i=1anixi is the sum of values of earlier rows substituted into the current nth equation. When n=1 the sum is simply S1=0.
- The elements of x are computed by looping over rows of A, and at each row the sum Sn loops over columns of the current row up to n.
- Suppose A is a N×N upper triangular matrix and we wish to solve Ax=b for x.
- This requires a sequence of back substitutions a11a12…a1n0a22a23…00⋱…0…0annx=b⇒x1=b1−∑ni=2a1ixia11↑⋮↑xn−1=bn−1−an−1nxnan−1n−1↑xn=bn/ann.
- Beethoven in his grave … Decomposing
- If |A|≠0 then:
- ∃L,U:A=LU and L is lower △, U is upper △, and L has 1s on its diagonal.
- Since the diagonal of L is normalized |A|=|LU|=|L||U|=1×|U|=∏i=1,…Nuii.
- The matrices L and U for A can be found numerically using an algorithm that fails only if |A|=0. This will happen in the process at some row i when uii is too close to 0. At that point the L-U decomposition will stop because it has discovered that the determinant of A is 0. Thus, instead of computing the determinant in order to solve a system of equations (as taught in math econ courses), computer algorithms compute the determinant as a by-product of solving the system.
- The solution is numerically more accurate if permutations of the rows of A are allowed. That is, a permutation matrix P is the identity matrix I with some of it is rows swapped. The L-U algorithm will then produce 3 matrices such that A=PLU. However, note Ox's implements this differently
declu(A, aL, aU, aP) Argument Explanation ------------------------------------------------------------------------------------ A the (square) matrix to decompose aL an address (pointer) where to put L aU an address (pointer) where to put U aP an address (pointer) where to put a permutation matrix P, On output P contains a 2xN matrix containing integers. The first row is the order of rows of LU to get back to the original rows of A. That is, A = (L*U)[ P[0][] ][] Returns 1: if no error occurred, and the L, U, P matrices are reliable. 2: if the decomposition could be unreliable (because A is near to singular) 0: if the LU decomposition failed and the matrix A is (numerically) singular.
How does a program using decl L, U, P, ok, A, b; // compute or load A and b ok = declu(A, &L, &U, &P); if (ok>0) { x = solvelu(L,U,P,b); println(" x = ",x); } else println("A is singular, cannot solve for x");
invert(A) Returns inverse of A, 0 if decomposition failed. invert(A,aLogDet,aSign) Argument Explanation ------------------------------------------------------------------------------------ aLogDet address to put log( |det(A)| ), the logarithm of absolute value of |A| aSign address to put a sign code for |A|: 0: singular -1,-2: negative determinant +1,+2: positive determinant either -2 or +2 indicate the result is unreliable.
- Use the L-U decomposition to compute P, L and U for A. If this fails then A is not invertible.
or handle this situation some other way.
- Forward solve Ly=b for y.
- Backward solve Uz=y for z.
- Rearrange rows: x=Pz.
- Non-square and Singular Systems: Singular Value Decomposition Besides the L-U decomposition there are other procedures for square matrices (the QR decomposition and the Cholesky decomposition for symmetric matrices discussed elsewhere). When it comes to non-square matrices other algorithms also become important. So suppose B is a N×M matrix and N≥M. We know that if N≠M the inverse of B is not defined and the square triangular L-U decomposition is not defined. However, there are notions of "generalized inverses" of non-square matrices which we do not discuss. And there is an important decomposition method we will define, the SVD or singular value decomposition: B=UWV′. Here U is a N×M matrix (same dimensions of B), W is a square M×M diagonal matrix, and V is a M×M matrix (notice the transpose in the equation). Since W is diagonal it has M elements on the diagonal denoted wi. These are known as the singular values of B. Further, the matrices U and V are normalized so that U′U=V′V=IM. This says that columns of U can be constructed to be orthogonal to each other. The same for the columns of V. In the special case of B square (N=M) we could write it's inverse as B−1=Vdiag(1/w11/w2⋯1/wN)U′. This relies on the fact that the inverse of an orthogonal matrix is its transpose (UU′=I.) This matrix will exist as long none of the singular values are exactly 0. Otherwise it is not defined because the diagonal matrix has an infinite value. So having computed the SVD of B you can look at the w′s and determine how "close" to singular it is. When N>M you can use the SVD to compute the "least squares" solution to Bx=b. That is, there are more equations than unknowns and you want to find an x vector that as close as possible to solving the system. x=Vdiag(1/w11/w2⋯1/wN)U′b. You can see directly that if it is a square system this gets back to x=B−1b. Otherwise, this x minimizes the Euclidean distance between Bx and b. In regression analysis this is another way to rewrite the OLS estimates: b≡y, ˆβols≡x and the SVD component is derived from X and equates to (X′X)−1X′.
feq(A[n][n],0.0)
and if this is true the algorithm can stop and report a (numerically) 0 determinant. (The real algorithm can do better than just stopping.)
We can generalize what we can see for the 3x3 matrix above.
Algorithm 3. Solving a Lower Triangular Systems.
Algorithm 4. Solving Upper Triangular Systems.
So we have simple algorithms to solve any triangular system. Now Suppose A=LU, where L is lower △ and U is upper △. Then Ly=b⇒y=L−1bUx=y⇒x=U−1y=U−1L−1b Now, to consider the general problem consider first the trivial case when N=1 (one equation in one unknown). Then we have scalars Ax=b and therefore x=b/A. It should be clear that any non-zero scalar A can be written as the product of two non-zero numbers, call them L and U. Indeed, in the scalar case we can simply normalize L=1 and define U=A. When N=1 there is a decomposition of A into two triangular "matrices". If, on the other hand, A=0, then U=0 and it is not possible to factor the matrix and then solve, since this would involve dividing by 0.
Once we let N be greater than 1 it is not so obvious that A can be factored into triangular matrices or how to go about doing it. However, the idea that elements of L can still be normalized might be intuitively clear. There are N(N+1)/2 non-zero elements in a triangular matrix. Two such matrices have a total of N(N+1)=N2+N elements, but the original A matrix has only NN elements. So we have N more degrees of freedom (free values) than we need for the factorization. One way to reduce this freedom is to normalize N elements of the two triangles. Following the scalar example, we can normalize the diagonal of L to consist of 1s. Then together there are exactly N2 free elements in L and U to just match the elements in A.
Fortunately, for the case of N>1 it has been known for decades that if A is non-singular then we can indeed find matrices L and U for it. We will not discuss how the algorithm for computing the L and U works (it's tedious) but let's be clear what it does.
Algorithm 5. L-U Decomposition & solving Ax=b & Inversion
In Ox's implementation of this algorithm, the job of telling the user how the rows of A are rearranged is done with a vector of integers. The earlier section on fancy indexing explained how a vector can be used to index a matrix. For example, suppose
B
is a 2x5 matrix. Then B[<1;0>][]
swaps is the 2x5 matrix with the rows of B swapped. That syntax is how Ox reports back the permutation:
The built-in Ox function for carrying out the LU Decomposition is:
declu()
know if the decomposition was successful? By looking at the return value of declu()
. If your program has called declu(A,&L,&U,&P)
and it returns a 1 then we know we now solve a system of the form Ax=b with back- and forward-substitution, which Ox's solvelu()
will do for you:
One key implication of solving linear systems using decompositions is that it does not require computing A−1, although if A−1 is already available it can be used to do the job. Since L−U depends only on A, the decomposition can be used to solve for different b's. In fact, suppose you 5 different b vectors to solve Ax=b. Then you can put the column vectors into a matrix B. The back/forward substitution process can be used to solve a matrix equation AX=B. Each column of X will correspond to the vector solution for the corresponding column of B.
Suppose A is N×N. Now think of applying the idea in the previous point where B=I, the identity matrix of order N. We can get A−1 by solving AX=I. So solving a single linear system by first computing A−1 involves n−1 unnecessary solutions. So, in symbolic linear algebra we imagine finding a matrix inverse to solve a system of equations. But a computer would typically find the inverse by solving N linear systems. There are many cases when A−1 is needed for its own sake. Or you have to solve for many vectors b (more than N different ones). Then it makes sense to compute A−1 and then use it solve the linear systems. Computing A−1b is quicker than back/forward solving once. So use L-U to compute A−1 then store for other calculations or use it for many b vectors. Built-in Ox routines will calldeclu()
and solvelu()
for you:
Algorithm 6. A General Algorithm to Solve Ax=b & Inversion of a Matrix
Exercises
- Write a nested-loop flowchart for solving a triangular system. The outer loop is over rows; the inner loop is over columns of the current row.
- Write a function that solves a triangular system. Test your function by sending triangular systems that you can verify easily. (In this version just assume the matrix is non-singular.) You function should satisfy this description:
SolveL(A,b,ax) Argument Explanation ------------------------------------------------------------------------------------ A N × N lower triangular matrix b N × 1 constant vector ax address to place resulting solution x*
Open the code below. It is a shell for this function which you can then fill in code to carry out the algorithm above.
23-SolveL.ox 1: #include
2: 3: SolveL(A,b,ax) { 4: decl n, j, S; 5: ax[0] = constant(.NaN,b); //initialize x* 6: for(n=0;n < rows(A);++n) { 7: // put code here to solve ax[0][n] 8: } 9: } 10: 11: main() { 12: // Put calls to your function to test it 13: 14: } - Once your SolveL() function works, modify it so that it implements an additional feature:
⋮ Returns the determinant of A, |A|. If A has a numerically zero determinant it returns exactly 0 as the determinant and does not complete the solution to x*.
- Once your SolveL() code works try eliminating the inner loop by using vector multiplication and subranges of A and ax[0].
- Make a copy of your
SolveL()
function, rename itSolveU()
, and change the code to solve an upper triangular system. - Write your own
MyInvert(A)
function to replace the standardinvert(A)
usingdeclu()
andsolvelu()
. Your function returns the inverse if successful or the integer 0 if the inverse fails. - Modify
MyInvert(A)
to not usesolvelu()
. Instead, use yourSolveL()
andSolveU()
coded earlier.