1. CrossroadsRobert Johnson 1938:  Finding roots of single equations
    Suppose we have a scalar non-linear function, $f:\Re\to\Re.$ And we want to find a value $x^\star$ such that $f(x^\star)=0.$ We say $x^\star$ is a root of the equation (or the system). In general there can be no root, one root or any number of roots, including a continuum of roots if the function is flat at zero. Note that this is different than the linear case in which there must be either 0, 1, or a continuum of roots.

  1. Bracket and Bisect
  2. If we draw a picture you can just see the root with your eyes. If $f(x)$ is coded as a function in a language like Ox the only way to know its value is to send an argument and see the returned value. We will assume one very important thing: the function is continuous in $x.$ Otherwise we know nothing about its shape except through sending values to it and seeing the result. That leads to a sequential method of finding the root as described here:
    Bracket
    1. Pick a starting value $x_0$ and compute $f_0 = f(x_0)$ at any point. So we evaluate $f(x_0)$ and it might look like this:
      Obviously $x_0$ is not a root. Where is a root? It could be to the left or right of $x_0$. It could be close or far away. There is no way to know with just this information.
    2. So let's try evaluating the function at a different point, $x_1$.
      We still don't know if there is a root or where it might be.
    3. A third function evaluation, $x_2$. We might as well keep going to the right, but $x_2$ could be anywhere:
      So far all the points are below the x axis but we don't know much more than that.
    4. One more try …
      Now something has happened! Between $x_2$ and $x_3$ the function must have crossed the x-axis at least once because of the intermediate value theorem you learn about in calculus.
    5. So now we see what the point of evaluating the function at different points has been. Keep looking until you get values both above and below 0. Once we have that we have a bracket around a root. We know there is one there. Let's call these two points $x_l$ and $x_u$. The function values have to be different signs which is the same as the condition $f(x_l)f(x_u) < 0.$ Two points that satisfy this conditions is said to bracket a root.

    Note that this process can keep going for an indefinite time. We can't stop evaluating the function and points until the bracketing condition holds, and unless we know more about the function we can only keep evaluating points until it happens. However, once we have captured (bracketed) one or more roots it is easy to find one of them.

    Bisect
    1. We start the bisect phase of root finding once a bracket has been found. Now fill a function to make the process more complete:
    2. Instead of stretching out to find a bracket we now stay between $x_2$ and $x_3$. The simplest choice is to compute the value of the function at the mid-point, $(x_2+x_3)/2$, hence the term bisecting.
      It turns out that for this function this point is bigger than $f(x_2)$. We can now see that we have new bracket formed by $x_2$ and $x_3$ with the new point in between. We move the left end of the bracket up from $x_1$ to $x_2$.
    3. With our new bracket we then bisect again by computing the objective at the new midpoint $(x_2+x_3)/2$.
      It turns out that this point is less than the current midpoint value, which means a new bracket is created by moving the right end from $x_3$ to the new point. Now the bracket is $1/4$ the size of the original bracket and we still have the actual maximum captured in between the endpoints.
    4. Unlike the bracketing process, the bisecting stage will end. Just keep bisecting and moving one end or the other to ensure a bracket remains. Eventually the two end points will be extremely close to each other and a maximum somewhere in between. Note that if we had happened to bracket two local maxima this process will converge on one or the other. There isn't even a guarantee that the process will converge to the bigger of the two local maxes.

    Algorithm 11. Bracket and Bisect Root Finding

    Given $f:\Re\to\Re$, $x_0$ and tolerance $\epsilon$.
    1. Initialize. Calculate $f(\,x_0\,)$. Pick point $x_1\gt x_0$ and evaluate $f(\,x_1\,)$.
    2. ↻ Bracket. Evaluate $x_2 \lt x_3 \lt \dots \lt x_B$ until a local max must lie in $[x_0,x_B]$: That is, $${\max}_{1\lt b \lt B} f(\,x_b\,)\ge \max\{\ f(\,x_0\,)\,,\,f(\,x_B\,)\ \}.\nonumber$$ Set $x^L = x_0$, $x^m = {\arg\max}_{1\lt b\lt B}\quad f(x_b)$ and $x^U=x_B$.
    3. → Bisect. Try $x^{\,\prime} = (x^L+x^U)/2$. So: $x^L \lt \min\{x^{\,\prime},x_m\} \lt {\max} \{x^{\,\prime},x^m\} \lt x^U$. Either set $x^L=\min\{x^{\,\prime},x^m\}$ and $x^m={\max} \{x^{\,\prime},x^m\}$; or set $x^U={\max} \{x^{\,\prime},x^m\}$ and $x^m=\min\{x^{\,\prime},x^m\}$, ensuring bracket contains the optimum.
    4. If $x^U \mathop{=_\epsilon} x^L $ set $x^\star = {(x^L+x^U) / 2}$ and .
      Otherwise, return to →, step 1.
  3. Roots of Polynomials
  4. NR 9.4
    A polynomial is a special case of a single non-linear equation, and we specialize even further to the case of one parameter or unknown. A polynomial of order $n$ in the unknown $x$ takes the form: $$g(x;a)\quad =\quad {\sum}_{i=0}^n a_i x^i.\tag{Poly}\label{Poly}$$ The (n+1) vector of coefficients, $a = \l(\matrix{a_0 & a_1 & \cdots & a_n}\r)$ defines the polynomial. As is well known, there are at most $n$ roots to a n-th order polynomial, $g(x_j; a) = 0$. Of course, not all the roots are real; they can be complex and they can be repeated roots.

    Because they are special and important functions in one-dimension, special algorithms to solve for all the roots of a polynomial are available.

    In Ox, you can use polyeval() to compute the value of a polynomial. For example, polyeval(<1,1,-6>,2.0) returns $1+1\times 2-6\times 2^2 = -21$. And polyroots(a,&xroots) will compute the roots of the polynomial defined by the a vector of coefficients. It returns the roots in xroots because its address is sent as the second argument. In particular, the output will be 2× m matrix. The first row is the real part of the roots. The second row is the imaginary part. So the second row of the output will be zeros if all the roots are real. Finally, polyroots() returns an error code of 0 if the solutions are found and returns 1 if the algorithm failed to converge.

    Here is how to solve for the roots of the simple quadratic equation above:

    Code
    26-poly-roots.ox
     1:    #include "oxstd.h"
     2:    main(){
     3:        decl x,ccode;
     4:        ccode = polyroots(<1,1,-6>,&x);
     5:        println("Convergence code: ",ccode,"\nSolutions: ",x);
     6:        }
    
    Output:
    Convergence code: 0
    Solutions:
          -3.0000       2.0000
          0.00000      0.00000
    

Exercises