1. All Down the LineRolling Stones 1971: 1-D Line Optimization
    NR 10.1-10.2
    We start with the case of one-dimensional optimization. This procedure is part of other algorithms that handle more than one dimensions.
  1. Bracket
    1. Consider the information at the start of a sequential optimization problem, which is a starting value $x_0$ and the ability to evaluate the function $f(x)$ at any point. We will also assume that the function is continuous but otherwise we know nothing about its shape. So we evaluate $f(x_0)$:
      Where is a maximum? There is no way to know with just this information.
    2. So let's try evaluating the function at a different point, $x_1$. Given the information so far there is nothing special about this new point. It could be to the left or right of $x_0$. It could be close or far away. The only thing is we want it to be different than $x_0$.
      We still don't know if there is a maximum anywhere. The function could be increasing between $x_0$ and $x_1$ or it might have turned around in between and be on the way back down.
    3. A third function evaluation, $x_2$. We might as well keep going to the right, but $x_2$ could be anywhere:
      Again nothing tells us about where a maximum exists. We know $x_0$ and $x_1$ can't be global maxima because their values are less than $f(x_2)$. That's about it.
    4. One more try …
      Now something has happened! The function has turned around between $x_2$ and $x_3$. What defines a turnaround? It is a function evaluation, in this case $f(x_2)$, that is greater than the function values on either side of it, $f(x_1)$ and $f(x_3)$: $$\eqalign{ f(x_2)\ \gt\ f(x_1) &\qquad\hbox{and}\qquad f(x_2)\ \gt\ f(x_3).\cr x_1\quad &\lt \quad x_2 \quad \lt\quad x_3\cr}\nonumber $$ Three points that satisfy these conditions is said to bracket a maximum. That is because if $f(x)$ is continuous we know that at some point it had to reach a highest point somewhere between $x_1$ and $x_3$.
    Note that this process can keep going for an indefinite time. We can't stop evaluating the function and points until the bracketing condition holds, and unless we know more about the function we can only keep evaluating points until it happens. However, once we have captured (bracketed) one or more maxima it is easy to find one of them.
  2. Bisect
    1. We start the bisect phase of line optimization where we finished bracketing. Now we will fill in an objective function to make the process more complete:
    2. Instead of stretching out to find a bracket we now stay between $x_1$ and $x_3$. The simplest choice is to compute the value of the function at the mid-point, $(x_1+x_3)/2$, hence bisecting.
      It turns out that for this function this point is bigger than $f(x_2)$. We can now see that we have new bracket formed by $x_2$ and $x_3$ with the new point in between. We move the left end of the bracket up from $x_1$ to $x_2$.
    3. With our new bracket we then bisect again by computing the objective at the new midpoint $(x_2+x_3)/2$.
      It turns out that this point is less than the current midpoint value, which means a new bracket is created by moving the right end from $x_3$ to the new point. Now the bracket is $1/4$ the size of the original bracket and we still have the actual maximum captured in between the endpoints.
    4. Unlike the bracketing process, the bisecting stage will end. Just keep bisecting and moving one end or the other to ensure a bracket remains. Eventually the two end points will be extremely close to each other and a maximum somewhere in between. Note that if we had happened to bracket two local maxima this process will converge on one or the other. There isn't even a guarantee that the process will converge to the bigger of the two local maxes.

    Algorithm 14. Bracket and Bisect Line Maximization

    Given $f:\Re\to\Re$, $x_0$ and tolerance $\epsilon$.
    1. Initialize. Calculate $f(\,x_0\,)$. Pick point $x_1\gt x_0$ and evaluate $f(\,x_1\,)$.
    2. ↻ Bracket. Evaluate $x_2 \lt x_3 \lt \dots \lt x_B$ until a local max must lie in $[x_0,x_B]$: That is, $${\max}_{1\lt b \lt B} f(\,x_b\,)\ge \max\{\ f(\,x_0\,)\,,\,f(\,x_B\,)\ \}.\nonumber$$ Set $x^L = x_0$, $x^m = {\arg\max}_{1\lt b\lt B}\quad f(x_b)$ and $x^U=x_B$.
    3. → Bisect. Try $x^{\,\prime} = (x^L+x^U)/2$. So: $x^L \lt \min\{x^{\,\prime},x_m\} \lt {\max} \{x^{\,\prime},x^m\} \lt x^U$. Either set $x^L=\min\{x^{\,\prime},x^m\}$ and $x^m={\max} \{x^{\,\prime},x^m\}$; or set $x^U={\max} \{x^{\,\prime},x^m\}$ and $x^m=\min\{x^{\,\prime},x^m\}$, ensuring bracket contains the optimum.
    4. If $x^U \mathop{=_\epsilon} x^L $ set $x^\star = {(x^L+x^U) / 2}$ and .
      Otherwise, return to →, step 1.

Exercises