[ Search |  Up Level |  Project home |  Index |  Class hierarchy ]

 FiveO.ox

A Guide to FiveO: Object Oriented Objective Optimization in Ox.

    CONTENTS
  1. Overview
  2. Notation
  3. Formal Definition
  4. Set up, initialize, solve
  5. Checkpointing
  6. Comparison with Standard Ox Components
    1. If you use maximize already
    2. If you use Modelbase already

  1. Overview
  2. Note: If you want to follow a simple example go to Get Started with FiveO

    FiveO is designed to implement two closely related mathematical programs, both of which involve choosing a N x 1 vector, denoted \(\psi\). Either

    Maximize a real-valued function \(f(\psi)\), so in this case \(f:\Re^N\to\Re\).
    or
    Find a root of the non-linear system \(f(\psi)\), so \(f:\Re^N\to\Re^M\), \(N\ge M\).
    The objective/system is coded by the user as a object derived from the Objective class. If \(f\) is an objective then the objective is by default a sum of ƒ.M‾ component functions: $$f(\psi)= \sum_{i=0,\cdots,M-1} f_i(\psi),$$ where M-1 is shorthand for M-1. This means the user can code, say, the sample log-likelihood vector or moment conditions vector. FiveO adds up the vector and in turn can compute the so-called gradient matrix while implementing the BHHH algorithm. If \(f(\psi)\) is a system, then notice that it does not look square: \(N\) parameters map to \(M\) equations. However, this is because \(\psi\) may, from the user's point of view, contain parameters that are not varying while finding the solution. It must be the case that the variable components of \(\psi\) must number M.

    The user can give the problem/system to be solved a name that means something, but in these notes we will call your problem MyObjective. In FiveO, MyObjective is an Ox class derived from one of the built-in classes derived from the base Objective. The objective ƒ() is the column sum of MyObjective::vfunc(), a user-provided function (method). The system would be the column itself, not the sum.

    Unlike non-OOP approaches, the parameter vector \(\psi\) is not represented (simply) as a vector of real numbers. Instead, individual parameters are members of MyObjective and are objects derived from the Parameter class. So \(\psi\) is internally stored as a list (an OxArray) of parameters, but in MyObjective each parameter is simply a variable, like x or y. Parameters can be constrained and related to each other. A ParameterBlock will hold a vector of parameters to be treated together, like the coefficient vector β in a regression.

    Optimization/root-finding is carried out by invoking one of several standard algorithms which are classes derived from the base Algorithm class. The algorithm operating on MyModel is denoted Alg (to avoid repeatedly writing the algorithm applied to the objective/system ...). Multiple objectives and nested objectives can be handled by creating new objects (new instances of MyObject or different classes). This is a major distinction between FiveO and DDP, which is designed to hold just one DP model at a time. FiveO is designed to optimize over a handful, dozens, or a few hundred parameters, and problems of this scale do not tax the memory capacity of computers. Thus, almost all members of Objective are automatic, specific to the new object. (On the other hand, DDP is designed to represent very large state spaces so as many members of that class are declared static as possible.

  3. Notation
  4. Individual parameters are lower case Roman (x, y, etc.) or lower case Roman with subscripts (x0, x1, etc.). For this guide, generic parameter will be denoted x and when ordering matters xi. Parameters are not just numbers. They are objects of a class derived from Parameter.

    Parameters, as objects, have several properties associated with them. The initial value of x, before Alg begins, is denoted x.0. Thus the first element of \(\psi\) is initially set to x0.0. The current value of x at some point in the algorithms operation is x.v. The open range of feasible values for x, for a parameter that an algorithm can vary, is an open interval denoted x.I, where x.I ⊆ ℜ. For example, if x is a Free parameter it can take on values in the range x.I = (-∞,+∞). A Positive parameter has interval (0,∞). A key feature of FiveO is that the bounds of x.I can be dynamic: they can depend on the current values of other elements of \(\psi\). This dependence must be forward: the range of a parameter can only depend on elements that come before it in \(\psi\). That is, for i > 0, the interval can be written explicitly as xi.I(x0.v, …, xi‾.v). Using x0.v emphasizes that the interval can depend on the current values of preceding values, and then xi.v ∈ xi.I. For standard algorithms to work, the intervals must always be smooth functions of their arguments.

    A parameter of class Determined will not be under the control of the algorithm. That is, a determined parameter is a point, or closed interval of the form x.I = [d,d].. The value d is determined by something else, usually a fixed constant or a function of other parameters in \(\psi\). For example, a determined parameter could be set equal to the average of two other parameters, z = (x+y)/2. FiveO would ensure that restriction on z holds. The algorithm knows only about x and y. Two or more parameters can form a ParameterBlock, denoted as upper case Roman (X, Y, etc.). This allows MyModel to refer to a single variable which is a vector not a scalar. Parameter blocks are, in effect, sub-vectors of the overall parameter vector.

    The parameter vector, denoted \(\psi\), generically takes the form $$\psi = \pmatrix{x_0& x_1 &\cdots& x_{N-1}},$$ where N‾ is shorthand for N-1. \(\psi\) is built up by adding parameters and parameter blocks to it. The number of dimensions, N, is also written \(\psi\).N. The parameter space, Ψ, is the Cartesian product of the intervals for the elements of \(\psi\), $$\Psi \equiv \times_{n=0}^{N-1}\psi_n.I.$$ MyObject may tell Alg not to vary some parameters in \(\psi\) directly. A Determined parameter does not have an interval, it has a point value, x.I ∈ ℜ. As with feasible intervals, x.I can be a function of the current value of preceding parameters. The feasible parameter space can only be defined implicitly: $$\Psi \equiv \{\psi = (x_0 \cdots x_{N-1}) : x_0 \in x_0.I, x_1\in x_1.I(x_0), \dots ,x_{N-1}\in x_{N-1}.I(x_0,\dots,x_{N-2}) \}.$$ By using predefined parameter classes and parameter blocks this complex and very flexible parameter space emerges naturally from the parameters added to MyModel.

    Alg, as a sequential algorithm, can be thought of as a function \(S: \Re^N\rightarrow \Re^N.\) S maps the starting vector \(\psi\).0, to a final value, \(\psi^S \approx \psi^\star\).

  5. Formal Definition
  6. The overall optimization problem for is defined as trying to find for a given \(f : \Re^N \rightarrow \Re\) the optimizing vector $$\psi^\star = \arg\max_{\psi \in \Psi}\quad f(\psi).$$ The output of the algorithm is an approximation of that problem: \(S(\psi.O) \equiv \psi^S \approx \psi^\star.\)

    The overall problem for system solving is defined as trying to find for a give \(f: \Re^N \rightarrow \Re^M\), \(N \ge M\) such that $$\psi^\star \in \Psi \& f(\psi^\star) = \overrightarrow{0}$$ and \(S(\psi.O) \equiv \psi^S \approx \psi^\star\). Here \(\overrightarrow{0}\) is a vector of zeros of length M.

  7. Set up, initialize, solve
  8. The user code for MyObject should include these elements
    Set Up: code that is not executed (part of the compilation stage)
    Declare MyObject as derived from the Objective class
    Include data members for parameters and blocks that appear in \(\psi\).
    Declare and define vfunc
    as a method (member function) to represent \(f()\).
    Initialize: code that runs before optimization begins
    Assign new objects derived from Parameter to the data members for \(\psi\) elements
    Add parameters to the objective by sending them to Parameters().
    Set \(\psi\).0:
    Solve: call one of the optimization or solution methods of Objective to solve for \(\psi^\star\) sequentially.

  9. Checkpointing, Restarting, and Strategies
  10. FiveO is designed to save (checkpoint) a problem in two different senses. First, it will checkpoint the objective itself: the current best parameter vector as it is found. By default the information is stored in a file named L.optobj, where L is the string label assigned to the objective (L).

    MyObject can save the current vector (whether best or not) using Save(). The format of the file is not simply a vector of numbers. It is a summary of the problem and the current state of the solution process. The .optobj file is designed to be loaded back into FiveO using Load(). Load() will check some aspects of the file for consistency, read in starting values, \(\psi\).O and ignore the other information. The parameter vector in the file can be edited to reset the values. MyObject will always hard code starting values for parameters when they are created. The code can be written so that these hard values are only used if the person wants to complete restart. Otherwise, the starting vector, \(\psi\).O, will be loaded from a file:

     if (!Load(fn)) Encode(); 
    If fn=-1 then Load() will return FALSE and do nothing. Then Encode(0) will use the hard-coded values for \(\psi\).0. On the other hand, if fn = "hello", Load() will try to load values from hello.oxobj, returning TRUE if successful. Finally, if fn=0 it will do the same but using the default file name, L.optobj.

    Objective checkpointing means execution can be stopped without losing the progress made so far. The current best vector does not checkpoint the state of the sequential algorithm. The algorithms in FiveO checkpoint their current state so that the problem can restarted exactly where it stopped in the algorithm. For example, in BFGS the checkpoint reloads the current value of the Hessian. And in NelderMead the checkpoint reloads the current simplex. Reading a checkpoint file is done by sending the argument UseCheckPoint to the algorithm's Iterate().

    Example Sequence of Optimization Starts and Restarts
    The objective myobj contains a 4x1 vector of parameters, \(x\). The following segments of code explain how the user would change their code while changing the objective and restarting the optimization.
    1. First Time: use hard-coded starting values
    2. The user has coded everything and now wants to try maximizing the objective. The hard-coded starting values are as good place as any to start:
          x = new Coefficients("x",zeros(4,1));
          myobj->Parameters(x);
          bfgs = new BFGS(myobj);
          bfgs -> Iterate();
      
    3. Load current vector and start from there
    4. Based on the initial attempt the user has made some changes to the objective. One reason might be that the starting value of 0 for \(x_2\) was way off and ended up at 15000. The other parameters are all below 10 in absolute value. SO the code was changed so that \(x_2\) in units of a thousand. This required a manual change in the .optobj file to start at 15 instead of 15000. Other changes made to the objective will shift the optimum in other ways as well. The algorithm should restart with the initial Hessian because changes to the objective mean the old one is not appropriate given the rescaling.
          x = new Coefficients("x",zeros(4,1));
          myobj->Parameters(x);
          myobj->Load();          // new, read from the default .optobj file
          bfgs = new BFGS(myobj);
          bfgs -> Iterate();
      
      The Load() could specify a different file to read from, but in this case the file that the previous attempt saved is used.

    5. Third Time: restart after an interruption
    6. The user is running this code on a system that limits run time and the second time ran out of time before convergence. Now the user wants to restart from the last iteration because nothing has changed. The current Hessian is appropriate and simply restarting at the current parameter vector will lose the information built up during the previous iterations.
          x = new Coefficients("x",zeros(4,1));
          myobj->Parameters(x);
          myobj->Load();
          bfgs = new BFGS(myobj);
          bfgs -> Iterate(UseCheckPoint);   //restart the algorithm where it left off
      
      The Load() still happens but reading the chkpt file wipes out those values. They would be the same.

  11. If you use maximize already.
  12. Differences between maximize and FiveO At a Glance.

    ItemIn Ox::maximizeIn FiveO
    The objective
    ƒ() : ℜN→ℜ
    MyObj(x,v,G,H){ ··· } 
    Coded as a standalone function with a particular declaration
    class MyObj : BlackBox {
            ···
            vfunc();
            }
        MyObj::vfunc() {···}
    Coded as a method with a particular name of a class derived from Objective.
    The parameter vector $$\psi \equiv \pmatrix{x_0& x_1 &\dots &x_{N-1}}$$ A single parameter x

    Initial value x0.

    Current value: x

     
        psi = new matrix(N,1);  
    A position in psi

    Starting values sent to a function.

    Extracted by the user: decl x = psi[i];.

        x = new Parameter("x",0.0);
        Parameters(x);
        ⋮
        CV(x);
    An instance of a class derived from Parameter added to the model
    Initial value set at creation or manually.
    CV(x) or x.v
    Related Parameters
    (e.g. Coefficients β)
    beta = x[2:8];
    yhat = x*beta;
    extracted from the parameter vector by the user
    beta = new Coefficients("beta",7,0);
    ⋮
    yhat = x*CV(beta);
    Accessed as a separate vector without extraction.. Value beta.v or CV(q)
    Open interval constraints
    For example, 0< q < 1
    prob = exp(x[0])/(1+exp(x[0]);
    User transforms parameters to keep them feasible
    prob = new Probability("p",0.3);
    Predefined classes such as Positive and Probability.
    User can create their own derived parameter classes.
    Restrictions Across Parameters
    p = x[0:4] | 1-sumc(x[0:4]);
    Up to the user to impose and track
     p = new Simplex("p",6);
    Dynamic interval bounds and pre-defined blocks.
    Algorithms
    MaxBFGS(f,…)
    Standalone functions.
    obj -> Quasi(BFGS,0,0);
    Method applied to the object.
    Objectives that are not a BlackBox
    \(f = \sum f_i(\psi_i)\)
    Up to the user to programEfficient and simple handling of Separable and Mixture objectives.
    Long execution timesCurrent best vector not saved automatically.Automatic Save of current best, which can be Loaded to restart.
    Integrated with CFMPI for parallel execution.

  13. If you use Modelbase already.
  14. The routines in maximize are low level ones that leave a great deal of work for the user. However, Ox provides a class that provides a higher level function, Modelbase. It would often be a better and more convenient way to develop an econometric application than a direct use of maximize . And its approach can be better than FiveO for coding econometric models to account for variable selection, parameter definitions that are primarily coefficients on variables.

    FiveO is not concerned with variables, observations and coefficients. Rather, it is concerned with parameterizing a model that involves optimization in a much more general sense than something related to linear econometric models. The fact that the objective may be related to data as well is not primary to FiveO.

    In Modelbase aspects of parameters are stored separately, at least from the user's perspective. Parameters can be fixed or free, but other kinds of constraints and relationships between parameters are left to the user. A key difference with FiveO is that parameters are represented as objects derived from built-in parameter types. Parameters are created using a class that captures the constraints on the parameter and its relation to other parameters. For example, with one line of code a user can add a vector of parameters that are guaranteed to be Increasing during optimization: x1 >x2 > ··· > xM.

Author:
© 2011-2023 Christopher Ferrall

 Global functions

Functions
 Explore Take a random walk in the parameter space of a model.

 MultiNomialChoice

Public methods
 Estimate

 Global

 Explore

Explore ( model , Ncalls , Chol , ... )
Take a random walk in the parameter space of a model.
Parameters:
model Object that has a Solve() method.
Ncalls integer, number of calls, default=0, no end to calls
Chol Cholesky argument to send to Iterate(). Set to 0 to use default identity matrix.
... Parameters or arrays of Parameters to wander over.

This routine creates a NoObjective objective, which calls method->Solve(). It creates a RandomSearch algorithm and then iterates on it. These objects are deleted if/when the number of calls reaches Ncalls.

 MultiNomialChoice

 Estimate

MultiNomialChoice :: Estimate ( )