This is useful for someone who already knows a great deal about dynamic programming and has some familiarity with Ox or similar languages. If you want a much more basic introduction, start back at the beginner's guide You can also start with GetStarted for a demonstration of coding and return here.
MyModel
The discount factor: \(\delta\) Single action variable: \(a\) Vector of actions: \(\alpha = (a_0,a_1,\dots)\) Number of values: \(a\).N Size of a vector: \(\alpha\).N Distinct vectors \(\alpha\).D Endogenous state space: \(\Theta\) Utility: \(U()\)
a
) or variable properties of an object (i.e. current value a.v
). A generic element of a vector will usually use the Roman letter corresponding to the vector's Greek name and without subscript. A subscript is used when ordering is important.a.N
) a.N
is the property N associated with the object a
. The binary .
operator is how properties (members) are accessed in Ox.
o.pretrieves from the object o the aspect or property p.
n
with n.N
values would have the range n = 0, … ,(n.N)‾.0 0 1 0 0 1 1 1
class
and a struct
are both a class as usually defined.
struct
and class
is simply whether elements of an object are by default directly accessible from the outside (i.e. public) or not (private): yes in a struct
, no in a class
. DDP is designed for convenience not reliability, so everything class is declared struct
, but the term class is used in this documentation.A class is a bundle of data and functions to operate on the data. The data are called members of the class and the functions are called methods, although Ox documentation also refers to these as data members and function members, respectively. Multiple copies of a class can be created while a program runs. Each copy is called an object or instance of the class. The key is that the methods work with the data of the object without needing to pass the data to it as with non-OOP languages or constructs.
Members and methods of class are either static or automatic. This distinction is extremely important in the design of DDP. Static members/methods are shared by all objects of a class, whereas automatic members/methods are specific to the instance. DDP conserves memory by storing as much information in static variables as possible. The word automatic does not ever appear, it is implicit. If the tag static
does not appear in the declaration of the item then it is automatic.
MyModel
and MyCode
are metasyntatic variablelike
foo
. A user builds a DDP model by adding components to it: states and actions and the functions related to them. DDP cannot know how many items will be added of each type. How can the model be ready to store whatever the user chooses? The answer: the user of DDP constructs a model as a class derived from one of the built-in models, called a DDP for Derived Dynamic Program
. The user's class inherits the built-in properties of its base model. So DDP can solve the model and produce output for it even though it does not know the details of the user's model until the program starts executing.
DDP Models are derived from the Bellman class, which in turn is derived from the base DP class. If the user wants to call his/her model MyModel, they would have something like this
class MyModel : DPparent { ⋮ }
class Make : ExtremeValue { ⋮ }
MyModel
is specifically Make
and DPparent
is specifically ExtremeValue
.
Which Bellman-derived class that MyModel
is based on is called DPparent
in these notes. MyModel::
is prefixed to items that the user provides or customizes. DPparent
is prefixed to items related to the parent. Other predefined or default items either have no prefix or are prefixed by DP::
. The convention of using MyModel
avoids having to write repeatedly the user's version of … DP::x
Instead, MyModel::x
suffices.
MyModel
in Ox, and they must write an Ox program that executes some tasks in the proper order. For example, the code must always call the Initialize()
method for DPparent
before adding things to the model. So the Ox code that executes these tasks is collectively called MyCode
in these notes. Another way to think of it: MyModel
is a translation of the pen-and-paper aspects of your model and MyCode
are the instructions to implement the model, solve it and use it.
x
of discrete variables has a length x.N. But it creates a space of possible values (the Cartesian product) equal to the product of the individual variable cardinalities. So x.D
is the size of the Cartesian space of a vector x
.
$$x.D \equiv {\prod}_{i= 0}^{x.N-1} x_i.N.$$
A space in DDP, say \(\Theta\), is usually a set of vectors of the space of possible vectors. The number of points in a space is then \(\Theta\).D.a.v
is the value of a, which can only be one of the values 0 … (a.N)‾.MyCode
does not need to reference .v
directly. This is explained in detail below.MyModel
may need discrete values to correspond to another set of values (possibly not even integer values). Which values are mapped to may depend on parameters that are changing between solutions of the model. MyModel
creates an action or state variable x from a derived the class then it can also provide an Update()
routine to reset and store the vector of actual.x.actual = 0... (x.N)‾
, the range of x.v
.MyCode
can get the current value of an object simply using x.v
(and the actual value as x.actual[v]
, but this is not recommended. MyCode
should never modify these values because DDP ensures they have the correct values at all times.
One reason to avoid direct reference to .v
is that MyCode
will change as it develops. For example, early on some quantity x may not be an action or state variable, but simply a fixed number. So MyModel
can access its current value as simply x
. However, as the model takes shape x may be changed to a variable. But now x
is a complicated object not a number. The user would have to go through MyCode
and x
to x.v
. It is also possible to want to change a number into a function, x()
that computes and returns a value.
CV()
CV(x)
is a routine in niqlow that you can send almost anything to and it will return the value of the object. It examines the argument x
, and if it is a Quantity object with an element v
then CV(x)
will return it. That is CV(x) = x.v
. double
in Ox) as x
then CV(x)
simply returns the value. If you send a function to CV()
then it will call the function and return its value. Thus, you do not need to change your code as some concept changes during programming if you use CV(x)
.AV()
N-1
. However, in the model each of those discrete values may map into different values in the model. The property actual
contains these model-relevant values. AV(x)
acts like CV()
but it returns x.actual[x.v]
.AV(x) ≡ CV(x)
unless different actual values. This is handled by calling x->myAV()
if it exists. Discrete quantities have different myAV()
functions, but the default is that myAV() = actual[v]
. That is, the current value of the object is used as an index into a vector of actual values.Update()
is called. Variables are updated before value iteration methods begin so the actual values can depend on estimated parameters.a->Update()
is only called once for each variable on each model solution to reset .actual
. If .actual
were not a vector x->Update()
would have to be called every time x.v
changed.MyModel
categorize elements and specialize the environment.
MyCode
) will build the model up dynamically as the code executes. Then when all the elements of the model has been defined the code will call DPparent::CreateSpaces()
, which will construct the action set and state space. After this, MyCode
can solve the model and use using tools described elsewhere.
class MyModel : DPparent { // declare static members to hold action and state variables objects // declare required and optional methods static Initialize(); } ⋮ MyModel::Initialize() { DPparent::Initialize(new MyModel(),…); // define actions and states (create objects) // add them to the model DPparent::CreateSpaces(…); ⋮ }
DPparent::Initialize()
and DPparent::CreateSpaces()
. MyModel
provide an Initialize()
function. However, it is convenient to do this so that all the model creation steps occur together within a (static) method, and that method has direct access to elements of MyModel
and, through inheritance, all the methods and data members within Initialize()
method. What this means and why it is done is not easy to explain at this point. See below.…
inside ()
means that some other required arguments need to be sent or optional arguments can be sent, depending on the parent of MyModel
.MyCode
builds the action \(\alpha\) by adding action variables to it using Actions().
At a minimum, an ActionVariable is defined by its Label and the number of distinct values it takes on, N. DDP tracks values of a as 0 … N‾.
class MyModel : DPparent { ⋮ static decl d; // NEW ⋮ static Initialize(); } ⋮ MyModel::Initialize() { DPparent::Initialize(new MyModel(),…); ⋮ d = new ActionVariable("choice",2); // NEW Actions(a); // NEW ⋮ CreateSpaces(); }
As with \(\alpha\), the state of the DP model \(\theta\) is built up by adding state variables to it. In the basic notation above, \(\theta\) is simply a point in a set, but in DDP it will be a vector of individual state variables. Unlike action variables, state variables evolve and how they evolve affects what needs to be stored and computed for them. The transition \(P(\theta^\prime; \alpha,\theta)\) emerges from the individual transitions of the state variable added to the state. In DDP, state variables are classified as either autonomous or coevolving depending on how they enter the state transition Ρ().MyCode
builds the state space by adding adding state variables and state blocks toMyModel
.
s
is autonomous, then its transition is independent of all other transitions and the transitions of all other variables is independent of s
. This means the s
transition enters the overall Ρ() independently. The transition for s
can still depend on the current action and current state. The transition is specified by making the state variable an instance (object) of one of the built-in autonomous state variables adding it to MyModel
.class MyModel : DPparent { ⋮ static decl d; static decl m; //NEW ⋮ static Initialize(); } ⋮ MyModel::Initialize() { DPparent::Initialize(new MyModel(),…); ⋮ d = new ActionVariable("choice",2); Actions(a); m = new LaggedAction("prevd",d); //NEW EndogenousStates(m); //NEW ⋮ CreateSpaces(); }
Ρ() is generated automatically by the state variables and blocks added to the state vectors.MyModel
always includes a time-keeping state block,
today, the current state \(\theta\), and of
tomorrow, the next state \(\theta^{ \prime}\).
class MyModel : DPparent { ⋮ static decl d; static decl m; ⋮ static Initialize(); } ⋮ MyModel::Initialize() { DPparent::Initialize(new MyModel(),…); SetClock(NormalAging,40); //NEW ⋮ d = new ActionVariable("choice",2); Actions(a); m = new LaggedAction("prevd",d); EndogenousStates(m); ⋮ CreateSpaces(); }
MyModel::Utility()
should return utility as a vector for the feasible matrix at the current state.
MyModel::Utility()
Utility()
, because it replaces a virtual method, DP::U
. It returns the utility as a vector: one element for each feasible action θ.A. You might expect MyModel::Utility()
would require arguments to pass the value of state variables. However, using the object-oriented approach to representing the model means that the values are available because DDP will set the value of members of MyModel
before calling U(). class MyModel : DPparent { ⋮ static decl d; static decl m; Utility(); //NEW ⋮ static Initialize(); } ⋮ MyModel Utility(); { //NEW return CV(d) .== CV(m); //NEW } //NEW ⋮ MyModel::Initialize() { DPparent::Initialize(new MyModel(),…); SetClock(NormalAging,40); ⋮ d = new ActionVariable("choice",2); Actions(a); m = new LaggedAction("prevd",d); EndogenousStates(m); ⋮ CreateSpaces(); }
MyModel
should set the discount factor using SetDelta().
DP::delta=0.95
. MyModel::δ is either a fixed real value or a Parameter, which allows it to depend on outside variables and/or to be estimated within a nested solution algorithm. SetDelta() can be used to set the value, passing either a real number or a Parameter. Unlike some other elements of the model, the discount factor can be set or changed after CreateSpaces()
has been called.class MyModel : DPparent { ⋮ static decl d; static decl m; Utility(); ⋮ static Initialize(); } MyModel Utility(); { return CV(d) .== CV(m); } ⋮ MyModel::Initialize() { DPparent::Initialize(new MyModel(),…); SetClock(NormalAging,40); ⋮ d = new ActionVariable("choice",2); Actions(a); m = new LaggedAction("prevd",d); EndogenousStates(m); ⋮ CreateSpaces(); SetDelta(0.99); // NEW ⋮ }
MyModel
.
Sort state variables by their role in the transition Ρ().
Segregation of state variables into different vectors can reduce memory and computing since only the information required for restricted state variables are stored.. These distinctions matter for how DP solves MyModel
, but from the point of view of MyModel
a state variable is just a state variable regardless of which category it is placed. A generic state variable that is not associated with a particular vector is denoted s
.
MyModel
except \(\theta\) which always has a Clock.MyModel
then DDP places a special Fixed state variable that takes on only the value 0. This has no effect on the size of the state space but it greatly simplifies the internal coding of algorithms.MyCode
places in the other state vectors could be in the \(\theta\) (but not vice versa). State variables are added to \(\theta\) by sending them to the function EndogenousStates(). Like any DDP model \(\theta\) is a semi-Markov process in which the transition to \(\theta^{ \prime}\) depends potentially on all current state variables and the action \(\alpha\).MyModel
can treat elements of \(\gamma\) like other states.Continuous states are placed in \(\zeta\) and can only affect U() not \(P()\).
MyModel
can account for limits on choice conditional on the endogenous state.
Α
, was defined above as the Cartesian product of the ranges of all action variables. However, in many cases MyModel
may rule out certain actions as not logically possible at a particular state. Or some actions are ruled infeasible for convenience to avoid calculations that are relatively unimportant to the overall goal of the model.
Feasibility is a property MyModel
imposes on \(\alpha\). The model rules out some actions given the interpretation of \(\alpha\). In dynamic programming, the set of feasible actions can depend on the current state, \(\theta\). In typical math notation it would be natural to write this as \(A(\theta)\), where \(A()\) is now a matrix-valued function of the state. Instead, write feasibility as a property of the state
The feasible actions at \(\theta\) is a matrix property: \(\forall \theta \in \Theta\), \(\theta.A \subseteq A\). DDP does not allow exogenous states to affect the choice set. So MyModel
must assign a variable that affects feasible actions to \(\theta\) even if its transition would otherwise qualify for exogenous or semi-exogenous status.
A different way to handle infeasible choices is to have MyModel::U()
return numeric -∞ as the utility for any infeasible \(\alpha\). This option is always open for use in MyModel
, but it does not the size of the static optimization problem and is not as close to the standard notation.
By default all possible actions are feasible at all \(\theta\) because the built-in FeasibleActions() specifies this. If MyModel
does not say otherwise, \(\theta.A \equiv A\) for all endogenous states. MyModel
can restrict choices by providing a replacement for the virtual method Bellman::FeasibleActions()
. MyModel::FeasibleActions
returns a column vector which indicates that \(\alpha\) is feasible or not:
FeasibleActions() returns a vector of length A.D containing I{A.i∈θ.A}, i = 0 … (A.D)‾.
MyModel
can define feasibility without knowing everything about the model. Indeed, another user may be deriving their model from yours, adding additional choice variables that you did not anticipate. Even so, your feasibility conditions can still be imposed regardless of the presence of other columns and rows of A.
Typically there is a small number of different feasible sets relative to the size of the state space. In this case, storing a matrix at each \(\theta\) is wasteful. So DDP stores a list (OxArray
) of different feasible sets. Rather than storing \(\theta\).A it only stores an index \(\theta\).j into a list of feasible sets. In DDP, the list of feasible matrices is simply A
. And the index \(\theta\).j into the list at a state is Aind
. MyModel
accesses the current feasible matrix as Alpha::A[Aind]
. The first matrix, Alpha::A[0]
is always the possible matrix A. If MyModel
does not specify feasible actions, then Aind = 0
.
Note: the elements of the A list are the actual value of actions and are updated at the start of each value solve. If an action variable does not have its own Update() routine defined then the actual values are simply the default range 0 … (a.N)‾.
In DDP the key functions, U() and Ρ() act on a single point in the state space at a time. So the current value of a state variable is placed in the .v
property of the Ox variable representing it. On the other hand, both U() and Ρ() are 'vectorized' in actions: they must operate on the whole feasible matrix at once.
Action variables have the .v
property, but it is not used for them. Their current values are in a column of the \(\theta\).A matrix, which is A[Aind]
in the code. The previously described functions \(CV()\) and \(AV()\) work with action variables as well, returning the column of the current/actual action matrix at \(\theta\).
For example
Consider a model that has two choices: work hours and whether to volunteer or not. Then at some state \(\theta\).A may look like this, along with the return value of a(work)
.
A[Aind] |
work vol | CV(work)
-------------------------
0 0 | 0
1 0 | 1
2 0 | 2
0 1 | 0
1 1 | 1
2 1 | 2
MyModel
can make values of a state variable terminal by calling MakeTerminal().
Some dynamic programs end if and when certain states are encountered.
Such states are called terminal states. There are three features of terminal state:
q.T is a subset of the possible values of q that terminate decision making.
MyModel
makes values terminal by applying q->MakeTerminal()
to it. Only endogenous state variables, those in \(\theta\), can have terminal values since other state vectors are either IID or invariant.
The set of terminal states is defined as \(\overline{\Theta}\). A state is terminal if any of the endogenous state variables currently equal a terminal value.
$$\theta.T\quad =\quad I\{ \hbox{for some }k, \theta.v_k \in q_k.T \}.$$
The convention in DDP is that at a terminal state there is no choice, and MyModel
must provide a value for the state via U(). Because of this convention, MyModel::FeasibleActions()
is not called at a terminal state. Instead, for \(\theta \in \overline{\Theta}\), \(\theta\).A is automatically equal to the first row of \(A\).
Let \(\overline{V}(\theta)\), for \(\theta \in \overline{\Theta}\) be the exogenous value of arriving at a terminal state \(\theta\). MyModel::Utility()
returns this value. DDP sets it as \(V(\theta)\) directly.
Following the notion of possible versus feasible actions above, the Cartesian product of all possible values of the endogenous state variables is defined as the possible state space: $$ \Omega \quad\equiv\quad \prod_{q_k \in\theta} \{ 0 \dots q_{_{k.^{N-1}}} \}$$ The current value of a state is always equal to some row in \(\Omega: \theta.v \in \Omega\).MyModel
can trim the state space by providing aMyModel::Reachable()
routine. It returnsTRUE
for states that can be reached from initial conditions andFALSE
otherwise.
Just because a state is possible (\(\theta\in\Omega\)) does not mean the DP can ever get there. Of course, if the DP was solved and then started at exactly \(\theta\) it would be reached. But finite horizon dynamic programs must have initial conditions specified. And those initial conditions can mean that some states in \(\Omega\) will never be reached. And if they can't be reached then they do not need to be stored and included in the solutions to Bellman's Equation.
So in some models, especially those with a finite horizon, Ω contains many endogenous states that cannot be reached from possible initial states of the user's situation.
The term Reachable
As with feasibility of actions, reachability of a state is not a mechanical property. Rather it depends on the model and how it will be used. Since, in DDP, an endogenous state \(\theta\) is not just a vector of numbers but rather an object with many properties attached it, it is important for efficiency that DDP only create and process objects for reachable states.
The property \(\theta\).R equals 1 if MyModel
specifies that \(\theta\) is reachable. Otherwise \(\theta\).R = 0. The state space \(\Theta\) is the set of reachable states within the set of all possible states. It emerges from the property \(\theta\).R of each logically possible state:
$$\Theta\quad \equiv\quad \bigl\{ \theta \in \Omega\ :\ \theta.R = 1 \bigr\}$$
Storage Details
Note that R
is a conceptual property only. During computation objects are only created for states with &theta.R=1. So the actual test is whether a possible state contains an object or just the number 0 as a placeholder. Storing a 0 for possible but unreachable states still requires allocation of a single oxvalue
, but it avoids storage of vectors and matrices associated with each reachable state (such as the utility vector, the optimal choice probability matrix, etc.).
As shown above, MyModel
must call DPparent::CreateSpaces()
, which sets up the list of feasible action matrices and creates the state space \(\Theta\). It must traverse (loop over) the possible state space Ω at least once It is during this traversing that it is determined whether the point \(\theta\) is reachable or not. CreateSpaces()
uses two methods to determine reachability: inherent reachability of included state variables and the user supplied routine that asserts reachability or not.
Some state variables generate inherently unreachable states. Above it was emphasized that reachability depends on the whole model, but including some kinds of state variables in your model will generate unreachable states in non-stationary environments (such as a finite decision horizon).
As an example, a Counter state variable counts how many times an action or state has occurred in the past. If the counter starts at 0 in a finite horizon model then the only reachable states are ones in which this state variable's current value is less than or equal to the value of the clock, t.
This is true regardless on other state variables added to the model, which actions are feasible, etc. It depends solely on the clock type, initial conditions and the presence of this variable in the model.
So some trimming of the state space can be done automatically based on state variables in the endogenous vector and the model's Clock block. By design, state variable objects do not know directly about the model's clock, because all the discrete variable classes are defined and used by the DP class.
So checking for reachable requires sending the clock to the state variable. The StateVariable class includes a virtual method named IsReachable() which takes a single argument that will be the model's clock block. The default function returns TRUE
. That is, by default, a state variable says no point in the state space is unreachable. However, some kinds of state variables that do generate unreachable states (such as aCounter) supply a replacement copy of IsReachable()
.
At each possible state in the space Ω CreateSpaces
will call IsReachable()
for each state variable in the endogenous state vector \(\theta\). If any of them return FALSE
that point is marked as unreachable. (This can be turned off when creating the state variable using the optional Prune
argument.)
If no state variables claim the current state is inherently unreachable then the user supplied method is called which can mark states unreachable for reasons not inherent to the clock and state variables themselves.
Reachable()
method.MyCode
must first call DPparent::Initialize()
, then add variables to the model, then call CreateSpaces. An object of MyModel
is sent to DP::Initialize()
which will clone
it for each reachable point in the state space. As with Bellman::FeasibleActions
, which must have that name because it is virtual function, this function must have the name Reachable()
.
The base Bellman
class has a method Reachable()
which simply returns TRUE. So if MyModel
does not provide its own version of Reachable
all possible states are asserted as reachable. However, remember that some state variables may have inherently unreachable states in finite horizon models, and these conditions will be checked regardless of whether MyModel
provides its own Reachable()
.)
If MyModel
provides a replacement for the virtual Reachable
it must return TRUE or FALSE depending on whether the current values of state variables are a reachable state. Inside CreateSpaces()
and during the loop over the possible state space \(\Omega\), and if no state variables assert inherent unreachability of their current value, then MyModel::Reachable()
is called.
MyModel::Reachable()
indicates \(\theta\) is reachable by returning TRUE. Otherwise it should return FALSE to indicate something about the current values of all endogenous state variables makes this point unreachable. So Reachable()
returns 1 if \(\theta.R = 1\) and 0 if \(\theta.R = 0\).
MyModel
.Recall that MyModel
is a class derived from some DDP, denoted DDPparent
. A DDP is designed to represent both the overall model and an endogenous state \(\theta\). DDP creates a copy (an object) of MyModel
for each reachable \(\theta\). It places them on a list named Theta
(a static global variable).
To conserve memory, only a limited number of variables (aka data members) are specific to each object for different \(\theta\)'s. These are what Ox calls automatic variables. Most data members for MyModel
are static members. They are shared by all objects of type MyModel
. These are properties of the overall model.
To conserve space, the Ox variables (members) in MyModel
that hold actions and states should by declared static
(see code above). Otherwise, if variables are automatic new storage for them is created at each point in \(\theta\) even though DDP processes one \(\theta\) at a time. By storing elements as static and then updating their current value (.v
property) storage for large state spaces is reduced dramatically.
MyModel
can require several solutions to a DP model that differ only by shifts in \(U()\) or \(P()\).
Group variables are like random or fixed effects in econometrics. They are fixed and non-random from an agent's point of view, but from our point of view they vary across agents.
Group variables are not involved in the creation of \(\Theta\), which is reused for the solution of the model for each \(\gamma\). Instead, the group space \(\Gamma\) is created from the Cartesian product of all possible values of the group variables.
$$\Gamma\quad\equiv\quad\prod_{k=0\dots\gamma.N-1}\ \bigl\{\,0\dots\, \gamma.N-1\,\bigr\}$$
Example
γ = (g,d)
, where g is gender, so g.N‾=1 and d is degree status so
d.N‾ =2, (no high school degree, high school, some college). Then Γ = {0,1}×{0,1,2} =
0 0
1 0
0 1
1 1
0 2
1 2
A key reason the notation differs here is because it is used to describe a framework for designing a DDP and solving it efficiently. Most other notation is used to describe a specific model or to describe models generally without reference to restrictions that can be used for efficiency.
α →
(a) [Here, vectorized actions do not retain dimensions of choice]
α
|θ) → P(a|x,θ) [Distinguished from primitive Ρ() even when arguments are suppressed.]
α
= (a0… aα.N‾) → (d00…1 d10…0 … d11…1)
MyModel
MyModel
in the notation used above (which combines standard mathematical statement of DP and some peculiarities of DDP.)
α
). Values depend on current values of state variables in all the vectors, but not ζ unless the model will be solved with reservation values.struct
that MyModel
is derived from, referred to as DDPparent
.MyModel
and any other derived elements needed in the model.#import
or #include
approach to using MyModel
in an Ox program.
#import "MyModel"requires two separate files:
MyModel.h
and MyModel.ox
#include "MyModel.ox"requires one file
MyModel.ox
which includes what would be in the header and ox file. You can have a separate MyModel.h
file, but you may need to use conditional define directives
to avoid multiple inclusions.MyModel.h
). For each derived element, write a struct
declaration.MyModel
and its components. Put this material in MyModel.ox
.includes
or imports
the definitions of the elements then builds up the model and solves itDDPparent::Initailize()
for the base of MyModel
new
clock variable be created first and sent to Initialize()
. For these methods you do not call SetClock
; it will be called by Initalize()
.new
instances for the action variables and the state variables in the model.Items above are done once while the program runs. They can be repeated only after callingDelete() which disposes of the elements of the previous model. Items below can be done repeatedly during the life of the program once the steps above are done.