C<sup>4</sup>E: Computation For Economists

Known Unknowns^♭Donald Rumsfeld 2003

NR 7.0

scramble memory

simulated randomness

independent sampling

fake

Probability and Random Variables: a Review

random experiment

outcomes

sample space

random variable

numerical

discrete random variable

$Y$

$P(Y=y)\gt 0$

$y \in \{y^1,y^2,\dots,y^M\}$

continuous random variable

one or more intervals of the real line

cumulative distribution function (cdf)

$F(y)=Prob(Y\le y)$

$Y$

$F(y)$

$F(y) \equiv Prob( Y\le y).\nonumber$

$Y\quad \sim\quad F(y)$

$Y$

$F(y)$

not

Left: The CDF of a (strange) continuous random variable. Steep portions of the curve are values with greater density (pdf). Flatter portions are low density (less likely) values.
Right: The CDF of a discrete random variable. The jump points are values the random variable takes on. The size of the jump is the probability (pdf) of that value.

pdf of a discrete

$Y$

$f(y)$

any number y

$f(y)\ \equiv\ Prob(Y=y).\nonumber$

pdf of a continuous random variable

$Y$

not

$f(y) \equiv {d Prob(Y\le y)\over dy} = F^\prime(y).\nonumber$

$\eqalign{ \qquad\hbox{Discrete}\qquad\qquad \qquad &\qquad \qquad \qquad\hbox{Continuous}\qquad\cr F(y) = {\sum}_{k: y^k \le y} f(y^k)\qquad\qquad &\qquad \qquad F(y) = \int_{-\infty}^y f(z)dz.\cr}\nonumber$

Joint Probability and Distribution

$A$

$B$

$P(A,B)$

and

$P(A,B) \equiv P( A \cap B).\nonumber$

$Y_1$

$Y_2$

$P()$

the joint cumulative distribution function (joint cdf)

below

$\hbox{for all } y_1, y_2,\quad F(y_1,y_2) \equiv Prob( Y_1\le y_1, Y_2\le y_2).\nonumber$

joint pdf

equal

$\hbox{for all }y_1, y_2,\quad f(y_1,y_2) \equiv Prob( Y_1= y_1, Y_2= y_2).$

joint pdf

$\hbox{for all }y_1, y_2,\quad f(y_1,y_2) \equiv {\partial^2 Prob( Y_1= y_1, Y_2= y_2)\over \partial y_1\partial y_2} = {\partial^2 F(y_1,y_2) \over \partial y_1\partial y_2}.$

Definition 4. Marginal Distributions of Random Variables

Let

$Y_1$ and

$Y_2$ be two random variables with joint cdf and pdf

$F(y_1,y_2)$ and

$f(y_1,y_2)$ .

$Y_2$ is discrete and takes on

$M$ different values, denoted

$y^1_2,\dots,y^{M}_2$ , then the marginal cdf of

$Y_1$ is

$F_1(y_1) = Prob(Y_1\le y_1) = F(y_1,y_2^{M}).$ If

$Y_1$ is discrete the marginal pdf of

$Y_1$ is

$f_1(y_1) = Prob(Y_1 = y_1) = {\sum_{k=1}^{M}}f(y_1,y^k_2).$

$Y_2$ is continuous then the marginal cdf of

$Y_1$ is

$F_1(y_1) = Prob(Y_1\le y_1) = F(y_1,\infty).$

$Y_1$ is continuous then its marginal pdf is

$f_1(y_1) = {d Prob(Y_1 = y_1)\over dy_1 }= \int_{-\infty}^{+\infty}\ f(y_1,y_2)dy_2.$

$Y_1$

$y_1$

$Y_2$

$Y_1$

$y_2=\infty$

$y_2^M$

$F(y_1,y_2) = F(y_1,y_2^M)$

$y_2>y_2^M$

$Y_1$

$Y_2$

$F(y_1,y_2)$

$f(y_1,y_2)$

$Y_1$

$Y_2$

statistically independent

$\hbox{for all } y_1 \hbox{and} y_2,\quad f(y_1,y_2) = f_1(y_1) f_2(y_2).\tag{Indep}\label{Indep}$

$Y_1,Y_2,\dots,Y_N$

$f(y_1,y_2,\dots,y_N)\quad =\quad f_1(y_1)f_2(y_2)\cdots f_N(y_N) \quad =\quad \prod_{i=1}^N f_i(y_i).\tag{Indep2}\label{Indep2}$

$F(\cdots)$

Some Distributions

standard uniform

$U(0,1)$

$V\sim U(0,1)$

$V$ is a continuous random variable that takes values strictly between 0 and 1.
$V$ has a uniform (constant) pdf on (0,1). To make the total probability integrate to 1 the height of the density is 1:

$f(v) = \cases{ 1 & for $0\le v\le 1$\cr0 & otherwise\cr}$

The cdf of $V=\int_{0}^{v}f(u)du$ is simply $v$ :

$F(v) = \cases{ 0 & for $v\le0$\cr v & if $0\lt v \lt 1$\cr 1 &if $v\gt 1$}$

$v$

$(a,b)$

$(0,1)$

$b-a$

$U(0,1)$

$F(x)$

Definition 5. The Standard and General Normal Distributions

We write

$Z\sim {\cal N}(0,1)$ to mean:

$Z$ is a continuous random variable that takes on values between $(-\infty,+\infty)$ .
The pdf of a Z variable is usually denoted $\phi$ and equals $\phi(z) = {1\over\sqrt{2\pi}}e^{-z^2/2}.\nonumber$
The CDF of a $Z$ random variable is denoted $\Phi$ and equals $\Phi(z)= \int_{-\infty}^z \phi(v)dv.\nonumber$ This integral has no closed form expression. This is the reason stats books contain tables of normal values.

We write

$X\sim {\cal N}(\mu,\sigma^2)$ to mean

$X$ is a continuous random variable that takes on values between $(-\infty,+\infty)$ .
It has pdf of the form $f(x) = {1\over\sqrt{2\sigma\pi}}e^{-(x-\mu)^2/(2\sigma^2)}.\nonumber$ We do not use a special symbol for the pdf/cdf of a general normal random variable, but we reserve $\phi(z)$ and $\Phi(z)$ for the special case ${\cal N}(0,1)$ .
As with the standard normal, the cdf of a normal random variable is an integral with no closed form.

Left: The Standard Normal Density (THE Bell Curve) and CDF.
Right: Three Different Normal Densities with different means and variances.

Left: The Standard Normal Density (THE Bell Curve) and CDF.
Right: Three Different Normal Densities with different means and variances.

Pseudo-Randomness

$Y_0$

$Y_1$

$t+1$

$t$

autocorrelation

$Y_t$

$Y_{t-1}$

sufficient

$y$

Definition 6. Pseudo-Random Number Generator (pRNG)

A pRNG is an algorithm that produces a list of numbers that have these features:

The list is recursive: the algorithm uses where it is in the list to produce the next item (and then moves to that spot in the list). A pRNG has one input, called its seed. Setting the seed picks where to start.
The list is circular: starting from any point eventually to a value that produces the starting point. A good pRNG is a very long list; it starts repeating itself only after a lot of draws. The length of the list is called the pRNG period.
The numbers all lie between 0 and 1 and are evenly distributed. They look like the uniform distribution, so we will write $U \ {\buildrel pseudo \over \sim}\ U(0,1)$ if $U$ is a number produced using a pRNG.
The sample autocorrelation function of the list is very close to 0 for lags $k>1$ and less than the period.

$X$

$F(x)$

$X$

$V=F(X)$

$F(x)$

$X$

$x$

$X$

$F(x)$

$V$

$G(v) = Prob(V\le v) = Prob(F(X)\le v) = Prob(X \le F^{-1}(v)) = v.\nonumber$

$g(v) = G^{\,\prime}(v) = 1.\nonumber$

$F(x)$

$F^{-1}(v)$

Algorithm 21. Generating a random draw from (almost) any CDF

Given a CDF

$F(x)$ and its mathematical inverse function,

$F^{-1}(u)$ .

Draw a realized value from $U(0,1)$ . Call this value $u$ .
Invert the mathematical function for the CDF of $X$ at the value of $u$ , $x = F^{-1}(u).\nonumber$ Then $x\sim F$ .

$X$

$u$

$x$

$F(x) = u$

$x$

$f(x)$

Come up with a way to simulate draws from the Uniform distribution.

Have the ability to invert the CDF of the random variable you want to simulate.

Algorithm 22. Simulate an IID Sample (continuous)

Given

$X\sim F(x)$ .

Set $r=1$ (the first observation). Draw $u^r \ {\buildrel pseudo \over \sim}\ U(0,1)$ . This is one pseudo-random replication of a IID uniform random variable.
Find the value $x^r$ such that $F(x^r)=u^r$ .

That is, compute

$x^r = F^{-1}(u^r)$ . Based on the theorem above,

$x^r$ is a simulated draw from the distribution

$F(x)$ :

$x^r \ {\buildrel pseudo \over \sim}\ F(x).\nonumber$ The inverse of the cdf can be computed either as a closed form for the inverse of the cdf or it can be computed

$x^i$ numerically using computer algorithms.

Increment $r$ and repeat. A pseudo IID sample is then $X_{r=1,\dots,R} \ {\buildrel pseudo \over \sim}\ F(x)$ .

Pseudo-Random Numbers in Ox

Table 13. Basic Pseudo-Random Functions in Ox

Function       Pseudo-Random Operation
-------------------------------------------------------------
ranu(r,c)	   generate a r×c matrix of
               uniform pseudo-random numbers
rann(r,c)	   generate matrix of standard normal
               distributed pseudo-random numbers
ranseed(iseed) set and get seed, or choose random
               number generator (depending on iseed)

oxprob

#include oxprob.oxh

29-rand.ox
 1:    #include "oxstd.h"
 2:    main() {
 3:    	println(ranu(3,2));
 4:    	println(ranu(1,4));
 5:    	ranseed(-1);
 6:    	println(ranu(2,2));
 7:    	ranseed(33);
 8:    	println(ranu(2,2));
 9:    	}

Output

      0.56444      0.76994
      0.41641      0.15881
     0.098209      0.37477

      0.56912      0.44078      0.47337      0.71055

      0.56444      0.76994
      0.41641      0.15881

ranseed()

ranu()

Known Unknowns^♭Donald Rumsfeld 2003

Exhibit 45. Continuous and Discrete CDFs

Definition 4. Marginal Distributions of Random Variables

Exhibit 46. The Uniform PDF and CDF

Definition 5. The Standard and General Normal Distributions

Exhibit 47. Bell Curves

Definition 6. Pseudo-Random Number Generator (pRNG)

Algorithm 21. Generating a random draw from (almost) any CDF

Exhibit 48. Inverting the CDF to Simulate a Random Experiment

Algorithm 22. Simulate an IID Sample (continuous)

Table 13. Basic Pseudo-Random Functions in Ox

Known Unknowns♭Donald Rumsfeld 2003

Exhibit 45. Continuous and Discrete CDFs

Definition 4. Marginal Distributions of Random Variables

Exhibit 46. The Uniform PDF and CDF

Definition 5. The Standard and General Normal Distributions

Exhibit 47. Bell Curves

Definition 6. Pseudo-Random Number Generator (pRNG)

Algorithm 21. Generating a random draw from (almost) any CDF

Exhibit 48. Inverting the CDF to Simulate a Random Experiment

Algorithm 22. Simulate an IID Sample (continuous)

Table 13. Basic Pseudo-Random Functions in Ox

Known Unknowns^♭Donald Rumsfeld 2003