C<sup>4</sup>E: Computation For Economists

Learning and Neural Networks

PRELIMINARY AND IDIOSYNCRATIC! This belongs in Part III, but was placed here to avoid changing hard links to sections.

train

Paying perhaps $50 for bad software when now your phone or smart speaker can understand you or even translate in real time a conversation into a different language may seem silly. But really the basic method is the same. The difference is simply computing power and much more sophisticated "training" of the system

This leads us to the world of neural networks (NN) and machine learning (ML). We will cover only the basics, but with the purpose that you have some conception of how packages/modules work rather than simply use them naively. ML is being used in economics now, sometimes as a gimmick and sometimes for doing something better than other techniques. However, compared to the gap from Naturally Speaking software of the 1990s to voice recognition today, computational economics continues to use rely algorithms we cover in this course.

(Artificial) neural networks involves many of the techniques already covered in this course, including linear equations, binary responses, and optimization of an objective.

Neurons

An artificial neuron is really simply a linear equation. In our earlier notation we might write it as: $$y=ax+b.$$ A few notes about jargon and minor differences is concepts! In neural networks the constant $b$ is called bias; $x$ is a $N\times 1$ vector of inputs; the $1\times N$ vector $a$ does not contain coefficients: it contains weights. The $y$ is still consider output. From a matrix algebra point of view we could include the bias in the first component by simply concatenating a 1 onto $x$ and placing $b$ in the weights matrix. Using Ox operators:

That's all a neuron is: it responds to real valued inputs and produces a single output. But the output $y$ is often not the end of the story. The output is transformed by an activation, $V(y)$. These transformations have names: $$\eqalign{ \hbox{Linear:} V(y) &= y\cr \hbox{RectiLinear} V(y) &= I_{y\gt 0} y\cr \hbox{Sigmoid} V(y) &= {e^y \over 1+ e^y} = {1\over 1+e^{-y}}\cr }$$ What makes NN powerful are these additional elements: Neurons can be placed together in layers; their outputs become the inputs for another layer of neurons; and the weights in $a$ and the bias in $b$ can be chosen through a process called training. Training is the process of choosing weights and bias to match the output of the network to external data. The data include observed inputs paired with outputs the NN is supposed to produce. Once trained the NN can be used on new inputs to predict new output. This is what the Naturally Speaking software needed to do: The sounds I made when reading the training text was the input. The NN was trained to take those input signals and produce into the words on the page. After it was trained the hope was that the NN would be able to predict from the sounds I made the words I wanted in the document.

Dense Layer

layer

Neural Network

neural network