C<sup>4</sup>E: Computation For Economists

Tell Me Why^♭The Beatles 1964: Some Motivation and Background

Real Economists do not use insert-user-friendly-package-here.
--Variations on Real Programmers Don't Use Pascal

Kids Don't Follow^♭The Replacements 1982

programming gap

Why does economics have a programming gap?

Why computing continues to have almost no role in formal economic curricula is a mystery to me. One reason is that it is closely tied to ever-changing technology, so what seemed important to teach 20 years ago is not, and the same may true of what seems important now. But in many ways the fundamental aspects of computing in economics are not changing any more quickly than other tools. And certainly the barriers to hands-on training have completely disappeared. (Readers of a certain age will remember getting a Dickensian portion of CPU minutes on the campus mainframe to run regressions.)

Another reason is the tradition of teaching the way you were taught. Most academic economists that do original programming in their research learned the methods on their own. It is then natural for them to leave computing out of their formal teaching, even at the graduate level in fields that require intensive programming. Students start with code given to them by advisors or older students and then tweak it (and now download some Matlab code for a published paper and tried to figure out what it does). This perpetuates a cottage-industry approach to computational economics. Unlike disciplines that work on large-scale, multiple-author computing problems (like cosmology), economists work in groups of two or three. Code is handed down and modified or extended through trial-and-error.

Older cohorts believed that real programmers use FORTRAN (or more recently C). One perfectly valid reason for using any language is that the programmer has built up language-specific capital. But a good reason to stay with one language, chosen in the past at a very different stage of computer development, is not a valid reason to adopt that language fresh rather than something else. Advisors transmit this view to their students. Since formal teaching of FORTRAN is non-existent, self-study is the solution.

Now consider what the novice student programmer confronts. Code available in economics, especially when written by self-taught programmers, is usually badly documented (let alone coherently indented). No data structure more complicated than multi-dimensional arrays will be used. So the code and the coding style to learn is far removed from the mathematics it implements. Any attempt to teach programming to economists with this starting point quickly bogs down in nested loops, endless assignment statements and blackbox math libraries that may or not be available to students located elsewhere.

So, these patterns combine to produce the strange fact that the typical economics student in 2013 understands no more, and perhaps even less about numerical mathematics than their counterparts in the past while the practice of economics relies on programming and numerical methods more every year. The result is a widening gap in programming skills among economics students at the point they start to do research.

Fixing a Hole^♭The Beatles 1967

That assumed reader is a fair description of the median student I encounter in my classes. That person is somewhat reticent to admit they do not know what the difference between compiled and interpreted languages, nor how zeros and ones can represent real numbers. However, when I ask a class if they learned to invert matrices using cofactor matrices this assumed student nods. They ploughed through those complex formulas in their math econ class and are ready to do it again. It seems I am always the first person to ever tell them the truth: no one computes the inverse of a matrix this way and the knowledge is useless. They are taught that way because forty years ago it was the only recourse a student might have to invert a matrix and the math econ textbooks have put cofactor matrices, and Cramer's rule in the canon. It seems to me that it would be better that the student knew that computers solve linear systems with matrix decomposition even if they can't do it on paper for the 3x3 case.

In economics the major text on computational methods is Numerical Methods in Economics by Kenneth Judd. The book is comprehensive in its coverage of algorithms for solving economic models. However, it does not discuss computer programming at all. The algorithms are step-by-step mathematical expressions in pseudo code, and an experienced programmer can easily implement them in their favorite language. But the inexperienced programmer will not know where to start. If they took their advisor's advice and started teaching themselves FORTRAN they will soon discover the gap between elegant vector notation and the tedium of three levels of DO loops. Further, Judd's book is comprehensive and supported by research on numerical analysis, but its emphasis is not on practical issues. For example, in one sentence Judd mentions that optimization algorithms can be made to respect bounds on parameters to keep them feasible (such as not trying to compute log(-2)) by non-linear transformations. This book includes a whole chapter to the implementation of this idea because it is essential for model building and estimation.

This book is part prequel to Judd and part companion to it. It discusses the process of creating, testing and describing a program. It also introduces topics, such as object-oriented programming, that help a programmer write a good program regardless of the algorithm it implements. And it covers high performance computing issues so that a student of economics can move their code from the laptop to the cluster or cloud. To be complete, there is a great deal of overlap with Judd when it concerns basic algorithms in digital mathematics.

Everyday I Write the Book^♭Elvis Costello 1985

But in the early 1990s I learned a lesson about programming languages and economics. When trying to solve a very large (for the time) problem I overwhelmed what a PC could do. But I was able to secure some precious hours of CPU time on a 'supercomputer'. Of course, as a DOS program, Gauss was not an option. So I realized I had backed myself into a corner. From then on I knew that I would avoid languages that were not available on multiple platforms. I bit the bullet and translated my code into Pascal, which served me well over 15 years, multiple machines, and architectures despite its minority-language status. Learning to use MPI (Message Passing Interface) for parallel execution in the 1990s was a key. The MPI library was available in FORTRAN and C. But this was no problem, because a little bit of C programming allowed me to access it from within Pascal and my model building continued apace. And I was even able to write Pscal code that a few other people used in C and FORTRAN . The lesson I learned: computer language popularity was less important as portability and inter-operability with other languages.

Around 1998 a grad student thought I should look at Ox. I did, and thought it was interesting but I was wary of the corner another matrix-based language had put me in. Further there was no hope that it could support the large-scale parallel execution I needed. But two years later I was planning to teach a short course on numerical methods at a different university. I had access to a computer lab and wanted to have assignments and demonstrations. But there was no hope of getting a licensed program such as Gauss installed. I remembered Ox, and being free and easy to install made it a perfect solution. I used it and found it perfect for that purpose. I still viewed it as primarily for teaching and small scale work. Only later did I start trying to use it for research. And once again the problem I was working on became much larger than a PC could handle. So with some effort I once again accessed MPI routines written in C from within Ox and was able to take advantage of high performance computing resources without translating my code.

The last piece of the whole story is object objected programming (OOP), an approach which I was only vaguely familiar with since my formal training took place before OOP had become a standard approach. With a desire to create a package for solving dynamic programs available to others, I saw a big tradeoff. My usual approach, which was the same as most economists, was to write a program specific to the problem at hand. I had 'libraries of routines' to use, but there was no way to define the problem as the program executed. So if another person was to use my code they would have to fiddle with the knobs and switches of the code and re-compile the result. Soon it became clear that this was onerous and very hard to make flexible and general. The only people who might use it would have to be guided by me. Eventually, I came to see a way around this problem using OOP in Ox. That large-scale (and on-going) project forced me to think carefully about distribution, documentation and efficiency.

Why Ox? Why not insert-your-favorite-language?

Economists trained before the 1980s who write scientific programs but otherwise had no computer science training almost invariably used FORTRAN, which was synonymous with scientific programming. Students often adopt languages (naturally) used by their professors. So FORTRAN had momentum even when other languages and platforms became as good at scientific programming, such as C. Adoption of FORTRAN in economics has slowed markedly in recent years. Early young researchers would likely have used C and would be able to find mathematical packages in C.

When PCs came on the scene in the 1980s the basic languages like FORTRAN and C were not readily available for them. One of the first PC-based languages was Gauss, which was quite popular in economics through the 1990s. However, Gauss was a commercial program that did not run on "mainframe" computers. By the 2000s, Matlab was starting to replace it as was the open source statistical platform R. Stata has introduced a matrix language in order to support more general programming than its original data set orientation. Lately use of Python in scientific computing has been growing.

Each year one or two students ask me Why do you use Ox rather than X," where the value of X slowly evolves. No single choice could possibly suit every potential reader of this book. Most people who start to program do not survey all the available options, weigh the pros and cons and then pick the optimal choice. One reason is that they have no way to weigh the tradeoffs between features and capacities. So nearly everyone relies on trusted advice.

Tell Me Why♭The Beatles 1964: Some Motivation and Background

Tell Me Why^♭The Beatles 1964: Some Motivation and Background