1. Nash
  1. A Two-Person Two-Strategy Game
  2. The Edgeworth Box example focusses on competitive behaviour, which in economics means each agent takes prices as given when making their choices. Game theory is the opposite extreme of (pure) competition. In particular, each agent acts strategically. When making their decisions they attempt to account for the reaction of others, or at least attempt to guess what they will do.

    Below is a payoff matrix for the Battle of the Sexes game. Pat and Jordan plan to binge watch Netflix. They like to watch together but they have different preferred series. If they watch the same series they can watch together. The numbers in each box are the utility Pat and Jordan place on each outcome. We don't compare utility within the box since that is an inter-personal comparison. Instead we look at the payoffs for each player across boxes to think about what happens.

    In a simultaneous move game both have to chose at the same time (which is not realistic for Netflix, but go with me here). Jordan controls which column occurs and Pat controls the row.
    What to Binge Watch
    Jordan
    UP,UJ Breaking Bad (BB) Glee (G)
    Pat Breaking Bad (BB) 3,2 1,1
    Glee (G) 0,0 2,3
    An action $a$ (also called a strategy) is chosen by each player from their strategy set, which is the same for both, $\{G, BB\}$. Player $i's$ utility depends on both actions, so we can write $U_i(a_P,a_J)$, where $i = P, J$. When choosing their action Pat anticipates (or guesses) what Jordan will do. So Pat's action is a function of beliefs about Jordan's action and vice versa.

    A best response is an action that maximizes a player's utility given their beliefs about the other player's action. We can be fancy and write Pat's best response to Jordan as: $$a^\star_P (a_J) = \arg\max_{a\in\{G,BB\}}\quad U_P(a,a_J).\nonumber$$ And Jordan's best response to Pat as: $$a^\star_J (a_P) = \arg\max_{a\in\{G,BB\}}\quad U_J(a_P,a).\nonumber$$ So Pat's best response depends on the first number in each box given a column to decide which row is best. Jordan's best response depends on at the second number given a row to decide which column is best. Each simply finds the bigger of two numbers. For Pat:
    $a^\star_{P}(BB) = BB\qquad a^\star_{P}(G) = G$
    For Jordan:
    $a^\star_{J}(BB) = BB\qquad a^\star_{J}(G) = G$
    The best response vector puts together the two individual best responses. So for each box in the payoff matrix find the box that is a best response to the opposing strategy: $$a^\star(a_P,a_J) = \bigl(a^\star_P(a_J),a^\star_J(a_P)\bigl)\nonumber$$
    $a^\star(BB,BB) = (BB,BB)$
    $a^\star(G,BB) = (BB,G)$
    $a^\star(BB,G) = (G,BB)$
    $a^\star(G,G) = (G,G)$
  3. Nash Equilibrium in Pure Strategies
  4. A Nash Equilibrium is a fixed point in best responses. That is, first find a best response of player $i$ to the other player $j's$ strategy. If $j's$ best response is to that is the original strategy $i$ was responding to you have a fixed point and thus a Nash Equilibrium (NE). A Nash Equilibrium in the game above would satisfy this condition: $$a^\star\bigl(a^o_P,a^o_J\bigr) = \bigl(a^o_P,a^o_J\bigr)\nonumber$$ In the Battle of the Sexes there are two Nash equilibria: $$NE = \biggl\{ (BB,BB),\ (G,G) \biggr\}.\nonumber$$
  5. Complications
  6. Best responses may not be unique, so $a^\star()$ may be a set of strategies that are each best responses to a given strategy. This means any computational approach has to account for ties in payoffs otherwise some equilibria may be missed. In the example above there are Nash equilibria in pure strategies. But that is not always the case and in some cases the set up is unrealistic.

  7. Mixed Strategies
  8. A mixed strategy means that a player is playing different "pure" strategies with some probability. Your opponents are then forming best responses over beliefs about these probabilities. The NE consists of the probabilities of playing strategies that are best responses to each other. This has the theoretical attraction of smoothing the problem of best response.

    Return to the finite strategy game between two players, $i=0,1$ having $J$ strategies each. (It would not be hard to generalize to different numbers of strategies for each player.) So the example above is the case $K=2.$ Let $U^i$ be the $K\times K$ matrix of payoffs for player $i.$ We will code the matrices so that player $i$ is choosing the row $n$ and the opponent ($j$) is choosing the column $m$.

    A mixed strategy for player $i$ is a $K\times 1$ vector $p_i$ which contains elements $p_{ik},$ the probability that the other player believes player $i$ will choose strategy $k.$ Thus each element of $p_i$ has to be in the range $[0,1]$ and the numbers have to add up to $1$: $$ 0 \le p_{ik} \le 1, \hbox{ for all k} \qquad \hbox{and}\qquad \sum_{k=0}^{K-1} p_{ik} = 1.$$

    What happens with mixed strategies? The outcome is not determined. Instead each possible outcome has some probability of occuring. Players can no longer choose strategies to maximize utility because the payoffs/utilities in $U^i$ do not happen with certainty. In mixed strategies we model players as caring about expected utility, the average payoff they expect to get given that the other player is playing a mixed strategy.

    The probabiilty that player 0 plays strategy $n$ and player 1 plays strategy $m$ equals $p_{0n}p_{1m}.$ In general, the $K\times K$ matrix of probabilities that each outcome occurs is simply: $$P = p_0 p_1'.$$ That is, it is the "outer product" of the two probability matrix. This matrix of probabilities matches up to the outcomes in player 0's playoff matrix. That is the $(n,m)$ element of $P$ gives the probability that player 0 gets payoff $U^0_{nm}.$ That's because we put $p_0$ on the left of the outer product so each of its elements is constant in the rows of $P.$ From the perspective of player $1$ we have to transpose $P$ so that the probabilities match up with their payoffs $U^1.$ Thus, if both players are following mixed strategies we can compute each utlity weighed by the probability that the outcome occurs. This results in two $K\times K$ matrices: $$M_0(p_0,p_1) = U^0 .* [p_0p_1'] = U^0 .* P \qquad\qquad M_1(p_1,p_0) = U^1 .* [p_1p_0'] = U^1 .* P'.$$ Here "M" means is used for "mixed utility." And note that we define each of these values putting the player's own mixing vector as the first argument and the other player's mixing probability as the second argment. This makes it easier to generalize to multiple players when the second argument will be "all other players."

    A matrix of weighted utilities is not something to maxmize. To maximize over mixing probabilities we need an overall value for each player. We can compute expected utility by simply adding up all the elements of these matrices. For player 0: $$EU_0(p_0,p_1) = \sum_{n=0}^{K-1} \sum_{m=0}^{K-1} M_0(p_0,p_1)_{nm}.$$ And for player 1: $$EU_1(p_1,p_0) = \sum_{n=0}^{K-1} \sum_{m=0}^{K-1} M_1(p_1,p_0)_{nm}.$$ These two numbers reflect how each player ranks different mixing strategies (assuming they care about expected utility when outcomes are uncertain).

    We can now think about a best response in mixed strategies. For player 0 it means that they would choose the vector $p_0$ so as to maximize expected utility given that the other player is believed to play their strategies according to the probabilities in $p_1.$ The same is true of player 1 in reacting to beliefs about player 0, the probabilities in $p_0.$ There is one important and fairly complicated constraint on these responses. In pure strategies we knew a best response was just which strategy had the highest utility. Here the response are probabilities on strategies. So a player can choose any $p_i$ vector. It has to be a vector of probabilities that satisfy the condition above that they lie between 0 and 1 and sum to 1. That is the set of mixed strategies that a player can choose from. We will define that set as a "simplex": $$\Lambda(K) = \l\{ v : 0 \le v_{k} 1, \hbox{ for all k} \qquad \hbox{and}\qquad \sum_{k=0}^{K-1} v_{k} = 1\r\}.$$ In other words, $\Lambda(K)$ is the set of a vectors of length $K$ such that all the elements are between 0 and 1 and the elements add up to 1.

    >We can now define best responses in mixed strategies as: $$ p^\star_0(p_1) = \arg\max_{p_0\in \Lambda(K)} EU_0(p_0,p_1) \qquad\qquad p^\star_1(p_1) = \arg\max_{p_1\in \Lambda(K)} EU_1(p_1,p_0).$$ That is, player $i$ chooses $p_i$ to maximize their expected utility given that they believe the other player is mixing with probabilities $p_j.$

    Finally we can define as Nash Equilibrium in mixed strategies (for a two-person game) is a pair of probability vectors $p^\star = (p^\star_0,p^\star_1)$ that are "best responses to each other." That is, $$p^\star_0(p^\star_1) = p^\star_0 \qquad \hbox{and} \quad p^\star_1(p^\star_0) = p^\star_1.$$

    How could we ever go about computing $p^\star.$ One approach is to construct an objective function whose maximum satisfies the best response condtions. First,"stack" the mixed strategy vectors: $$p = \pmatrix{p_0 \cr p_1}.$$ This will be the parameter vector for the objective, $V(p).$ Next, define the utility that player $i$ receives if they play each of their pure strategies (ignoring $p_i$) while the other player is mixing: $$M_i(p_j) = U^i p_j.$$ This is $K\times 1$ vector. Next define the difference between this vector and the expected utility at the current parameter vector: $$D_0(p_0,p_1) = M_0(p_1) - EU_0(p_0,p_1)\qquad D_1(p_0,p_1) = M_1(p_0) - EU_1(p_0,p_1)$$ Next for each player define: $$ED_i(p_0,p_1) = \sum_{k=0}^{K-1} \max\l\{D_i(p_0,p_1),0\r\}^2.$$ And finally: $$V(p) = ED_0(p_0,p_1) +ED_1(p_0,p_1).$$ It is not at all obvious that if we maximize this function at $p^\star$ it will be NE in pure strategies. However, it turns out that that is exactly the case. The explanation is as follows: to be completed

  9. Sequential games
  10. Sequential games study behaviour when moves are made in a pre-defined sequence that creates a game tree instead of a payoff matrix. One special kind of Nash equilibrium in this case is a subgame perfect equilibrium. Repeated games are cases where the same payoffs are repeated multiple times which may or may not result in different outcomes from the one-shot game. Repeated games play an important role in industrial organization as well as evolutionary psychology.
  11. Computation
  12. Here the problem is to find best responses and then fixed points in the best response vector. This involves looping across possible strategies. The other types of games and equilibria involve different computations.

Exercises