Equilibrium selection

Summary

Equilibrium selection is a concept from game theory which seeks to address reasons for players of a game to select a certain equilibrium over another. The concept is especially relevant in evolutionary game theory, where the different methods of equilibrium selection respond to different ideas of what equilibria will be stable and persistent for one player to play even in the face of deviations (and mutations) of the other players. This is important because there are various equilibrium concepts, and for many particular concepts, such as the Nash equilibrium, many games have multiple equilibria.

Equilibrium Selection with Repeated Games edit

A stage game is an n-player game where players choose from a finite set of actions, and there is a payoff profile for their choices. A repeated game is playing a number of repetitions of a stage game in discrete periods of time (Watson, 2013). A player's reputation affects the actions and behavior of the other players. In other words, how a player behaves in preceding rounds determines the actions of their opponents in subsequent rounds. An example is the interaction between an employee and an employer where the employee shirks their responsibility for a short-term gain then loses on the bonus that the employer discontinues after observing the employee's behavior (Watson, 2013). The dynamics of equilibrium selection for repeated games can be illustrated with a two-period game. With every action from the players in one period, a new subgame is initiated based on that action profile. For the Nash Equilibrium of the entire game, a subgame perfect equilibrium of every game is required. Hence, in the last period of a repeated game, the players will choose a stage game Nash Equilibrium. Equilibria stipulation actions that are not Nash Equilibrium strategies in the stage game are still supported. This can be achieved by establishing a reputation of "cooperation" in the preceding periods that leads to the opponent selecting a more favorable Nash Equilibrium strategy in the final period. If a player builds a reputation of deviating instead of co-operating, then the opponent can "punish" the player by choosing a less favorable Equilibrium in the final period of the repeated game.

Focal point edit

Another concept that can help to select an equilibrium is focal point. This concept was first introduced by Thomas Schelling, a Nobel-winning game theorist, in his book The Strategy of Conflict in 1960 (Schelling, 1960). When the participants are in a coordinate game where the players do not have a chance to discuss their strategies beforehand, focal point is a solution that somehow stands out as the natural answer.  

For example, in an experiment conducted in 1990 by Mehta et al. (1994), the researchers let the participants answer a questionnaire, which contained the question "name a year" or "name a city in England". The participants were asked to provide the first answer that came to their minds, and many provided their birth year or hometown city.

However, when they had the incentive to coordinate - the participants were told they would be paid if they managed to answer the question the same way as an anonymous partner - most of them chose 1990 (the year at the time) and London (the largest city in the UK). These are not the first answers that came to their minds, but they are the best bets after deliberation while trying to find a partner without prior discussion. In this case, the year 1990, or the city of London, is the focal point of this game, to help the players to select the best equilibrium in this coordinate game.

Besides, even in the situation where the game players are allowed to communicate with each other, such as negotiation, focal point can still be useful for them selecting an appropriate equilibrium: When the negotiation is about to the end, each player must make the last-minute decision about how aggressive they should be and to what extent they should trust their opponents (Hyde, 2017).

Symmetries edit

Various authors have studied games with symmetries (over players or actions), especially in the case of fully cooperative games. For example, the team game Hanabi has been considered, in which the different players and suits are symmetric. Relatedly, some authors have considered how equilibrium selection relates between isomorphic games. Generally, it has been argued that a group of players shouldn't choose strategies that arbitrarily break these symmetries, i.e., should play symmetric strategies to aim for symmetric equilibria.[1] Similarly they should play isomorphic games isomorphically.[2] For example, in Hanabi, hints corresponding to the different suits should have analogous (symmetric) meanings. In team games, the optimal symmetric strategy is also a (symmetric) Nash equilibrium.[3] While breaking symmetries may allow for higher utilities, it results in unnatural, inhuman strategies.[1] Every finite symmetric game (including non-team games) has a symmetric Nash equilibrium.[4]

Examples of equilibrium selection concepts edit

Risk & Payoff dominance edit

Definition: Consider a situation where a game has multiple Nash equilibria (NE), the equilibrium can be classified into two categories:

  • A Nash equilibrium is considered risk dominant if it has the largest basin of attraction (i.e. is less risky).
  • A Nash equilibrium is considered payoff dominant if it is Pareto superior to all other Nash equilibria in the game.

Explanation: A risk dominant NE is chosen when the player wants to avoid big losses while a payoff dominant NE is considered for an optimal payoff solution. Note that either the risk dominant or payoff dominant is a typical type of NE.

Example: Take the normal form payoff matrix of a game as an example:

Table1: Normal form of the example game
L R
U 10, 10 0, 9
D 9, 0 5, 5

There are two NEs in this game, i.e. (U, L) and (D,R). Here (U, L) is a payoff dominant NE since such a strategy can return an optimal overall payoff. However, given the uncertainty of an opponent's action, one of the players may consider a more conservative strategy (D for player 1 and R for player two), which can avoid the situation of "a great loss", i.e. getting 0 payoff once the opponent deviates. Hence the (D, R) is a risk dominant NE.

1/2 dominance edit

See also edit

References edit

  1. ^ a b Hu, Hengyuan; Lerer, Adam; Peysakhovich, Alex; Foerster, Jakob (2020). "'Other-Play' for Zero-Shot Coordination". Proceedings of the 37th International Conference on Machine Learning. arXiv:2003.02979.
  2. ^ Harsanyi, John C.; Selten, Reinhard (1988). A General Theory of Equilibrium Selection. MIT Press.
  3. ^ Emmons, Scott; Oesterheld, Caspar; Critch, Andrew; Conitzer, Vincent; Russell, Stuart (2022). "For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria". Proceedings of the 39th International Conference on Machine Learning. pp. 5924–5943.
  4. ^ Nash, John (1951). "Non-cooperative games". Annals of Mathematics. 54 (2): 286–295. doi:10.2307/1969529.
  • Harsanyi, John C. and Selten, Reinhard, A General Theory of Equilibrium Selection in Games, MIT Press (1988)
  • Watson, J. (2013). Repeated Games and Reputation. In Strategy: An introduction to game theory (3rd ed., pp. 291–305). essay, Norton & Company.
  • Hyde, T., 2017. Can Schelling's focal points help us understand high-stakes negotiations?. [online] Aeaweb.org. Available at: <https://www.aeaweb.org/research/can-schellings-focal-points-help-us-understand-high-stakes-negotiations> [Accessed 9 December 2021].
  • Mehta, J., Starmer, C., & Sugden, R. (1994). The Nature of Salience: An Experimental Investigation of Pure Coordination Games. The American Economic Review, 84(3), 658–673. JSTOR 2118074
  • Schelling, T. C. (1960). The strategy of conflict. Cambridge, Mass.