Iterated Prisoner’s Dilemma

Cite this article as: Praveen Shrestha, "Iterated Prisoner’s Dilemma," in Psychestudy, November 17, 2017, https://www.psychestudy.com/behavioral/learning-memory/iterated-prisoners-dilemma.

Prisoner’s dilemma is a bargaining game where the biggest reward is only achieved when both players co operate, yet they might not do so.

When the bargaining game is played repeatedly between the same two players, prisoner’s dilemma is now referred to as Iterated Prisoner’s Dilemma. Since the game is repeated between the same two parties, both get the opportunity to re-formulate their strategy based on the previous round.

Apart from the general form of prisoner’s dilemma, Iterated Prisoner’s Dilemma requires that 2R > T + S. This ensures that alternating and defection and cooperation does not yield higher reward than mutual cooperation.

This table shows payoffs to you for various outcomes. What Other Player Does
Co-operate Defect
What
You Do
Co-operate Fairly good.
REWARD
for mutual co-operation.
3 points
Very bad.
You Lose.

0 points

Defect Very good.
TEMPTATION
to defect.
5 points
Fairly bad.
PUNISHMENT
for mutual defection.
1 point

 

The Winning Strategy

Robert Axlerod wrote about Iterated Prisoners’ Dilemma in his book The Evolution of Cooperation (1984). In his book, he reported on the tournament he organized by setting number of steps for the repetition of prisoners’ dilemma. Participants had to choose mutual strategy again and again recalling the choice made by the opposite participant in previous encounters.

After studying vast number of strategy programs formulated by Robert and his colleagues, he discovered that altruistic strategies fared better than selfish strategies in the long run. He used his discovery to show the mechanism for the evolution of altruistic behavior.

The winning strategy was tit for tat, developed and entered into the tournament by Anatol Rapoport. The strategy was simple; the player did exactly the same thing his opponent did on the previous move. This would gradually lead to the cycle of defections. So, a better strategy would be “tit for tat with forgiveness”. Meaning, instead of reciprocating the other player’s move every time, the player can choose to cooperate anyway. Even though the probability of other player doing the same is relatively low, this could help getting out of the defection cycle.

Based on the highest scoring strategies, Axlerod put together certain conditions in order for any strategy to be successful.

  • Strategy must be nice. Meaning, a player will not defect before the opponent. It’s also referred to as optimistic strategy.
  • Any successful strategy cannot be a blind optimist, meaning always cooperating. The player must also retaliate in order avoid being a victim of ruthlessly exploited by other player.
  • Forgiving is another key aspect for strategies to be successful. This is based on the idea of “Tit for tat with forgiveness”. Forgiving prevents cycle of defection, allowing the participants to gain maximum points.
  • A player must not be focused on scoring more points than the opponent in order to maximize collective points.

Another strategy in the game of IPD is Pavlov. The basic idea is to repeat the move if won, and to switch if lost. The strategy Pavlov has the upperhand over all other strategies, given that the other participant is also applying the same strategy. In which case, given that both players cooperated in the first move, they would continue so for the number of times.

Players’ Goal of Iterated Prisoners’ Dilemma

As participants your goal shouldn’t be to score higher than the other player, but to manage collectively higher score than other players involved in the tournament. This generally means having equal score as the other player.

In theory, if both participants involved in a prisoners’ dilemma cooperated every time, they would secure maximum points. However, it’s not very practical. According to Axlerod, players are likely to come to a  point of altruism as the repetition of the moves continue, but the chances of the perfect scoring is relatively low.

Observation

The optimal strategy for a one-time prisoners’ dilemma is defection regardless of the composition of opponents. However, the optimal strategy is subjective to the strategies of the opponents in iterated prisoners’ dilemma game.

Cite this article as: Praveen Shrestha, "Iterated Prisoner’s Dilemma," in Psychestudy, November 17, 2017, https://www.psychestudy.com/behavioral/learning-memory/iterated-prisoners-dilemma.