Memory, Noise, and Relatedness Effect on Iterated Prisoner Dilemma Strategies Behaviour

Essam EL-Seidy; Mohamed Mamdouh Zayet; Hassan El-Hamouly; E. M. Roshdy

Paper Information
Paper Submission

Journal of Game Theory

p-ISSN: 2325-0046 e-ISSN: 2325-0054

2016; 5(1): 16-26

doi:10.5923/j.jgt.20160501.03

Memory, Noise, and Relatedness Effect on Iterated Prisoner Dilemma Strategies Behaviour

Abstract
Reference
Full-Text PDF
Full-text HTML

Essam EL-Seidy¹, Mohamed Mamdouh Zayet², Hassan El-Hamouly², E. M. Roshdy²

¹Department of Mathematics, Faculty of Science, Ain Shams University, Cairo, Egypt

²Department of Mathematics, Military Technical College, Cairo, Egypt

Correspondence to: Essam EL-Seidy, Department of Mathematics, Faculty of Science, Ain Shams University, Cairo, Egypt.

Email:

This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/

Abstract

Game theory is an essential tool for a wide range of technologies. It has emerged as a powerful platform for business, economy, biology, military and many other fields. Recently, researchers have shown an increased interest in the behavior of strategies in a game under noise effect with short memory. The main challenge faced by many experiments is the great number of available strategies. So the study offers some important insights into two state automata. In the Iterated Prisoner's Dilemma (IPD) game, we display the interactions between the strategies which its reaction depends on the outcome of the round before the last one (strategies with memory two). With different average relatedness between players, we determine the payoffs matrices and illustrate the behavior of strategies under a probable error in implementation. We found that, Pavlov strategy is one of the superior strategies that can not be invaded.

Keywords: Iterated games, Prisoner's dilemma, Finite automata, Game dynamics, Transition matrix, Perturbed payoff

Cite this paper: Essam EL-Seidy, Mohamed Mamdouh Zayet, Hassan El-Hamouly, E. M. Roshdy, Memory, Noise, and Relatedness Effect on Iterated Prisoner Dilemma Strategies Behaviour, Journal of Game Theory, Vol. 5 No. 1, 2016, pp. 16-26. doi: 10.5923/j.jgt.20160501.03.

Article Outline

1. Introduction

2. Methodology

3. Results and Discussion

4. Conclusions

1. Introduction

There is a growing body of literature that recognizes the importance of game theory. Game theory is frequently prescribed for conflict situations. Rational decision making has been studied by many researchers using the theory of games. Studies of Von Neumann [21] and Nash [20] showed the importance of game theory.

Prisoner Dilemma (PD) could be considered one of the most frequently used example that arises in many applications [7, 24]. PD is a simple form of a two player-game. Each of the players has two choices, one of them is to cooperate C (deny) with the other player and the second is to defect D (confess). Both players are rewarded by R if both of them cooperate. In turn, they are punished by P if both of them defect [18, 22]. In the case of opposite decisions the co-operator is suckered by S and the defector is tempted by T.

The conditions of PD are T > R > P > S, and 2R > T + S. The second condition means that the player who chooses mutual cooperation receives a payoff higher than switching between cooperation and defection [16]. This game can be summerized in the following payoff matrix.

(1)

Game theory applications use groups of strategies for achieving goals. Surveys such that conducted by Rubinstein (1986) showed that the number of possible strategies grows exponentially with the number of rounds in the game [1, 6, 17, 18, 23]. Thus in Iterated Prisoner Dilemma (IPD) game, we have sixteen strategies (since, there exist four possible outcome R, S, T and P in each round). Each strategy action depend on the last round outcome. We denoted by S_k for these strategies of the two players; k = 0, 1, ......, 15. These strategies can be represented by a quadruple (u₁, u₂, u₃, u₄) of zeros and ones. Where u_i denotes the probability to cooperate after outcome R, S, T and P respectively (pure strategies). Thus S₁₅ = (1, 1, 1, 1) is the rule of cooperate in each round (AllC), S₀ = (0, 0, 0, 0) is the rule of defect in each round (AllD) and S₉ = (1, 0, 0, 1) for pavlov (Win Stay – Los Shift (WSLS)). These strategies can be pictured using two states automata [4, 10, 16]. Figure 1 represents the aoutomata of Tit For Tat (TFT).

Figure 1. Automata of TFT strategy

A huge number of strategies is considered a common disorder characterized by a complexity and time consuming. The two state automata [9] plays a crucial role in regulating the huge number of existing strategies. So, the procedures of this study were approved by a finite two state automata.

In this paper, we consider the interactions between the strategies which its reaction depends on the outcome of the round before the last one (strategies with memory two).

There is only a few information about the effects of players’ relationship. So this paper traces the development of strategies’ behavior between the related players by increasing the players' payoff by a given factor. This factor depends on the relation between players. Game between relatives were reported in the first studies of Maynard Smith for the Hawk–Dove game [11, 14, 15].

Noise is an important component in the game theory and plays a key role in infinitely repeated games [18, 25]. Recently, investigators have examined the effects of a noise on (PD) game with memory one (the reaction depends on the outcome of the last round). Questions have been raised about the effect of noise on memory two [12] with relative players. In the pages that follow, it will be argued that the effect of error in implementation will be studied.

In this paper, we will introduce a brief inkling of related players (PD) game. Then the payoff matrix will be computed with different values of relation factors under noise effect using memory two algorithm. Strategies behavior will be finally declared by analyzing the payoffs matrices.

2. Methodology

A case-study approach was adopted to help understanding how relative players cooperate. In recent years, two different approaches arises (personal fitness, inclusive ﬁtness) have attempted into account for the payoff calculations. Personal fitness was prepared according to the procedure used by Grafen [14]. The data of this research were normalized using inclusive ﬁtness approach [15]. Data management and analysis were performed using relatedness factor

Traditionally, inclusive ﬁtness has been assessed by increasing the player’s payoff by r times of the co-player payoff. Essam (2015) [9] identified the payoff matrix as follows:

(2)

Case studies have been long established in (PD) game to present a detailed analysis of strategies behavior. Up till now, the research has tended to focus on memory one (PD) game [11, 14, 15, 18, 25]. This method is particularly useful in the study if the recent previous state is known. Memory two was prepared according to the procedure used by Essam (2016) [12]. Illustration 1 demonstrate how states will be produced using a conflict between S₈ against S₁₁.

Illustration 1. Confliction between S₈ against S₁₁

The sixteen alternatives raised in Illustration 1 yield only three average payoffs (regimes) as in Table 1.

Table 1. Regimes of S₈ against S₁₁

In this investigation, there are several sources of error that may be occured. Error in implementation is the most frequent error. When error occurs, it may change the payoff of any rounds. Markov process is the main non-invasive method used to determine the transition matrix between available payoffs

and

The first step in this process is to identify a couple of four quadruples P = (p₁, p₂, p₃, p₄) and Q = (q₁, q₂, q₃, q₄). P and Q represents the probability of playing C after the four possible outcomes

and

respectively. Once the four quadruples were constructed, the transition matrix takes place as follows:

(3)

The previous matrix is of interest because it has a unique left eigenvector

for eigenvalue 1. The eigenvector Π is the unique stationary distribution. Recent evidence suggests that stationary distribution is the probability to play C after every outcome if the game is repeated to infinity. Previous studies have reported that the perturbed payoff for player P against Q can be calculated by the following equation.

(4)

Since p_i and q_i are zeros or ones, then in some cases, the stochastic matrix (2) will contain many zeros, and it is no longer irreducible. Therefore, the vector π is no longer uniquely deﬁned.

Existing research calculates perturbed payoffs by a more simple, practical and direct approach. In the paragraph that follows, every regime of S₈ against S₁₁ will be studied separately.

Thus the corresponding transition matrix from one regime to another will be as follows.

(5)

The triples

is the corresponding stationary distribution for perturbed S₈ against S₁₁. The perturbed payoff will be calculated by the following equation.

(6)

This algorithm will be repeated for every two strategies.

Domination is the main invasive method to determine the strongest stable strategy. Does

dominates

To answer this question, firstly four payoff values of

and

must be recalled. Then these payoffs are used to form a new 2×2 matrix.

(7)

Then if

and

with at least one inequality being strict this means that

dominates

[17]. This algorithm will be repeated for each two strategies.

3. Results and Discussion

This set of analysis examines the impact of each strategy under the effect of average relatedness. This analysis takes place for different values of T, R, P, S, and r (Axelrod's values (T = 5, R = 3, P = 1, S = 0), Chicken game values (T = 1, R = 0, P = -10, S = -1), (T = 1, R = 5, P = 3,S = 0)) and is significant at r = 0.0001, and r = 0.999. So Table 8, Table 9, and Table 10 present an overview of these three cases. Every case has two states as follows:

(1) The two states for Axelrod's values:

i. One interesting finding for small average relatedness (r = 0.0001) at the first column in Table 8 is that every strategy is invaded by at least two strategies except S₉. Strategy S₉ (WSLS) not only neither strategy can invade it but also it outcompetes the greatest number of strategies (exactly twelve). This means that stategy S₉ can be considered a strong strategy as in memory one. In addition, it is important to point out that S₀ (AllD) and S₈(Grim) are also strong strategies which defeat eleven strategies. However S₀ and S₈ are strong strategies, they can't abide in front of S₁₀ (TFT), which does well in front of defective strategies. On the other hand S₆ (idiot) is an inactive strategy, which is defeated by twelve strategies and doesn't invade any strategy. It's also apparent that cooperative strategies such as S₇, S₁₄, and S₁₅ are weak strategies which are invaded by eleven strategies at least. These results suggest that for small average relatedness defective strategies dominate cooperative ones.

ii. The second column of Table 8 illustrates strategies’ behavior for large average relatedness (r = 0.999). From this data, it can be seen that S₉ and S₁₅ are superior strategies, as they don't permit any other strategy to rival them. Also, they put at least thirteen strategies down. In contrast to superior strategies, there are dormitive strategies S₀, S₆, and S₇. These strategies are impermissible to beat any strategy. Also, there are twelve strategies that can crush S₆. Overall, these results indicate that defective results fail in front of cooperative ones at large average relatedness.

(2) Chicken game states:

Surprisingly, no differences were found in strategies’ behavior between small and large average relatedness for all strategies in this case. Despite this, little progress has been made in the behavior of some strategies such as S₁₁ (Tweedledee). Strategy S₁₁ attacks nine strategies and forbids all other strategies to beat it. However S₁₁ is a strong strategy, strategy S₉ shows better results. Strategy S₉ attacks five strategies more than S₁₁. This means that S₉ conquer all other strategies excep its strong competitor S₁₁. Turning now to incompetent strategies such S₀, S₁, S₆, and S₇. Like the two states of Axelrod's values, S₆ is the poorest strategy. Together these results provide important insights into the preponderance of cooperative strategies in the chicken game.

(3) States of (T = 1, R = 5, P = 3, S = 0):

In this case, there is no significant difference in strategies invasion between small and large r for all strategies except for S₁₀. There are five strategies offense S₁₀ for small r and six strategies for large r. Strategy S₁₀ is invaded by S₆ for large r only. This finding was unexpected because S₆is always an inactive strategy for the two previous cases. Moreover, S₆ incommoding other three strategies and no strategy can wreaks it. As before S₉is the dominant which rashes six strategies. The evidence presented in this paragraph suggests that cooperative strategies scent defective ones.

Table 2. Payoff Matrix of Axelrod's Values (T = 5, R = 3, P = 1, S = 0) and r = 0.0001

Table 3. Payoff Matrix of Axelrod's Values (T = 5, R = 3, P = 1, S = 0) and r = 0.999

Table 4. Payoff Matrix of Values (T = 1, R = 0, P = -10, S = -1) and r = 0.0001

Table 5. Payoff Matrix of Values (T = 1, R = 0, P = -10, S = -1) and r = 0.999

Table 6. Payoff Matrix of Values (T = 1, R = 5, P = 3, S = 0) and r = 0.0001

Table 7. Payoff Matrix of Values (T = 1, R = 5, P = 3, S = 0) and r = 0.999

Table 8. Dominating Strategies for Different Values of T, R, P, S, and r

Table 9. Number of Strategies which Outcompete (Lose (L)) S_i or Outcompeted (Win (W)) by S_i

Table 10. Difference between Number of Strategies that Outcompete S_i and Outcompeted by S_i

4. Conclusions

The present study was designed to determine the effect of relatedness on strategies’ behavior between players of IPD game with memory two under noise effect. Only strategies generated by a finite two-state automata are used. The state before the last one is used to generate the new state instead of the immediately previous one. The outcome of the state before the last one is used due to the delay or lack of knowledge of the immediately previous one. This generation of states is affected by an error in implementation. If there is a relation between the players, any player can donates a fraction of its payoff to the other player. This fraction r takes values from zero to one. The payoff matrix is constructed and analyzed for different values of T, R, P, S, and r.

Multiple regression analysis revealed that S₉ has excellent performance. What is surprising is that there is no strategy that can withstand in front of S₉ for all cases in Table 8. It is also apparent from Table 9 and Table 10 that S₉over press sixty-five opponents. So this study has identified that if any application can be modeled and played under the same conditions, it is advisable to use S₉ anyway.

One of the more significant findings to emerge from this study is to avoid playing with strategy S₆. Strategy S₆ renders in front of most strategies. Data from Table 9 reveal that S₆ can't be standing in front of forty-two strategies. Also, Table 10 illustrates that the difference between the numbers of strategies that outcompete S₆ andoutcompeted byS₆ is 35. In general, it seems that the player who plays with S₆ will be the loser.

References

[1]	D. Abreu, A. Rubinstein, The structure of Nash equilibrium in repeated games with finite automata, Economy. J. Econom. Soc. (1988) 1259–1281.
[2]	R.J. Aumann, Survey of repeated games, Essays Game Theory Math. Econ. Honor Oskar Morgenstern, Bibliogr. Institute, Mannheim/Wien/Zurich, (1981) 11–42.
[3]	R. Axelrod, The Evolution of Cooperation, Basic Books, New York. Econometrica. 39 (1984) 383–396.
[4]	R. Axelrod, W.D. Hamilton, The evolution of cooperation, Science (80-. ). 211 (1981) 1390–1396.
[5]	J.S. Banks, R.K. Sundaram, Repeated games, finite automata, and complexity, Games Econ. Behav. 2 (1990) 97–117.
[6]	K.G. Binmore, L. Samuelson, Evolutionary stability in repeated games played by finite automata, J. Econ. Theory. 57 (1992) 278–305.
[7]	R. Cressman, Evolutionary dynamics and extensive form games, MIT Press, 2003.
[8]	Z.K. E. El-Seidy, The perturbed payoff matrix of strictly alternating prisoner’s dilemma game, Far Fast J. Appl. Math. 2. 2 (1998) 125–146.
[9]	E.-S. Essam, The effect of noise and average relatedness between players in iterated games, Appl. Math. Comput. 269 (2015) 343–350.
[10]	E.-S. Essam, E.M. Elshobaky, K.M. Soliman, Two population three-player prisoner’s dilemma game, Appl. Math. Comput. 277 (2016) 44–53.
[11]	E.-S. Essam, S.H. Salah El Din, M.A. Ali, On The Behavior of Strategies in Iterated Games Between Relatives, (n.d.).
[12]	E.-S. Essam, M.M. Zayet, H. El-Hamouly, E.M. Roshdy, The Effect of Memory Change in Iterated Prisoner Dilemma Strategies Behaviour, J. Game Theory. 5 (2016) 1–8.
[13]	D. Fundenberg, E. Maskin, Evolution and cooperation in noisy repeated games, Am. Econ. Rev. (1990) 274–279.
[14]	A. Grafen, The hawk-dove game played between relatives, Anim. Behav. 27 (1979) 905–907.
[15]	W.G.S. Hines, J.M. Smith, Games between relatives, J. Theor. Biol. 79 (1979) 19–30.
[16]	L.A. Imhof, D. Fudenberg, M.A. Nowak, Tit-for-tat or win-stay, lose-shift?, J. Theor. Biol. 247 (2007) 574–580.
[17]	K. Lindgren, M.G. Nordahl, Evolutionary dynamics of spatial games, Phys. D Nonlinear Phenom. 75 (1994) 292–309.
[18]	K.S. Martin A. Nowak Essam El-Sedy, Automata, repeated games and noise, J. Math. Biol. 33 (1995) 703–722.
[19]	F. and Maskin, Evolution and repeated games, Mimeo. (2007).
[20]	J. Nash, Non-cooperative games, Ann. Math. (1951) 286–295.
[21]	J. von Neumann, O. Morgenstern, Theory of Games and Economic Behavior, Princeton University Press, Princeton, NJ. (1944).
[22]	A. Rapoport, A.M. Chammah, Prisoner’s dilemma: A study in conflict and cooperation, University of Michigan press, 1965.
[23]	A. Rubinstein, Finite automata play the repeated prisoner’s dilemma, J. Econ. Theory. 39 (1986) 83–96.
[24]	L. Samuelson, Evolutionary games and equilibrium selection. Series on economic learning and social evolution, (1997).
[25]	B.M. Zagorsky, J.G. Reiter, K. Chatterjee, M.A. Nowak, Forgiver triumphs in alternating Prisoner’s Dilemma, (2013).

Paper Information

Journal Information

Memory, Noise, and Relatedness Effect on Iterated Prisoner Dilemma Strategies Behaviour

Article Outline

1. Introduction

2. Methodology

3. Results and Discussion

4. Conclusions

References