Поиск по сайту




Пишите нам: info@ethology.ru

Follow etholog on Twitter

Система Orphus

Новости
Библиотека
Видео
Разное
Кросс-культурный метод
Старые форумы
Рекомендуем
Не в тему

список статей


Negative reciprocity: The coevolution of memes and genes

Daniel Friedman, Nirvikar Singh

1. Introduction

2. The underlying game

3. The viability problem

4. Group structure

5. Elements of the model

6. Results

7. Discussion

Acknowledgment

Appendix. 

A.1. Notation

A.2. Alternative loss functions

A.3. Alternative assumptions about status

A.4. Probabilities of cooperation

A.5. The individual optimum and the encounter function

References

Copyright

1. Introduction

Negative reciprocity is the act of harming those who wrong us. It is often accompanied by powerful emotions of anger that motivate us to harm culprits even at some cost to ourselves.

Negative reciprocity complements positive reciprocity, the helping of those who have helped us. The folk theorem of game theory, which applies to repeated interactions, explains positive reciprocity as an individually rational (indeed, a subgame perfect) way to support efficient exchange, as long as the discount factor exceeds the ratio of personal cost to social benefit. As explained below, negative reciprocity can further increase social value in two ways: it can support efficient exchange even when the discount factor is low, as for example when repeat interaction is sporadic, and it can deter opportunistic behavior that would undermine positive reciprocity. However, as far as existence is concerned, it is beside the point whether negative reciprocity is helpful or harmful to society. The crucial theoretical issue from an evolutionary perspective is whether vengeful traits convey a selective advantage. It would appear that the answer is “no”: we will show, in a stylized analysis that captures the essence of cooperation dilemmas, that negative reciprocity is weakly dominated by (i.e., never yields a higher payoff than) otherwise similar behavior that shirks on the personal cost. Therefore, it is a theoretical puzzle how negative reciprocity ever established itself in the repertoire of human motives, and how it sustains itself. Until the puzzle is solved, theory will offer no guidance on how negative reciprocity might be regulated to increase its social value and to reduce its devastation.

In this paper, we offer an evolutionary account of negative reciprocity in humans. Our definition restricts reciprocity, positive or negative, to social creatures that have the capacities to identify and recall the earlier behavior of specific individuals, and to reward or punish them contingent on earlier behavior. The account we offer also requires cultural transmission of codes of behavior. We do not explore the extent to which our model might apply to nonhuman species with these capacities.

Our account draws on the concepts of both selfish genes and cultural memes. Dawkins (1982) defines a meme as “the unit of information that is conveyed from one brain to another during cultural transmission.” Our concern is with memes that pertain to the group rather than to an individual, such as the routines and norms within a business corporation (Nelson & Winter, 1982) (for general discussions of memes and social transmission, see Blackmore, 1999, Blackmore, 2000, Dawkins, 1976 and comments on the former).

Our account of negative reciprocity starts with a standard normal form game that captures, simply and directly, the idea of a personal cost incurred to reap social gains. The game illustrates how a preference for negative reciprocity realigns incentives and supports a socially efficient equilibrium, but demonstrates that negative reciprocity is itself evolutionarily problematic.

After discussing earlier treatments of the problem in Section 3, we propose an evolutionary model with individual learning and evolution as well as meme selection for groups. In Section 4 we argue that groups can use low-cost sanctions (or simply status changes) to enforce a particular norm on the proper degree of negative reciprocity. Section 5 assembles the elements of a simple model, and Section 6 derives the main results. Actual behavior typically will fall short of the norm, but selection across groups will adjust the norm so that actual behavior maximizes the fitness of group members, and the free-rider problem is overcome. Following a concluding discussion, Appendix A shows that the main conclusions survive the relaxation of many simplifying assumptions.

2. The underlying game

We begin by demonstrating how a preference for negative reciprocity can convert a standard Prisoner's Dilemma (PD) problem to a simple coordination problem with a Pareto efficient equilibrium. The idea is that, given a motive for negative reciprocity, cooperative behavior is no longer dominated and can become part of a Nash Equilibrium (NE), even without repeat interaction. Our subsequent analysis builds on this game, which captures in simple terms the conflict between social efficiency and individual self-interest.

The basic underlying game is a symmetric two-player PD with a cooperator payoff of 1, a temptation payoff of 2, a sucker payoff of −1, and an all-defect payoff of 0 (Table 1). In other words, the benefits of full cooperation of 2 are evenly split and the benefit of one-sided cooperation of 1 is very unevenly split at (2, −1), relative to the no-cooperation payoff, which is normalized to (0, 0).

Table 1.

The underlying PD game: fitness without negative reciprocity

(ν=0) C D
C 1, 1 −1,2
D 2, −1 0, 0

Payoffs so far are material, and describe both fitness and utility. The social dilemma is that there is a personal cost of one unit to choosing the cooperative strategy, but it produces a social gain, also of one unit. The game has a unique NE in which each player chooses the dominant strategy D and achieves fitness 0. The choices of specific payoffs are intended only to simplify the algebra and exposition; essentially the same results hold for other fitness payoffs satisfying the usual PD inequalities: temptation>cooperation>all defect>sucker; and temptation+sucker<2×cooperation.

To this underlying game, we add a punishment technology and a punishment motive with parameter ν, which (as we shall soon see) is the incurred cost. We hypothesize that a player can inflict harm (fitness loss) h on the other player at personal fitness cost ch. The marginal cost c is a constant parameter between 0 and 1 that captures the technological opportunities for punishing others. We also hypothesize that inflicting harm h yields the player a utility bonus of νln h (but no fitness bonus) when he is the victim of the sucker payoff and no bonus in other circumstances. Thus, the motive is not spite (e.g., Hirshleifer, 1987, Levine, 1998), but rather is revenge for damage personally experienced, and so the action taken by the victim involves negative reciprocity. The motivational parameter ν is subject to evolutionary forces and is intended to capture an individual's temperament, e.g., his susceptibility to anger Frank, 1988, Hirshleifer, 1987.

The objective function for the victim of a sucker payoff with motivational parameter ν is therefore νln h−ch−1. The utility-maximizing degree of negative reciprocity h* to inflict on a defector is the unique solution of the first-order condition 0=ν/hc, so h*=ν/c is the inflicted damage. Hence, ch*=ν, and the motivational parameter also becomes the incurred cost. Utility in this case is νln ν/cν−1, while fitness is just −ν−1. The game now has the same fitness payoffs as before on the main diagonal, but the sucker payoff is reduced by the cost of negative reciprocity, and the temptation payoff is reduced by the amount of harm inflicted, as in Table 2.

Table 2.

The modified PD game: fitness with negative reciprocity

(ν>0) C D
C 1, 1 −1−ν, 2−ν/c
D 2−ν/c, −1−ν 0, 0

For ν>c, the transformed game no longer has D as a dominant strategy. When population fraction s plays C, the expected fitness of C is W(C)=1s−(1+ν)(1−s) and the expected fitness of D is W(D)=(2−ν/c)s. The two expressions are equal at s*=(1+1/ν)/(1+1/c). For s<s* the expected fitness is higher for D and we can expect cooperation to disappear as play converges to the inefficient (fitness 0) all-D equilibrium, as in the basic game. But for s>s* the expected fitness is higher for C and we can expect negative reciprocity to drive out defection, resulting in the Pareto efficient all-C equilibrium. Thus, for ν>c we have a coordination game which has two locally stable pure Nash equilibria and an unstable mixed NE at s*<1, as illustrated in Fig. 1. These statements are true under any plausible evolutionary dynamics, in particular, compatible or monotone dynamics Friedman, 1991, Weibull, 1995.


View full-size image.

Fig. 1. The advantage of cooperating. The fitness advantage A(s)=W(C)−W(D) is graphed as a function of the population fraction s playing C for two values of the negative reciprocity parameter ν. The graph of A rotates counterclockwise as ν increases.


Note that efficient all-C behavior can also be sustained as a repeated game NE even in the original (ν=0) version if culprits can be detected and identified, and if all players have discount factors that exceed 0.5, by using standard tit-for-tat or similar punishment strategies. But it may well be the case that repeat meetings are infrequent or culprits are hard to track, so the discount factor is too small to sustain the efficient outcome. Thus, the anticipation of negative reciprocity can support efficient social outcomes that cannot be sustained by standard repeated game strategies.

3. The viability problem

There is a gap in the argument so far. The motivational parameter ν is itself subject to evolutionary forces, albeit perhaps slower forces than those determining the prevalence s of cooperation. Recall that the expected fitness of a cooperator is W(C|s, ν)=2s−1−ν(1−s), which is a strictly decreasing function of ν for any fixed s<1. Only when there are no culprits left to punish at s=1 is the expected fitness independent of ν. Assuming that players occasionally encounter culprits (an assumption we shall develop later), player ν′ is fitter than player ν whenever 0<ν′<ν. Therefore, the parameter ν will be driven towards 0 under any plausible evolutionary dynamics. We have a variant of the classic free-rider or chiseling problem, and it seems that negative reciprocity is not viable.

Existing literature offers several possible avenues for escaping the viability problem. Prominent among them is inclusive fitness Haldane, 1955, Hamilton, 1964. The viability problem is attenuated for social creatures that interact with close genetic relatives, such as slime molds (index of relatedness r=1−ε) or ants and bees (r up to .75). But we are interested in humans, who often interact with others who are not necessarily closely related (say on average r=0 to .25). Hence, for our purposes this avenue is unpromising.

Friedman & Singh, 1999, Friedman & Singh, 2004 discuss a variety of other proposed avenues. Some, such as weakened notions of evolutionary stability, and mutation constraints that preclude intermediate levels of the trait or chain the trait to some adaptive trait, play no role in the subsequent analysis. Other proposed avenues, however, relate to our proposed solution. First, perhaps individuals with higher values of ν encounter D play less frequently (e.g., Frank, 1987). Harrington (1989) points out the importance of observability; we shall focus on observability at the group level rather than at the individual level. Second, the personal cost of negative reciprocity—c in our model—might be zero, or even negative if looting is possible or in some forms of repeated play (e.g., Guttman, 2003, Rosenthal, 1996). We shall focus on one-off encounters outside the group, where c is positive, but we also consider low-cost technologies for disciplining members within a group.

Third, one can impose some sort of group selection. The idea goes back at least to Darwin (1871): “A tribe including many members who…were always ready to aid one another, and to sacrifice themselves for the common good would be victorious over most other tribes; and this would be natural selection.” The idea has proved controversial (e.g., Alexander, 1987, Sober & Wilson, 1998, Trivers, 1985, Wynne-Edwards, 1962). Our focus on group traits is related to recent work on cultural group selection (e.g., Boyd et al, 2003, Gintis et al., 2003). Finally, one can consider higher-order punishment (punish those who do not punish D players, etc.; e.g., Henrich & Boyd, 2001) and third-party punishment (e.g., Nowak & Sigmund, 1998, Sugden, 1986; but see also Leimar & Hammerstein, 2000); neither solves the viability problem for encounters outside the group, but both reinforce our view of enforcement within the group.

4. Group structure

How do humans overcome the viability problem? Our core idea is that groups discipline their members. During the vast majority of its evolutionary history, Homo sapiens, like other social primates, presumably lived in small groups of individuals who interacted with other group members on a daily basis. Within the group, everyone knows everyone else, and several devices are available to enforce the all-C equilibrium. Tit-for-tat and related repeated game strategies work well because repeat interaction is reliable and frequent (e.g., Sethi & Somanathan, 2003); third-party and higher-order punishment strategies become feasible; and reputations for vengeful behavior can be established with one's fellow group members. While these devices for disciplining behavior are not perfect, they do suggest that D behavior will be relatively rare within well-functioning groups.

How about interactions with individuals in other groups? Depending on the setting, a member of a given group may encounter a specific nonmember only rarely, but, aggregating across all other groups and their members, such encounters could lead to significant fitness differences Black-Michaud, 1975, Fehr & Henrich, 2003, O'Kelley & Carney, 1986. An individual who somehow could induce strangers to play C would do much better than one who (correctly or incorrectly) anticipates D play. Unfortunately, an individual in a cross-group encounter cannot reliably signal her true ν because outward signs can be mimicked at low cost, nor (due to the large numbers of sporadic personal encounters) can she easily establish a personal reputation for her true ν. It is much more plausible that her group can establish a reputation which would determine the outcome of the interaction. For example, if one of the authors met a stranger on a train in India, the stranger might try to ascertain the author's family village and his last name, as ways of assigning him to a group with a particular reputation. The questioner is likely to find such information more useful than personal details, which are easier to disguise.

Our concern here is with the social norms maintained by a group, and with their enforcement and evolution. All known human groups maintain social norms that prescribe appropriate behavior towards fellow group members, and typically prescribe different appropriate behavior towards individuals outside the group (Sober & Wilson, 1998). For example, Nisbett and Cohen's (1996) “culture of honor” prescribes that a person responds with violence or the threat of violence to any insult or perceived affront; Nisbett and Cohen studied the American South, but their findings align well with the anthropological literature on many other societies (e.g., Black-Michaud, 1975, Farb, 1978, Galaty & Bonte, 1991, Gilmore, 1991, Lowie, 1954, Peristiany, 1965). Pettigrew (1975) describes the culture of honor for North India's Jats (herders, originally from Central Asia, who have become settled farmers over time) as follows:

Relationships of extreme friendship and hostility between families were actively involved with the philosophy of life embodied in the concept of izzat—the complex of values regarding what was honourable.… That aspect of izzat according to which the relationships between families were supposed to be ordered emphasized the principle of equivalence in all things, i.e., not only equality in giving but also equality in negative reciprocity. Izzat was in fact the principle of reciprocity of gifts, plus the rule of an eye for an eye and a tooth for a tooth…Izzat enjoined aid to those who had helped one. It also enjoined that revenge be exacted for personal insults and damage to person or property. (p. 58)

How might a group enforce a social norm like izzat? The vengeance technology already introduced could, of course, be used to punish norm violators within the group. But groups have at least two other, lower-cost punishment technologies not available to individuals. First, members may choose to interact less frequently with norm violators, i.e., partial shunning. Norm violation may lead group members to regard the violator as less reliable, and therefore they will often prefer (and believe it to be in their material interest) to choose an alternative partner. Shunning reduces the overall fitness in the group because some opportunities for mutual gains are not fully realized. But the cost falls mainly on the violator, because the shunner can find the next best alternative partner.

Second, and for an even lower cost, the group may lower the status of a norm violator. Of course, status generally depends on individual traits of all sorts, including age, sex, height, strength, birth order, and parental status. But it is reasonable to postulate that, other things equal, an individual will have higher status when his behavior better upholds the group's norms (again see Nisbett & Cohen 1996). Status matters because it affects resource allocation. Groups allocate many resources; depending on the context, these might include marriage partners, home sites, and access to fishing holes or plots of land. Status is a device for selecting among the numerous allocation equilibria: the higher status individuals get the first choice on available home sites, desirable marriage partners tend to prefer higher status suitors, etc. (e.g., see MacDonald, 1994, on Jewish society in 13th century Spain, or Nisbett & Cohen, 1996, on the American South, past and present). The model introduced below uses a single parameter, a, to measure the sensitivity of fitness to status combined with the sensitivity of status to behavior.

Enforcement could affect the fitness of nondeviators as well as deviators. Indeed, since status is relative, a decrease in one individual's status will increase the status of others and hence increase their fitness. Catanzaro (1992) makes precisely this point regarding the Sicilian Mafia: “…the men who usurped honor did so at the expense of others who stood to lose it to the same degree… Ultimately, honor has been described as a system of stratification [by Davis, 1980]…” (pp. 46–47).

The combination of a group's relevant social norms and their enforcement devices is referred to below as the group's meme. The meme pertains to the group rather than to its individual members. For example, the membership of a street gang might turn over while its meme (e.g., its dress style, graffiti logos, or combat codes of conduct) remains constant. Conversely, the group's meme could evolve with constant membership via mechanisms ranging from imitating more successful groups to conquest.

How do group memes evolve? We will assume that a given meme becomes more prevalent when it brings higher average fitness to its group members than do alternative memes. Such monotone dynamics are consistent with many specific mechanisms of meme preservation and transmission, which can include various kinds of communication and reinforcement behavior within the group (see, e.g., Boyd & Richerson, 1990, Durham, 1991, Nisbett & Cohen, 1996, Weingart et al., 1997). We do not assume, like Wilson (1980), that genes always hold memes on a “short leash” that allows only minor short-run deviations from genetic fitness, but simply that the short leash is a reasonable approximation in the present case, group norms concerning negative reciprocity.

5. Elements of the model

We now specify elements of a model in which group memes for negative reciprocity coevolve with individual characteristics. A complete specification of a group's meme would include prescriptions for proper behavior towards culprits and cooperators within the group, and possibly different behavior towards culprits and cooperators outside the group, together with enforcement devices. We have already noted that the group has many available devices for ensuring good levels of cooperation within the group, and cooperation outside the group is not at issue. Our focus is the prescription for outgroup culprits and the enforcement of the prescription.

Hence, we summarize the relevant memes using two parameters: νn for the group's normative level of negative reciprocity outside the group, and a for the rigor with which the group enforces that norm. For example, Izzat applied to the basic game calls for h=2, since the culprit causes a loss of 2 (relative to the cooperative outcome of 1) and therefore rather strict enforcement of the norm νn=2c is enjoined.

Enforcement is modeled by a loss function ρ(x), where x=νnν is the deviation of an individual's vengeful behavior from the group norm. The group imposes an expected fitness loss ρ on a deviator by lowering that individual's status or reputation within the group. The idea is that the deviation sometimes will be observed by another member of the group and gossip will spread the news. The simplest possible quadratic specification is ρ(x; a)=x2/(2a), where enforcement is more rigorous the smaller the parameter a>0. Recall that norm enforcement may also affect the fitness of nondeviators. Let R denote the fitness increment (zero or negative) an individual receives due to the deviations of other group members from the normative level νn. In the special case of enforcement by changes in relative status, R will exactly offset the loss associated with the enforcement function, ρ.

The other side of the coevolution model specifies the individual traits. Each individual is characterized by two parameters: his actual negative reciprocity level ν, and the maximum possible value νmax that any meme could induce. The capacity for feeling anger and expressing it by damaging others as summarized in νmax may well be genetically transmitted, but the actual ν of an individual is best regarded as developmentally labile.

A few remarks are in order about fitness, monotone dynamics and time scales. We shall assume that individual levels of ν adjust rapidly within [0, νmax]; the idea is that people learn and accommodate to the group's meme within a relatively short period, possibly only weeks or months. Memes also adjust, but in the medium run of years to decades. By definition, νmax is innate, but it, too, can change in the long run, over several generations. Thus, for simplicity we assume that, at any given time scale, only a single (scalar) variable is adapting. With the assumption of monotone dynamics, the direction of change is immediate from the definition of fitness: values of ν that bring higher fitness become more prevalent in the population at the expense of values that bring lower fitness.

The last element of our model incorporates the idea that external reputation is carried by the group as a whole, and defines the frequency f with which an individual encounters culprits. Consider a group of individuals with average negative reciprocity level ν̄>c. Outsiders on average have an unbiased estimate of ν̄ (they make no systematic errors in perceiving an individual's group affiliation or the group's reputation), but have no other credible information regarding any specific group member. It is intuitive that a group with a reputation for higher levels of negative reciprocity will deter more outsiders from choosing D and thus its members will experience lower f. Appendix A confirms this intuition, and derives a smooth decreasing encounter function f(ν̄). Here we take the function f as exogenous and note that it will be shifted by changes in the group's environment, including the composition of neighboring groups: this is therefore a partial equilibrium approach. A convenient parameterization is f(ν̄)=exp(−ν̄/b).

The next section derives the uniform level νo that is optimal for the group given the encounter function f(ν̄). Derivation of νo is conceptually and technically straightforward, but its relevance is not immediately obvious, due to the basic viability problem. We will show that νn mediates a close connection of νo to the individual optimum and hence to the group average ν̄. Appendix A begins by listing the definitions of the key variables.

6. Results

Here we work with the simple parameterizations of the fitness loss function ρ(x) and the encounter function f(ν̄) introduced in the previous section, leaving generalizations to the Appendix. Recall that a proportion f(ν̄) of encounters with outsiders are defections, yielding direct payoff −1 together with losses ν due to costly negative reciprocity and ρ due to deviating from the group norm. Encounters with cooperators [proportion 1−f(ν̄)] yield fitness payoff 1, so the individual's expected fitness is

where R is the base-level fitness including the (positive) effect on one's status from other group members' deviations from the norm νn. This expression does not allow for the possibility that the individual ever plays D, but this omission is harmless (see Appendix A). The intuition is that the vengeance parameter affects own fitness when one cooperates but not when one defects, because defectors are never suckers. (More formally, terms that capture the own-effects of playing D are independent of ν, and hence have no impact in our derivations.) Also, recall from the previous section that in the pure status case, R cancels the mean contribution of ρ. Hence, in this case the group's average fitness is simply

The first result shows that short-run learning dynamics will drive ν and hence ν̄ toward some individually optimal level ν*. Dynamics are assumed to operate at a time scale where νn and a are constant: indeed, this defines the concept of the short run.Proposition 1.

In short-run equilibrium, ν=ν̄=ν*=[νn−a], truncated to the interval [0, νmax], maximizing individual fitness for the given meme νnand a.

The argument proceeds as follows. Recall that a ν-cooperator meeting a defector will receive fitness loss [1+ν+ρ(νnν)]: the sucker payoff plus the cost of imposing negative reciprocity plus the social loss from violating the norm. The same individual will receive a fitness gain of 1 in encounters with cooperators. For given ν̄ and νn, short-run selection will drive ν towards values that increase individual expected fitness W(ν| ν̄,νn) or equivalently, that decrease ν+ρ(νnν). The first-order condition is 1=ρ′(νnν)=(νnν)/a, with solution ν*=νna. It is easy to see that W is single peaked at ν*, so short-run dynamics (under our monotonicity assumption) push the individual's parameter towards this optimum, which will be attained as long a the value is within the allowable range; otherwise ν* is truncated below at 0 and above at νmax. Since learning dynamics are rapid, we obtain the desired conclusion that ν* is a good approximation of an individual ν and an even better approximation of the average ν̄.

Of course, the individual optimum ν* does not necessarily maximize the group's fitness Wg(ν̄)=1−f(ν̄)(2+ν̄). The group optimum νo is the value that maximizes this expression on (0, νmax). Inserting f(ν)=exp(−ν/b), the first-order condition reduces to 2+ν=−f/f′=b, so νo is b−2, truncated to (0, νmax). While the solution here is particularly simple, Appendix A shows that similar conclusions hold quite generally.

What then is the relation between the group optimum νo and the individual optimum ν*? Assume for the moment that both are interior, so ν*=νna and νo=b−2. Our second result is that medium run meme selection aligns them as follows:Proposition 2.

Coevolution of memes and individual learning drives actual behavior ν* toward the group optimum νoin the medium run, and interior equilibrium is achieved at νn=a+b−2.

This second result is easily established in the present setting. The group meme, embodied in the parameters a and νn, is subject to selective pressures in the medium run, and Wg is again a single-peaked function. Any group whose memes bring ν*=νna closer to νo=b−2 has a selective advantage. Again, any monotone dynamics will work for this statement. So in the interior case considered, we get the expression claimed.

Our final result is a corollary of Proposition 2, taking into account the long-run evolution of the individual's capacity νmax. If the constraint ν or ν*≤νmax binds in the medium run, then there is a selective advantage to individuals with higher genetic capacity for negative reciprocity and for group memes that encourage its expression. (Durham, 1991, provides examples of such coevolution, such as lactose tolerance in herding communities.) Thus, there is no truncation in the long run and the algebraic expressions can be rewritten as in the following result.Proposition 3.

Coevolution of memes and genes produces the socially optimal negative reciprocity level in long run evolutionary equilibrium, i.e., νo=ν*, but the supporting meme, νno+a, exaggerates the optimal level.

There can be shifts in the environment (as captured in the parameter b) and in the punishment technology (as captured in c). These shifts will affect the encounter function f and hence the group optimum νo. Our results suggest that memes will adjust to these shifts under selective pressure in the medium run (and genes will adjust if necessary in the long run) so that individual behavior ν* will track the new group optimum. The coevolution of the meme (νn and a) with the gene (νmax) allows actual behavior to track optimal behavior as the environment changes.

Appendix shows that this conclusion holds under conditions far more general than the simple parametric model used here. The derivation starts with consistent estimates of the probabilities that two strangers will choose C or D given imperfect observation of each other's ν parameters. It then identifies regions in the perceived characteristic space where the individual will choose C or D as in Fig. 2. Here, individual fitness is given by a sum of integrals over the choice regions. The encounter function f and the first-order condition 1=ρ′(νnν) turn out to arise naturally in this setting.


View full-size image.

Fig. 2. The decision rule. The appropriate choice of C or D is given by the sign of the advantage function A(p, u), where p is the probability that the partner will choose C and u is an unbiased estimate of her negative reciprocity parameter. The A=0 locus shifts up with increases in the decision-maker's direct (v) or full (α) negative reciprocity cost.


Two other technical questions are dealt with in Friedman and Singh (2004). First, how can νmax>c get started from an initial value of νmax=0? The key idea is that small values of ν turn out to have selective advantage within the group because they are complementary with positive reciprocity. Second, how can high-ν̄ groups protect their reputation against faked membership by individuals who actually are members of low-ν̄ groups? Our idea is that the high-ν̄ groups enjoin punishment of such individuals whenever they are detected. The same paper also contains an extended literature survey.

We close this section with some interpretive remarks. In the model everyone has the same vengeance parameter ν and makes the same choices in equilibrium. In reality, members of a given group have different life experiences, temperaments, tastes and abilities, so there will always be behavioral heterogeneity; see Friedman and Singh (2003) for a model incorporating observational as well as behavioral errors (but no group structure). Even ignoring such heterogeneity, one might wonder about the status impact when everyone falls short of the group norm νn by the same amount a. In equilibrium, of course, there is no net effect on status because the shortfall by others has impact R that exactly offsets the impact ρ of one's own shortfall. Actually, it seems to us a realistic and appealing feature of the model that actual behavior ν falls short of the group's vision of proper behavior νn.

7. Discussion

Our argument can be summarized briefly. A capability for negative reciprocity is a significant part of the human emotional repertoire. We model its important role in sustaining cooperation while highlighting a free-rider problem: fitness benefits of negative reciprocity are shared, whereas the costs are borne individually. In our model, the countervailing force that sustains negative reciprocity is a group norm together with low-powered (and low-cost) group enforcement thereof. Such memes coevolve with personal tastes and capacities to produce the optimal level of negative reciprocity.

One could object to our account on several grounds. First, it is too simple. The underlying social dilemma was modeled as a specific PD game. It is straightforward to adapt the model to other parameterizations of PD, but this evades the real point. In reality, the stakes and complexity of social interactions vary considerably, and actual memes are more complex and variable than in our model. Ours is the usual response: insight is clearest with an appropriate simple model, and for specific applications the model can be extended as necessary, to deal with specific essential complexities. A similar response can be made to the issue of tackling n-person rather than dyadic social dilemmas: the essential logic of our analysis appears to extend to the more general case.

One could also object that the model is too complicated, especially if the main goal is to explain cooperation. Norms of cooperative behavior and their enforcement could be modeled directly. The same apparatus should suffice: preferences that offer a utility gain (but not a fitness gain) for positive reciprocity together with a social norm from which deviations lead to fitness loss. Negative reciprocity thus seems redundant. Our response is twofold. First, our primary goal is to explain negative reciprocity, not cooperation per se. Second, since culprits are rare and cooperators are ubiquitous in successful society, the fitness cost of a meme that relies entirely on positive reciprocation might be excessive. Our suggestion, therefore, is that social norms of negative reciprocity, in taking advantage of biological capacities in that direction, are able to reduce the burden on direct social norms of positive reciprocity in sustaining cooperative behavior. Thus, the existence of direct social norms of positive reciprocity does not make negative reciprocity redundant.

A third objection to our account is that it is too powerful: all sorts of behavior, including behavior that has never been seen and never will, could be described as coevolutionary equilibria. We concede this point, but have been unable to find a simpler account that convincingly explains the viability of preferences for negative reciprocity. Of course, one needs additional principles to get a reasonably sharp theory, and here we have relied on anthropological observations of phenomena such as “cultures of honor.” There are indeed many ways to capture the potential gains to cooperation. Social insects, for example, rely on close genetic kinship. Likewise, bipedalism is not the only (or even necessarily the best) form of locomotion: it is worth studying because it is the one humans use. We claim nothing more (nor less) than this for our focus on negative reciprocity as a means of reaping the gains of cooperation.

How well does our model apply in different societies? Others may be in a position to assess the model's application to hunter–gatherer bands or to villagers. Here the parameter b would reflect directly the uncooperative tendencies of people from neighboring bands or villages, and c the opportunities to identify, track down and inflict harm on them. Indeed, suppose the parameter b is a function of the average vengefulness of these neighboring groups. To the extent that these groups are similar to the focal group, then, in a general equilibrium, b=ψ(ν), where ψ can still depend on environmental factors. In this case, the equilibrium value of νo that was derived in Proposition 2 now reduces to the solution to νo=ψ(νo)−2.

In highly structured societies, some important acts of negative reciprocity are performed by specialists, such as courts and police, rather than by aggrieved parties. This may lower the marginal cost c of negative reciprocity, but it is still costly to lodge a complaint, to testify, etc., and many situations (e.g., office politics) are not well suited for specialists. Thus, our model still applies to more complex societies, but it is incomplete in that it takes as given the institutional mechanisms that alter the technology parameter c.

What are the empirical implications and applications of our model? One can easily imagine laboratory experiments that would distinguish a taste for negative reciprocity from the egalitarian preferences hypothesized by recent writers. Fehr and Gächter (2000) have collected results that generally confirm strong tastes for negative reciprocity. The comparative statics of the model are also clear in principle, and testable with anthropological data: norms of negative reciprocity and actual vengeful behavior should vary systematically with the hostility of the environment, the technology for harming culprits, and the technology for enforcing group norms. If the model is on the right track, there is reason to hope that extremely dysfunctional vengeful behavior might improve over time, as the relevant memes evolve.

Acknowledgements

The first author is grateful to CES and the University of Munich for hospitality while writing the first fragments in May 1997. We have benefited greatly from the comments of Ted Bergstrom, Sam Bowles, Robert Boyd, Herb Gintis, Jack Hirshleifer, Peter Richerson, Donald Wittman, and seminar audiences at JAFEE2000, Indiana, Purdue, UCLA, and UCSC. Two anonymous referees and the editors of this journal helped improve the final version. Remaining shortcomings are our responsibility.

Appendix.

A.1. Notation

νn Group's normative negative reciprocity level
ρ(x), x=νnν Fitness loss ρ imposed on deviator by group, for deviation x
a Tolerance parameter when ρ(x; a)=x2/(2a)
νmax Maximum possible taste for negative reciprocity
ν∈[0, νmax] Actual negative reciprocity cost an individual prefers
ν̄∈[0, νmax] Group average of ν
f(ν̄) Frequency with which an individual encounters culprits
b Environmental hostility parameter when f(ν̄)=exp(−ν̄/b)

A.2. Alternative loss functions

Consider the case ρ=exp(k|νnν|)−1, where k is a positive parameter that measures the severity of norm enforcement. The kink in ρ at 0 implies a first-order loss for first-order small deviations. The first-order condition ρ′(νnν)=1 is now kexp[k(νnν)]=1, with solution ν*=νn+ln k/k. If k≤1 then ν*≤νn and the solution is still of the form ν=νna, so the previous analysis carries over to this case. If k>1, we have a corner solution, given by ν*=νn, which is a limiting case of νna as a approaches 0. In the medium-run equilibrium in this case, νn=νo, that is, the memes that support this group-optimal equilibrium include the actual optimum value νo. Thus, the analysis proceeds as in the main text, with a treated as 0.

Asymmetry can be introduced by setting ρ=0 for ν>νn, or by using different values of k for positive and negative deviations. Since ν*≤νn is the relevant range for solutions, such asymmetries will have no effect on the subsequent analysis.

A.3. Alternative assumptions about status

Recall the expression for individual fitness W(ν| ν̄, νn)=1−f(ν̄)(2+ν+ρ(νnν))+R. Suppose now that status is not completely relative, so that R only partially cancels out ρ(νnν). We can model this by introducing a parameter tε [0, 1] that measures the net loss of average fitness due to deviations from the norm. Group average fitness becomes Wg(ν̄)=1−f(ν̄)(2+ν̄+(νnν̄)). With f and ρ as specified in the main text, the first-order condition for the medium-run equilibrium is now [1−t(νnν)/a]exp(−ν/b)=−[2+ν+t(νnν)2/2a](−1/b)exp(−ν/b). Canceling the exponential terms, multiplying through by b, and substituting νnν with a, yields b(1− t)=(2+νna+at/2), or νn=a(1−t/2)+b(1−t)−2.

If t=0, we have the case analyzed in the text. At the other extreme, t=1, only absolute status matters. In that case, νn=(a/2)−2, independent of the parameter b. In general, greater weight on absolute rather than relative status (i.e., a higher t) decreases the equilibrium norm νn, since the derivative dνn/dt=(−a/2)−b is negative. The comparative statics for νn with respect to a and b are qualitatively the same for all values of t in the unit interval, i.e., νn increases as either a or b increases. In words, if enforcement is less stringent (higher a) or the environment is more hostile (higher b), then the norm of negative reciprocity in the medium-run equilibrium will be higher.

A.4. Probabilities of cooperation

To derive key constructs from more general assumptions, we first solve the decision problem faced by an individual encountering a new partner, or “stranger.” The encounter function f and the characterization of the individual optimum will emerge endogenously. Let i=1 index the given individual and i=2 index the stranger. Their true degrees of vengefulness (ν1, ν2) are imperfectly perceived by the other person; 1's perception of 2's ν is ν̂2=ν̂2+e2, and similarly (replacing 2 by 1) for 2's perception of 1. It is common knowledge that the perception errors (e1, e2) have mean zero and joint cumulative distribution function G(e1, e2).

The expected payoffs to cooperation Wi(C|•••) and to defection Wi(D|•••) can be expressed in terms of i's perceptions of j=3−i and i's own characteristics as follows. Let piI=[0, 1] be j's estimate of the probability that i will play C; for the moment it is arbitrary, but we shall derive it shortly. Let αi=νi+ρ(νn(i)νi) be the full cost of negative reciprocity to i, taking into account the loss ρ that his group imposes when he deviates from the norm νn(i). Let i denote the induced estimation error of αi.

Then Wi(C)=(1)pj+(−1−αi)(1−pj)=−(1+ai)+pj(2+αi), and Wi(D)=(2−νj/c)pj+(0)(1−pj)=pj(2−νj/c). Each person i chooses C when the perceived advantage Ai(pj, νj, αi)=Wi(C)−Wi(D) is positive and chooses D when Ai is negative.

Now we need some second-order reasoning. Write j's perception of i's perceived advantage as Ai(pj, νj+ej, α+i) The error i reflects the fact that j knows i's negative reciprocity cost αi imperfectly, and the error ej is included because j realizes that i knows j's own ν imperfectly. (The error ej was dropped out of the Wi(D) expression above because it has mean zero, but now we need to keep track of it because covariances can be relevant.) The probability pj is still arbitrary, but now we have the machinery in place to enforce consistency.

The construction of consistent (i.e., Bayesian Nash equilibrium) probability estimates uses best response B to map (p1, p2) into an updated choice (q1, q2), and looks for a fixed point. The idea is that the tentative choice probabilities plugged into the decision function A imply new choice probabilities, and the probabilities are internally consistent at a fixed point. Formally, the first component of B(p1, p2) is q1=mA1(p2, ν2+e2, α11)|G(e1, e2), where the expression m[a(x)|F(x)] denotes the measure (i.e., the probability mass) of the set of x's such that a(x)=0, given that x has distribution function F. The second component of B is q2=mA2(p1, ν1+e1, α2+2)|G(e1, e2).

One can show that the mapping B: (p1, p2)→(q1, q2) of the positive unit square I2 into itself satisfies the assumptions of the Brouwer theorem and therefore has a fixed point. This conclusion holds for any particular choice of (ν1, ν2); indeed, the mapping B depends smoothly on (ν1, ν2) if G has a density function. Therefore, one can assign (not necessarily uniquely) fixed-point probability estimates (p1, p2) as a function of (ν1, ν2). Thus, we have the mapping we sought, call it P: 0, νmax2I2, (ν1, ν2)α(p1, p2). One can verify (although it is not necessary for our purposes) that P is the assessment component of a Bayesian Nash equilibrium.

In practice, a nice way to implement P is to begin with initial estimates p1=p2=0.5and to iterate using the B map (for the actual values of the ν's) until convergence. The intuition is not that people actually do the iteration or the calculation, but rather that a stable convention emerges on how likely you (as member of a group with a particular value of ν) are to encounter C play from a stranger with given apparent ν.

A.5. The individual optimum and the encounter function

The next task is to derive general expressions for fitness functions and to characterize the individual optimum. We focus on a particular individual (i=1 in the last subsection) whose negative reciprocity parameter ν is to be shaped by the learning process. Others' perceptions of him have mean ν̄ and remain constant during this process; the interpretation in the text was that the others perceive his group affiliation but have no other credible information about him.

The individual faces an environment defined by a distribution function F(u) for strangers' negative reciprocity parameters ν2=u. The distribution F(u), together with the mapping P derived above, induces a distribution function H(p, u|ν̄) where p denotes the first component p1 of P(ν̄, u). The distribution H summarizes the fitness-relevant data for the individual: the probability p that the stranger will play C and her (correlated) negative reciprocity parameter u. Monotonicity properties of the mapping P imply an ordering by ν̄ of the distributions H via first-order stochastic dominance.

Consider the possible values of (p, u) in the rectangle I×[0, νmax], as in Fig. 2 of the text. Simplifying the notation of the previous subsection, the individual's decision function is A1(p, u, α1(ν))=A(p, u, α)=−(1+α)+(u/c+α).

The locus A(p, u, α)=0, which is the graph of the relation , separates the rectangle into two regions, denoted [C] and [D] to indicate the individual's choice. The measure (or probability mass, using the distribution H) of these regions gives the overall probabilities of C and D play by an individual whose imperfectly perceived negative reciprocity parameter is ν̄.

The individual's fitness is the expectation (with respect to the distribution H) of the fitness payoff to C or D over the possible new partners. It is given by the Stieltjes integral

(1)

The key calculation is the fitness gradient. Taking the derivative in Eq. (1) with respect to ν we obtain

(2)

The last term in Eq. (2) is a line integral over the locus A=0. It comes from the relevant generalization of the fundamental theorem of calculus (or a special case of Stokes' Theorem) because the locus moves when ν changes. Conveniently, it is zero because W(C)=W(D) precisely on the locus A=0 where C and D are equally fit.

Recall that W(D)=p(2−u) depends on the stranger's negative reciprocity parameter u but is independent of the individual's own value of ν, so the middle term in Eq. (2) also vanishes. That leaves only the first term, whose integrand is the derivative of W(C)=−(1+α(ν))+p(2+a(ν)) with respect to ν. Hence

(3)
where the encounter function used in the text is now seen to be precisely the probability that the individual is the victim of the sucker payoff. This probability is independent of ν, so the shape of the payoff function w depends only on the group's enforcement function ρ.

It is now clear that the simple argument in the text applies directly since it was based on the same first order condition ρ′(νnν)=1 that emerges here. We conclude as in the main text that individuals will adapt monotonically towards a point ν* somewhat below the group norm νn, with the size of the gap depending on norm enforcement.

Presumably, there is some family of joint distributions H that gives rise to the exponential family f(ν̄) used in the text, but its description remains an open question. A deeper open question is to characterize the distribution H from parameters of a general equilibrium model whose state variable is the distribution of memes across all groups. Analytical work with such models involves nonlinear partial differential equations and is well beyond the scope of the present paper. Numerical simulations as in Boyd et al (2003) and numerous other studies could also provide some insight.

References

Alexander, 1987 1.Alexander RD. The biology of moral systems. New York: Aldine de Gruyter; 1987;.

Black-Michaud, 1975 2.Black-Michaud J. Cohesive force: feud in the Mediterranean. Oxford: Blackwell; 1975;.

Blackmore, 1999 3.Blackmore S. The meme machine. Oxford: Oxford University Press; 1999;.

Blackmore, 2000 4.Blackmore S. The power of memes. Scientific American. 2000;64–73October.

Boyd et al, 2003 5.Boyd R, Gintis H, Bowles S, Richerson P. The evolution of altruistic punishment. Proceedings of the National Academy of Sciences. 2003;1006:3531–3535.

Boyd & Richerson, 1990 6.Boyd R, Richerson PJ. Group selection among alternative evolutionarily stable strategies. Journal of Theoretical Biology. 1990;145:331–342. MEDLINE

Catanzaro, 1992 7.Catanzaro R. Men of respect: a social history of the Sicilian Mafia. New York: The Free Press; 1992;.

Darwin, 1871 8.Darwin C. The descent of man and selection in relation to sex. New York: Appleton; 1871;.

Davis, 1980 9.Davis J. Antropologia della Societa Mediterranee: Un'analisi Comparata. Turin: Rosenberg & Sellier; 1980;.

Dawkins, 1976 10.Dawkins R. The selfish gene. New York: Oxford University Press; 1976;.

Dawkins, 1982 11.Dawkins R. The extended phenotype: the gene as the unit of selection. San Francisco: Freeman; 1982;.

Durham, 1991 12.Durham WH. Coevolution: genes, culture, and human diversity. Stanford, CA: Stanford University Press; 1991;.

Farb, 1978 13.Farb P. Man's rise to civilization: the cultural ascent of the Indians of North America. New York: Penguin; 1978;.

Fehr & Gächter, 2000 14.Fehr E, Gächter S. Fairness and retaliation: the economics of reciprocity. Journal of Economic Perspectives. 2000;14(3):159–182.

Fehr & Henrich, 2003 15.Fehr E, Henrich J. Is strong reciprocity a maladaptation?. In:  Hammerstein P editors. Genetic and cultural evolution of cooperation. Cambridge, MA: MIT Press; 2003;p. 55–82.

Frank, 1987 16.Frank R. If Homo Economicus could choose his own utility function, would he want one with a conscience?. American Economic Review. 1987;77:593–604.

Frank, 1988 17.Frank R. Passions within reason: the strategic role of the emotions. New York: WW Norton; 1988;.

Friedman, 1991 18.Friedman D. Evolutionary games in economics. Econometrica. 1991;59:637–666.

Friedman & Singh, 1999 19.Friedman, D., & Singh, N. (1999). On the viability of vengeance. UC Santa Cruz working paper. Available at: http://econ.ucsc.edu/faculty/workpapers.html.

Friedman & Singh, 2003 20.Friedman, D., & Singh, N. (2003). Equilibrium vengeance. UC Santa Cruz working paper. Available at: http://leeps.ucsc.edu/leeps/projects/misc/EqVenge/EqVenge.

Friedman & Singh, 2004 21.Friedman D, Singh N. Vengeance evolves in small groups. In:  Huck S editors. Festschrift in honor of Werner Güth. 2004;In press.

Galaty & Bonte, 1991 22.In:  Galaty JG,  Bonte P editor. Herders, warriors and traders: pastoralism in Africa. Boulder, CO: Westview Press; 1991;.

Gilmore, 1991 23.Gilmore DD. Manhood in the making: cultural concepts of masculinity. New Haven: Yale University Press; 1991;.

Gintis et al., 2003 24.Gintis H, Bowles S, Boyd R, Fehr E. Explaining altruistic behavior in humans. Evolution and Human Behavior. 2003;24:153–172.

Guttman, 2003 25.Guttman JM. Repeated interaction and the evolution of preferences for reciprocity. Economic Journal. 2003;113:631–656.

Haldane, 1955 26.Haldane JBS. Population genetics. New Biology. 1955;18:34–51.

Hamilton, 1964 27.Hamilton WD. The genetical evolution of social behaviour. Journal of Theoretical Biology. 1964;7:1–52. MEDLINE | CrossRef

Harrington, 1989 28.Harrington JE. If Homo economicus could choose his own utility function, would he want one with a conscience?: comment. American Economic Review. 1989;79:588–593.

Henrich & Boyd, 2001 29.Henrich J, Boyd R. Why people punish defectors: weak conformist transmission can stabilize costly enforcement of norms in cooperative dilemmas. Journal of Theoretical Biology. 2001;208:79–89. MEDLINE | CrossRef

Hirshleifer, 1987 30.Hirshleifer J. On the emotions as guarantors or threats and promises. In:  Dupré J editors. The latest on the best: essays in evolution and optimality. Cambridge, MA: MIT Press; 1987;p. 307–326.

Leimar & Hammerstein, 2000 31.Leimar O, Hammerstein P. Evolution of cooperation through indirect reciprocity. Proceedings of the Royal Society of London B. 2000;268:745–753.

Levine, 1998 32.Levine DK. Modeling altruism and spitefulness in experiments. Review of Economic Dynamics. 1998;1:593–622.

Lowie, 1954 33.Lowie RH. Indians of the plain. New York: McGraw-Hill; 1954;.

MacDonald, 1994 34.MacDonald KB. A people that shall dwell alone: Judaism as a group evolutionary strategy. Westport, CT: Praeger; 1994;.

Nelson & Winter, 1982 35.Nelson RR, Winter SG. An evolutionary theory of economic change. Cambridge, MA: Belknap Press of Harvard University Press; 1982;.

Nisbett & Cohen, 1996 36.Nisbett RE, Cohen D. Culture of honor: the psychology of violence in the south. Boulder, CO: Westview Press; 1996;.

Nowak & Sigmund, 1998 37.Nowak MA, Sigmund K. Evolution of indirect reciprocity by image scoring. Nature. 1998;393:573–577. MEDLINE | CrossRef

O'Kelley & Carney, 1986 38.O'Kelley CG, Carney LS. Women and men in society. New York: D. Van Nostrand; 1986;.

Peristiany, 1965 39.In:  Peristiany JG editors. Honor and shame: the values of Mediterranean society. London: Weidenfeld and Nicolson; 1965;.

Pettigrew, 1975 40.Pettigrew J. Robber noblemen: a study of the political system of the Sikh Jats. London: Routledge & Kegan Paul; 1975;.

Rosenthal, 1996 41.Rosenthal, R. W. (1996). Trust and social efficiencies. Boston University manuscript.

Sethi & Somanathan, 2003 42.Sethi R, Somanathan R. Understanding reciprocity. Journal of Economic Behavior and Organization. 2003;50:1–27.

Sober & Wilson, 1998 43.Sober E, Wilson DS. Unto others: the evolution and psychology of unselfish behavior. Cambridge, MA: Harvard University Press; 1998;.

Sugden, 1986 44.Sugden R. The economics of rights, co-operation and welfare. New York: Blackwell; 1986;.

Trivers, 1985 45.Trivers R. Social evolution. Menlo Park, CA: Benjamin/Cummings; 1985;.

Weibull, 1995 46.Weibull JW. Evolutionary game theory. Cambridge, MA: MIT Press; 1995;.

Weingart et al., 1997 47.Weingart P, Boyd R, Durham WH, Richerson PJ. Units of culture, types of transmission. In:  Weingart P,  Mitchell SD,  Richerson PJ,  Maasen S editor. Human by nature: between biology and the social sciences. Mahwah, NJ: Lawrence Erlbaum; 1997;.

Wilson, 1980 48.Wilson EO. Sociobiology. Cambridge, MA: Harvard University Press; 1980;.

Wynne-Edwards, 1962 49.Wynne-Edwards VC. Animal dispersion in relation to social behavior. Edinburgh: Oliver and Boyd; 1962;.

Department of Economics, Social Sciences 1, University of California, Santa Cruz, CA 95064, USA

Corresponding author. Tel.: +1-831-459-4093; fax: +1-831-459-5900

PII: S1090-5138(04)00010-8

doi:10.1016/j.evolhumbehav.2004.03.002



2007:11:13