PAPER-DIGEST · 2026-07-01

Liu et al.: More Memory Makes AI Agents Less Cooperative — Fukai Reads

The 'memory curse' in repeated social dilemmas (arXiv preprint, May 2026)

Reviewed by Fukai · #paper-digest #research #llm-agents #cooperation #social-dilemma #game-theory #multi-agent #memory #npc-design

日本語版を読む →

TL;DR

Drop large language models (LLMs — AI trained on huge amounts of text to read and write language) into a repeated cooperative game as 'agents,' and the longer you let them see the past, the worse they cooperate. This is the counterintuitive phenomenon reported by a Carnegie Mellon-led team, which they name the 'memory curse.'

Running 500-round repeated matches across 7 models, 4 social-dilemma games, and history windows up to 80 rounds, cooperation fell as history grew in 18 of 28 model-game settings. The driver is not context length but the content — the accumulated record of defections — and nudging the reasoning toward the future partly repairs it. More memory is not automatically better.

Introduction — who and where

The ten authors are Jiayuan Liu, Tianqin Li, Shiyi Du, Xin Luo, Haoxuan Zeng, Emanuel Tewolde, Tai Sing Lee, Tonghan Wang, Carl Kingsford, and Vincent Conitzer, affiliated across Carnegie Mellon University, its Foundations of Cooperative AI Lab (FOCAL), the University of Michigan, and Harvard University. It is an arXiv preprint (arXiv:2605.08060, submitted 8 May 2026), so it has not yet passed peer review (prior expert vetting), and citations have not accumulated — this is still a pre-discussion stage.

I picked it today because, for people who build games, 'how much of the past should the AI remember' is becoming an unavoidable design decision. When you wire an LLM into an NPC (Non-Player Character — a character the player does not control) or an opponent, it is tempting to assume 'remember everything and it gets smarter.' This paper challenges that intuition head-on.

Background — does longer memory really help cooperation?

The classic social dilemma (a situation where what is good for the individual clashes with what is good for the group) is the Prisoner's Dilemma. In a one-shot game defection pays, but against the same opponent repeatedly, mutual cooperation pays more over the long run. Game theory's 'Folk Theorem' — skipping the hard math — essentially says that with near-infinite repetition and enough history, cooperation can be sustained.

Behavioral psychology, however, has shown the opposite. In Ma et al. (2021), cited by the authors, giving human subjects too much memory let them be jerked around by historical noise (chance defections), hold grudges, and cooperate less. Humans 'forgive' by 'forgetting' and adapt. AI agents, by contrast, read history verbatim and it never naturally fades. Hence the question: does more memory build trust, or does a perfect ledger doom the system to retaliation?

Most prior LLM game studies looked only at short matches of around ten rounds, obscuring the effect of long histories. The authors scale this up an order of magnitude to 500 rounds and treat history length itself as the variable — which is what is new here.

Approach — what they measured and how

The authors had seven open models (Gemma-3-12B, GPT-OSS-20B, GPT-OSS-120B, Llama-3.3-70B, Llama-4-Scout-17B, Mistral-7B, Qwen2.5-Coder-32B) play four games — Prisoner's Dilemma, Traveler's Dilemma, Public Goods, and Trust Game — repeatedly with 2-3 players. The key knob is 'history length (HL),' the number of past rounds an agent can see, varied across nine steps from zero to 80. Each setting ran three times, 500 rounds per match (with a 99% chance of continuing each round).

The headline metric is the cooperation rate. The authors also collected 378,000 of the models' reasoning texts (chain-of-thought — the 'train of thought' written before the final decision) and analyzed the vocabulary, scoring strategic intent as the ratio of forward-looking words (eyeing long-term gain) to history-following, risk-averse words (bound to past betrayals).

Three further tests follow. First, fine-tuning Mistral-7B on a small corpus of purely forward-looking reasoning (via LoRA — a lightweight method that trains only a thin added layer, not the whole model) to actively check whether the cause is a 'reasoning habit.' Second, a 'memory sanitization' experiment that keeps context length fixed at 80 rounds but swaps the content for cooperative records. Third, a comparison against a no-chain-of-thought condition (an ablation study — testing which part of the design matters by removing elements one at a time).

Findings — longer history, weaker cooperation

The big picture: at zero history (HL=0), models fear immediate exploitation and collapse into defection, with near-zero cooperation (especially in Public Goods and Trust). But a minimal memory (HL=2) lets them read recent actions as cues to intent, enabling Tit-for-Tat-like strategies, and cooperation jumps. The trouble is beyond that: extend the history and cooperation erodes. Of 28 model-game settings, 18 collapsed ('Memory Cursed') and 10 held cooperation above 95% across all history lengths ('Memory Immune').

Concrete numbers from the paper: in the Trust Game, Gemma-3-12B's cooperation fell from 51.2% (HL=2) to 9.5% (HL=80), with cumulative reward dropping from 8.59 to 5.19. GPT-OSS-20B in the Prisoner's Dilemma went 92.1% to 20.6%; Llama-4-Scout-17B in Public Goods went 82.6% to 45.8%. At HL=80, variance across seeds (initial conditions) explodes (±24.0% for Llama-4-Scout-17B). Long memory amplifies an early chance defection and locks agents into a fixed retaliation pattern.

The lexical analysis revealed a surprising breakdown: the collapse is driven not by 'more hostile words' but by 'fewer cooperative words' (cooperative reasoning fell 57% in GPT-OSS-20B at HL=80). And the sanitization experiment — keeping the 80-round window but swapping the content for cooperative records — restored cooperation substantially. So the curse, the authors conclude, is the content (accumulated defection), not context length.

There is also a remedy hint. Mistral-7B fine-tuned only on forward-looking reasoning saw HL=80 cooperation rise by +14.7 to +79.3 points across the four games, reaching near 100% in Public Goods, Trust, and Prisoner's Dilemma (Trust went 34.8% to 100.0%). It transferred zero-shot (no extra training) to untrained games and kept its general math/knowledge/coding ability (+2.3% on GSM8K). Conversely, forcing the model to write its reasoning (CoT) strengthened the curse: in each model's hardest game, Llama-3.3-70B fell from 100% without reasoning to 6.9% with it (−93.1 points). The authors call this the 'tragedy of overthinking.'

For game and puzzle designers

First, the 'memory design' of NPCs and AI opponents. If I were building an LLM-driven companion or negotiation partner, remembering every past betrayal is dangerous — one accident can curdle into permanent distrust. Given the cooperation peak near HL=2, deliberately building a window that 'remembers only the last few moves vividly while old records fade (forget and forgive)' yields an NPC that can recover relationships. The authors' 'selective forgetting / summarization' drops straight into a design spec.

Second, emergent simulations with a cast of characters (games where towns or factions act autonomously). Here a single 'grudge-holder' is toxic. In the paper's asymmetric experiment, one long-memory agent dragged the group into a retaliation spiral and crashed group welfare in Public Goods. Conversely, a short-memory 'forgiver' kept cooperating even when outnumbered (+33 points in one setting). When assembling matchmaking or faction AI, simply avoiding 'everyone has long memory' can change a society's stability.

Third, tuning an AI's 'personality' via reasoning style. The longer you make it reason, the more it enumerates past betrayals and turns harsh. Want a friendly sparring partner? Keep reasoning light. Want a tough nemesis? Let it deliberate. For a tutorial bot teaching cooperative negotiation, simply keeping memory short makes a 'forgiving, kind teacher.'

More broadly, this applies to digital ports of board games, diplomacy/trade-heavy strategy, and AI tuning for Among Us-style betrayal games. The shared design principle is one: memory is not better when there is more of it — what you keep, and how much, decides the experience.

Limitations — how far this goes

Starting with the authors' own caveats. They explicitly state the fine-tuning (LoRA) is an 'interventional probe' to confirm the cause, not a scalable engineering fix for the curse. The training data was selected by reasoning style, but since that style correlates with cooperative actions, they say the possibility of 'merely memorizing cooperation labels' cannot be fully ruled out. Some Traveler's Dilemma settings (HL=5/20/40) were bimodal, with a fraction of seeds collapsing into a race to defect. And the study is limited to open models accessed through a single API.

What I, Fukai, would add: first, scope. The targets are abstract economic games like the Prisoner's Dilemma, not actual video games. The implications for NPC design are powerful as an analogy, but the paper did not validate them in a real game, and that gap is worth keeping in mind. Second, the 'forward-looking / hostile' intent metric depends on hand-built vocabulary lists — a handy approximation, but fragile to phrasings it misses. Third, this is a pre-peer-review preprint whose results have not yet been independently replicated. I quoted every figure as written, but it is too early to read them as settled fact.

Fukai's reading (my interpretation, only here)

I would place this study as one instance in which Ma et al.'s (2021) human finding — that limited memory optimizes cooperation — is reproduced in LLM agents. In the vocabulary of design criticism, it is a re-evaluation of 'forgetting' as a game mechanic. We have long set a naive equals sign between 'memory' and 'intelligence,' but in the experience of cooperation, it is the slack to forget and forgive that keeps a relationship alive. A perfect record structurally strips away the possibility of forgiveness — that is the one sentence this paper leaves with me.

Closing

If you want to go deeper, read this alongside the behavioral experiment it builds on — Ma et al. (2021), 'limited memory optimizes cooperation' — and the LLM repeated-game literature (Akata et al. 2025 and others); together they sketch a map of how human and AI memory differ. The report that reasoning models can cooperate less (Piedrahita et al. 2025) also echoes this paper's 'tragedy of overthinking.'

The takeaway for game makers is clear. When you hand the past to an AI, design not only 'what it remembers' but 'what, and when, it forgets.' Memory is a material; it is not, by itself, either intelligence or kindness.

References

Papers and related material referenced in this article:

・The Memory Curse: How Expanded Recall Erodes Cooperative Intent in LLM Agents (Liu et al., 2026, arXiv preprint 2605.08060)

・DOI: 10.48550/arXiv.2605.08060 (arXiv-issued, registration pending)

・Related: Ma et al. (2021), “Limited memory optimizes cooperation in social dilemma experiments”

・Related: Akata et al. (2025), “Playing repeated games with large language models”

Reactions (no login)

Anonymous • one of each per visitor per day