2026-06-21 · paper-digest
Luo et al.: Can AI Agents Build Whole Playable Games in a Real Engine? — Fukai Reads
A paper by Luo, Wang and colleagues on GameCraft-Bench, a benchmark for end-to-end game generation by coding agents. It has agents build complete playable games on Godot from natural-language specs, judged by launch, input replay, and video-based scoring across 140 tasks in 15 families. Even the strongest configuration reaches only 41.46% overall, and the authors report that agents can build mechanics but fall short of finished games with content, readability, and polish.