2026-06-19 · design-roundup
Where 'Solvable' and 'Fun' Diverge — PuzzleJAX Hands 500+ PuzzleScript Games to the Machines (arXiv, Aug 2025)
One article today: "PuzzleJAX: A Benchmark for Reasoning and Learning" (arXiv preprint, August 2025) by researchers at NYU, the University of Malta, the University of the Witwatersrand (South Africa), and Microsoft (Sam Earle, Graham Todd, Ahmed Khalifa, Julian Togelius and others). They reimplement PuzzleScript — Stephen Lavelle's (increpare) 2013 puzzle-authoring language — on the GPU and hand 500+ human-authored games to tree search, reinforcement learning, and large language models. Read as a designer, the core is one observation: 'solvable by a machine' and 'interesting to a human' are not the same thing. Tree search brute-forces simple games but stalls the moment they get richer; LLMs score 0% on most. The authors even note PuzzleScript's own creator hesitating to embed an auto-solver into the IDE, a caution about measuring difficulty by search.