2026-06-22 · paper-digest
Monti et al.: Measuring AI's Planning Power on a Single-Corridor Sokoban — Fukai Reads
A paper by Monti and colleagues on SokoBench, a benchmark that measures reasoning models' long-horizon planning with Sokoban. By lining up only single-box straight corridors and narrowing difficulty to a single axis (corridor length), it shows that even state-of-the-art reasoning models break down once more than 25-30 moves of lookahead are needed. The authors locate the cause in accumulated miscounting.