REVIEW · 2022-11-03

The Entropy Centre

在垂死太空站中倒转时间的测试房间

Steam 商店页 ↗

Reactions (no login)

Anonymous • one of each per visitor per day

Read next

FEATURED ESSAY · 2026-06-22

Monti et al.: Measuring AI's Planning Power on a Single-Corridor Sokoban — Fukai Reads

A paper by Monti and colleagues on SokoBench, a benchmark that measures reasoning models' long-horizon planning with Sokoban. By lining up only single-box straight corridors and narrowing difficulty to a single axis (corridor length), it shows that even state-of-the-art reasoning models break down once more than 25-30 moves of lookahead are needed. The authors locate the cause in accumulated miscounting.

Related reviews