commit 8f6bcecff9c0de70911f505506bc65fc3c588dc7
parent b3077d446a1b2e14aa526b86af97732322f74c40
Author: Brian Graham <brian@buildingbetterteams.de>
Date: Tue, 7 Apr 2026 07:01:05 +0200
Document bot false positives and unbuildable game limitation
- Start detection order bug: clicks canvas before checking start buttons
- piece_locks and game_over can false-positive on static start screens
- Some agents build working Vite/webpack projects but don't compile them
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat:
1 file changed, 3 insertions(+), 0 deletions(-)
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -119,6 +119,9 @@ Short URL IDs: 8-char SHA256 hash for `/r/` and `/c/` routes with redirect pages
### Eval
- [ ] Quality scoring too coarse (binary pass/fail on 3 checks = 0/33/67/100%)
- [ ] Gameplay bot does NOT test: wall kicks, lock delay (sliding at collision line), T-spins, hold piece, ghost piece, next piece preview, level/speed progression, DAS. Known limitation for methodology page.
+- [ ] Gameplay bot start detection checks canvas click before start buttons, causing false "started" on start screens. Reorder to check buttons first.
+- [ ] Gameplay bot false positives: piece_locks and game_over can pass on static start screens when grid reader misidentifies UI chrome as game state.
+- [ ] Some agents build working games that require a build step (Vite/webpack) but don't run the build, so the artifact is source code not a playable game. The eval scores 0 but the game "works" if you build it.
- [ ] Memory leak detection via Playwright heap snapshots
- [ ] Frame rate measurement during gameplay
- [ ] Dead code detection (knip)