commit e7d3751a6c70ea21ce4ababeea6480153f01ddc8
parent 8f6bcecff9c0de70911f505506bc65fc3c588dc7
Author: Brian Graham <brian@buildingbetterteams.de>
Date: Tue, 7 Apr 2026 07:07:49 +0200
Add limitation: UI bugs masking working gameplay logic
Games with CSS issues, broken start buttons, or overlays can score 0
even when the underlying game logic works correctly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat:
1 file changed, 1 insertion(+), 0 deletions(-)
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -122,6 +122,7 @@ Short URL IDs: 8-char SHA256 hash for `/r/` and `/c/` routes with redirect pages
- [ ] Gameplay bot start detection checks canvas click before start buttons, causing false "started" on start screens. Reorder to check buttons first.
- [ ] Gameplay bot false positives: piece_locks and game_over can pass on static start screens when grid reader misidentifies UI chrome as game state.
- [ ] Some agents build working games that require a build step (Vite/webpack) but don't run the build, so the artifact is source code not a playable game. The eval scores 0 but the game "works" if you build it.
+- [ ] Games with minor UI bugs (CSS z-index, overflow, missing start button handler) can mask fully working gameplay logic. The bot scores 0 because it can't access the game, even though the code is correct. A "start game" button that doesn't work prevents testing all other mechanics.
- [ ] Memory leak detection via Playwright heap snapshots
- [ ] Frame rate measurement during gameplay
- [ ] Dead code detection (knip)