commit 17a4bada036386de83204177a3af6db3546666c3
parent bfd97a203969e55db8b901dd4e9779c27b575264
Author: Brian Graham <brian@buildingbetterteams.de>
Date: Tue, 7 Apr 2026 07:27:33 +0200
Add spec for gameplay bot rewrite (falling piece detection)
Start detection based on detecting a falling piece instead of pixel
changes. Conditional phase execution to prevent false positives.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat:
1 file changed, 66 insertions(+), 0 deletions(-)
diff --git a/tasks/tetris/eval/gameplay-bot/NEXT_SESSION_SPEC.md b/tasks/tetris/eval/gameplay-bot/NEXT_SESSION_SPEC.md
@@ -0,0 +1,66 @@
+# Gameplay Bot Rewrite Spec
+
+## Problem
+Bot has false positives because it thinks the game started when it didn't.
+Current start detection clicks canvas and checks if any pixel changed --
+this triggers on title screens, hover effects, animations.
+
+## New Start Detection
+
+Universal signal: **a piece is falling**. After each trigger attempt,
+run a falling piece detector instead of screenshot comparison.
+
+### Trigger sequence (try each, check for falling piece after each):
+1. Wait 3s (auto-start)
+2. Click canvas center
+3. Press Enter
+4. Press Space
+5. Click body at various positions
+6. Press various keys (arrow down, Z, etc.)
+
+### Falling piece detector:
+- Take 3 screenshots ~1s apart
+- Find a rectangular cluster of colored pixels (~4 cells) that moved downward
+- "Roughly square-ish" -- tetromino bounding box is 2x2 to 4x1
+- May have rounded edges, glows, shadows -- look for the bounding box
+- Works for canvas, DOM, SVG, WebGL -- any rendering approach
+- If piece already at bottom, detect new piece spawning at top instead
+- Consider: games might render pieces as individual DOM divs, SVG rects,
+ canvas fills, or WebGL quads
+
+### If no falling piece after all triggers:
+- Game did not start
+- All downstream tests: "skipped: game did not start"
+- Zero false positives
+
+## Conditional Phase Execution
+
+Each phase depends on the previous succeeding:
+
+1. **Load + calibrate**: always runs
+2. **Start detection**: try triggers, confirm falling piece
+3. **Mechanics test**: only if game started (piece detected)
+4. **Gameplay (play to win)**: only if mechanics worked
+5. **Game over**: only if pieces can be placed. Must stack pieces to top
+ and verify via grid reader (filled cells in top rows), NOT screenshot comparison
+6. **Endurance**: only if gameplay phase succeeded
+
+Failed prerequisites -> "skipped: [prerequisite] failed" on all downstream tests.
+No more false positives from static screens.
+
+## Game Over Fix
+
+Current: screenshot comparison (nothing changed = game over).
+This false-positives on static start screens.
+
+New:
+1. Actually place pieces (hard drop repeatedly)
+2. Verify via grid reader that filled cells reach top rows
+3. Then check if inputs stop having effect (piece doesn't spawn)
+4. Optionally look for "game over" text in DOM
+
+## Notes
+- Games might auto-start (no button needed)
+- Start buttons might be canvas-rendered (no DOM button to find)
+- Some games have splash screens with animations (pixel change != game start)
+- The key insight: a FALLING PIECE is the only universal signal that gameplay began