REFACTOR_SPEC.md (30796B)
1 # Two-Tier Refactor Spec: Driver + Bot 2 3 ## Problem Statement 4 5 The gameplay bot is ~3500 lines across 6 files, with two distinct concerns tangled 6 together: understanding the webpage (finding grids, clicking buttons, reading pixels, 7 sending keystrokes) and playing Tetris (phase orchestration, AI decisions, test 8 derivation, bug detection). The boundary between them is blurred: 9 10 - `calibrate.ts` handles grid detection, start mechanism detection, control detection, 11 overlay detection, interactivity verification, screenshot sampling, visual change 12 detection, and page surveying -- all in one 1300-line file. 13 - `tests.ts` does phase orchestration, BUT ALSO calls `readGrid` directly during 14 mechanics tests, reads score elements, detects game over text, measures drop 15 intervals, detects next piece previews, and reads level displays. 16 - `player.ts` calls both `readGrid` and `page.keyboard.press` directly, coupling 17 AI logic to the Playwright API. 18 - `grid-reader.ts` is the cleanest module but still exports low-level grid analysis 19 utilities (bounding boxes, cell counts, piece identification) that the bot calls 20 directly instead of going through an abstraction. 21 22 The result: any change to how the page is read ripples through all files. You cannot 23 test the AI player without a live Playwright page. You cannot swap the grid reader 24 without touching the test orchestrator. 25 26 ## Proposed Architecture 27 28 ``` 29 +------------------+ 30 | index.ts | Entry point: HTTP server, Playwright test, 31 | | report output. Unchanged. 32 +--------+---------+ 33 | 34 v 35 +------------------+ 36 | bot.ts | Layer 2: "The Brain" 37 | | Phase orchestration, AI decisions, test 38 | | derivation, competitive play, bug detection. 39 | | Calls only the Driver interface. 40 +--------+---------+ 41 | 42 v 43 +------------------+ 44 | driver.ts | Layer 1: "The Eyes and Hands" 45 | | Abstracts the webpage. Exposes a clean API. 46 | | Handles grid reading, start detection, 47 | | control detection, keyboard input. 48 +--------+---------+ 49 | 50 +---------+---------+ 51 | | 52 v v 53 +-----------+ +------------+ 54 | types.ts | | player.ts | Pure Tetris logic: AI heuristics, 55 | | | | board simulation, placement finding. 56 +-----------+ | | NO Playwright imports. NO page access. 57 +------------+ 58 ``` 59 60 ### What goes where 61 62 **driver.ts** -- "I can see and interact with this webpage" 63 - Grid detection (finding the grid on the page) 64 - Grid reading (10x20 boolean matrix from canvas/DOM/SVG) 65 - Start mechanism detection (the 5-phase cascade) 66 - Control detection (which keys the game responds to) 67 - Score/level/lines reading 68 - Keyboard input (move, rotate, drop) 69 - Screenshot capture 70 - Interactivity verification 71 - Page surveying (pre-test data collection) 72 - Background color sampling 73 - Visual change detection 74 - Next piece preview detection 75 - Game over text detection 76 - Re-calibration 77 78 **bot.ts** -- "I know Tetris rules and test logic" 79 - Phase orchestration (the 8 conditional phases) 80 - Test derivation from session data (the 24 tests) 81 - Score/timing/event tracking (GameSession bookkeeping) 82 - Competitive play with bug detection 83 - Line clear detection logic (watching grid state transitions) 84 - Game over triggering strategy (stack pieces to fill grid) 85 - Endurance testing 86 - Report assembly (BotReport construction) 87 88 **player.ts** -- "I know where to put pieces" (pure computation, no I/O) 89 - 4-heuristic scoring (aggregate height, lines, holes, bumpiness) 90 - Piece definitions (rotations, dimensions) 91 - Board simulation (drop piece, clear lines) 92 - Best placement finding 93 - No `Page` import, no `readGrid` call, no `keyboard.press` 94 95 **types.ts** -- unchanged, all interfaces stay 96 97 **grid-reader.ts** -- absorbed into driver.ts (see migration plan) 98 99 **index.ts** -- unchanged except it calls bot.ts instead of tests.ts 100 101 --- 102 103 ## Driver Interface 104 105 ```typescript 106 import type { Page } from "@playwright/test"; 107 import type { 108 Grid, 109 GridBounds, 110 RendererType, 111 Controls, 112 StartMechanism, 113 SurveyData, 114 PieceType, 115 } from "./types"; 116 117 // --------------------------------------------------------------------------- 118 // Configuration returned by calibration, passed through subsequent calls. 119 // Replaces CalibrationResult for internal use within the Driver. 120 // --------------------------------------------------------------------------- 121 122 export interface DriverCalibration { 123 renderer: RendererType; 124 gridDetected: boolean; 125 gridBounds: GridBounds | null; 126 cellWidth: number; 127 cellHeight: number; 128 controls: Controls; 129 startMechanism: StartMechanism; 130 scoreElementSelector: string | null; 131 levelElementSelector: string | null; 132 backgroundColor: [number, number, number] | null; 133 consoleErrors: string[]; 134 gridConfidence: number; 135 startButton?: { 136 selector: string; 137 text: string; 138 disappeared: boolean; 139 position: { x: number; y: number }; 140 }; 141 } 142 143 // --------------------------------------------------------------------------- 144 // Grid snapshot: the grid state plus derived information the bot needs. 145 // --------------------------------------------------------------------------- 146 147 export interface GridSnapshot { 148 /** The 10x20 boolean grid. null if reading failed. */ 149 grid: Grid | null; 150 /** Total filled cells. 0 if grid is null. */ 151 filledCount: number; 152 /** Filled cells in the bottom N rows. */ 153 filledInBottom(rows: number): number; 154 /** Whether any cell in the top N rows is filled. */ 155 hasFilledInTop(rows: number): boolean; 156 /** Number of fully complete rows. */ 157 completeRows: number; 158 /** Active piece cells (diff against settled grid). null if undetectable. */ 159 activePieceCells: [number, number][] | null; 160 /** Identified piece type from active piece cells. null if no active piece. */ 161 activePieceType: PieceType | null; 162 } 163 164 // --------------------------------------------------------------------------- 165 // The Driver interface. This is what the Bot sees. 166 // --------------------------------------------------------------------------- 167 168 export interface TetrisDriver { 169 // -- Lifecycle -- 170 171 /** 172 * Navigate to the game URL, wait for load, begin console error collection. 173 * Returns false if the page failed to load. 174 */ 175 loadPage(url: string): Promise<{ loaded: boolean; detail: string; errorsOnLoad: number }>; 176 177 /** 178 * Survey the page structure before any interaction. 179 * Returns information about overlays, canvas elements, DOM grids, visible text. 180 */ 181 surveyPage(): Promise<SurveyData>; 182 183 /** 184 * Run full calibration: grid detection, start mechanism detection, 185 * control detection, score element detection, grid confidence measurement. 186 * Includes re-calibration fallback if initial detection fails. 187 * Never throws. 188 */ 189 calibrate(): Promise<DriverCalibration>; 190 191 /** 192 * Re-run calibration after the game state may have changed 193 * (e.g., after starting, grid might appear that wasn't there before). 194 * Keeps the current calibration if re-calibration finds nothing better. 195 */ 196 recalibrate(): Promise<DriverCalibration>; 197 198 /** 199 * Get the current calibration. Throws if calibrate() hasn't been called. 200 */ 201 getCalibration(): DriverCalibration; 202 203 // -- Grid Reading -- 204 205 /** 206 * Read the current grid state. Returns a GridSnapshot with the raw grid 207 * and derived metrics. If settled grid is provided, active piece detection 208 * is diffed against it. 209 * 210 * Returns a snapshot with grid: null if reading fails. 211 */ 212 readGrid(settledGrid?: Grid | null): Promise<GridSnapshot>; 213 214 /** 215 * Compare two grids for equality. True if they differ. 216 */ 217 gridsAreDifferent(a: Grid | null, b: Grid | null): boolean; 218 219 // -- Input -- 220 221 /** 222 * Press a game control key. Uses the controls detected during calibration. 223 */ 224 pressKey(action: "left" | "right" | "down" | "rotate" | "drop"): Promise<void>; 225 226 /** 227 * Press an arbitrary key (for testing CCW rotation with 'z', etc.). 228 */ 229 pressRawKey(key: string): Promise<void>; 230 231 /** 232 * Wait for a specified duration (milliseconds). 233 */ 234 wait(ms: number): Promise<void>; 235 236 // -- Score/Level/Lines Reading -- 237 238 /** 239 * Read the current score from the detected score element. 240 * Returns null if no score element was found or reading fails. 241 */ 242 readScore(): Promise<number | null>; 243 244 /** 245 * Read the current level from the page. 246 * Returns null if no level display found or reading fails. 247 */ 248 readLevel(): Promise<number | null>; 249 250 // -- Page State Queries -- 251 252 /** 253 * Check if "Game Over" (or equivalent) text is visible on the page. 254 * Returns the matched text, or null if not found. 255 */ 256 detectGameOverText(): Promise<string | null>; 257 258 /** 259 * Check if a restart button/prompt is visible. 260 */ 261 detectRestartOption(): Promise<boolean>; 262 263 /** 264 * Check if a next piece preview display exists. 265 */ 266 detectNextPiecePreview(): Promise<boolean>; 267 268 /** 269 * Get all console errors collected since loadPage() was called. 270 */ 271 getConsoleErrors(): string[]; 272 273 // -- Screenshots -- 274 275 /** 276 * Take a screenshot. Returns raw PNG buffer. 277 */ 278 screenshot(): Promise<Buffer>; 279 280 /** 281 * Measure the auto-drop interval (time between gravity-driven grid changes 282 * with no input). Returns average interval in ms, or 0 if unmeasurable. 283 */ 284 measureDropInterval(): Promise<number>; 285 } 286 ``` 287 288 ### Method-to-Source Mapping 289 290 Each Driver method maps to existing code as follows: 291 292 | Driver Method | Current Source | Current Function(s) | 293 |---|---|---| 294 | `loadPage()` | tests.ts:277-303 | `loadAndCheckPage()`, `loadGamePage()` | 295 | `surveyPage()` | calibrate.ts:1300-1393 | `surveyPage()` | 296 | `calibrate()` | calibrate.ts:24-94 | `calibrate()`, `detectGrid()`, `detectStartMechanism()`, `detectControls()`, `detectScoreElement()`, `measureGridConfidence()` | 297 | `recalibrate()` | tests.ts:152-163 | inline re-calibration after start | 298 | `readGrid()` | grid-reader.ts:15-38, 46-118, 142-364 | `readGrid()`, `readCanvasGrid()`, `readDomGrid()`, plus `countFilled()`, `countFilledInBottomRows()`, `hasFilledInTopRows()`, `countCompleteRows()`, `detectActivePieceCells()`, `identifyPieceType()` | 299 | `gridsAreDifferent()` | grid-reader.ts:400-410 | `gridsAreDifferent()` | 300 | `pressKey()` | player.ts:251-277 | inline `page.keyboard.press()` calls using `cal.controls` | 301 | `pressRawKey()` | tests.ts:841-842 | inline `page.keyboard.press("z")` | 302 | `wait()` | everywhere | `page.waitForTimeout()` | 303 | `readScore()` | tests.ts:490-497, 529-538, 743-749 | inline score element reading | 304 | `readLevel()` | tests.ts:1597-1630 | `readLevelFromPage()` | 305 | `detectGameOverText()` | tests.ts:929-940 | inline `page.evaluate()` for game over text | 306 | `detectRestartOption()` | tests.ts:943-955 | inline `page.evaluate()` for restart buttons | 307 | `detectNextPiecePreview()` | tests.ts:1669-1717 | `detectNextPiecePreview()` | 308 | `getConsoleErrors()` | tests.ts:94-98 | `consoleErrors` array | 309 | `screenshot()` | player.ts:370-371 | `page.screenshot()` | 310 | `measureDropInterval()` | tests.ts:1636-1664 | `measureDropInterval()` | 311 312 ### How the Driver handles different renderers 313 314 The Driver encapsulates renderer differences entirely. The Bot never knows or cares 315 whether the game uses canvas, DOM, SVG, or WebGL. 316 317 ``` 318 readGrid() internally: 319 if renderer === "canvas" && gridBounds: 320 -> readCanvasGrid() via page.evaluate(getImageData) 321 if renderer === "dom": 322 -> readDomGrid() via page.evaluate(DOM traversal) 323 if renderer === "svg": 324 -> future: readSvgGrid() 325 fallback: 326 -> try canvas if bounds exist, then try DOM 327 ``` 328 329 The `GridSnapshot` returned to the Bot is always the same shape regardless of renderer. 330 331 ### Re-calibration 332 333 The Driver maintains mutable internal state: 334 335 ```typescript 336 class PlaywrightDriver implements TetrisDriver { 337 private page: Page; 338 private cal: DriverCalibration | null = null; 339 private consoleErrors: string[] = []; 340 } 341 ``` 342 343 `recalibrate()` re-runs grid detection and start detection, but preserves 344 the existing calibration if the new one is worse (e.g., grid detection fails 345 on re-calibration but worked initially). This handles: 346 347 - Games where the grid appears only after clicking "Start" 348 - Games where the grid is rebuilt on game restart (new DOM elements) 349 - Games where the canvas resizes after initialization 350 351 ### Error handling 352 353 | Scenario | Driver behavior | 354 |---|---| 355 | Grid read returns null | `readGrid()` returns `GridSnapshot` with `grid: null`, `filledCount: 0` | 356 | Grid read throws | Same as null -- caught internally, never thrown to Bot | 357 | No score element found | `readScore()` returns `null` | 358 | Score element disappeared | `readScore()` returns `null` (caught internally) | 359 | Console error during play | Accumulated in `consoleErrors`, accessible via `getConsoleErrors()` | 360 | Page navigation fails | `loadPage()` returns `{ loaded: false, detail: "..." }` | 361 | Canvas getImageData all zeros (no GPU) | Grid validation rejects (>60% filled), returns null | 362 | Calibration finds nothing | Returns calibration with `gridDetected: false`, `startMechanism: "unknown"` | 363 364 The Driver never throws. All errors are represented in return values. 365 366 --- 367 368 ## Bot Interface 369 370 ### How the Bot calls the Driver 371 372 The Bot receives a `TetrisDriver` instance. It never imports `Page` or 373 anything from Playwright. It never calls `page.evaluate()`, `page.keyboard`, 374 or `page.screenshot()` directly. 375 376 ```typescript 377 // bot.ts 378 import type { TetrisDriver, DriverCalibration, GridSnapshot } from "./driver"; 379 import type { 380 TestResult, 381 GameplayStats, 382 GameSession, 383 CompetitivePlayResult, 384 SurveyData, 385 BotReport, 386 Grid, 387 } from "./types"; 388 import { findBestPlacement } from "./player"; 389 390 export async function runAllTests( 391 driver: TetrisDriver, 392 serverUrl: string 393 ): Promise<{ 394 testResults: TestResult[]; 395 calibration: DriverCalibration; 396 gameplay: GameplayStats; 397 session: GameSession; 398 survey: SurveyData; 399 competitivePlay: CompetitivePlayResult | null; 400 }> { 401 // Phase 1: Load 402 const loadResult = await driver.loadPage(serverUrl); 403 // ... 404 405 // Phase 2: Calibrate 406 const cal = await driver.calibrate(); 407 // ... 408 409 // Phase 3-8: Use only driver.readGrid(), driver.pressKey(), etc. 410 } 411 ``` 412 413 ### Phase execution flow using Driver methods 414 415 **Phase 1: Page Load** 416 ``` 417 driver.loadPage(url) -> { loaded, detail, errorsOnLoad } 418 driver.wait(3000) 419 ``` 420 421 **Phase 2: Calibrate + Start** 422 ``` 423 survey = driver.surveyPage() 424 cal = driver.calibrate() 425 // Internally: detectStartMechanism(), detectGrid(), etc. 426 if cal.startMechanism === "unknown" || !cal.gridDetected: 427 cal = driver.recalibrate() 428 ``` 429 430 **Phase 3: Basic Mechanics** 431 ``` 432 // Auto-drop test 433 snap0 = driver.readGrid() 434 driver.wait(5000) 435 snap1 = driver.readGrid() 436 gridChanged = driver.gridsAreDifferent(snap0.grid, snap1.grid) 437 438 // Movement tests 439 for dir in [left, right, down]: 440 snapBefore = driver.readGrid() 441 driver.pressKey(dir) 442 driver.wait(300) 443 snapAfter = driver.readGrid() 444 // compare 445 446 // Rotation test 447 snapBefore = driver.readGrid() 448 driver.pressKey("rotate") 449 driver.wait(300) 450 snapAfter = driver.readGrid() 451 // compare bounding boxes of active piece cells 452 453 // Hard drop test 454 driver.pressKey("drop") 455 driver.wait(500) 456 snapAfter = driver.readGrid() 457 // check bottom rows 458 ``` 459 460 **Phase 4: Piece Lifecycle** 461 ``` 462 // Already tested during Phase 3 mechanics 463 // Piece locks: bottom cells persist across reads 464 // New piece spawns: top rows have cells after drop 465 // Multiple pieces: piecesLocked counter >= 3 466 ``` 467 468 **Phase 5: Gameplay** 469 ``` 470 driver.loadPage(url) 471 cal = driver.calibrate() 472 initialScore = driver.readScore() 473 // Play loop (60 pieces / 45s): 474 while pieces < 60 && elapsed < 45s: 475 snap = driver.readGrid(settledGrid) 476 if snap.activePieceCells: 477 placement = findBestPlacement(settledGrid, snap.activePieceType) 478 // Execute placement using driver.pressKey() 479 for i in 0..placement.rotations: 480 driver.pressKey("rotate") 481 driver.wait(50) 482 // Move to column 483 driver.pressKey("left" or "right") * N 484 driver.pressKey("drop") 485 driver.wait(100) 486 settledGrid = (await driver.readGrid()).grid 487 driver.wait(60) 488 finalScore = driver.readScore() 489 ``` 490 491 **Phase 6: Game Over** 492 ``` 493 driver.loadPage(url) 494 driver.calibrate() 495 // Hard drop 40 times, checking grid after every 5 496 for i in 0..40: 497 driver.pressKey("drop") 498 driver.wait(150) 499 if i % 5 === 0: 500 snap = driver.readGrid() 501 if snap.hasFilledInTop(4): 502 driver.pressKey("drop") 503 driver.wait(300) 504 snap2 = driver.readGrid() 505 if !driver.gridsAreDifferent(snap.grid, snap2.grid): 506 // Game over detected 507 gameOverText = driver.detectGameOverText() 508 ``` 509 510 **Phase 7: Endurance** 511 ``` 512 driver.loadPage(url) 513 driver.calibrate() 514 // Play for 30 seconds using same play loop as Phase 5 515 ``` 516 517 **Phase 8: Competitive Play** 518 ``` 519 driver.loadPage(url) 520 driver.calibrate() 521 initialDropInterval = driver.measureDropInterval() 522 initialLevel = driver.readLevel() 523 // Play for 60 seconds with detailed tracking 524 // Every 5th poll: driver.readScore() 525 // Every 10th poll: driver.readLevel() 526 // Periodic: driver.pressRawKey("z") for CCW test 527 // Periodic: soft drop test via driver.pressKey("down") 528 finalDropInterval = driver.measureDropInterval() 529 nextPieceVisible = driver.detectNextPiecePreview() 530 gameOverText = driver.detectGameOverText() 531 restartAvailable = driver.detectRestartOption() 532 ``` 533 534 ### Test derivation 535 536 `deriveTestResults()` stays in bot.ts. It receives the `GameSession` data 537 that the Bot accumulated during phases, and produces the 24 `TestResult[]` array. 538 It does not need the Driver at all -- it operates on pure data. 539 540 The function signature is unchanged: 541 542 ```typescript 543 function deriveTestResults( 544 session: GameSession, 545 cal: DriverCalibration, 546 loadResult: LoadResult, 547 consoleErrors: string[], 548 gameplay: GameplayStats, 549 phaseState: PhaseState, 550 competitivePlay: CompetitivePlayResult | null 551 ): TestResult[] 552 ``` 553 554 ### Where the AI player logic lives 555 556 `player.ts` becomes a pure computation module. It keeps: 557 558 - `PIECES` definitions 559 - `findBestPlacement()` (exported) 560 - `findBestPlacementGeneric()` 561 - `simulateDropPiece()` 562 - `clearLines()` 563 - `aggregateHeight()`, `countHoles()`, `bumpiness()` 564 - `stripActivePiece()` (exported) 565 - `Placement` interface (exported) 566 567 It loses: 568 569 - `playGame()` -- moves to bot.ts (it orchestrates grid reads + AI + key presses) 570 - `hardDrop()` -- replaced by `driver.pressKey("drop")` 571 - `playRandomMove()` -- moves to bot.ts 572 - `playRandomForDuration()` -- moves to bot.ts 573 - `tryFillRow()` -- moves to bot.ts 574 - `stackToGameOver()` -- moves to bot.ts 575 - `executePlacement()` -- moves to bot.ts (it calls driver.pressKey) 576 - `countTotalFilled()` -- redundant with GridSnapshot.filledCount 577 578 After refactor, `player.ts` has zero Playwright imports. 579 580 --- 581 582 ## Migration Plan 583 584 ### New files created 585 586 | File | Purpose | Est. lines | 587 |---|---|---| 588 | `driver.ts` | TetrisDriver interface + PlaywrightDriver implementation | ~900 | 589 | `bot.ts` | Phase orchestration, play loops, test derivation | ~1100 | 590 591 ### Files modified 592 593 | File | Change | 594 |---|---| 595 | `player.ts` | Remove all Playwright-dependent functions, keep pure AI logic | ~350 -> ~250 | 596 | `types.ts` | Add `DriverCalibration`, `GridSnapshot` interfaces (or keep in driver.ts). Minor additions. | ~205 -> ~220 | 597 | `index.ts` | Change import from `tests.ts` to `bot.ts`, instantiate `PlaywrightDriver`, pass to `runAllTests`. | ~260 -> ~270 | 598 599 ### Files deleted 600 601 | File | Reason | 602 |---|---| 603 | `calibrate.ts` | Absorbed into `driver.ts` | 604 | `grid-reader.ts` | Absorbed into `driver.ts` | 605 | `tests.ts` | Replaced by `bot.ts` | 606 607 ### What stays 608 609 - `types.ts` -- interfaces stay the same, report format unchanged 610 - `index.ts` -- HTTP server, Playwright test structure, report writing all stay 611 - `SPEC.md` -- unchanged 612 - `COMPETITIVE_PLAY_SPEC.md` -- unchanged 613 - Report format (`BotReport`) -- identical JSON output 614 615 ### Incremental migration (4 phases) 616 617 **Phase A: Create driver.ts with the interface + implementation (no callers yet)** 618 619 1. Create `driver.ts` with `TetrisDriver` interface and `PlaywrightDriver` class. 620 2. Move into it from `calibrate.ts`: 621 - `detectStartMechanism()` and its sub-functions (`tryKeyboardTriggers`, `tryDomButtons`, `tryCanvasClicks`) 622 - `detectGrid()` 623 - `detectControls()` 624 - `detectScoreElement()` 625 - `measureGridConfidence()` 626 - `surveyPage()` 627 - `sampleScreenshot()` 628 - `detectVisualChange()` 629 - `verifyInteractivity()` 630 - `clusterPoints()` 631 - `recalibrateWithRetry()` 632 3. Move into it from `grid-reader.ts`: 633 - `readGrid()`, `readCanvasGrid()`, `readDomGrid()` 634 - `sampleBackgroundColor()` 635 - `validateGridBounds()` 636 - `gridsAreDifferent()` 637 - `countFilled()`, `countFilledInBottomRows()`, `hasFilledInTopRows()` 638 - `countCompleteRows()`, `isRowComplete()` 639 - `getColumnHeights()` 640 - `detectActivePieceCells()`, `identifyPieceType()` 641 4. Move into it from `tests.ts`: 642 - `readLevelFromPage()` 643 - `measureDropInterval()` 644 - `detectNextPiecePreview()` 645 - `extractScoreFromText()` (internal helper) 646 5. Wrap everything behind `PlaywrightDriver` methods. 647 6. Export both the interface and the class. 648 7. At this point, old code still works -- `calibrate.ts`, `grid-reader.ts`, and `tests.ts` are unchanged. 649 650 **Commit A**: "Add driver.ts: TetrisDriver interface and PlaywrightDriver implementation" 651 652 **Phase B: Create bot.ts (calls driver.ts, replaces tests.ts)** 653 654 1. Create `bot.ts` with the new `runAllTests()` that accepts `TetrisDriver`. 655 2. Move into it from `tests.ts`: 656 - `runAllTests()` (rewritten to call Driver instead of Playwright directly) 657 - `runBasicMechanicsPhase()` 658 - `runGameplayPhase()` 659 - `runGameOverPhase()` 660 - `runEndurancePhase()` 661 - `runCompetitivePlayPhase()` 662 - `deriveTestResults()` 663 - `ALL_TEST_NAMES` 664 - `emptyCalibration()` (adapted to return `DriverCalibration`) 665 - `loadAndCheckPage()` (replaced by `driver.loadPage()`) 666 - `boundingBox()` helper 667 - `countFilledInTopRows()` helper (local in tests.ts, replaced by GridSnapshot method) 668 3. Move into it from `player.ts`: 669 - `playGame()` (rewritten to call Driver) 670 - `executePlacement()` (rewritten to call Driver) 671 - `playRandomMove()` (rewritten to call Driver) 672 - `playRandomForDuration()` (rewritten to call Driver) 673 - `tryFillRow()` (rewritten to call Driver) 674 - `stackToGameOver()` (rewritten to call Driver) 675 4. bot.ts imports `findBestPlacement`, `stripActivePiece`, `Placement` from `player.ts` 676 and everything else from `driver.ts`. 677 678 **Commit B**: "Add bot.ts: phase orchestration using TetrisDriver" 679 680 **Phase C: Rewire index.ts, slim player.ts** 681 682 1. Update `index.ts`: 683 - Import `PlaywrightDriver` from `./driver` 684 - Import `runAllTests` from `./bot` (not `./tests`) 685 - In the test body: `const driver = new PlaywrightDriver(page); const results = await runAllTests(driver, serverUrl);` 686 2. Remove from `player.ts`: 687 - `playGame()`, `hardDrop()`, `executePlacement()`, `playRandomMove()`, `playRandomForDuration()`, `tryFillRow()`, `stackToGameOver()` 688 - `import type { Page }` and `import { readGrid, ... }` from grid-reader 689 - `countTotalFilled()` (redundant) 690 3. `player.ts` now exports only: 691 - `findBestPlacement()` (accepts `Grid` and `PieceType`, returns `Placement | null`) 692 - `stripActivePiece()` (accepts `Grid` and cells, returns `Grid`) 693 - `Placement` interface 694 695 **Commit C**: "Rewire index.ts to use bot.ts + driver.ts, slim player.ts" 696 697 **Phase D: Delete old files** 698 699 1. Delete `calibrate.ts` 700 2. Delete `grid-reader.ts` 701 3. Delete `tests.ts` 702 4. Verify all imports resolve 703 5. Run the full eval pipeline against a known artifact to confirm identical report output 704 705 **Commit D**: "Remove old calibrate.ts, grid-reader.ts, tests.ts" 706 707 ### Backwards compatibility 708 709 The report format (`BotReport`) does not change. The JSON output is byte-identical 710 for the same game input. The summary score calculation is unchanged. The test names 711 are unchanged. The competitive play data structure is unchanged. 712 713 The only external-facing change is the internal file structure. Nothing downstream 714 (the scoring pipeline, the dashboard, the harness) needs to change. 715 716 --- 717 718 ## File Structure After Refactor 719 720 ``` 721 gameplay-bot/ 722 types.ts ~220 lines Interfaces (unchanged) 723 driver.ts ~900 lines TetrisDriver interface + PlaywrightDriver class 724 player.ts ~250 lines Pure AI: heuristics, simulation, placement finding 725 bot.ts ~1100 lines Phases, play loops, test derivation, competitive play 726 index.ts ~270 lines Playwright test entry, HTTP server, report output 727 SPEC.md Unchanged 728 COMPETITIVE_PLAY_SPEC.md Unchanged 729 REFACTOR_SPEC.md This document 730 ``` 731 732 Total: ~2740 lines (down from ~3500 because of deduplication and removing 733 redundant helpers that now live behind the Driver). 734 735 ### Import/dependency graph 736 737 ``` 738 index.ts 739 -> driver.ts (PlaywrightDriver constructor) 740 -> bot.ts (runAllTests) 741 -> types.ts (BotReport) 742 743 bot.ts 744 -> driver.ts (TetrisDriver interface, DriverCalibration, GridSnapshot) 745 -> player.ts (findBestPlacement, stripActivePiece, Placement) 746 -> types.ts (all data interfaces) 747 748 driver.ts 749 -> types.ts (Grid, GridBounds, RendererType, Controls, etc.) 750 -> @playwright/test (Page) 751 752 player.ts 753 -> types.ts (Grid, PieceType) 754 (NO @playwright/test import) 755 ``` 756 757 Key constraint: `bot.ts` does NOT import `@playwright/test`. It depends on the 758 `TetrisDriver` interface, not the implementation. This means the Bot can be tested 759 with a mock driver that returns canned grid states -- no browser needed. 760 761 --- 762 763 ## Edge Cases 764 765 ### Games that need re-calibration mid-session 766 767 **Scenario**: Grid appears only after clicking "Start". On page load, there is no 768 canvas and no DOM grid -- just a splash screen. 769 770 **Current behavior**: `calibrate()` runs on the splash screen, finds nothing. 771 Then `tests.ts` tries start mechanisms, and after starting, re-runs `calibrate()`. 772 773 **Driver behavior**: `calibrate()` includes start detection. If it starts the game 774 but finds no grid, it waits and re-scans. `recalibrate()` is also available for the 775 Bot to call explicitly after any phase reload. 776 777 **Bot flow**: 778 ``` 779 cal = driver.calibrate() 780 if cal.gridDetected === false && cal.startMechanism !== "unknown": 781 // Game started but grid not found yet -- wait and retry 782 driver.wait(500) 783 cal = driver.recalibrate() 784 ``` 785 786 ### Games where the Driver cannot read the grid at all 787 788 **Scenario**: Canvas game without GPU access. `getImageData()` returns all zeros. 789 790 **Driver behavior**: `readGrid()` returns `GridSnapshot { grid: null }` every time. 791 The Bot sees grid failures accumulate. 792 793 **Bot flow**: Phase 3 (mechanics) detects that `gridReadSuccess === 0`. The Bot 794 marks all grid-dependent tests as failed with detail "grid reader unavailable". 795 It does NOT fall back to screenshot-only testing (per the "NO FALSE POSITIVES" rule). 796 Competitive play is skipped. 797 798 ### Games that pause themselves 799 800 **Scenario**: Player accidentally triggers a pause menu (Escape key, or a pause 801 button that overlaps with the game area). 802 803 **Driver behavior**: `readGrid()` may return null (if an overlay covers the grid) 804 or return a static grid (same state on every read). The Driver does not know about 805 pausing -- it just reports what it sees. 806 807 **Bot flow**: The play loop in bot.ts already handles stale grids. If the grid 808 hasn't changed for 8 seconds, it tries pressing the drop key (which may unpause). 809 If grid reads start returning null, the Bot counts consecutive failures. After 10 810 consecutive null reads, it falls back to random key presses for a brief period, 811 then re-reads. 812 813 The Bot could also try pressing Escape or P to dismiss a pause screen: 814 ``` 815 if consecutiveUnchanged > 80: // 80 polls * 60ms = ~5 seconds 816 driver.pressRawKey("Escape") 817 driver.wait(500) 818 driver.pressRawKey("p") 819 driver.wait(500) 820 ``` 821 822 ### Games with overlays that block gameplay 823 824 **Scenario**: A modal overlay (tutorial, cookie consent, "enter your name" dialog) 825 appears on top of the game, blocking input. 826 827 **Driver behavior**: `surveyPage()` detects overlays (positioned elements covering 828 >50% of viewport). The start mechanism detection already tries clicking overlays 829 and pressing Escape to dismiss them. 830 831 **Bot flow**: If the game started but mechanics tests show no response to input 832 (movementsObserved === 0), the Bot can request a recalibrate, which may re-run 833 start detection and dismiss a new overlay. 834 835 ### Games in different languages 836 837 **Scenario**: The game UI is in Spanish, Japanese, or any non-English language. 838 "Start", "Game Over", "Score" have different text. 839 840 **Driver behavior**: Start mechanism detection is already fully language-agnostic 841 (visual change detection + interactivity verification, no text matching). Score 842 element detection falls back from labeled text ("Score: 0") to structural heuristics 843 (leaf element containing a standalone number). Game over text detection checks 844 multiple languages ("game over", "fin del juego", etc.) or falls back to 845 grid-state-based detection (grid frozen after filling to top). 846 847 **Bot flow**: The Bot does not do any text matching. It delegates all text-based 848 detection to the Driver. Tests like `game_over` use `driver.detectGameOverText()` 849 which is the Driver's responsibility. The Bot adds a grid-based game over check 850 (frozen grid after stacking) as a secondary signal that doesn't depend on language. 851 852 The `detectGameOverText()` method could be extended with more languages: 853 ```typescript 854 // Inside driver.ts 855 const gameOverPatterns = [ 856 "game over", "gameover", "you lose", "try again", 857 "play again", "restart", "fin del juego", "juego terminado", 858 "ゲームオーバー", "游戏结束" 859 ]; 860 ``` 861 862 But the primary game over detection in bot.ts (Phase 6) does not depend on text -- 863 it watches the grid freeze after filling to the top. 864 865 --- 866 867 ## What This Spec Does NOT Cover 868 869 - WebGL grid reading (not implemented yet, out of scope) 870 - New tests beyond the existing 24 871 - Changes to the report format or scoring 872 - Dashboard changes 873 - Harness changes 874 - Performance optimization of grid reading 875 - Testability improvements beyond the Driver/Bot split (e.g., mock Driver tests) 876 877 These are natural follow-ups after the refactor lands, but they are separate work items.