loop-benchmarking

Controlled experiments across agentic coding configurations. Same task, one variable, what actually works.
git clone https://git.shiptheloop.com/loop-benchmarking.git
Log | Files | Refs | README

commit 7df3ddd793a69cba93ded966d634045f4810a5fc
parent 2644610c24ac12d9ef707571aec1e31a934389a8
Author: Brian Graham <brian@buildingbetterteams.de>
Date:   Fri, 10 Apr 2026 21:06:34 +0200

Correct attribution: Pierre Dellacherie's 4-heuristic Tetris AI

The algorithm is from Pierre Dellacherie (2003), not LeeYiyuan.
Weights are from Colin Fahey's GA optimization. LeeYiyuan/tetrisai
is the reference implementation we adapted code from.

Updated methodology page, SPEC.md, player.ts, bot.ts, CLAUDE.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Diffstat:
MCLAUDE.md | 2+-
Mdashboard/src/pages/methodology.astro | 8+++++---
Mtasks/tetris/eval/gameplay-bot-v2/bot.ts | 3++-
Mtasks/tetris/eval/gameplay-bot/SPEC.md | 4++--
Mtasks/tetris/eval/gameplay-bot/player.ts | 13+++++++------
5 files changed, 17 insertions(+), 13 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md @@ -85,7 +85,7 @@ All evaluation is deterministic code. No LLM grading. 2. `quality.sh` - ESLint, typecheck, bundle size 3. `code-analysis.py` - 14 code quality metrics 4. `transcript-analysis.py` - agent behavior from conversation log -5. `gameplay-bot/` - two-phase: mechanics test then play-to-win (adapted from MIT-licensed LeeYiyuan/tetrisai and mikhail-vlasenko/Tetris-AI) +5. `gameplay-bot/` - two-phase: mechanics test then play-to-win. Uses Pierre Dellacherie's 4-heuristic Tetris AI (2003) with Colin Fahey's GA-optimized weights. Reference implementations: LeeYiyuan/tetrisai and mikhail-vlasenko/Tetris-AI (both MIT) 6. `sonarqube-scan.py` - automated code quality scan (requires SonarQube at localhost:9000) ## Dashboard diff --git a/dashboard/src/pages/methodology.astro b/dashboard/src/pages/methodology.astro @@ -254,12 +254,14 @@ import Base from "../layouts/Base.astro"; <h3>AI play</h3> </div> <p> - A 4-heuristic algorithm evaluates all possible placements for each piece: + Pierre Dellacherie's 4-heuristic algorithm (2003) evaluates all possible placements for each piece: </p> <pre><code>score = -0.51 * height + 0.76 * lines - 0.36 * holes - 0.18 * bumpiness</code></pre> <p class="muted"> - Algorithm from <a href="https://github.com/LeeYiyuan/tetrisai" target="_blank" rel="noopener">LeeYiyuan/tetrisai</a> (MIT license). - The goal is not to play well -- it is to exercise all game mechanics. + Weights from genetic algorithm optimization by Colin Fahey. Reference implementation: + <a href="https://github.com/LeeYiyuan/tetrisai" target="_blank" rel="noopener">LeeYiyuan/tetrisai</a> (MIT license). + The bot is a strong Tetris player -- the original algorithm can clear thousands of lines without losing. + We use it to exercise game mechanics and trigger events like multi-line clears for bug detection. </p> </div> diff --git a/tasks/tetris/eval/gameplay-bot-v2/bot.ts b/tasks/tetris/eval/gameplay-bot-v2/bot.ts @@ -17,7 +17,8 @@ import type { } from "./types"; // --------------------------------------------------------------------------- -// AI Player Logic (from LeeYiyuan/tetrisai, MIT License) +// Pierre Dellacherie's 4-heuristic Tetris AI (2003) with Colin Fahey's +// GA-optimized weights. Reference implementation: LeeYiyuan/tetrisai (MIT) // --------------------------------------------------------------------------- const W_HEIGHT = -0.510066; diff --git a/tasks/tetris/eval/gameplay-bot/SPEC.md b/tasks/tetris/eval/gameplay-bot/SPEC.md @@ -9,7 +9,7 @@ SVG, WebGL). The bot must work with all of them. The bot does NOT use Claude or any LLM for grading. All evaluation is deterministic code. The bot plays the game using a known-good AI algorithm (4-heuristic genetic -optimization from LeeYiyuan/tetrisai, MIT License) and records what happens. +optimization, reference implementation: LeeYiyuan/tetrisai, MIT License) and records what happens. ## Architecture @@ -328,7 +328,7 @@ not game state). Validate aspect ratio (height ~= 2 * width). ## AI Player -4-heuristic evaluation from LeeYiyuan/tetrisai (MIT License): +Pierre Dellacherie's 4-heuristic evaluation (2003) with Colin Fahey's GA-optimized weights, reference implementation: LeeYiyuan/tetrisai (MIT License): - Aggregate height: sum of column heights (weight: -0.510066) - Lines cleared: number of complete rows (weight: 0.760666) - Holes: empty cells below filled cells (weight: -0.35663) diff --git a/tasks/tetris/eval/gameplay-bot/player.ts b/tasks/tetris/eval/gameplay-bot/player.ts @@ -1,12 +1,13 @@ -// Heuristic evaluation adapted from LeeYiyuan/tetrisai (MIT License) -// Weights are from genetic algorithm optimization in that project. -// Piece definitions and simulation logic also adapted from that codebase. +// Pierre Dellacherie's 4-heuristic Tetris AI (2003). +// Weights from Colin Fahey's genetic algorithm optimization. +// Reference implementation: LeeYiyuan/tetrisai (MIT License) -- piece +// definitions and simulation logic adapted from that codebase. import type { Page } from "@playwright/test"; import type { Grid, CalibrationResult, PieceType } from "./types"; import { readGrid, detectActivePieceCells, identifyPieceType, gridsAreDifferent } from "./grid-reader"; -// Genetically optimized weights from LeeYiyuan/tetrisai +// Genetically-optimized weights (Fahey) const W_HEIGHT = -0.510066; const W_LINES = 0.760666; const W_HOLES = -0.35663; @@ -19,7 +20,7 @@ const GRID_COLS = 10; * Standard Tetris piece definitions. * Each piece has 4 rotation states. * Each rotation state is a list of [row, col] offsets from the piece origin. - * Adapted from LeeYiyuan/tetrisai piece.js + * Adapted from LeeYiyuan/tetrisai piece.js (reference implementation) */ const PIECES: Record<string, [number, number][][]> = { I: [ @@ -390,7 +391,7 @@ export async function stackToGameOver( } // --- Heuristic evaluation functions --- -// Adapted from LeeYiyuan/tetrisai (MIT License) +// Pierre Dellacherie 4-heuristic, reference: LeeYiyuan/tetrisai (MIT License) /** * Find the best column and rotation for a given piece type using the

Impressum · Datenschutz