paper_type.json (294B)
1 { 2 "paper_type": "empirical", 3 "reason": "Runs controlled experiments comparing an interpretability agent against baselines, reporting quantitative success rates (91% vs 39%) across 33 model organisms and 7 architectures to demonstrate that finetuning creates detectable activation traces." 4 }