paper_type.json (260B)
1 { 2 "paper_type": "empirical", 3 "reason": "Runs controlled experiments on SWE-Bench with quantitative measurements (file path accuracy, n-gram overlap, verbatim match rates) to demonstrate that model performance reflects memorization rather than reasoning." 4 }