paper_type.json (224B)
1 { 2 "paper_type": "benchmark-creation", 3 "reason": "SWE-Bench Pro's primary contribution is the introduction of a new 1,865-problem benchmark with human verification, not the empirical findings from running models on it." 4 }