paper_type.json (306B)
1 { 2 "paper_type": "empirical", 3 "reason": "Primary contribution is experimental evaluation of multiple LLMs on a test generation task with detailed quantitative findings about model performance, context effects, and repair strategies, rather than introducing TestBench as a reusable benchmark resource." 4 }