paper_type.json (290B)
1 { 2 "paper_type": "empirical", 3 "reason": "Runs experiments comparing LLMs (GPT-4, GPT-3.5, Code Llama, WizardCoder) on existing benchmarks (HumanEval, MBPP, LeetCode) and reports quantitative findings about code efficiency, correctness-efficiency correlation, and prompting strategies." 4 }