paper_type.json (287B)
1 { 2 "paper_type": "empirical", 3 "reason": "Systematically evaluates Codex performance on security assertion generation across 2,268 prompt configurations and 10 benchmarks, reporting quantitative accuracy results and identifying prompt engineering as the dominant performance factor." 4 }