paper_type.json (256B)
1 { 2 "paper_type": "empirical", 3 "reason": "Runs experiments on standard benchmarks (GSM8K, HumanEval, HotpotQA) and reports quantitative results on token usage, latency, and accuracy metrics with empirical analysis of skill selection phase transitions." 4 }