paper_type.json (275B)
1 { 2 "paper_type": "empirical", 3 "reason": "Introduces AT-GRPO algorithm and validates it with quantitative experiments on existing benchmarks (long-horizon planning, coding, math), with the primary contribution being the experimental findings of performance improvements." 4 }