paper_type.json (248B)
1 { 2 "paper_type": "empirical", 3 "reason": "Proposes SmoothQuant method and validates with quantitative experiments (1.56× speedup, 2× memory reduction) across diverse LLM sizes, with integrated implementation in PyTorch and FasterTransformer." 4 }