paper_type.json (289B)
1 { 2 "paper_type": "empirical", 3 "reason": "The paper proposes a KV cache compression method (Scissorhands) based on an observed attention pattern and validates it experimentally across multiple OPT models with quantitative results on memory reduction, perplexity, and downstream tasks." 4 }