paper_type.json (241B)
1 { 2 "paper_type": "benchmark-creation", 3 "reason": "Introduces The Stack v2, a large-scale code dataset (4× larger, spanning 619 languages), with StarCoder 2 models serving as baseline implementations to demonstrate the dataset's value." 4 }