paper_type.json (269B)
1 { 2 "paper_type": "benchmark-creation", 3 "reason": "Introduces the Remote Labor Index (RLI), a new benchmark dataset of 240 real-world freelance projects designed to measure AI automation of remote work; empirical evaluation of frontier models serves as validation." 4 }