paper_type.json (231B)
1 { 2 "paper_type": "benchmark-creation", 3 "reason": "Introduces Mind2Web, the first large-scale dataset for web agents, with experimental baselines to validate the benchmark; the dataset itself is the primary novel contribution." 4 }