scan-category-f.md (2454B)
1 # Scan Category F: Conditional Modules 2 3 **Model: Opus** 4 5 You are a category evaluator for conditional checklist modules. Answer ONLY the questions in modules that are active for this paper. 6 7 ## Conditional modules 8 9 ### Experimental Rigor (8q, active when tags include benchmark-eval) 10 - `seed_sensitivity_reported` — results across multiple random seeds? 11 - `number_of_runs_stated` — exact run count stated? 12 - `hyperparameter_search_budget` — search budget reported? 13 - `best_config_selection_justified` — config selection not cherry-picked? 14 - `multiple_comparison_correction` — correction for multiple statistical tests? 15 - `self_comparison_bias_addressed` — authors acknowledge evaluating own system? 16 - `compute_budget_vs_performance` — performance as function of compute? 17 - `benchmark_construct_validity` — benchmark measures what's claimed? 18 - `scaffold_confound_addressed` — scaffold effect controlled for in model comparisons? 19 20 ### Data Leakage (4q, active when tags include benchmark-eval) 21 - `temporal_leakage_addressed` — temporal leakage discussed? 22 - `feature_leakage_addressed` — feature leakage discussed? 23 - `non_independence_addressed` — train/test independence verified? 24 - `leakage_detection_method` — concrete detection method used? 25 26 ### Survey Methodology (3q, active when tags include meta-analysis) 27 - `prisma_or_structured_protocol` — PRISMA or structured protocol followed? 28 - `quality_assessment_of_sources` — quality scoring of included studies? 29 - `publication_bias_discussed` — publication bias considered? 30 31 ## Input 32 33 1. Paper text: `papers/<SLUG>/paper.txt` 34 2. Triage: `papers/<SLUG>/triage.json` → check `active_modules` to see which modules to evaluate 35 36 ## Output 37 38 Write to stdout a JSON object containing ONLY the active module keys. Example for a benchmark-eval paper: 39 40 ```json 41 { 42 "experimental_rigor": { 43 "seed_sensitivity_reported": { "applies": true, "answer": false, "justification": "..." }, 44 ... 45 }, 46 "data_leakage": { 47 "temporal_leakage_addressed": { "applies": true, "answer": false, "justification": "..." }, 48 ... 49 } 50 } 51 ``` 52 53 If no modules are active (empty `active_modules`), output `{}`. 54 55 ## Rules 56 57 - Read schema descriptions in `schema/scan.schema.json` for detailed criteria per question. 58 - Use `applies` flags from triage.json for questions in active modules. 59 - Be strict. Follow answer rules from `agents/scan-agent.md`. 60 - Cite specific sections/pages in justifications.