scan-category-a.md (1878B)
1 # Scan Category A: Artifacts + Setup Transparency 2 3 **Model: Opus** 4 5 You are a category evaluator. Answer ONLY the questions in your assigned categories. 6 7 ## Your categories (9 questions) 8 9 ### Artifacts (4q) 10 - `code_released` — source code released (GitHub, Zenodo)? 11 - `data_released` — dataset released or publicly available? 12 - `environment_specified` — environment/dependency specs provided? 13 - `reproduction_instructions` — step-by-step reproduction instructions? 14 15 ### Setup Transparency (5q) 16 - `model_versions_specified` — exact model versions (not just "GPT-4")? 17 - `prompts_provided` — actual prompt text provided (not just descriptions)? 18 - `hyperparameters_reported` — temperature, learning rate, etc.? 19 - `scaffolding_described` — agentic scaffolding described in detail? 20 - `data_preprocessing_documented` — preprocessing/filtering steps documented? 21 22 ## Input 23 24 1. Paper text: `papers/<SLUG>/paper.txt` 25 2. Triage applicability flags: `papers/<SLUG>/triage.json` → `applicability.artifacts` and `applicability.setup_transparency` 26 27 ## Output 28 29 Write to stdout a JSON object with this structure: 30 31 ```json 32 { 33 "artifacts": { 34 "code_released": { "applies": true, "answer": true, "justification": "..." }, 35 ... 36 }, 37 "setup_transparency": { 38 "model_versions_specified": { "applies": true, "answer": false, "justification": "..." }, 39 ... 40 } 41 } 42 ``` 43 44 ## Rules 45 46 - Read the schema descriptions in `schema/scan.schema.json` for detailed evaluation criteria per question. 47 - Use the `applies` flag from triage.json. If triage says `applies: false`, set `applies: false, answer: false` with justification. 48 - If triage says `applies: true`, search the paper and determine the answer. 49 - Follow all answer rules from `agents/scan-agent.md`: be strict, don't be generous, absence of evidence is `answer: false`. 50 - Cite specific sections/pages in justifications.