{"action":{"additionalProperties":false,"description":"Action space for the ad fraud investigation agent.\n\nThree action types:\n- investigate: Spend budget to reveal information about an ad\n- verdict: Approve, reject, or escalate an ad\n- link_accounts: Flag two ads as part of the same fraud network","properties":{"metadata":{"additionalProperties":true,"description":"Additional metadata for the action","title":"Metadata","type":"object"},"action_type":{"enum":["investigate","verdict","link_accounts"],"title":"Action Type","type":"string"},"ad_id":{"description":"Target ad identifier (e.g. 'ad_001')","title":"Ad Id","type":"string"},"investigation_target":{"anyOf":[{"enum":["advertiser_history","landing_page","payment_method","targeting_overlap","campaign_structure","policy_classifier"],"type":"string"},{"type":"null"}],"default":null,"description":"What to investigate (required for action_type='investigate')","title":"Investigation Target"},"verdict":{"anyOf":[{"enum":["approve","reject","escalate"],"type":"string"},{"type":"null"}],"default":null,"description":"Verdict decision (required for action_type='verdict')","title":"Verdict"},"confidence":{"anyOf":[{"maximum":1.0,"minimum":0.0,"type":"number"},{"type":"null"}],"default":null,"description":"Agent's confidence in verdict (0.0-1.0)","title":"Confidence"},"rationale":{"anyOf":[{"maxLength":2000,"type":"string"},{"type":"null"}],"default":null,"description":"Optional natural-language reason for the verdict (consumed by the Auditor)","title":"Rationale"},"linked_ad_id":{"anyOf":[{"type":"string"},{"type":"null"}],"default":null,"description":"Other ad in suspected fraud ring (required for action_type='link_accounts')","title":"Linked Ad Id"},"link_reason":{"anyOf":[{"type":"string"},{"type":"null"}],"default":null,"description":"Why the agent believes these ads are connected","title":"Link Reason"}},"required":["action_type","ad_id"],"title":"AdReviewAction","type":"object"},"observation":{"additionalProperties":false,"description":"Observation returned after each Investigator step.\n\nText-heavy by design so LLM agents can reason about the content naturally.\nStructured data is in queue_status for programmatic access.","properties":{"done":{"default":false,"description":"Whether the episode has terminated","title":"Done","type":"boolean"},"reward":{"anyOf":[{"type":"boolean"},{"type":"integer"},{"type":"number"},{"type":"null"}],"default":null,"description":"Reward signal from the last action","title":"Reward"},"metadata":{"additionalProperties":true,"description":"Additional metadata for the observation","title":"Metadata","type":"object"},"queue_summary":{"default":"","description":"Natural language overview of the ad queue","title":"Queue Summary","type":"string"},"current_ad_info":{"default":"","description":"Details of the ad currently in focus","title":"Current Ad Info","type":"string"},"investigation_findings":{"default":"","description":"Accumulated investigation results","title":"Investigation Findings","type":"string"},"verdict_history_summary":{"default":"","description":"Summary of verdicts rendered so far","title":"Verdict History Summary","type":"string"},"feedback":{"default":"","description":"Natural language feedback on the last action taken","title":"Feedback","type":"string"},"available_ads":{"description":"Ad IDs still pending review","items":{"type":"string"},"title":"Available Ads","type":"array"},"queue_status":{"additionalProperties":true,"description":"Structured status: total_ads, reviewed, pending, budget, step","title":"Queue Status","type":"object"},"queue_may_grow":{"default":false,"description":"True when running inside the Referee — Fraudster can still add ads","title":"Queue May Grow","type":"boolean"},"evidence_ledger":{"additionalProperties":{"additionalProperties":true,"type":"object"},"description":"Per-ad structured evidence accumulated across investigations. Surface fields (category, country, account_age_days) are always present once an ad has been touched; investigation-only fields (payment_id, registrar, domain, targeting_fingerprint, advertiser_id) appear only after the corresponding `investigate` target has been pulled. Cross-ad collisions on a SUBSET of these fields indicate fraud rings — the policy must learn which fields are discriminative (payment_id collisions matter, country collisions usually don't).","title":"Evidence Ledger","type":"object"},"queue_digest":{"description":"One-row-per-pending-ad summary surfaced WITHOUT requiring an investigation. Each row carries a curated subset of fields from the ad + advertiser_profile: a small set of potentially-discriminative columns (payment_type, registrar, domain) the Investigator can use as a pre-investigation ring-detection hint, plus a handful of decoy columns (category, country, account_age_days) that are intentionally non-discriminative so the policy must learn which collisions matter. Capped to ~12 ads to keep the prompt budget bounded.","items":{"additionalProperties":true,"type":"object"},"title":"Queue Digest","type":"array"},"decided_ads":{"description":"Per-decided-ad summary: verdict + confidence + a curated mix of discriminative (payment_id, registrar, domain, targeting_fingerprint) and decoy (category, country, account_age_days) signals from the evidence ledger. Gives the Investigator memory of past decisions for link_accounts.","items":{"additionalProperties":true,"type":"object"},"title":"Decided Ads","type":"array"}},"title":"AdReviewObservation","type":"object"},"state":{"additionalProperties":true,"description":"Base class for environment state.\n\nRepresents internal environment state, separate from observations.","properties":{"episode_id":{"anyOf":[{"type":"string"},{"type":"null"}],"default":null,"description":"Unique identifier for the current episode","title":"Episode Id"},"step_count":{"default":0,"description":"Number of steps taken in the current episode","minimum":0,"title":"Step Count","type":"integer"}},"title":"State","type":"object"}}