# EvidenceDataset

Canonical: https://www.usepancake.com/e/evidence-dataset

**Definition:** An EvidenceDataset is Pancake's unit of backtest input: a content-hashed (rows_sha256) table of prediction-market rows — market reference, decision time, entry price, resolution time, and resolved outcome, plus declared feature columns — validated for schema, lookahead, monotonicity, and value ranges at ingest.

The name is deliberate: a backtest result is a claim, and the EvidenceDataset is the evidence the claim cites. Pinning that evidence — by content hash, not by reference to a mutable table — is what makes the claim checkable later. The rows_sha256 digest appears in every result, so "what data was this run on" has exactly one answer.

Validation happens at ingest, and structural failure aborts rather than degrades. The checks: schema_match (declared columns present with correct types), lookahead (decision_time strictly before resolution_time on every row), monotonicity (no reversed timestamps, no negative prices), range (values within declared bounds), and required columns (the five semantic roles present exactly once). A dataset that passes is structurally sound; no partial acceptance exists.

Datasets enter the system two ways. The canonical pool holds prediction-market datasets Pancake maintains — assembled from venue records, validated, hashed, and searchable by any MCP-capable agent via the search_datasets tool. Custom rows can be uploaded via create_evidence_dataset and receive identical validation. Every dataset records provenance, so a result built on pool data remains auditable back to its upstream source.

One distinction keeps the concept honest: structural validation is not real-world verification. The engine confirms the rows are well-formed and lookahead-clean; it cannot confirm that an agent's feature column reflects reality. That residual trust is named explicitly in every result's agent-supplied evidence block rather than laundered into a "verified" badge — the reader sees exactly which inputs were accepted as declared.

In the reproducibility chain, the EvidenceDataset is the first link: rows_sha256 and spec_hash pin the inputs, the deterministic engine pins the computation, and the SHA-256 result_hash pins the output. Re-running the same three produces the same digest, on any machine.

## Related

- [/e/agent-supplied-evidence — the trust layer](https://www.usepancake.com/e/agent-supplied-evidence)
- [Q&A — how to get historical Polymarket data](https://www.usepancake.com/q/how-to-get-historical-polymarket-data)
- [Methodology — evidence validation](https://www.usepancake.com/methodology)

---

Markdown twin of https://www.usepancake.com/e/evidence-dataset — same content as the HTML page, generated from the same source data. More machine surfaces: https://www.usepancake.com/llms.txt