How do you get historical Polymarket data?

For raw access, Polymarket's Gamma API serves market metadata — questions, outcomes, end dates, resolution status — and the CLOB API serves market pricing data. Both are publicly documented. Assembling backtest-grade history from them takes real work: joining price snapshots to resolution records, aligning timestamps, and being disciplined about what was knowable when.

That assembly step is where most DIY Polymarket backtests silently break. A row that uses a price observed after the decision time, or a resolution joined to the wrong outcome token, produces results that look fine and mean nothing.

Pancake's canonical dataset pool packages this work: evidence datasets with a market link, decision time, entry price, resolution time, and resolved outcome per row, validated at ingest (schema, lookahead, monotonicity, ranges) and content-hashed (rows_sha256). An agent finds them with the search_datasets MCP tool and runs a backtest against them directly. Custom rows can also be uploaded via create_evidence_dataset and get the same validation.

Every dataset records provenance — source URLs and transformations — so a result built on pool data remains auditable back to the upstream source.