What is a verifiable backtest?

Most published backtests are trust-me artifacts: a chart, a Sharpe ratio, a claim. Nothing binds the number to the data or the code that produced it, so the reader's only options are belief or disbelief. This is why backtest claims in trading forums and AI-generated strategy pitches are routinely — and rightly — dismissed.

A verifiable backtest closes the gap with four properties. Hashed inputs: the strategy spec and the evidence dataset are canonicalized and content-hashed, so the exact inputs are pinned. Deterministic open engine: the execution engine is public and byte-stable, so anyone can re-run it. Reproducible output: re-running the same spec, dataset, and engine version yields an identical result hash. Explicit scope: the result states what was verified, what was accepted as supplied, and what was not modeled — no implied guarantees.

Pancake implements this end to end: spec_hash and rows_sha256 pin the inputs, the Apache-2.0 batter engine (Python 3.12+, PCG64-seeded) produces a SHA-256 result_hash stable across platforms, and the verification boundary 3-tuple appears in every result. The result lives at a permanent URL any reader or agent can fetch as HTML or JSON.

Verifiable does not mean profitable, predictive, or risk-free. It means the computation is honest and checkable — the floor for taking any backtest claim seriously, especially one authored by an AI agent.