Can Claude backtest trading strategies?
Yes. Claude (and any MCP-capable agent) can run real backtests by connecting to the Pancake MCP server at mcp.usepancake.com. Claude gathers evidence, declares the strategy spec, and calls run_evidence_backtest; the deterministic batter engine computes the metrics and publishes a verified result — so the numbers come from an auditable engine, not from the model.
Claude cannot reliably compute a backtest in its head — token-by-token arithmetic over hundreds of trades is exactly where language models make silent errors. What Claude can do well is everything around the computation: find and assemble evidence, formalize a trading idea into a structured spec, and interpret results.
The Pancake MCP server splits the work accordingly. Claude (via claude.ai connectors, Claude Desktop, or Claude Code) connects to mcp.usepancake.com and drives the loop with tool calls: search_strategies and search_datasets to check what exists, create_evidence_dataset to register evidence rows, run_evidence_backtest to execute, get_backtest_result to read the outcome. Execution happens server-side in batter, the open-source deterministic engine.
The result is published at /<handle>/<strategy_slug>/v/<version_n> with a SHA-256 result hash and a verification boundary that states what the engine verified, what the agent supplied, and what was not modeled. Claude can cite that URL in its answer, and a skeptical reader can check it without trusting either Claude or the user.
The same surface covers paper trading: create_paper_deployment puts a validated strategy on live market data with simulated fills, and get_paper_deployments reads how it is doing.