ai-stock-agent

Why your backtest looks like a winner

The most dangerous flaw in a backtest isn't a bug in the code. It's the companies that aren't in the data — because they died. We paid to put them back in, and several strategies died with them.

June 2026 · 6 min read

The flaw that ships with free data

Pull prices from a typical free API and you get the stocks that trade today. It feels complete — thousands of tickers, decades of history. But it is history written by the winners: a company that delisted in 2014 or went bankrupt in 2009 usually is not in the list at all.

Any backtest built on that universe is quietly conditioned on survival. It doesn't test 'what would buying cheap stocks have returned?' — it tests 'what would buying cheap stocks that we now know survived have returned?' Those are very different questions, and the second one always has the better answer.

Why the bias always flatters you

Survivorship bias is not random noise that sometimes helps and sometimes hurts — it is a thumb on the scale that always pushes the same direction. The stocks most likely to vanish from the data are precisely the catastrophic losers, so removing them mechanically raises every average.

It is cruellest to exactly the strategies that sound most tempting. 'Buy beaten-down cheap stocks' looks brilliant on survivor data, because the beaten-down stocks that went to zero — the ones that would have destroyed the strategy — are simply not there to be bought. The backtest can only pick from the losers that recovered.

What fixing it actually requires

The fix is point-in-time data: for every historical date, the exact set of stocks that existed and were tradeable on that date — including the ones that later died — with delisting returns accounted for. You cannot reconstruct this from a current ticker list; it has to come from an archive that recorded the membership as it was.

For this project we bought the full paid archive of Japanese equity data and rebuilt the panel that way, survivorship-free across a decade, delisted names included. It was the single most important data decision we made — partly so that 'you just didn't have the right data' could never be the excuse for our negative results.

What changed when the dead came back

Several strategies that looked attractive on survivor-only data stopped working on the honest panel. The pattern was consistent with the mechanism: ideas that implicitly bet on troubled companies recovering lost the most, because the honest data contains the troubled companies that never did.

This is also why our headline results can look 'worse' than the backtests circulating online for similar ideas. They are not worse — they are net of a flattery the other numbers still contain.

How to spot it in the wild

When you see a backtest, ask one question first: where did the universe come from? If the answer is a current index membership or a free price API, assume the results are inflated — moderately for large-cap momentum-style ideas, severely for anything that buys cheap, small or distressed stocks.

Related tells: returns that begin exactly when the data provider's history begins, no mention of delistings, and 'we tested all stocks' without saying as of when. Honest research names its data source and says the words point-in-time. If those words are missing, the thumb is probably on the scale.

The boring conclusion that survives

A cap-weighted index never has this problem — it holds whatever exists, automatically adding the new and shedding the dead, which is part of why it is so hard to beat honestly. Our research keeps returning to the same verdict from another angle: most claimed edges are artifacts of measurement, and the index is what remains when you measure honestly.

Survivorship bias Backtest Out-of-sample Index fund Value (factor)

This is a research write-up of historical simulations, not investment advice. Past distributions do not guarantee future outcomes. Investment decisions are your own.