Evidence
The evidence — every test, fully disclosed
Each lane below is published whether it passed or failed. None beat the index after cost.

FINDING01How far can this data actually be trusted?Before any verdict: what we bought, and what biases we removed.FOUNDATION
The data behind the verdict
Not a hot take — built on serious, survivorship-free data, with every test published.
- 10 yrs
- survivorship-free JP data, incl. delisted names
- 5,300+
- stocks in the test universe
- 99 yrs
- of S&P 500 history (1927–2026)
- 148,000+
- sector-day observations (short-ratio lane)
- 1,159
- rolling backtest cohorts (contribution study)
- 15+
- strategies — every one disclosed, pass or fail
FINDING02What did this research learn the expensive way?Only the lessons that repeated across dozens of tests.LESSONS
Why NISA is hard to beat
Durable lessons from the closed research lanes. Our own methodology and conclusions — published whether they flatter us or not.
1. The signal had no edge — random tied or beat it.
A liquidity-matched random basket (OOS terminal 1.130) tied or beat the breakout signal (1.098). With no edge over random selection, swapping in another single factor does not help.
2. Turnover, cost and tax are a structural opponent.
NISA turns over ~0%; active books 66-99%. At 1.5% round-trip, ~0.8 turnover bleeds ~1.2%/month (~20% over 18 months). NISA gains are also tax-free — a hurdle the backtest never charged.
3. Multi-regime confirmed: still no edge, deeper drawdowns.
Re-run across 2016-2026 (incl. 2018 Q4, 2020 COVID, 2022): breakout loses to NISA in 0/5 regimes, beats its random control in 0/5, and its drawdowns are deeper than NISA's in every selloff. The drawdown-protection hope is rejected.
4. A real edge, but long-only can't diversify.
The margin (supply-demand) signal is the first to beat random (5/5) and liquidity-matched random (4/5) — a genuine small edge. But it still loses to NISA standalone and is 0.94 correlated to a broad NISA core, so a 10% satellite lowers Sharpe (0.67->0.60). A long-only equity factor can't diversify an equity core.
5. A premium that only shows up after 2023 is the rally, not an edge.
Cap-weighted, the value+quality tilt beat the index +4.5pp/yr overall — but split −0.5pp pre-2023 / +15.3pp after. The semis sector tilt splits the same way. Once you cut pre/post-2023, most 'premiums' turn into the TSE-reform / AI rally. A both-halves test is the honesty guard, and gross excess is then eaten by turnover cost.
6. By the time a real change is visible in the filings, it's priced.
Selecting the biggest durable margin step-ups underperformed by 9.8pp/yr (−46% DD); 'quiet accumulation' volume events underperformed by 14pp/yr (−62% DD). Both lose in both halves: the rerate happens before the signal is observable, so chasing the trajectory buys the top and loads cyclical, volatile names that revert.
Research and paper-simulation only. Not investment advice. No live trading or edge claim. — docs/lesson_2026-06-02_why_nisa_is_hard_to_beat.md
FINDING03Did the breakout strategy beat plain NISA?A frozen strategy, tested out-of-sample with nowhere to hide.FAIL
NISA OOS validation
Out-of-sample test of the frozen breakout screen against NISA benchmarks. We publish this whether it passes or fails.
- Breakout portfolio
- 0.883×
- NISA equal-weight
- 1.592×
- NISA large-cap
- 1.8×
- OOS lift vs NISA
- 0.99×
- ≥5x tail events
- ≈6
- Multibagger capture
- 52.2%
Conclusion
- • Frozen breakout screen is not tradeable.
- • Live radar disabled.
- • NISA remains the evidence-backed default.
Nuance
The screen did catch some multibagger candidates, but portfolio construction did not beat NISA after costs.
Now tested out-of-sample across multiple regimes (2018–2026, incl. 2018 Q4, 2020 COVID, 2022): the breakout portfolio still loses to NISA after costs. The FAIL is definitive, not a single-regime artifact.
OOS (2022–2026): breakout −3.4%/yr vs NISA-EW +13.5% / large-cap +17.4%, barely above its random control. (2018-01-31 → 2026-03-31)
FINDING04Fine — what if you blend in just a 5% satellite?Testing the last hope: maybe it helps as a core-satellite blend.FAIL
Core-Satellite Overlay
Tests whether a NISA core plus a small breakout satellite beats plain NISA. Published whether it passes or fails.
- NISA-EW core
- 1.354×
- Best overlay
- 1.341×
- Best excess vs NISA
- -1pp
5% satellite, equal-weight
- • Top-N concentration made it worse, not better.
- • The satellite worsens the blend as its allocation rises.
Conclusion
- • Breakout is not useful as a NISA satellite.
- • No paper skill / no live radar.
Breakout research lane closed — standalone and satellite both failed. See docs/breakout_research_closure.md.
FINDING05Is post-earnings drift (PEAD) still alive?One of the oldest documented anomalies, retested on today's market.FAIL
PEAD / earnings surprise
Out-of-sample test of post-earnings drift vs NISA. Published whether it passes, fails, or cannot be decided yet.
- OOS events
- 7,840
- Revision α (t+21d)
- -1.39pp
- Surprise α (t+21d)
- -2.236pp
- Combined α (t+21d)
- -1.354pp
- • Guidance-based PEAD produced no positive OOS drift alpha vs the NISA/EW proxy.
- • Revision and surprise signals are negative after costs; combined is only marginal at t+63d and not robust.
- • No paper skill / no live radar.
Rejected on monthly resolution, and the daily t+5d announcement window — the one sub-angle left open — was then measured on paid daily data (21,014 liquid events, 2016–2026) and is also negative after cost (see docs/pead_research_closure.md, docs/event_window_study_closure.md). The only remaining untested input is analyst-consensus surprise, which J-Quants does not carry.
No edge or profitability is claimed.
FINDING06Which signals actually moved the results?Not vibes — decomposed, measured attribution.MEASURED
Signal attribution
Which evidence lanes supported or opposed each decision — recorded now, scored only once outcomes close.
Coverage
- Decisions
- 3730
- With attribution
- 3730
- Ready to score
- 3359
Lanes represented (decisions with a known stance)
- Agent decision3730
- Technical3730
- Risk3730
- Event awarenessunknown
- Thesisunknown
- World Monitorunknown
Stances recorded
- Supports buy
- 3320
- Supports hold
- 3119
- Supports sell
- 1021
- Caution
- 3730
Each decision links to its Research Outcomes window; stances mature into evidence only as those windows close.
Attribution is not causation: a lane supporting a decision does not mean it caused the result.
Read-only attribution coverage. No edge or profitability is claimed.
FINDING07Couldn't this all just be bad data?We bought the full paid archive and re-ran everything to kill that excuse.RULED OUT
Paid-data archive — the flows everyone watches turned out to be nothing
We bought the full J-Quants dataset and tested the flows JP investors actually follow. The two most-cited are coincident or noise. (The two with real structure get their own panels below.)
Investor flows (投資部門別) — all 12 categories as timing signals
NOT PREDICTIVEWe tested every investor type, not just foreigners. Individuals really are contrarian traders (−0.73 same-week) and foreigners/prop chase momentum (+0.40) — but none of it predicts next week: every category's forward correlation is ~0 and every hit-rate ~50%. 'Follow the smart money, fade dumb retail' is coincident behaviour, already in the price, not a forward signal.
520 weeks · individuals same-week −0.73, foreigners +0.40 · every category's next-week corr ≈ 0 (max |r| = 0.06)
Earnings-announcement premium — is return concentrated on earnings?
REAL, but already in the indexA rare positive: stocks really do earn more around their earnings (+27 bps/day, 67% of names) — the global 'announcement premium' replicates in Japan. But it's a RISK premium (announcement days are 1.35× more volatile, direction is a 51% coin flip), and a buy-and-hold / NISA index holder already captures it by holding through every earnings. Isolating it means trading around every announcement — turnover that eats it. Not an edge over the index.
+27 bps on announcement days vs +4 bps baseline · positive in 67% of stocks · 166k events
Large disclosed shorts (空売り残高報告) — are big short-sellers informed?
weak-stock marker, not a timing edgeEvery short position ≥0.5% is disclosed by name (Barclays, Citadel…). Stocks carrying big institutional shorts do underperform (~−1.9pp/60d) — but a short INCREASE (−1.6) and a DECREASE (−2.1) underperform about equally, so it's a 'shorted stocks are weak stocks' selection marker, not informed timing on the change. They're hard-to-borrow and the signal is negative — an avoidance flag, not a retail edge.
1.29M disclosures, 3,593 stocks · shorted names −1.9pp/60d vs TOPIX · increase −1.6 ≈ decrease −2.1
Sector short-ratio (空売り比率) — as a contrarian reversal
NO REVERSALExtreme short-selling does not precede a bounce. High-short sectors mildly UNDERperform low-short ones; the faint tilt is continuation, not reversal, and it flips sign across eras = noise. The spike is coincident sentiment, already in the price.
high − low short-ratio sectors: −0.49pp over 20d · 148k observations
Research/paper, local cached data, no API spend. Published whether it helps or not. Not investment advice.
FINDING08Is a margin restriction a tradeable sell signal?The signal is real. There's still a reason you can't trade it.NOT TRADEABLE
Margin restriction (信用規制) — what happens after a stock is restricted
Stocks flagged for overheated speculative margin buying. A real, stable signal — but not one you can trade.
Restricted names spiked a median +35% in the prior month, then gave back ~4% vs TOPIX over the next quarter (3,989 events, both halves).
| Horizon | Excess vs TOPIX | % positive |
|---|---|---|
| +20 days | −1.7pp | 35% |
| +60 days | −3.8pp | 32% |
The restriction cools the speculative rally — but these are exactly the un-shortable, squeeze-prone names, and the signal is negative, so there's no long edge. It's an avoidance rule: don't chase a stock that just spiked into a margin restriction.
Entry = a name newly under a hard restriction vs the prior weekly snapshot; excess vs TOPIX from the publication date. Aggregate forward tell only, not tradeable. Not advice.
FINDING09Can you beat the index by chasing hot sectors?TOPIX-17 momentum, tested — what showed up wasn't alpha.MARGINAL
Sector rotation (TOPIX-17 momentum) — does chasing the hot sectors beat TOPIX?
Each month rotate into the top-5 trending sectors, equal-weight, cost-charged, vs cap-weight TOPIX.
Rotation edges TOPIX by +0.9pp/yr net and is positive in both halves — but with deeper drawdowns and an identical Calmar (0.40), so it's just higher beta.
| Strategy | CAGR | max DD | pre-23 | post-23 |
|---|---|---|---|---|
| Rotation top-5 (net) | 10.4% | −26% | +2.4% | +22.9% |
| TOPIX (cap-wt) | 9.4% | −23% | +1.7% | +21.6% |
The least-bad active result on the paid archive — yet the extra 0.9pp/yr comes with deeper drawdowns and no Calmar improvement. And a robustness check settles it: across 9 lookback × top-K settings, only 1 beat TOPIX (the +0.9pp cell shown); the other 8 lost, mean −0.6pp/yr. The 'edge' is a single lucky parameter — higher beta, not skill, and not a durable edge.
Official TOPIX-17 cap-weighted sector indices, monthly top-5 by 12m momentum, net 0.4% round-trip turnover, dividends excluded both sides. Not advice.