Evidence

The evidence — every test, fully disclosed

Each lane below is published whether it passed or failed. None beat the index after cost.

FINDING01How far can this data actually be trusted?Before any verdict: what we bought, and what biases we removed.FOUNDATION

The data behind the verdict

Not a hot take — built on serious, survivorship-free data, with every test published.

10 yrs
survivorship-free JP data, incl. delisted names: 5,300+
stocks in the test universe: 99 yrs
of S&P 500 history (1927–2026): 148,000+
sector-day observations (short-ratio lane): 1,159
rolling backtest cohorts (contribution study): 15+
strategies — every one disclosed, pass or fail

FINDING02What did this research learn the expensive way?Only the lessons that repeated across dozens of tests.LESSONS

Why NISA is hard to beat

Durable lessons from the closed research lanes. Our own methodology and conclusions — published whether they flatter us or not.

StatusHOLD

1. The signal had no edge — random tied or beat it.
A liquidity-matched random basket (OOS terminal 1.130) tied or beat the breakout signal (1.098). With no edge over random selection, swapping in another single factor does not help.
2. Turnover, cost and tax are a structural opponent.
NISA turns over ~0%; active books 66-99%. At 1.5% round-trip, ~0.8 turnover bleeds ~1.2%/month (~20% over 18 months). NISA gains are also tax-free — a hurdle the backtest never charged.
3. Multi-regime confirmed: still no edge, deeper drawdowns.
Re-run across 2016-2026 (incl. 2018 Q4, 2020 COVID, 2022): breakout loses to NISA in 0/5 regimes, beats its random control in 0/5, and its drawdowns are deeper than NISA's in every selloff. The drawdown-protection hope is rejected.
4. A real edge, but long-only can't diversify.
The margin (supply-demand) signal is the first to beat random (5/5) and liquidity-matched random (4/5) — a genuine small edge. But it still loses to NISA standalone and is 0.94 correlated to a broad NISA core, so a 10% satellite lowers Sharpe (0.67->0.60). A long-only equity factor can't diversify an equity core.
5. A premium that only shows up after 2023 is the rally, not an edge.
Cap-weighted, the value+quality tilt beat the index +4.5pp/yr overall — but split −0.5pp pre-2023 / +15.3pp after. The semis sector tilt splits the same way. Once you cut pre/post-2023, most 'premiums' turn into the TSE-reform / AI rally. A both-halves test is the honesty guard, and gross excess is then eaten by turnover cost.
6. By the time a real change is visible in the filings, it's priced.
Selecting the biggest durable margin step-ups underperformed by 9.8pp/yr (−46% DD); 'quiet accumulation' volume events underperformed by 14pp/yr (−62% DD). Both lose in both halves: the rerate happens before the signal is observable, so chasing the trajectory buys the top and loads cyclical, volatile names that revert.

Research and paper-simulation only. Not investment advice. No live trading or edge claim. — docs/lesson_2026-06-02_why_nisa_is_hard_to_beat.md

FINDING03Did the breakout strategy beat plain NISA?A frozen strategy, tested out-of-sample with nowhere to hide.FAIL

NISA OOS validation

Out-of-sample test of the frozen breakout screen against NISA benchmarks. We publish this whether it passes or fails.

VerdictFAIL

Breakout portfolio: 0.883×
NISA equal-weight: 1.592×
NISA large-cap: 1.8×
OOS lift vs NISA: 0.99×
≥5x tail events: ≈6
Multibagger capture: 52.2%

Conclusion

• Frozen breakout screen is not tradeable.
• Live radar disabled.
• NISA remains the evidence-backed default.

Nuance

The screen did catch some multibagger candidates, but portfolio construction did not beat NISA after costs.

Now tested out-of-sample across multiple regimes (2018–2026, incl. 2018 Q4, 2020 COVID, 2022): the breakout portfolio still loses to NISA after costs. The FAIL is definitive, not a single-regime artifact.

OOS (2022–2026): breakout −3.4%/yr vs NISA-EW +13.5% / large-cap +17.4%, barely above its random control. (2018-01-31 → 2026-03-31)

FINDING04Fine — what if you blend in just a 5% satellite?Testing the last hope: maybe it helps as a core-satellite blend.FAIL

Core-Satellite Overlay

Tests whether a NISA core plus a small breakout satellite beats plain NISA. Published whether it passes or fails.

VerdictFAIL

NISA-EW core: 1.354×
Best overlay: 1.341×
Best excess vs NISA: -1pp

• Top-N concentration made it worse, not better.
• The satellite worsens the blend as its allocation rises.

Conclusion

• Breakout is not useful as a NISA satellite.
• No paper skill / no live radar.

Breakout research lane closed — standalone and satellite both failed. See docs/breakout_research_closure.md.

FINDING05Is post-earnings drift (PEAD) still alive?One of the oldest documented anomalies, retested on today's market.FAIL

PEAD / earnings surprise

Out-of-sample test of post-earnings drift vs NISA. Published whether it passes, fails, or cannot be decided yet.

VerdictFAIL

OOS events: 7,840
Revision α (t+21d): -1.39pp
Surprise α (t+21d): -2.236pp
Combined α (t+21d): -1.354pp

• Guidance-based PEAD produced no positive OOS drift alpha vs the NISA/EW proxy.
• Revision and surprise signals are negative after costs; combined is only marginal at t+63d and not robust.
• No paper skill / no live radar.

Rejected on monthly resolution, and the daily t+5d announcement window — the one sub-angle left open — was then measured on paid daily data (21,014 liquid events, 2016–2026) and is also negative after cost (see docs/pead_research_closure.md, docs/event_window_study_closure.md). The only remaining untested input is analyst-consensus surprise, which J-Quants does not carry.

No edge or profitability is claimed.

FINDING06Which signals actually moved the results?Not vibes — decomposed, measured attribution.MEASURED

Signal attribution

Which evidence lanes supported or opposed each decision — recorded now, scored only once outcomes close.

Coverage

Decisions: 3730
With attribution: 3730
Ready to score: 3359

Lanes represented (decisions with a known stance)

Agent decision3730
Technical3730
Risk3730
Event awarenessunknown
Thesisunknown
World Monitorunknown

Stances recorded

Supports buy: 3320
Supports hold: 3119
Supports sell: 1021
Caution: 3730

No closed outcomes yet — scoring is on hold. Win rates are not computed until outcome windows close.

Each decision links to its Research Outcomes window; stances mature into evidence only as those windows close.

Attribution is not causation: a lane supporting a decision does not mean it caused the result.

Read-only attribution coverage. No edge or profitability is claimed.

FINDING07Couldn't this all just be bad data?We bought the full paid archive and re-ran everything to kill that excuse.RULED OUT

Paid-data archive — the flows everyone watches turned out to be nothing

We bought the full J-Quants dataset and tested the flows JP investors actually follow. The two most-cited are coincident or noise. (The two with real structure get their own panels below.)

Investor flows (投資部門別) — all 12 categories as timing signals
NOT PREDICTIVE
We tested every investor type, not just foreigners. Individuals really are contrarian traders (−0.73 same-week) and foreigners/prop chase momentum (+0.40) — but none of it predicts next week: every category's forward correlation is ~0 and every hit-rate ~50%. 'Follow the smart money, fade dumb retail' is coincident behaviour, already in the price, not a forward signal.
520 weeks · individuals same-week −0.73, foreigners +0.40 · every category's next-week corr ≈ 0 (max |r| = 0.06)
Earnings-announcement premium — is return concentrated on earnings?
REAL, but already in the index
A rare positive: stocks really do earn more around their earnings (+27 bps/day, 67% of names) — the global 'announcement premium' replicates in Japan. But it's a RISK premium (announcement days are 1.35× more volatile, direction is a 51% coin flip), and a buy-and-hold / NISA index holder already captures it by holding through every earnings. Isolating it means trading around every announcement — turnover that eats it. Not an edge over the index.
+27 bps on announcement days vs +4 bps baseline · positive in 67% of stocks · 166k events
Large disclosed shorts (空売り残高報告) — are big short-sellers informed?
weak-stock marker, not a timing edge
Every short position ≥0.5% is disclosed by name (Barclays, Citadel…). Stocks carrying big institutional shorts do underperform (~−1.9pp/60d) — but a short INCREASE (−1.6) and a DECREASE (−2.1) underperform about equally, so it's a 'shorted stocks are weak stocks' selection marker, not informed timing on the change. They're hard-to-borrow and the signal is negative — an avoidance flag, not a retail edge.
1.29M disclosures, 3,593 stocks · shorted names −1.9pp/60d vs TOPIX · increase −1.6 ≈ decrease −2.1
Sector short-ratio (空売り比率) — as a contrarian reversal
NO REVERSAL
Extreme short-selling does not precede a bounce. High-short sectors mildly UNDERperform low-short ones; the faint tilt is continuation, not reversal, and it flips sign across eras = noise. The spike is coincident sentiment, already in the price.
high − low short-ratio sectors: −0.49pp over 20d · 148k observations

Research/paper, local cached data, no API spend. Published whether it helps or not. Not investment advice.

FINDING08Is a margin restriction a tradeable sell signal?The signal is real. There's still a reason you can't trade it.NOT TRADEABLE

Margin restriction (信用規制) — what happens after a stock is restricted

Stocks flagged for overheated speculative margin buying. A real, stable signal — but not one you can trade.

REAL, NOT TRADEABLE

Restricted names spiked a median +35% in the prior month, then gave back ~4% vs TOPIX over the next quarter (3,989 events, both halves).

Horizon	Excess vs TOPIX	% positive
+20 days	−1.7pp	35%
+60 days	−3.8pp	32%

The restriction cools the speculative rally — but these are exactly the un-shortable, squeeze-prone names, and the signal is negative, so there's no long edge. It's an avoidance rule: don't chase a stock that just spiked into a margin restriction.

Entry = a name newly under a hard restriction vs the prior weekly snapshot; excess vs TOPIX from the publication date. Aggregate forward tell only, not tradeable. Not advice.

FINDING09Can you beat the index by chasing hot sectors?TOPIX-17 momentum, tested — what showed up wasn't alpha.MARGINAL

Sector rotation (TOPIX-17 momentum) — does chasing the hot sectors beat TOPIX?

Each month rotate into the top-5 trending sectors, equal-weight, cost-charged, vs cap-weight TOPIX.

MARGINAL — beta, not alpha

Rotation edges TOPIX by +0.9pp/yr net and is positive in both halves — but with deeper drawdowns and an identical Calmar (0.40), so it's just higher beta.

Strategy	CAGR	max DD	pre-23	post-23
Rotation top-5 (net)	10.4%	−26%	+2.4%	+22.9%
TOPIX (cap-wt)	9.4%	−23%	+1.7%	+21.6%

The least-bad active result on the paid archive — yet the extra 0.9pp/yr comes with deeper drawdowns and no Calmar improvement. And a robustness check settles it: across 9 lookback × top-K settings, only 1 beat TOPIX (the +0.9pp cell shown); the other 8 lost, mean −0.6pp/yr. The 'edge' is a single lucky parameter — higher beta, not skill, and not a durable edge.

Official TOPIX-17 cap-weighted sector indices, monthly top-5 by 12m momentum, net 0.4% round-trip turnover, dividends excluded both sides. Not advice.

The data behind the verdict

Coverage

Lanes represented (decisions with a known stance)

Stances recorded

Investor flows (投資部門別) — all 12 categories as timing signals

Earnings-announcement premium — is return concentrated on earnings?

Large disclosed shorts (空売り残高報告) — are big short-sellers informed?

Sector short-ratio (空売り比率) — as a contrarian reversal