Research · answer-check

Agent backtest

The real multi-agent pipeline replayed over a PAST week using only data available at each decision day, then graded against the prices that actually followed. The future is already known, so there is no waiting.

decision support

vs NISA S&P (Japan resident, all costs)

verdict: NISA S&P WINS (by >0.5pp)

Should remaining NISA quota go to this AI agent or to tsumitate S&P 500? Compares agent's mean return net of US slippage, FX, JP broker fees, and 20.315% capital gains tax — vs SPY total return (NISA-internal = tax-free).

Research output, not financial advice. Numbers are model approximations on a small sample.

Sample is small — these numbers are observations, not proof. Do not change real allocation based on this panel until significance tier reaches MEANINGFUL (n ≥ 30) and verdict is consistent across regimes.

mean returns (n=3)

agent gross (raw equity)	+0.50%
agent net of US slippage + JP costs (FX + broker)	-1.10%
agent NISA-OUTSIDE (after 20.315% tax)	-1.17%
agent NISA-INSIDE (tax-free)	-1.10%
NISA S&P (SPY total return, tax-free)	+1.27%

Δ inside NISA (agent − SPY)

-2.37pp

Δ outside NISA (after tax)

-2.44pp

assumptions

JP capital gains tax: 20.315%
FX round-trip: 30bps
JP broker round-trip: 99bps
benchmark: SPY total return (NISA-tax-free)

decision support — interactive

Scenario simulator: NISA S&P vs AI agent mix × leverage

Move the sliders to see how a mix of (NISA S&P 500) + (AI agent at chosen leverage, inside/outside NISA) would have played out across the same 3 backtest windows. Pure math from existing data — no LLM re-run.

Research output, not financial advice. Leverage math is window-level approximation; ruin = AI portion hit ≥100% drawdown.

Same small-sample caveat: do not change real NISA allocation based on this slider until significance tier reaches MEANINGFUL (n ≥ 30).

AI allocation0%

0% (全 NISA)100% (全 AI)

leverage on AI1.0x

1x4x

AI: inside NISA / outside

window	regime	NISA S&P	AI (levered)	combined	Δ vs NISA	ruin?
2026-03-17 → 2026-03-30	DOWN	-5.22%	-2.28%	-5.22%	+0.00pp	—
2026-04-06 → 2026-04-17	UP	+6.99%	+1.03%	+6.99%	+0.00pp	—
2026-05-06 → 2026-05-19	SIDE	+2.06%	-2.04%	+2.06%	+0.00pp	—
mean (n=3)		+1.27%	-1.10%	+1.27%	+0.00pp	0/3

mean combined return

+1.27%

mean Δ vs NISA S&P

+0.00pp

worst window: +0.00pp

ruin (AI portion ≥100% DD)

0/3

SCOPE: Backtest is a DEGRADED variant of the live agent, not a copy. All 4 analyst lanes (bull/bear/technical/fundamental) still run, but: fundamental gets ALL-NONE inputs (lane functionally dead); bull/bear see real technicals + null fundamentals + empty headlines (partially degraded); only technical_analyst is fully fed. The trader effectively decides on a TECHNICAL-HEAVY brief set. Universe is 15 mega-cap tech (Mag7 + AI), highly correlated. Fees model: ~5bps one-way slippage approximation. Macro context (World Monitor) is fed to bull/bear/trader point-in-time: each decision day pulls WM's ?as_of= reconstruction (macro regime + 1d/1w asset bias rebuilt from daily history as of that day). Approximations: news sentiment uses a VIX proxy (no historical news archive), and WM's 10-min horizon / sizing / signal-quality are omitted in as_of mode. Results DO NOT generalize to the live agent or to broader markets.

Aggregate across LLM strategy windows (demo excluded)

sample tier: ANECDOTE (n<5) (n=4)

Sample is too small for statistical significance. These numbers are observations, NOT proof of edge or skill. Read them as 'in this handful of windows so far'.

windows

mean α (gross)

-14.55pp

95% CI: [-38.10, +2.35]

mean α (net of fees)

-14.66pp

median α

-7.89pp

α std

24.18pp

α > 0 hit rate

25%

agent return / benchmark return

+2.20% / +16.75%

best window: +6.72pp (ollama-20260317-20260330)

worst window: -49.13pp (ollama-stride21-20250102-20260106)

Conviction calibration (pooled decisions across all LLM windows)

Mean t+7d forward price move grouped by conviction (0-1). Calibrated agent: BUY high should beat BUY low; SELL high should fall more than SELL low.

	low (0-0.33)	mid (0.33-0.67)	high (0.67-1)
BUY	—(n=0)	+2.26%(n=8)	+6.55%(n=26)
SELL	—(n=0)	+10.99%(n=3)	+3.11%(n=51)

by regime (benchmark return)

	n	α	hit	agent / bench
DOWN (bench < −3%)	1	+6.72pp	100%	-0.86% / -7.59%
SIDEWAYS (±3%)	0	—	—	— / —
UP (bench > +3%)	3	-21.64pp	0%	+3.23% / +24.86%

phase 2 — pick quality only

Conviction filter sweep — BUY pick quality alone

best threshold (mean α): T=0 → -10.93pp

For each window, hypothetical return = Σ (BUY.target_weight × t+7d) for BUYs with conviction ≥ threshold T. Cash earns 0% on the rest. SELL/HOLD ignored. Answers: 'were the high-conviction BUYs actually good picks vs the bench?' — NOT a full portfolio replay.

Research output, not financial advice. Single-period model with strong simplifications.

Sample is small. Even if pick quality looks positive, it does NOT mean the full agent or a real portfolio would beat NISA S&P. Only the BUY picks are evaluated here, in isolation, for ~7 trading days.

window	regime	bench	full agent return	full agent α	T=0	T=0.33	T=0.67	T=0.8
2026-03-17 → 2026-03-30	DOWN	-7.59%	-0.86%	+6.73pp	+8.47pp(n1)	+8.47pp(n1)	+8.47pp(n1)	+7.59pp(n0)
2026-04-06 → 2026-04-17	UP	+13.78%	+3.01%	-10.77pp	-0.26pp(n15)	-0.26pp(n15)	-0.79pp(n14)	-13.78pp(n0)
2026-05-06 → 2026-05-19	UP	+4.36%	-0.66%	-5.01pp	+4.51pp(n10)	+4.51pp(n10)	+1.72pp(n5)	-4.36pp(n0)
2025-01-02 → 2026-01-06	UP	+56.45%	+7.32%	-49.13pp	-56.45pp(n0)	-56.45pp(n0)	-56.45pp(n0)	-56.45pp(n0)
mean filtered α					-10.93pp	-10.93pp	-11.76pp	-16.75pp

model assumptions

approximation: single-period BUY-picker sim (each kept BUY held ~7d, cash 0%, SELL/HOLD ignored). NOT a full broker replay — answers 'were the high-conviction BUYs actually good picks?' only.

γ ablation

ablation: lanes_off

conclusion: SIGNAL NEUTRAL (|Δ| ≤ 0.5pp)

Same agent, same windows — but the ablated run no-ops every SELL order at the broker (agent reasoning unchanged). The α difference is the SELL signal's contribution. Sign convention: negative Δ = ablation hurt α = signal was valuable.

vanilla · mean α

-3.02pp

ablated · mean α

-2.94pp

Δ (ablated − vanilla)

+0.08pp

window	regime	bench	vanilla equity	ablated equity	vanilla α	ablated α	Δ α
2026-03-17 → 2026-03-30	DOWN	-7.59%	-0.86%	-0.94%	+6.72pp	+6.65pp	-0.07pp
2026-04-06 → 2026-04-17	UP	+13.78%	+3.01%	+3.94%	-10.77pp	-9.84pp	+0.92pp
2026-05-06 → 2026-05-19	UP	+4.36%	-0.66%	-1.27%	-5.01pp	-5.63pp	-0.61pp

γ ablation

ablation: model_deepseek_r1_14b

conclusion: SIGNAL VALUABLE (mean α dropped without it)

vanilla · mean α

-5.01pp

ablated · mean α

-6.00pp

Δ (ablated − vanilla)

-0.99pp

window	regime	bench	vanilla equity	ablated equity	vanilla α	ablated α	Δ α	blocked
2026-05-06 → 2026-05-19	UP	+4.36%	-0.66%	-1.65%	-5.01pp	-6.00pp	-0.99pp	0

γ ablation

ablation: model_qwen7b

conclusion: SIGNAL VALUABLE (mean α dropped without it)

vanilla · mean α

-5.01pp

ablated · mean α

-6.25pp

Δ (ablated − vanilla)

-1.24pp

window	regime	bench	vanilla equity	ablated equity	vanilla α	ablated α	Δ α	blocked
2026-05-06 → 2026-05-19	UP	+4.36%	-0.66%	-1.89%	-5.01pp	-6.25pp	-1.24pp	0

γ ablation

SELL signal removed (vanilla vs --no-sell)

conclusion: SIGNAL VALUABLE (mean α dropped without it)

vanilla · mean α

-3.02pp

ablated · mean α

-4.14pp

Δ (ablated − vanilla)

-1.13pp

window	regime	bench	vanilla equity	ablated equity	vanilla α	ablated α	Δ α	blocked
2026-03-17 → 2026-03-30	DOWN	-7.59%	-0.86%	-4.29%	+6.72pp	+3.30pp	-3.42pp	9
2026-04-06 → 2026-04-17	UP	+13.78%	+3.01%	+3.82%	-10.77pp	-9.96pp	+0.80pp	2
2026-05-06 → 2026-05-19	UP	+4.36%	-0.66%	-1.42%	-5.01pp	-5.78pp	-0.76pp	0

Total blocked: 11

γ ablation

ablation: picks_only

conclusion: SIGNAL HARMFUL (mean α improved without it)

vanilla · mean α

-3.02pp

ablated · mean α

-0.53pp

Δ (ablated − vanilla)

+2.49pp

window	regime	bench	vanilla equity	ablated equity	vanilla α	ablated α	Δ α	blocked
2026-03-17 → 2026-03-30	DOWN	-10.67%	-0.86%	-1.17%	+6.72pp	+9.51pp	+2.79pp	2
2026-04-06 → 2026-04-17	UP	+15.58%	+3.01%	+5.12%	-10.77pp	-10.46pp	+0.30pp	2
2026-05-06 → 2026-05-19	SIDEWAYS	-1.77%	-0.66%	-2.42%	-5.01pp	-0.65pp	+4.37pp	2

Total blocked: 6

How to read it

Portfolio result: the agent runs as paper trader with $100k starting cash; equity is marked-to-market each day. The dashed line is the equal-weight buy-and-hold benchmark over the same window.
α (alpha) = agent_return − benchmark_return, in percentage points. Positive = beat the basket; negative = left money on the table.
Each row of the matrix is the agent's decision (BUY / HOLD / SELL); each column is the average price move 1, 3, 7 days after.
No leakage: prices truncated to the decision day; fundamentals + news disabled (no clean historical-as-of feed).
A handful of windows is a handful of decisions — indicative only, before fees/slippage, NOT proof of edge.

ollama2025-01-02 → 2026-01-06Decisions: 184 · closed: 184

Portfolio result (simulated paper, $100k start)

$100,000 → $107,321

+7.32%

max drawdown 0.61% · 13 trading days

α vs buy-and-hold: -49.13ppnet α (~5bps/trade): -49.23ppbenchmark: +56.45%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

Not enough BUY/HOLD decisions to compare in this window.

BUY vs HOLD (t+7d): —SELL direction: n/a

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY6 decisions	—	—	—
HOLD159 decisions	—	—	—
SELL19 decisions	—	—	—

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 6HOLD 159SELL 19

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: picks_only2026-05-06 → 2026-05-19Decisions: 148 · closed: 148

Portfolio result (simulated paper, $100k start)

$100,000 → $97,582

-2.42%

max drawdown 4.40% · 10 trading days

α vs buy-and-hold: -0.65ppnet α (~5bps/trade): -0.75ppbenchmark: -1.77%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -3.86ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY10 decisions	-0.74%	-1.82%	+0.10%
HOLD128 decisions	+0.59%	+1.32%	+3.96%
SELL10 decisions	-0.10%	+1.56%	+2.05%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 10HOLD 128SELL 10

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: picks_only2026-04-06 → 2026-04-17Decisions: 148 · closed: 148

Portfolio result (simulated paper, $100k start)

$100,000 → $105,123

+5.12%

max drawdown 0.16% · 10 trading days

α vs buy-and-hold: -10.46ppnet α (~5bps/trade): -10.58ppbenchmark: +15.58%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -3.14ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY11 decisions	+1.53%	+3.64%	+6.55%
HOLD127 decisions	+1.58%	+4.43%	+9.69%
SELL10 decisions	-0.61%	+3.70%	+11.02%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 11HOLD 127SELL 10

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: picks_only2026-03-17 → 2026-03-30Decisions: 148 · closed: 148

Portfolio result (simulated paper, $100k start)

$100,000 → $98,835

-1.17%

max drawdown 1.49% · 10 trading days

α vs buy-and-hold: +9.51ppnet α (~5bps/trade): +9.45ppbenchmark: -10.67%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -4.97ppSELL direction: right — sold names fell

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY1 decisions	+1.91%	+5.28%	-6.36%
HOLD132 decisions	-0.68%	-1.82%	-1.39%
SELL15 decisions	-3.68%	-2.55%	-2.97%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 1HOLD 132SELL 15

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: model_deepseek_r1_14b2026-05-06 → 2026-05-19Decisions: 111 · closed: 111

Portfolio result (simulated paper, $100k start)

$100,000 → $98,353

-1.65%

max drawdown 1.65% · 10 trading days

α vs buy-and-hold: -6.00ppnet α (~5bps/trade): -6.05ppbenchmark: +4.36%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -5.81ppSELL direction: n/a

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY6 decisions	+2.05%	-1.98%	-1.49%
HOLD105 decisions	+0.60%	+1.79%	+4.32%
SELL0 decisions	—	—	—

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 6HOLD 105SELL 0

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: model_qwen7b2026-05-06 → 2026-05-19Decisions: 150 · closed: 150

Portfolio result (simulated paper, $100k start)

$100,000 → $98,106

-1.89%

max drawdown 4.34% · 10 trading days

α vs buy-and-hold: -6.25ppnet α (~5bps/trade): -6.74ppbenchmark: +4.36%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -1.50ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY64 decisions	+0.81%	+0.53%	+2.97%
HOLD85 decisions	+0.16%	+1.73%	+4.46%
SELL1 decisions	+0.19%	+1.14%	+2.15%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 64HOLD 85SELL 1

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollama2026-05-06 → 2026-05-19Decisions: 150 · closed: 150

Portfolio result (simulated paper, $100k start)

$100,000 → $99,344

-0.66%

max drawdown 3.35% · 10 trading days

α vs buy-and-hold: -5.01ppnet α (~5bps/trade): -5.11ppbenchmark: +4.36%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks beat simply holding over this window.

BUY vs HOLD (t+7d): +2.02ppSELL direction: right — sold names fell

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY10 decisions	+0.12%	+0.97%	+5.91%
HOLD133 decisions	+0.44%	+1.28%	+3.89%
SELL7 decisions	+0.76%	+0.29%	-0.73%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 10HOLD 133SELL 7

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollama2026-04-06 → 2026-04-17Decisions: 148 · closed: 148

Portfolio result (simulated paper, $100k start)

$100,000 → $103,011

+3.01%

max drawdown 0.00% · 10 trading days

α vs buy-and-hold: -10.77ppnet α (~5bps/trade): -10.91ppbenchmark: +13.78%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -3.48ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY15 decisions	+1.05%	+2.47%	+6.06%
HOLD124 decisions	+1.48%	+4.30%	+9.55%
SELL9 decisions	+1.38%	+7.76%	+15.38%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 15HOLD 124SELL 9

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollama2026-03-17 → 2026-03-30Decisions: 149 · closed: 149

Portfolio result (simulated paper, $100k start)

$100,000 → $99,136

-0.86%

max drawdown 0.86% · 10 trading days

α vs buy-and-hold: +6.72ppnet α (~5bps/trade): +6.62ppbenchmark: -7.59%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks beat simply holding over this window.

BUY vs HOLD (t+7d): +7.93ppSELL direction: flat — little movement

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY1 decisions	+7.26%	-1.65%	+5.91%
HOLD121 decisions	-1.03%	-2.04%	-2.02%
SELL27 decisions	-1.16%	-0.86%	-0.14%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 1HOLD 121SELL 27

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: no_sell2026-05-06 → 2026-05-19Decisions: 150 · closed: 150

Portfolio result (simulated paper, $100k start)

$100,000 → $98,583

-1.42%

max drawdown 1.70% · 10 trading days

α vs buy-and-hold: -5.78ppnet α (~5bps/trade): -5.85ppbenchmark: +4.36%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -2.65ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY6 decisions	-0.30%	-2.07%	+1.18%
HOLD135 decisions	+0.35%	+1.15%	+3.82%
SELL9 decisions	+2.25%	+4.42%	+5.38%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 6HOLD 135SELL 9

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: no_sell2026-04-06 → 2026-04-17Decisions: 148 · closed: 148

Portfolio result (simulated paper, $100k start)

$100,000 → $103,815

+3.82%

max drawdown 0.39% · 10 trading days

α vs buy-and-hold: -9.96ppnet α (~5bps/trade): -10.04ppbenchmark: +13.78%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -1.53ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY9 decisions	+1.92%	+3.93%	+7.98%
HOLD135 decisions	+1.44%	+4.35%	+9.51%
SELL4 decisions	+0.17%	+4.46%	+14.26%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 9HOLD 135SELL 4

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

demo2026-05-15 → 2026-05-19Decisions: 45 · closed: 45

Portfolio result (simulated paper, $100k start)

$100,000 → $100,000

+0.00%

max drawdown 0.00% · 3 trading days

α vs buy-and-hold: +3.97ppnet α (~5bps/trade): +3.97ppbenchmark: -3.97%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

Not enough BUY/HOLD decisions to compare in this window.

BUY vs HOLD (t+7d): —SELL direction: n/a

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY0 decisions	—	—	—
HOLD45 decisions	-1.54%	+0.98%	+8.52%
SELL0 decisions	—	—	—

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 0HOLD 45SELL 0

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: no_sell2026-03-17 → 2026-03-30Decisions: 141 · closed: 141

Portfolio result (simulated paper, $100k start)

$100,000 → $95,715

-4.29%

max drawdown 4.29% · 10 trading days

α vs buy-and-hold: +3.30ppnet α (~5bps/trade): +3.21ppbenchmark: -7.59%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -9.18ppSELL direction: right — sold names fell

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY3 decisions	+2.66%	-4.50%	-10.08%
HOLD117 decisions	-0.57%	-1.07%	-0.90%
SELL21 decisions	-2.67%	-3.65%	-3.63%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 3HOLD 117SELL 21

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: lanes_off2026-04-06 → 2026-04-17Decisions: 148 · closed: 148

Portfolio result (simulated paper, $100k start)

$100,000 → $103,935

+3.94%

max drawdown 0.43% · 10 trading days

α vs buy-and-hold: -9.84ppnet α (~5bps/trade): -10.00ppbenchmark: +13.78%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -3.11ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY15 decisions	+0.95%	+3.43%	+6.48%
HOLD119 decisions	+1.61%	+4.37%	+9.59%
SELL14 decisions	+0.44%	+4.87%	+12.48%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 15HOLD 119SELL 14

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: lanes_off2026-03-17 → 2026-03-30Decisions: 149 · closed: 149

Portfolio result (simulated paper, $100k start)

$100,000 → $99,064

-0.94%

max drawdown 0.94% · 10 trading days

α vs buy-and-hold: +6.65ppnet α (~5bps/trade): +6.53ppbenchmark: -7.59%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -14.04ppSELL direction: right — sold names fell

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY3 decisions	-2.66%	-6.62%	-15.09%
HOLD115 decisions	-0.80%	-1.51%	-1.05%
SELL31 decisions	-1.45%	-2.22%	-1.54%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 3HOLD 115SELL 31

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: lanes_off2026-05-06 → 2026-05-19Decisions: 150 · closed: 150

Portfolio result (simulated paper, $100k start)

$100,000 → $98,729

-1.27%

max drawdown 2.59% · 10 trading days

α vs buy-and-hold: -5.63ppnet α (~5bps/trade): -5.82ppbenchmark: +4.36%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -1.82ppSELL direction: flat — little movement

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY16 decisions	-0.21%	+1.28%	+2.76%
HOLD114 decisions	+0.51%	+1.54%	+4.58%
SELL20 decisions	+0.53%	-0.66%	+0.26%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 16HOLD 114SELL 20

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

demo2026-05-13 → 2026-05-19Decisions: 75 · closed: 75

Portfolio result (simulated paper, $100k start)

$100,000 → $100,000

+0.00%

max drawdown 0.00% · 5 trading days

α vs buy-and-hold: +1.32ppnet α (~5bps/trade): +1.32ppbenchmark: -1.32%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

Not enough BUY/HOLD decisions to compare in this window.

BUY vs HOLD (t+7d): —SELL direction: n/a

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY0 decisions	—	—	—
HOLD75 decisions	-0.37%	-0.03%	+6.49%
SELL0 decisions	—	—	—

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 0HOLD 75SELL 0

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollama2026-05-13 → 2026-05-19Decisions: 75 · closed: 75

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -2.96ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

	t+1d	t+3d	t+7d
BUY8 decisions	-1.16%	-2.36%	+4.05%
HOLD56 decisions	-0.47%	+0.35%	+7.01%
SELL11 decisions	+0.74%	-0.26%	+5.63%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 8HOLD 56SELL 11

Leakage controls

prices: truncated ≤ decision day
fundamentals: disabled
news: disabled

Caveat

Small-sample, price/technical+debate only, before fees/slippage. Indicative, NOT an edge/profitability claim.