Skip to content
Quon — ai-stock-agent mascotai-stock-agent/ dashboard
← Dashboard

Research · answer-check

Agent backtest

The real multi-agent pipeline replayed over a PAST week using only data available at each decision day, then graded against the prices that actually followed. The future is already known, so there is no waiting.

vs NISA S&P (Japan resident, all costs)

verdict: NISA S&P WINS (by >0.5pp)

Should remaining NISA quota go to this AI agent or to tsumitate S&P 500? Compares agent's mean return net of US slippage, FX, JP broker fees, and 20.315% capital gains tax — vs SPY total return (NISA-internal = tax-free).

Research output, not financial advice. Numbers are model approximations on a small sample.

Sample is small — these numbers are observations, not proof. Do not change real allocation based on this panel until significance tier reaches MEANINGFUL (n ≥ 30) and verdict is consistent across regimes.

mean returns (n=3)

agent gross (raw equity)+0.50%
agent net of US slippage + JP costs (FX + broker)-1.10%
agent NISA-OUTSIDE (after 20.315% tax)-1.17%
agent NISA-INSIDE (tax-free)-1.10%
+1.27%

Δ inside NISA (agent − SPY)

-2.37pp

Δ outside NISA (after tax)

-2.44pp

assumptions

  • JP capital gains tax: 20.315%
  • FX round-trip: 30bps
  • JP broker round-trip: 99bps
  • benchmark: SPY total return (NISA-tax-free)

Scenario simulator: NISA S&P vs AI agent mix × leverage

Move the sliders to see how a mix of (NISA S&P 500) + (AI agent at chosen leverage, inside/outside NISA) would have played out across the same 3 backtest windows. Pure math from existing data — no LLM re-run.

Research output, not financial advice. Leverage math is window-level approximation; ruin = AI portion hit ≥100% drawdown.

Same small-sample caveat: do not change real NISA allocation based on this slider until significance tier reaches MEANINGFUL (n ≥ 30).

0% (全 NISA)100% (全 AI)
1x4x
windowregimeNISA S&PAI (levered)combinedΔ vs NISAruin?
2026-03-172026-03-30DOWN-5.22%-2.28%-5.22%+0.00pp
2026-04-062026-04-17UP+6.99%+1.03%+6.99%+0.00pp
2026-05-062026-05-19SIDE+2.06%-2.04%+2.06%+0.00pp
mean (n=3)+1.27%-1.10%+1.27%+0.00pp0/3

mean combined return

+1.27%

mean Δ vs NISA S&P

+0.00pp

worst window: +0.00pp

ruin (AI portion ≥100% DD)

0/3

SCOPE: Backtest is a DEGRADED variant of the live agent, not a copy. All 4 analyst lanes (bull/bear/technical/fundamental) still run, but: fundamental gets ALL-NONE inputs (lane functionally dead); bull/bear see real technicals + null fundamentals + empty headlines (partially degraded); only technical_analyst is fully fed. The trader effectively decides on a TECHNICAL-HEAVY brief set. Universe is 15 mega-cap tech (Mag7 + AI), highly correlated. Fees model: ~5bps one-way slippage approximation. Macro context (World Monitor) is fed to bull/bear/trader point-in-time: each decision day pulls WM's ?as_of= reconstruction (macro regime + 1d/1w asset bias rebuilt from daily history as of that day). Approximations: news sentiment uses a VIX proxy (no historical news archive), and WM's 10-min horizon / sizing / signal-quality are omitted in as_of mode. Results DO NOT generalize to the live agent or to broader markets.

Aggregate across LLM strategy windows (demo excluded)

sample tier: ANECDOTE (n<5) (n=4)

Sample is too small for statistical significance. These numbers are observations, NOT proof of edge or skill. Read them as 'in this handful of windows so far'.

windows

4

mean α (gross)

-14.55pp

95% CI: [-38.10, +2.35]

mean α (net of fees)

-14.66pp

median α

-7.89pp

α std

24.18pp

α > 0 hit rate

25%

agent return / benchmark return

+2.20% / +16.75%

best window: +6.72pp (ollama-20260317-20260330)

worst window: -49.13pp (ollama-stride21-20250102-20260106)

Conviction calibration (pooled decisions across all LLM windows)

Mean t+7d forward price move grouped by conviction (0-1). Calibrated agent: BUY high should beat BUY low; SELL high should fall more than SELL low.

low (0-0.33)mid (0.33-0.67)high (0.67-1)
BUY(n=0)+2.26%(n=8)+6.55%(n=26)
SELL(n=0)+10.99%(n=3)+3.11%(n=51)

by regime (benchmark return)

nαhitagent / bench
DOWN (bench < −3%)1+6.72pp100%-0.86% / -7.59%
SIDEWAYS (±3%)0 /
UP (bench > +3%)3-21.64pp0%+3.23% / +24.86%

phase 2 — pick quality only

Conviction filter sweep — BUY pick quality alone

best threshold (mean α): T=0-10.93pp

For each window, hypothetical return = Σ (BUY.target_weight × t+7d) for BUYs with conviction ≥ threshold T. Cash earns 0% on the rest. SELL/HOLD ignored. Answers: 'were the high-conviction BUYs actually good picks vs the bench?' — NOT a full portfolio replay.

Research output, not financial advice. Single-period model with strong simplifications.

Sample is small. Even if pick quality looks positive, it does NOT mean the full agent or a real portfolio would beat NISA S&P. Only the BUY picks are evaluated here, in isolation, for ~7 trading days.

windowregimebenchfull agent returnfull agent αT=0T=0.33T=0.67T=0.8
2026-03-172026-03-30DOWN-7.59%-0.86%+6.73pp+8.47pp(n1)+8.47pp(n1)+8.47pp(n1)+7.59pp(n0)
2026-04-062026-04-17UP+13.78%+3.01%-10.77pp-0.26pp(n15)-0.26pp(n15)-0.79pp(n14)-13.78pp(n0)
2026-05-062026-05-19UP+4.36%-0.66%-5.01pp+4.51pp(n10)+4.51pp(n10)+1.72pp(n5)-4.36pp(n0)
2025-01-022026-01-06UP+56.45%+7.32%-49.13pp-56.45pp(n0)-56.45pp(n0)-56.45pp(n0)-56.45pp(n0)
mean filtered α-10.93pp-10.93pp-11.76pp-16.75pp
model assumptions

approximation: single-period BUY-picker sim (each kept BUY held ~7d, cash 0%, SELL/HOLD ignored). NOT a full broker replay — answers 'were the high-conviction BUYs actually good picks?' only.

γ ablation

ablation: lanes_off

conclusion: SIGNAL NEUTRAL (|Δ| ≤ 0.5pp)

Same agent, same windows — but the ablated run no-ops every SELL order at the broker (agent reasoning unchanged). The α difference is the SELL signal's contribution. Sign convention: negative Δ = ablation hurt α = signal was valuable.

vanilla · mean α

-3.02pp

ablated · mean α

-2.94pp

Δ (ablated − vanilla)

+0.08pp

windowregimebenchvanilla equityablated equityvanilla αablated αΔ αblocked
2026-03-172026-03-30DOWN-7.59%-0.86%-0.94%+6.72pp+6.65pp-0.07pp0
2026-04-062026-04-17UP+13.78%+3.01%+3.94%-10.77pp-9.84pp+0.92pp0
2026-05-062026-05-19UP+4.36%-0.66%-1.27%-5.01pp-5.63pp-0.61pp0

γ ablation

ablation: model_deepseek_r1_14b

conclusion: SIGNAL VALUABLE (mean α dropped without it)

Same agent, same windows — but the ablated run no-ops every SELL order at the broker (agent reasoning unchanged). The α difference is the SELL signal's contribution. Sign convention: negative Δ = ablation hurt α = signal was valuable.

vanilla · mean α

-5.01pp

ablated · mean α

-6.00pp

Δ (ablated − vanilla)

-0.99pp

windowregimebenchvanilla equityablated equityvanilla αablated αΔ αblocked
2026-05-062026-05-19UP+4.36%-0.66%-1.65%-5.01pp-6.00pp-0.99pp0

γ ablation

ablation: model_qwen7b

conclusion: SIGNAL VALUABLE (mean α dropped without it)

Same agent, same windows — but the ablated run no-ops every SELL order at the broker (agent reasoning unchanged). The α difference is the SELL signal's contribution. Sign convention: negative Δ = ablation hurt α = signal was valuable.

vanilla · mean α

-5.01pp

ablated · mean α

-6.25pp

Δ (ablated − vanilla)

-1.24pp

windowregimebenchvanilla equityablated equityvanilla αablated αΔ αblocked
2026-05-062026-05-19UP+4.36%-0.66%-1.89%-5.01pp-6.25pp-1.24pp0

γ ablation

SELL signal removed (vanilla vs --no-sell)

conclusion: SIGNAL VALUABLE (mean α dropped without it)

Same agent, same windows — but the ablated run no-ops every SELL order at the broker (agent reasoning unchanged). The α difference is the SELL signal's contribution. Sign convention: negative Δ = ablation hurt α = signal was valuable.

vanilla · mean α

-3.02pp

ablated · mean α

-4.14pp

Δ (ablated − vanilla)

-1.13pp

windowregimebenchvanilla equityablated equityvanilla αablated αΔ αblocked
2026-03-172026-03-30DOWN-7.59%-0.86%-4.29%+6.72pp+3.30pp-3.42pp9
2026-04-062026-04-17UP+13.78%+3.01%+3.82%-10.77pp-9.96pp+0.80pp2
2026-05-062026-05-19UP+4.36%-0.66%-1.42%-5.01pp-5.78pp-0.76pp0

Total blocked: 11

γ ablation

ablation: picks_only

conclusion: SIGNAL HARMFUL (mean α improved without it)

Same agent, same windows — but the ablated run no-ops every SELL order at the broker (agent reasoning unchanged). The α difference is the SELL signal's contribution. Sign convention: negative Δ = ablation hurt α = signal was valuable.

vanilla · mean α

-3.02pp

ablated · mean α

-0.53pp

Δ (ablated − vanilla)

+2.49pp

windowregimebenchvanilla equityablated equityvanilla αablated αΔ αblocked
2026-03-172026-03-30DOWN-10.67%-0.86%-1.17%+6.72pp+9.51pp+2.79pp2
2026-04-062026-04-17UP+15.58%+3.01%+5.12%-10.77pp-10.46pp+0.30pp2
2026-05-062026-05-19SIDEWAYS-1.77%-0.66%-2.42%-5.01pp-0.65pp+4.37pp2

Total blocked: 6

How to read it

  • Portfolio result: the agent runs as paper trader with $100k starting cash; equity is marked-to-market each day. The dashed line is the equal-weight buy-and-hold benchmark over the same window.
  • α (alpha) = agent_return − benchmark_return, in percentage points. Positive = beat the basket; negative = left money on the table.
  • Each row of the matrix is the agent's decision (BUY / HOLD / SELL); each column is the average price move 1, 3, 7 days after.
  • No leakage: prices truncated to the decision day; fundamentals + news disabled (no clean historical-as-of feed).
  • A handful of windows is a handful of decisions — indicative only, before fees/slippage, NOT proof of edge.
ollama2025-01-022026-01-06Decisions: 184 · closed: 184

Portfolio result (simulated paper, $100k start)

$100,000 $107,321

+7.32%

max drawdown 0.61% · 13 trading days

α vs buy-and-hold: -49.13ppnet α (~5bps/trade): -49.23ppbenchmark: +56.45%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

Not enough BUY/HOLD decisions to compare in this window.

BUY vs HOLD (t+7d): SELL direction: n/a

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY6 decisions
HOLD159 decisions
SELL19 decisions

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 6HOLD 159SELL 19

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: picks_only2026-05-062026-05-19Decisions: 148 · closed: 148

Portfolio result (simulated paper, $100k start)

$100,000 $97,582

-2.42%

max drawdown 4.40% · 10 trading days

α vs buy-and-hold: -0.65ppnet α (~5bps/trade): -0.75ppbenchmark: -1.77%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -3.86ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY10 decisions-0.74%-1.82%+0.10%
HOLD128 decisions+0.59%+1.32%+3.96%
SELL10 decisions-0.10%+1.56%+2.05%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 10HOLD 128SELL 10

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: picks_only2026-04-062026-04-17Decisions: 148 · closed: 148

Portfolio result (simulated paper, $100k start)

$100,000 $105,123

+5.12%

max drawdown 0.16% · 10 trading days

α vs buy-and-hold: -10.46ppnet α (~5bps/trade): -10.58ppbenchmark: +15.58%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -3.14ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY11 decisions+1.53%+3.64%+6.55%
HOLD127 decisions+1.58%+4.43%+9.69%
SELL10 decisions-0.61%+3.70%+11.02%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 11HOLD 127SELL 10

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: picks_only2026-03-172026-03-30Decisions: 148 · closed: 148

Portfolio result (simulated paper, $100k start)

$100,000 $98,835

-1.17%

max drawdown 1.49% · 10 trading days

α vs buy-and-hold: +9.51ppnet α (~5bps/trade): +9.45ppbenchmark: -10.67%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -4.97ppSELL direction: right — sold names fell

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY1 decisions+1.91%+5.28%-6.36%
HOLD132 decisions-0.68%-1.82%-1.39%
SELL15 decisions-3.68%-2.55%-2.97%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 1HOLD 132SELL 15

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: model_deepseek_r1_14b2026-05-062026-05-19Decisions: 111 · closed: 111

Portfolio result (simulated paper, $100k start)

$100,000 $98,353

-1.65%

max drawdown 1.65% · 10 trading days

α vs buy-and-hold: -6.00ppnet α (~5bps/trade): -6.05ppbenchmark: +4.36%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -5.81ppSELL direction: n/a

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY6 decisions+2.05%-1.98%-1.49%
HOLD105 decisions+0.60%+1.79%+4.32%
SELL0 decisions

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 6HOLD 105SELL 0

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: model_qwen7b2026-05-062026-05-19Decisions: 150 · closed: 150

Portfolio result (simulated paper, $100k start)

$100,000 $98,106

-1.89%

max drawdown 4.34% · 10 trading days

α vs buy-and-hold: -6.25ppnet α (~5bps/trade): -6.74ppbenchmark: +4.36%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -1.50ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY64 decisions+0.81%+0.53%+2.97%
HOLD85 decisions+0.16%+1.73%+4.46%
SELL1 decisions+0.19%+1.14%+2.15%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 64HOLD 85SELL 1

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollama2026-05-062026-05-19Decisions: 150 · closed: 150

Portfolio result (simulated paper, $100k start)

$100,000 $99,344

-0.66%

max drawdown 3.35% · 10 trading days

α vs buy-and-hold: -5.01ppnet α (~5bps/trade): -5.11ppbenchmark: +4.36%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks beat simply holding over this window.

BUY vs HOLD (t+7d): +2.02ppSELL direction: right — sold names fell

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY10 decisions+0.12%+0.97%+5.91%
HOLD133 decisions+0.44%+1.28%+3.89%
SELL7 decisions+0.76%+0.29%-0.73%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 10HOLD 133SELL 7

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollama2026-04-062026-04-17Decisions: 148 · closed: 148

Portfolio result (simulated paper, $100k start)

$100,000 $103,011

+3.01%

max drawdown 0.00% · 10 trading days

α vs buy-and-hold: -10.77ppnet α (~5bps/trade): -10.91ppbenchmark: +13.78%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -3.48ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY15 decisions+1.05%+2.47%+6.06%
HOLD124 decisions+1.48%+4.30%+9.55%
SELL9 decisions+1.38%+7.76%+15.38%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 15HOLD 124SELL 9

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollama2026-03-172026-03-30Decisions: 149 · closed: 149

Portfolio result (simulated paper, $100k start)

$100,000 $99,136

-0.86%

max drawdown 0.86% · 10 trading days

α vs buy-and-hold: +6.72ppnet α (~5bps/trade): +6.62ppbenchmark: -7.59%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks beat simply holding over this window.

BUY vs HOLD (t+7d): +7.93ppSELL direction: flat — little movement

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY1 decisions+7.26%-1.65%+5.91%
HOLD121 decisions-1.03%-2.04%-2.02%
SELL27 decisions-1.16%-0.86%-0.14%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 1HOLD 121SELL 27

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: no_sell2026-05-062026-05-19Decisions: 150 · closed: 150

Portfolio result (simulated paper, $100k start)

$100,000 $98,583

-1.42%

max drawdown 1.70% · 10 trading days

α vs buy-and-hold: -5.78ppnet α (~5bps/trade): -5.85ppbenchmark: +4.36%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -2.65ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY6 decisions-0.30%-2.07%+1.18%
HOLD135 decisions+0.35%+1.15%+3.82%
SELL9 decisions+2.25%+4.42%+5.38%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 6HOLD 135SELL 9

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: no_sell2026-04-062026-04-17Decisions: 148 · closed: 148

Portfolio result (simulated paper, $100k start)

$100,000 $103,815

+3.82%

max drawdown 0.39% · 10 trading days

α vs buy-and-hold: -9.96ppnet α (~5bps/trade): -10.04ppbenchmark: +13.78%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -1.53ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY9 decisions+1.92%+3.93%+7.98%
HOLD135 decisions+1.44%+4.35%+9.51%
SELL4 decisions+0.17%+4.46%+14.26%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 9HOLD 135SELL 4

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

demo2026-05-152026-05-19Decisions: 45 · closed: 45

Portfolio result (simulated paper, $100k start)

$100,000 $100,000

+0.00%

max drawdown 0.00% · 3 trading days

α vs buy-and-hold: +3.97ppnet α (~5bps/trade): +3.97ppbenchmark: -3.97%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

Not enough BUY/HOLD decisions to compare in this window.

BUY vs HOLD (t+7d): SELL direction: n/a

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY0 decisions
HOLD45 decisions-1.54%+0.98%+8.52%
SELL0 decisions

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 0HOLD 45SELL 0

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: no_sell2026-03-172026-03-30Decisions: 141 · closed: 141

Portfolio result (simulated paper, $100k start)

$100,000 $95,715

-4.29%

max drawdown 4.29% · 10 trading days

α vs buy-and-hold: +3.30ppnet α (~5bps/trade): +3.21ppbenchmark: -7.59%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -9.18ppSELL direction: right — sold names fell

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY3 decisions+2.66%-4.50%-10.08%
HOLD117 decisions-0.57%-1.07%-0.90%
SELL21 decisions-2.67%-3.65%-3.63%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 3HOLD 117SELL 21

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: lanes_off2026-04-062026-04-17Decisions: 148 · closed: 148

Portfolio result (simulated paper, $100k start)

$100,000 $103,935

+3.94%

max drawdown 0.43% · 10 trading days

α vs buy-and-hold: -9.84ppnet α (~5bps/trade): -10.00ppbenchmark: +13.78%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -3.11ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY15 decisions+0.95%+3.43%+6.48%
HOLD119 decisions+1.61%+4.37%+9.59%
SELL14 decisions+0.44%+4.87%+12.48%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 15HOLD 119SELL 14

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: lanes_off2026-03-172026-03-30Decisions: 149 · closed: 149

Portfolio result (simulated paper, $100k start)

$100,000 $99,064

-0.94%

max drawdown 0.94% · 10 trading days

α vs buy-and-hold: +6.65ppnet α (~5bps/trade): +6.53ppbenchmark: -7.59%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -14.04ppSELL direction: right — sold names fell

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY3 decisions-2.66%-6.62%-15.09%
HOLD115 decisions-0.80%-1.51%-1.05%
SELL31 decisions-1.45%-2.22%-1.54%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 3HOLD 115SELL 31

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollamaablation: lanes_off2026-05-062026-05-19Decisions: 150 · closed: 150

Portfolio result (simulated paper, $100k start)

$100,000 $98,729

-1.27%

max drawdown 2.59% · 10 trading days

α vs buy-and-hold: -5.63ppnet α (~5bps/trade): -5.82ppbenchmark: +4.36%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -1.82ppSELL direction: flat — little movement

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY16 decisions-0.21%+1.28%+2.76%
HOLD114 decisions+0.51%+1.54%+4.58%
SELL20 decisions+0.53%-0.66%+0.26%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 16HOLD 114SELL 20

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

demo2026-05-132026-05-19Decisions: 75 · closed: 75

Portfolio result (simulated paper, $100k start)

$100,000 $100,000

+0.00%

max drawdown 0.00% · 5 trading days

α vs buy-and-hold: +1.32ppnet α (~5bps/trade): +1.32ppbenchmark: -1.32%

— agent · - - buy-and-hold

Verdict (this window, t+7d)

Not enough BUY/HOLD decisions to compare in this window.

BUY vs HOLD (t+7d): SELL direction: n/a

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY0 decisions
HOLD75 decisions-0.37%-0.03%+6.49%
SELL0 decisions

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 0HOLD 75SELL 0

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Simulated paper portfolio, price/technical+debate only, before fees/slippage. Short window = indicative, NOT an edge/profitability claim.

ollama2026-05-132026-05-19Decisions: 75 · closed: 75

Verdict (this window, t+7d)

The agent's BUY picks underperformed simply holding over this window.

BUY vs HOLD (t+7d): -2.96ppSELL direction: wrong — sold names rose

Answer-check — avg forward return by action × horizon

t+1dt+3dt+7d
BUY8 decisions-1.16%-2.36%+4.05%
HOLD56 decisions-0.47%+0.35%+7.01%
SELL11 decisions+0.74%-0.26%+5.63%

Raw close-to-close price move from the decision day. Not P&L, before fees/slippage.

Decision mix

BUY 8HOLD 56SELL 11

Leakage controls

  • prices: truncated ≤ decision day
  • fundamentals: disabled
  • news: disabled

Caveat

Small-sample, price/technical+debate only, before fees/slippage. Indicative, NOT an edge/profitability claim.