Free during validation — all features unlocked. Feedback welcome

Journal

Strategy decisions, what prompted them, and the data behind them. Updated when something changes.

May 18, 2026
The ORB Bots Were Trading a Different Strategy Than the Backtest

I found the root cause of the ORB underperformance today. It took digging through 14 years of bar-by-bar data and comparing it line by line against the live bot logs to see it, but once I saw it, it was obvious.

The backtest enters a trade when a completed 5-minute bar closes above the ORB high. The live bot was entering the moment any live price tick crossed the ORB boundary — checked every 5 seconds. Those are not the same thing.

In a trending market they produce similar results. Price breaks out, keeps going, the bar closes above the boundary, and both versions enter at roughly the same point. But in a choppy market — which April and May 2026 have been — price spikes above the ORB high for a few seconds and then reverses back inside the range before the bar closes. The backtest would skip that entirely. The live bot entered it, then got stopped out as price reversed. Every time. That's the whipsaw pattern I'd been watching for weeks without understanding the mechanical cause.

I quantified it across the full 14-year dataset. The bar-close model produces a 42.6% win rate and $34,422 net over 14 years. The bar-touch model — what the live bots were actually running — produces a 26.4% win rate and $12,490. The gap is $21,933 over 14 years, or about $1,565 per year per contract, entirely from this one implementation mismatch.

The fix is straightforward: after the ORB locks at 9:35, the bot now fetches the close price of each completed 5-minute bar from Schwab's history API before deciding whether to enter. It only enters when a bar closes above or below the ORB boundary. Entry price is set at the ORB boundary plus slippage — not wherever the live price happens to be when the bar is checked. Results cached per bar so there's one API call per 5-minute interval, not one every 5 seconds.

I also went back and corrected the paper trade history for all three bots. The old logs reflected the broken implementation. Using the corrected bar-close model and filling in the post-April data from Schwab's history API, the corrected results are: ES 40W/32L ($1,390 net), NQ 28W/25L ($473 net), RTY 14W/25L ($408 net). Combined $2,271 over the live paper trading period versus roughly $573 in the broken logs — a $1,698 gap entirely from the entry logic mismatch.

The bots have been running the corrected logic since this afternoon. First real test is tomorrow morning. I'm depositing into IBKR later this week and plan to go live with all three micros next Monday, assuming the paper results this week look right.

Entry model14yr tradesWin rateNet P&LAnnual
Bar close above ORB (backtest / fixed)3,16842.6%$34,422$2,459
Any tick above ORB (old live bot)3,44326.4%$12,490$892
May 11, 2026
Back to Calls — and a Better Set of Parameters

The put premiums had gotten small. VIX was sitting in the 16-17 range, the market was calm, and the credits I was collecting on puts were a fraction of what they were in March and April when volatility was elevated. The strategy was still working — win rate fine, no losses — but the reward for the risk had shrunk. That's what prompted me to stop and ask whether the setup I was running was still the right one for this environment.

The tariff situation that dominated March and April seems to be settling. The overnight gaps that were almost exclusively upward — the reason I switched to puts in the first place — haven't materialized the same way in recent weeks. That's a different environment than the one that justified the puts position.

I ran a full parameter sweep this morning: every combination of strategy (calls vs puts), delta (6 through 16), spread width ($5 through $25), profit target (50% vs 80%), and stop loss (none, 2×, 3×) — tested over the last two years of SPX data with walk-forward validation. 240 combinations.

The result was unambiguous. Call credits have outperformed put credits by a wide margin over the last two years — not just in total P&L, but in risk-adjusted terms. The best call credit setup (10-delta, $10 wide, 80% target, no stop) produced a Sharpe of 14.95 versus 4.0 for the equivalent put credit setup. That's not a close call.

The puts were the right decision in March and April. Switching to them during a specific regime of upward gaps and tariff headline risk saved roughly $2,839 compared to staying on calls. But the two-year data makes clear that calls are the baseline. Puts were a temporary adjustment to an unusual regime, not a permanent stance.

I also bumped the delta from 8 to 10. The sweep showed 10-delta consistently outperforms 8-delta on calls across every width tested — slightly more premium, marginally higher loss frequency, better overall Sharpe. That was the meaningful change. Spread width stays at $5 — the $10 wide numbers look better on paper but once you account for the 2× capital requirement, the risk-adjusted difference is minimal. Not worth the upgrade.

Current setup: call credit, 10-delta, $5 wide, 80% profit target, 3:30 PM entry. No stop loss — the sweep confirmed stops don't help on calls, they just reduce wins without cutting the rare large loss in any meaningful way.

StrategyDeltaWidthTargetStopSharpe (2yr)
Call credit10$1080%None14.95
Call credit10$580%None12.49 ← current
Call credit8$1080%None10.84
Call credit8$580%None9.13
Put credit10$1080%None4.0
Put credit10$580%None3.61
Put credit8$580%None2.74
May 1, 2026
Three Losses, Two Problems, One Decision

Between March 31 and April 14, 2026, I took three losses on the live 1-DTE strategy. The March 31 loss was a full max loss — SPX gapped up overnight straight through my short call strike before the market opened. After that I added the gap stop: a kill switch that fires on expiry morning if SPX opens more than 80 points past the short strike, closing early to limit damage.

It worked. April 8 and April 14 both triggered the gap stop and came in at -$385 and -$405 — well below the ~$450 max loss each would have been without it. The gap stop was doing exactly what it was built for.

But the gap stop limits losses. It doesn't prevent them. Three losses in six weeks, all from the same source: overnight upward gaps. Tariff deals announced over weekends. Trade war ceasefires. SPX opening 80-120 points higher every time, landing on or through my short call strike. The risk management was working, but the environment was structurally wrong for call credits. I ran a replay of March and April using real SPX closes across every eligible day. Simulated calls came out at -$2,918. Simulated puts finished at -$79. My actual live account lost -$439 — less because the gap stop caught two of the three — but the point stands: the dominant overnight risk was upward gaps and I was on the wrong side. I switched to puts. One line in the config file.

Then I looked at the account balance and found the second problem. Buying power requirement for one contract: $464. Account balance: $464. Technically tradeable. No buffer. One more max-loss event away from either missing entries or adding capital under pressure — neither of which is a decision you want to make in the middle of a drawdown.

The $500 buffer rule came out of this directly. Never deploy capital that you can't afford to lose without it affecting the next trade. $464 BPR plus $500 sitting untouched means a full max-loss event leaves the system intact and able to continue. Below that threshold, the psychological cost of a loss starts bleeding into execution quality. The backtest assumes you always take the next trade. Real trading doesn't work that way when the cushion is gone. The strategy's edge is in consistency, and anything that disrupts consistency is a real risk even if it never appears in the P&L numbers.

ScenarioMar–Apr 2026 P&LNotes
Call credit — simulation (every day, no gap stop)−$2,91822 trades, 6 losses
Call credit — live account (actual)−$439Gap stop fired 2×, 2 sessions offline
Put credit — simulation (same period)−$79Aligned with gap direction
April 24, 2026
Sitting Out on Purpose

Friday afternoon, April 24. The gap stop was live. The strategy was switched to puts. The statistical case for entering was fine. I sat it out anyway.

The tariff news cycle had been making binary weekend announcements all month — deals announced Sunday night, breakdowns announced Saturday morning, nothing predictable, all of it moving SPX by 100-150 points at the open. The gap stop is designed to limit damage from exactly these events. In theory, entering was fine. In practice, holding a position into a weekend where the primary risk driver was a single person's social media activity felt like a different kind of risk than the one the backtest modeled.

There's a version of this that's just fear, and a version that's reasonable judgment about unquantifiable event risk. The honest answer is it was probably some of both. The backtest doesn't have a category for "geopolitical uncertainty driven by an individual." It has historical gap frequencies. When the mechanism generating the gaps is new and the frequency is accelerating, historical frequencies understate current risk.

I came back May 1. The strategy entered, ran cleanly, no gap event. The pause didn't cost much. But the reason for documenting it isn't the outcome — it's the decision process. Knowing when the model's assumptions no longer match the environment I'm actually in is part of running a systematic strategy. Sometimes that means sitting out. The system doesn't penalize you for skipping a day. The edge is still there the next morning.

April 10, 2026
Four Losses in Two Days. I Ran Every Test I Could Think Of. Then Did Nothing.

The ORB bots got hurt badly the week of April 7th. ES took four consecutive stop-losses. RTY was no better. The tariff news cycle was creating intraday reversals so fast that both long and short breakout attempts were stopping out on the same day. One trade on April 9th got stopped out in five seconds flat.

The instinct when something like that happens is to fix it — find a filter, a rule, something that would have kept me out of those trades. I tested everything: VWAP alignment, time caps at 30 and 60 minutes, skipping the first few bars after the open, gap size filters. Eight configurations total, tested across 14 years of data on three instruments, with walk-forward validation split into two independent halves.

None of them helped consistently. A filter that improved ES would hurt RTY. Something that worked in the first 7-year half failed in the second. The reason is statistical: the ORB strategy runs a 21% win rate by design, because the reward-to-risk ratio is 8-to-1. Four consecutive losses sounds catastrophic but it sits comfortably within the expected distribution of outcomes. The 14-year backtest contains hundreds of weeks like that one. The annual return absorbs them without flinching.

The whipsaw losses weren't a signal that the system was broken. They were a signal that April 2026 was an unusually volatile, news-driven month — and the backtest already priced in months like that. The right decision was to leave the system alone. That's a harder call than it sounds when you're watching losses print in real time.

April 1, 2026
Every Single Loss Had the Same Fingerprint

March 31, 2026. SPX gapped up hard at the open — right through my short call strike. The position that had been a routine profit-target candidate the afternoon before was now a max loss. That's how it works with overnight gaps: nothing you do during market hours matters, because the damage is already done before the open.

It stung. But more than the loss, it raised a question I hadn't fully answered: is this random, or is there a pattern? I had a strategy with a 99.1% backtest win rate, which means the losses are rare enough that you can actually study each one individually. So I did. I pulled every loss from the 7.5-year backtest and looked for what they had in common.

They all had overnight gaps. Every one. Not some of them. Not most of them. Every losing trade in the dataset began with SPX opening significantly past where it closed the day before — jumping directly into or through my short strike before the market even opened. No amount of intraday monitoring could have saved those positions, because the damage happened while the market was closed.

Once I knew what I was looking at, the fix was straightforward. Gaps larger than 80 points occurred roughly 1.3 times per year historically, and when they did, the spread had about a 50% chance of going to max loss. Expected cost of ignoring this: roughly $930 per event. I added a kill switch that fires only on expiry morning — if SPX opens more than 80 points past my short strike, the bot closes immediately. Small, targeted, high-value filter.

What the analysis also clarified: the loss distribution is not random noise. Losses come from a specific, identifiable event type. That's actually good news — it means the risk is manageable in a way that purely random losses wouldn't be. The bad news is that event frequency can change with regime. Historically 1.3 per year. In spring 2026, several hit within weeks of each other. The strategy's edge was intact. What changed was how much buffer I needed on the sideline to survive long enough to let it play out.

Gap Size (overnight)Frequency (historical)Loss RateExpected Cost / Event
< 40 ptsCommon~0%
40–80 pts~3× / year~15%~$140
> 80 pts~1.3× / year~50%~$930
March 31, 2026
The Scanner That Never Fired (And What That Tells You)

I was going through the strategy config looking for anything that was adding complexity without earning its place. The Kalman IV Z-score scanner had been in there since I built the first version — the idea was to only enter when implied volatility had spiked meaningfully above its recent baseline, capturing richer premium when the environment was favorable. I pulled the logs. It hadn't fired in 247 consecutive trading days. Not once.

The threshold required roughly a 4-5 point intraday VIX spike to fire — those are rare events, not normal trading days. The scanner was supposed to be a quality filter. In practice it was a gate that almost never opened.

I tested lower thresholds. Win rate dropped from 94.7% to between 50-80% depending on configuration. The scanner was selecting for high-IV moments, but high-IV moments during the trading day often mean the market is moving fast — and a fast-moving market is exactly when short options carry the most risk. I was inadvertently filtering toward the most dangerous entry conditions.

Removed it. The unconditional 3:30 PM entry matched or beat every gated version. Same decision applied to the afternoon entry window itself: the data from a 25,000-trade study showed afternoon entries (1 PM and later) outperform morning entries significantly. I enter at 3:30 PM and hold to expiration the next day. No gates, no conditions, no scanner.

Sometimes the most statistically honest conclusion is that a signal has no value. A feature that adds complexity without adding edge is a liability — it's a thing that can break, a parameter that can drift out of calibration, a decision point where you can second-guess yourself. I'd rather have a system with five moving parts that all earn their place than a system with ten where half are decorative.

December 2025
Why I'm Here: How I Actually Got Here

It didn't start with a trading idea. It started with reading.

I went through everything I could find with real numbers behind it. AQR's research on the Volatility Risk Premium documented across every major equity index from 1996 to 2016. CBOE's published data on SPX options market structure and 0-DTE/1-DTE usage. SSRN academic papers on short-dated options pricing and variance risk premium. ERN's 10+ year live track record selling SPX puts daily. Spintwig's published backtests across hundreds of strategy configurations. Zarattini, Aziz, and Barbon's 2024 paper on intraday momentum breakouts and what actually makes ORB strategies work. Not looking for tips. Looking for documented edges — strategies where someone had done the rigorous work and shown a structural reason the opportunity should persist.

The VRP kept coming up across every serious source: implied volatility consistently runs above realized volatility, and the sellers of that gap have earned a Sharpe of roughly 0.6 across two decades of data. That's not noise. It's a structural feature of how markets price fear. I gathered a wide list of candidates — options premium strategies, opening range breakouts, mean reversion setups — and started filtering for what was actually reachable with a small account.

Most institutional approaches aren't. You need capital scale, infrastructure, or instruments retail traders can't access. So the first filter was: what can I run with a few hundred dollars and still model honestly? The second filter was: what has a structural reason to work, not just a historical pattern that might be noise?

Then I started testing. I have a physics background — my default mode is to build a model, test it against data, and let the results tell me what to think. I threw ideas at backtests like darts: different instruments, different entry rules, different exit conditions. Then parameter sweeps on the most promising setups, walk-forward validation to check for overfitting, Monte Carlo to stress-test the distribution of outcomes. The goal wasn't to find the backtest that looked the best. It was to find the one where the edge was real, robust, and survived out-of-sample.

The pipeline for each strategy is the same: read the research → build the model → run the backtests and sweeps → validate with walk-forward and Monte Carlo → build a paper bot → run it on real market data with realistic fills → if the paper results hold up, deploy real capital.

The 1-DTE SPX credit spread was the first strategy to graduate that pipeline. It passed every test I put it through — 7.5 years of data, 98.5% win rate, Sharpe 9.13, max drawdown $759. The paper results held. I put cash in. The ORB futures bots — built on the Zarattini breakout framework and optimized against 14 years of ES, NQ, and RTY data — are in the paper phase right now. They go live when the results are clean for a few consecutive weeks.

That's the whole thing. Not a single strategy. A process for finding strategies that actually work, and a discipline for not deploying capital until they've earned it.

The math works as long as the outliers stay outliers. The journal entries that follow are about the times they didn't, what I did about it, and what I learned.

MetricValue
StrategySPX 1-DTE call credit spread, 10-delta, $5 wide
Backtest period7.5 years of SPX data
Win rate98.5%
Sharpe ratio9.13
Avg P&L / trade$23.63
Annual (estimated, 1ct)~$5,680
Max drawdown$759
Buying power required$464 / contract