← Publications

What Weather Routing Actually Buys

2026-05-25 · weather-routing

Download the full methodology (PDF) · Interactive report with route maps

A note on currents (why this matters)

ERA5 carries no ocean currents, so an initial wind/wave-only run understated value on the current-dominated routes — riding the favorable current / dodging the counter-current is a major real-world lever. We therefore add CMEMS GLORYS surface currents to the replay path (they apply to SOG and fuel), and Agulhas + N.Atlantic were re-run current-aware. As predicted, the missing value was structural (β), not tactical (α): on Agulhas the structural term rose from β ≈ 0.1% (blind) to β = +1.1% (current-aware), while the tactical term was unchanged (α ≈ 0.9%, an upper bound). Currents are quasi-stationary, so they enter the forecast-free β — strengthening the “structural > tactical” conclusion, not overturning it. All figures below are current-aware unless noted.

TL;DR

Weather-routing vendors quote “we save X %”. This study asks which layer the benefit lives in and against what baseline — and the honest answer is much smaller than the headline. We pre-registered a four-stage ROI estimator and ran it over a basket of four regimes (a storm+current corridor, a North Atlantic crossing, a trade-wind negative control, and a short coastal leg), each as a paired counterfactual under the same ERA5-replay weather.

The gain decomposes into geometry + β + α, and α (the only forecast-dependent part) is < 1 % of fuel. All current-aware, % of planned-route fuel, bootstrap 95 % CI:

route	Geometry G	β (seasonal + current)	α (forecast, hindcast UB)	total
Agulhas	11.9 %	+1.1 % [0.46, 1.89]	0.9 % [0.03, 1.85]	14.0 %
N.Atlantic	5.5 %	−1.4 % [−2.25, −0.64]	0.7 % [0.30, 1.19]	4.8 %

Geometry (a one-time route-design correction any shortest-path tool delivers) and β (the climatologically knowable corridor + riding the quasi-stationary current — free from a routeing/current chart) dominate. The tactical α — the marginal value of reacting to this departure’s forecast, which is all a real-time service adds — is under 1 % of fuel, and that is a hindcast upper bound (the operational figure is lower).

Two ROI numbers, because the baseline is the whole game. Against a ship’s planned route the optimiser appears to save 14.0 % (Agulhas) / 4.8 % (N.Atlantic). Against the constraint-respecting geodesic — the honest no-weather-routing baseline — the physics-ceiling ROI is small or negative: Agulhas +2.3 % [0.9, 3.9], N.Atlantic −0.7 % [−1.6, −0.05]. Even with currents, the optimiser barely beats (Agulhas) or loses to (N.Atlantic) the straight shortest path; the headline is mostly route-design + structure.
The negative control passes. On the benign trade-wind route the optimiser’s weather term is −0.2 % (CI [−0.31, −0.02], ≤0) — it does not manufacture a saving where there’s no weather to exploit. Unbiased.
North Atlantic is the sharpest result. Even current-aware, the optimiser’s storm/current-dodging swings (a corridor 3–6× the model grid — genuine tactical routing) add more distance than they save, so the optimised route is worse than the straight geodesic (physics-ceiling negative). Real-time spot optimisation adds little-to-negative value here, in hindcast (= the upper bound).
Everything here is an upper bound. Replay gives the optimiser the true future; the operational (forecast-error) figure is strictly lower (~44 % of benefit lost to forecast error in the literature). Surface currents are now included (CMEMS GLORYS); they entered β (structural), confirming the conclusion.

The honest reframing the evidence supports: weather routing = a cheap, robust seasonal-structural core (route design G + climatology/current β) + a small, tail-concentrated tactical premium α that real-time services sell — and the marketed headline conflates all of it against a flattering baseline.

1. The four-stage ROI estimator (why two ROI numbers)

A ship’s planned route departs from the great circle for reasons unrelated to weather: hard law (coast- distance minima, TSS) and soft law (the master’s prudence). Crediting the optimiser for undoing those is double-counting. So we report two numbers per voyage:

operational ROI = optimised vs the planned RTZ (soft-law corridor) — the deliverable number.
physics-ceiling ROI = optimised vs the constrained geodesic (shortest path subject to land + hard law) — the honest ceiling, route-design slack removed.

The constrained geodesic is built by a land-masked A* (using the optimiser’s own land mask, so both respect the same constraints — constraint symmetry) + a shore-aware string-pull. Every saving is split exactly into a distance term and a per-mile weather term; only the weather term is genuine weather-routing ROI.

2. Headline — two ROI numbers, the weather term, and the distance/weather split

All current-aware (CMEMS GLORYS); bootstrap 95 % CIs, 10 000 resamples, seed 20260524, paired by departure.

route	regime	operational ROI (vs planned)	physics-ceiling ROI (vs geodesic)	weather term (% fuel)	distance term
Agulhas	storm + Agulhas current	14.0 % [12.3, 15.7]	+2.3 % [0.9, 3.9]	6.9 % [5.0, 8.8]	7.1 %
N.Atlantic	depressions + Gulf Stream	4.8 % [3.7, 6.0]	−0.7 % [−1.6, −0.05]	4.2 % [2.9, 5.6]	0.6 %
Trade-wind (neg. control)	benign trades, no current	−0.8 % [−1.1, −0.5]	−0.8 % (= op; planned ≡ geodesic)	−0.2 % [−0.31, −0.02]	−0.6 %

Even with currents, the physics-ceiling ROI is small or negative. On Agulhas the geodesic rounds the Cape through the favorable current, so the optimiser’s edge over it is only +2.3 %. On N.Atlantic the optimiser loses to the straight geodesic (−0.7 %): its storm/current-dodging detours add distance that does not pay off. The weather term (genuine wind/wave + current routing, vs the planned route) is 4–7 %; the 14 % Agulhas headline is the mechanism by which a least-distance baseline inflates “weather skill” into route-shortening.

3. Layer 2 — structural vs tactical (corridor-width test)

Current-aware route spread (geodesic-Fréchet), within- vs across-season:

route	within-season	across-season	ratio	route band / grid Δ
Agulhas	89.6 nm	96.0 nm	0.93	1.4× (near-single corridor)
N.Atlantic	123.5 nm	134.5 nm	0.92	3.1× (wide, weather-dependent)
Trade-wind	36.1 nm	33.2 nm	1.09	1.0× (one corridor)

Within-season ≈ across-season everywhere (ratio ~0.9): the optimised route barely depends on the season — it is a near-invariant structure. Agulhas and the trade control sit close to the grid limit (structurally predictable from climatology, no forecast needed); N.Atlantic swings wider (3× the grid) — this is real tactical routing — but those swings don’t pay off (§4).

4. Layer 3 — the tactical premium (the money number)

α is exactly the tactical premium: fly each season’s medoid optimised route through the month’s actual weather and currents, and compare to the spot-optimal. Current-aware: - Agulhas: α = 0.9 % of fuel [0.03, 1.85], tail-concentrated — austral winter (JJA) ~2.5 %, calm shoulder months near zero or slightly negative. The forecast-reactive value is real but lives only in the stormy tail. - N.Atlantic: α = 0.7 % of fuel [0.30, 1.19], roughly flat across seasons — even on the canonical storm route, reacting to each departure’s specific weather buys under 1 % of fuel, in hindcast. In both cases α is small, and it is a hindcast upper bound: with real forecast error the operationally realisable tactical value is lower still. The tactical layer, where real-time services live, is not where the fuel is.

5. Layer 1 — seasonal dispersion & the driver

Baseline fuel on the fixed route, and how well a weather-severity index explains it: - Agulhas: mean 193.2 t, CV 4.1 %, driver R²(fuel ~ severity) 0.88. - N.Atlantic: mean 196.7 t, CV 10.5 % (big winter depressions), driver R² 0.97. - Trade-wind: mean 126.2 t, CV 2.3 % (benign), driver R² 0.65. Routing’s value scales with the severity the route is exposed to — confirming “when does routing earn its keep”: in the stormy tail, not the median voyage.

6. Production stability — convergence (#10), waypoint stress test, resolution

Convergence: after a distance-scaled time-step fix, the isochrone engine converged 12/12 on every basket route (48/48 isochrone runs), vs 2/8 before the fix on the long Agulhas route. Reliable convergence is the precondition for any ROI claim — now met across regimes and route lengths.
Waypoint stress test (#10): on Agulhas (Jul departure), the baseline machinery handled 6→200 waypoints with zero self-intersections, zero land crossings, bounded RSS (1.9→2.7 GB), stable wall-time; the isochrone produced identical output (169.24 t) for every corridor waypoint count 6→50 — it robustly finds its own path regardless of input density (the 50-waypoint schema cap is a sound guard). Engines are stable in production as waypoints multiply.
Resolution: isochrone fuel/distance are identical at 0.5° and 0.25° (all 12 months) — the isochrone is wavefront-based, not grid-based, so grid_resolution_deg doesn’t perturb it (grid engines would be sensitive, but are too slow to replay-bench).

7. Charter weather evidence (#12)

Beaufort/Douglas passage distributions + % passage time above the charter exclusion thresholds (Bft 5 / Douglas 4): - N.Atlantic: 87.5 % of passage time above exclusion (Bft 7–9 reached, Douglas 5–7 common) — a genuinely heavy-weather route. - Trade-wind: 90.5 % — but for a different reason: it sits steadily at Douglas 4 (73 % of the time). The trades are benign for routing (low variability, nothing to dodge) yet mostly above a Douglas-3 good-weather threshold. The exclusion metric measures absolute sea state, not variability — a distinction a charterer should understand. One-pagers: beaufort_douglas_{natl,trade_satl}.pdf (+ Agulhas).

8. Honest scope and the follow-on

This is a hindcast study (perfect foresight = an upper bound) of wind/wave routing (ERA5 has no currents — N.Atlantic’s Gulf-Stream value is unrepresented) on baseline + isochrone (the Cython A* engines are forecast-only; astar/dijkstra appear replay-compatible but are too slow to bench). A method caveat: the geodesic/planned baselines use voyage_calculate while the optimised route uses the optimiser path, so a small code-path component may sit in the physics-ceiling number — the operational number and the weather term are computed within consistent paths and are the robust headline. The operational (forecast-error) ROI — the real number — is lower and needs an archived forecast aligned to historical dates, which we do not have, so it is specced, not run (we do not fabricate forecasts). The current-riding β (CMEMS climatology) is likewise a specced follow-on.

What the confirmatory evidence supports, against the pre-registered protocol: the defensible weather-routing ROI is a few percent of fuel, lives mostly in route design + season, and is materially smaller than the headline once measured against the right baseline — with the tactical, real-time layer adding little-to-negative value in hindcast on the routes tested. The decision rule is met for production trust on convergence/stability (#10 ✓) and on the negative control (unbiased ✓); it counsels against marketing real-time spot optimisation as the source of the saving.

Method note. The estimator runs each voyage in-process through a paired ERA5 + CMEMS-GLORYS replay (the same realized weather and surface currents for every baseline), benchmarked against a constraint-respecting geodesic rather than the planned route, with bootstrap confidence intervals over twelve monthly departures and a pre-registered protocol fixed before any run. Hindcast figures are upper bounds on the operationally realisable, forecast-limited value; the operational (forecast-error) arm and an out-of-sample multi-year climatology are the specced next steps. Full methodology in the PDF; interactive route maps and the decomposition in the report.