Daily briefing
All value bets across every open market, ranked by edge. Status tells you whether to bet now or wait for fresher models.
Model freshness — current time: UTC
Analyse market
Paste a Polymarket URL to load buckets automatically, then run the forecast analysis.
Step 1 — Load from Polymarket
Paste a URL above — city, date and buckets will all fill in automatically.
Step 2 — Confirm details
Step 3 — Buckets
Temperature bucketMarket yes %
Results tracker
Paper trades logged automatically from the briefing. Results resolve overnight — check here each morning.
📋 Paper trades
Auto-logged from briefing · resolves overnight from Polymarket
Manual trade log — for real bets or manual paper trades
Weather Edge — User Guide
The definitive reference: how the app works, how to use it, and the betting strategy behind it.
1 · What this app does and why it works

Polymarket runs daily weather markets where traders bet on whether the temperature in a given city will land in a specific range. The market price reflects collective human judgment about the probability. Weather Edge replaces that human judgment with something better: a 31-member numerical weather prediction ensemble that produces a genuine probability distribution across outcomes.

The edge comes from a structural asymmetry that will not close: processing 31 ensemble forecasts and comparing them to market prices requires System 2 thinking — slow, deliberate, computational. The people pricing these markets use System 1 — fast, intuitive, heuristic. System 1 is incapable of running the calculation. This is not a temporary inefficiency. It is permanent, because it is cognitive.

The app automates the entire System 2 process: fetch models → compute probabilities → compare to market → size the bet → log and track. Your job is to check it twice a day and press the button.

2 · The four data sources
GFS ensemble (31 members)

The American Global Forecast System run 31 times with slightly different initial conditions. Each run produces a complete temperature forecast. We count what fraction of the 31 runs land in each bucket after rounding to whole degrees — matching Wunderground's resolution, which is how markets resolve. This is the primary betting signal. If 14 of 31 members show a daily high of 19°C, model probability = 45%.

ECMWF IFS (deterministic + ensemble)

The European Centre for Medium-Range Weather Forecasts — generally considered the world's most accurate NWP system. Used as a cross-check against GFS. When both agree, confidence is high. When they diverge by more than 1.5°, caution is warranted. The briefing runs GFS-only for speed. The full Analyse tab fetches both.

GFS deterministic

The single best-estimate GFS forecast (as opposed to the ensemble). Used as a sanity check — it should be close to the ensemble mean. Large divergence between deterministic and ensemble mean is a flag.

METAR (live station)

Current observed temperature at the exact airport station the market resolves against. Most valuable for same-day markets. Requires a free CheckWX API key in your Cloudflare environment variables.

3 · Your daily workflow

The 80/20 answer: set one alarm for 07:05 BST. That single session captures the majority of available edge.

Primary — 07:05 BST daily

GFS 06Z and ECMWF 00Z both available ~06:00 BST. Both models fresh simultaneously. Markets dormant all night — maximum anchoring gap. European markets not yet repriced. US traders asleep. Structurally the best window of the 24-hour cycle. Run briefing, click Log all BET NOW, done in 15 minutes.

Secondary — 19:00 BST daily

GFS 12Z and ECMWF 12Z both available ~18:00 BST. Good for US markets — evening repricing often incomplete. Worth doing; adds roughly 25% more opportunities. Best window for Asian cities (Tokyo, Singapore, HK) whose trading day is ending.

Opportunistic — ~00:30 BST (if you are up)

GFS 18Z available ~midnight BST. ECMWF won't update until ~06:00. The app fires BET NOW on GFS alone if spread ≤1.5° and edge ≥12pp — shown with a blue GFS only badge. Do not set an alarm for this. US markets have the best anchoring here.

4 · Model freshness — the three-tier status system
Tier 1 — Both GFS and ECMWF fresh

Maximum confidence. BET NOW fires when edge ≥10pp, spread tight, members sufficient. Standard case at 07:05 and 19:00 BST.

Tier 2 — GFS fresh, ECMWF stale (blue GFS only badge)

BET NOW fires only if GFS spread ≤1.5° AND edge ≥12pp. A tight spread means 31 members are self-consistently clustering — the ensemble is its own cross-check. Applies at the midnight window. Slightly lower confidence than Tier 1; consider reducing Kelly by one star rating.

Tier 3 — GFS stale

Always WAIT regardless of ECMWF freshness. GFS ensemble is the primary signal — stale primary = no reliable edge. Check again after the next GFS update.

GFS runs 00Z/06Z/12Z/18Z + ~5h lag. ECMWF runs 00Z/12Z + ~5h lag. In BST: GFS available ~05:00/11:00/17:00/23:00. ECMWF ~06:00/18:00.

5 · Reading the daily briefing table

Each row is the single best opportunity in that market — the bucket with the largest model-vs-market divergence.

Dir — Direction

YES — model thinks this bucket is more likely than market implies. Buy YES shares. NO — market has overpriced this bucket. Buy NO shares. Both directions are equally valid; the edge calculation is symmetric.

Model% — Model probability

Fraction of GFS ensemble members landing in this bucket after rounding to whole degrees. 14 of 31 = 45%. Real probability from real forecast data. Briefing uses GFS only; Analyse tab adds ECMWF for a combined figure.

Mkt% — Market implied probability

Current YES price as a percentage. Note: this is not the true fair-value probability — Polymarket takes ~2% fees and there is a bid-ask spread. Edge figures are overstated by roughly 2-4pp. Never bet on edges below 8pp gross.

Edge — The opportunity

Model% minus Mkt%. Green (+) = underpriced, bet YES. Red (−) = overpriced, bet NO. Edge is a point estimate with considerable uncertainty — with 31 members, model probabilities are granular to ~3pp. Do not treat edge figures as precise.

Kelly — Suggested bet size

Quarter-Kelly from your bankroll, scaled by lead time and confidence stars. Stars apply a multiplier (★★★ = 100%, ★★☆ = 50%, ★☆☆ = 25%). A dash means no active market — skip. Always treat Kelly as a ceiling, not a target. Until you have 30+ calibrated trades, consider halving the displayed figure for real bets.

Spread — GFS internal uncertainty

Standard deviation of the 31 GFS members. ±0.8° = confident. ±2.5° = uncertain — consider halving Kelly. Tight spread is also the condition enabling BET NOW without ECMWF at the midnight window.

Stars — Combined confidence rating

★★★ all conditions favourable: 25+ members, models agree within 1.5°, spread tight, edge above 12pp. ★★☆ one condition marginal. ★☆☆ multiple conditions weak — speculative only. Stars feed into Kelly multiplier automatically.

6 · The status signals

BET NOW Edge ≥10pp · GFS fresh · spread acceptable · members sufficient. The reason any condition is not met is shown inline.

BET NOW GFS only  Tier 2 — GFS fresh, ECMWF stale, but spread ≤1.5° and edge ≥12pp. Actionable with slightly lower confidence.

WAIT Edge exists but one or more conditions unmet. The specific reason is shown. Check again after the next model run.

PASS Edge below threshold. Not worth trading after fees.

7 · Betting strategy — from conventional edge to Kahneman/Taleb

The people pricing these markets are not irrational. They are human. Human brains run decision-making shortcuts that create predictable, systematic, exploitable errors. Two intellectual frameworks from Nobel Prize winners map precisely onto the opportunities this app finds.

Daniel Kahneman — Thinking, Fast and Slow

Human decision-making runs on two systems. System 1 is fast, automatic, and intuitive — it handles roughly 96% of all decisions, including pricing a temperature market at 8am. System 2 is slow and deliberate — the kind required to process 31 ensemble runs and calculate probability distributions. System 1 can't do that. It will never do that. This structural gap is the permanent, non-arbitrageable core of your edge.

Nassim Taleb — The Black Swan, Fooled by Randomness

Markets built on human intuition systematically underprice tail events — the extreme outcomes that feel unlikely because they rarely come to mind easily. This mispricing is structural. It persists as long as humans price markets. The tail buckets in temperature markets face a double discount: statistically underpriced (Taleb) and psychologically avoided (Kahneman). That is where the highest-payout opportunities live.

WYSIATI — What You See Is All There Is. The market prices what it can see: yesterday's weather, the BBC headline, the season. It cannot see 31 ensemble runs, the spread across members, or the ECMWF divergence. You can. That is the entire edge in one sentence.

Anchoring. The first prices set on a market are highly sticky. Even when new model data arrives, the market under-adjusts from the opening anchor. The largest edges appear in the 1–2 hours after model updates, before the market has repriced. This is why timing matters.

Availability bias. After a cold spell, cold feels probable. After a hot week, warmth feels inevitable. The ensemble has no memory of last week — it uses only current atmospheric state. After unusual weather in one direction, the opposite tail is systematically underpriced.

Loss aversion. People avoid long-shot tail bets emotionally. Losing a 10:1 bet feels worse than the arithmetic loss. This suppresses tail bucket prices beyond statistical analysis alone, creating a structural double discount on extreme outcomes.

Overconfidence — the rule for you. Algorithms consistently outperform expert judgment in complex probabilistic environments. Kahneman proved this repeatedly. You are the algorithm. Never override the model based on personal weather intuition. The moment you do, you have become the market you are trying to beat.
The four opportunity types
Type A — Daily edge (WYSIATI + Anchoring) · Live now

The standard BET NOW. Models fresh, GFS spread tight, edge clear, days 1–2. The market has anchored on yesterday's prices and the new model data has not yet been absorbed. Your bread and butter — this is what the daily briefing finds automatically.

Example: At 07:10 BST, GFS ensemble shows 19°C as 45% likely in London tomorrow. Market prices it at 6%. Edge +39pp, spread ±0.9°, ★★★. The anchor was set last night when 19°C felt like a stretch. The model now says otherwise. Act within 2 hours before the market reprices.

Type B — Recency play (Availability bias) · Live now

After a sustained run of unusual weather, the market overweights continuation and underweights reversion. The ensemble has no memory — it sees only current atmospheric conditions. When the model starts showing reversion that the market has not yet priced, check the opposite tail buckets manually.

Example: London has had five consecutive days above 28°C. The market is pricing warmth continuation heavily. The GFS ensemble suddenly shows the cold front arriving — 14°C or below is now 35% likely but priced at only 8%. The market is stuck on last week's weather. After any sustained unusual run, the Type B opportunity is the first thing to check.

Type C — Barbell bet (Loss aversion + Fat tails) · Live now

A tail bucket trading at very low odds (3–8%) where the ensemble shows meaningful support. The market's loss aversion suppresses the price below even the already-low statistical probability. Small stake, high payout, hold to resolution. The strategy is barbell: many small tail bets alongside the standard plays. The law of large numbers works in your favour across many such bets.

Example: The 22°C+ bucket in London on a June day is priced at 4%. Six of 31 GFS members show it — model probability 19%. Edge +15pp. Kelly suggests $3. Place it, log it, do not check it obsessively. You are not predicting the outcome — you are exploiting structural underpricing of tails across many bets over time.

Type D — Base rate signal (Base rate neglect) · Next update

The market prices today's forecast but systematically ignores the long-run historical frequency of outcomes. Open-Meteo's historical ERA5 dataset covers 80+ years. When current market odds diverge significantly from the empirical base rate for that city and month, that is a base rate neglect play — independent of what today's model says. When GFS ensemble, ECMWF, and the historical base rate all agree the market is wrong in the same direction, that is the highest-confidence opportunity in the entire framework.

Example (coming): A June day above 25°C in London has occurred 12% of the time historically across 80 years of ERA5 data. The market prices it at 5% due to a recent cold week. Even without a strong model signal, the base rate alone implies 7pp underpricing. The app will show this as a third column in the bucket table — Historical% alongside Model% and Mkt%.

8 · Kelly sizing — using it correctly

Kelly criterion gives the mathematically optimal fraction of bankroll to stake given an edge. Full Kelly requires that your probability estimate is correct. Ours is not yet verified by calibration data.

Quarter-Kelly baseline. 25% of full Kelly. Accounts for model estimation uncertainty — our probability estimates could be off by 10-15pp given 31 members and no historical calibration yet.

Lead-time scaling. Days 1–2: 100%. Days 3–4: 75%. Days 5–6: 50%. Day 7: 25%. NWP skill degrades with lead time. These steps are approximations of a continuous decay in forecast skill.

Star multiplier. ★★★ = 100%, ★★☆ = 50%, ★☆☆ = 25% of the scaled Kelly. Translates confidence directly into stake size.

5% bankroll cap. Hard ceiling per trade. Prevents ruin on data errors.

The honest caveat. All multipliers are derived from intuition, not empirical calibration. Once you have 50+ resolved trades and the reliability diagram shows the model is well-calibrated, consider moving to half-Kelly. Until then, treat displayed Kelly as a ceiling and start with half that amount in your first real-money bets.
9 · Reading the reliability diagram

The reliability diagram is the single most important diagnostic in the tracker. It answers the fundamental question: when the model says 40% probability, does it actually win 40% of the time? It appears once you have 15 resolved trades.

How to read it. X axis = model predicted probability. Y axis = actual win rate. The dashed diagonal = perfect calibration. Dots are sized by trade count in that probability bucket — bigger dots are more statistically reliable. Numbers above each dot show the count.

Points above the diagonal — underconfident. Model predicts 40% but you win 55% of the time. The model is being too conservative — true edge is larger than calculated. Once confirmed across 30+ trades, you can be more aggressive with Kelly sizing.

Points below the diagonal — overconfident. Model predicts 60% but you win 45% of the time. The model overstates certainty — Kelly stakes are too large. Reduce Kelly until the diagram corrects. This is the more dangerous pattern.

Systematic directional bias. If YES bets cluster below the diagonal but NO bets sit on it, GFS is running warm — overestimating high-temperature outcomes. The city analysis panel will flag this separately and suggest a stake adjustment.
Brier score

Mean squared error of probability forecasts. Shown as the fourth stat tile once 15 trades are resolved. 0 = perfect. 0.25 = uninformative coin flip. A well-calibrated weather model achieves 0.15–0.20 for day-1 forecasts. Below 0.15 is excellent. Above 0.22 suggests either poor calibration or you are betting markets where the model has no real edge.

Important caveat. 15 trades renders the diagram but is far too few for conclusions — most buckets will have 1–2 data points. Treat as directional only until 50+ trades spanning varied weather conditions. A heatwave producing 10 consecutive trades in one probability band is not 10 independent calibration points due to autocorrelation.

10 · City bias analysis

The city analysis panel in the tracker diagnoses whether GFS has a systematic warm or cold bias for each location. Cards appear at 5+ resolved trades per city; bias diagnosis unlocks progressively.

The YES/NO split. The key diagnostic. If GFS runs warm, YES bets on high-temperature buckets lose more than expected while NO bets win more. A city showing 50% overall hit rate but YES 38% / NO 62% has a warm GFS bias that aggregate stats would miss entirely.

Confidence gates. Below 10 trades: data shown, no diagnosis. 10–19 trades: tentative signal, no adjustment. 20–29 trades: emerging pattern, 10% stake reduction on biased direction. 30+ trades: 25% reduction. These thresholds are conservative by design — sequential trades during a single weather regime are not independent observations.

The sparkline. The coloured squares at the right of each city card show the last 10 outcomes (green = win, red = loss). This tells you whether a bias pattern is recent or historical — a streak of recent reds during a heatwave may be regime noise rather than systematic model error.

Non-stationarity. GFS bias varies by season and is reset by model upgrades. A warm bias measured in June may not hold in October. Treat city bias as a rolling signal, not a fixed correction.

11 · Known limitations — what a statistician would say

This app generates systematic hypotheses and collects calibration data. It is not yet a validated betting system. Be honest about these limitations:

31 members is not a large ensemble. Probability estimates are granular to ~3pp. True probability could differ from our estimate by 10pp in either direction. Edge figures are point estimates, not precise quantities.

Members are not independent. Generated by perturbing a single initial state. Effective sample size for capturing true atmospheric uncertainty is considerably less than 31.

Edge is overstated by ~2-4pp. We compare to the displayed market price, not the true breakeven price after fees and spread.

Kelly multipliers are arbitrary. Not derived from empirical data on this model. Reasonable starting points, nothing more.

No multiple testing correction. The briefing scans hundreds of markets per session — some will show spurious edge by chance. High-confidence rows (★★★, tight spread) are more likely to reflect real edge.

Selection bias in hit rate. We only bet when edge exceeds a threshold. Our observed hit rate is conditional on having detected edge, which correlates with model bias. Hit rate is not an unbiased estimator of model accuracy.

One season is not calibration. 50 trades in June tells you about summer. Almost nothing about winter or market regime changes. Stay humble across seasons.
12 · What comes next
Historical base rates (ERA5). Open-Meteo's historical API covers 80+ years of daily temperatures. For each market the app will fetch the empirical frequency for that city, month, and bucket — a third signal implementing the Type D opportunity. Will appear as Historical% in the bucket table.

Automatic bias compensation. Once city bias is confirmed at 30+ trades, Kelly will automatically reduce stakes on the biased direction rather than requiring manual application.

Regime-conditional calibration. Separate reliability diagrams for anticyclonic vs cyclonic conditions, short vs long lead time. Pooling all trades hides the structure that matters for betting decisions.

Bootstrap confidence intervals. Error bars on hit rate, edge, and Brier score. Every number currently presented as a point estimate should carry uncertainty bounds.