Math field is 'cooked' — AI will solve research-level mathematics (first open hard math problem imminently)
Predictor: Alex Wissner-Gross · ep#238 "Meta Buys Moltbook, GPT 5.4, and Fruitfly Brain Upload | Moonshots Live at The Abundance Summit 238" · source
Prediction text
Math field is 'cooked' — AI will solve research-level mathematics (first open hard math problem imminently) | math is cooked. We're we're seeing I think 38% capability... And there are even rumors even in the past 24 to 48 hours that the next tier up the so-called open problems benchmark that 5.4 is reportedly rumored to be on the verge of solving the first open hard math problem.
Verbatim quote
math is cooked. We're we're seeing I think 38% capability... And there are even rumors even in the past 24 to 48 hours that the next tier up the so-called open problems benchmark that 5.4 is reportedly rumored to be on the verge of solving the first open hard math problem.
Resolution evidence
GPT-5.4 claimed 38% on Frontier Math Tier 4; DeepMind AlphaProof trajectory to IMO gold 2024-2025. Math not fully 'cooked' but progressing rapidly.
Predictor: Alex Wissner-Gross
Calibration plot (stated vs observed)
Evidence about this node from Alex Wissner-Gross is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).
Reference class
This node isn't linked to a reference class. The Bayesian update applies without outside-view blending.
Probability over time
Milestone chain
- 2026-07-15 → 2026-08-31pendingAI achieves IMO Gold (top-30 score) on 2026 problemsHow: AI lab announces system achieving IMO Gold Medal score (typically 25/42+ on 2026 problems), with results published by ImoGrandChallenge or peer evaluatorSource: DeepMind AlphaProof, OpenAI math results announcementsconf 75%Notes: DeepMind already achieved Silver in 2024; Gold by 2026 is the natural progression. ImoGrandChallenge.org tracks this.
- 2026-06-01 → 2026-12-31pendingFrontierMath benchmark passes 50% by frontier modelHow: Top model on FrontierMath benchmark crosses 50% accuracy (current best as of late 2025 was ~25-35%)Source: https://epoch.ai/frontiermath — Epoch AI's FrontierMath leaderboard. Anthropic/OpenAI/DeepMind blog announcements.conf 65%Notes: FrontierMath is the canonical 'research-level math' benchmark — 200 problems by working mathematicians, designed to be hard.
- 2026-06-01 → 2027-03-31pendingFirst open math problem solved by AI publicly announcedHow: Frontier AI lab announces solution to a previously-open math problem (Erdős, Millennium, or peer-reviewed open conjecture) with verification by mathematiciansSource: Lab blog posts, arXiv preprints with mathematician co-authors, Quanta Magazine coverageconf 45%Notes: Wissner-Gross referenced 'rumors' of GPT-5.4 solving an open problem in late 2025. If true, this would already be a HIT.
- 2026-09-01 → 2027-12-31pendingMathematician community publishes paper acknowledging AI as research collaboratorHow: Peer-reviewed math paper lists AI system as essential research collaborator (not just tool) with acknowledgment from working mathematicians (e.g., Terence Tao writeup style)Source: arXiv math papers, Annals of Mathematics, Tao's blog (terrytao.wordpress.com)conf 60%
What if this resolves?
Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"
Evidence chain
Raw metadata
{
"source": "backfill_resolution_history.py",
"status": "partial",
"bayesian_v2": false,
"outcome_prob": 0.5,
"evidence_kind": "resolution_terminal",
"posterior_prob": 0.5,
"delta_to_outcome": -0.14456000000000002,
"inside_posterior": 0.64456,
"validation_notes": "GPT-5.4 claimed 38% on Frontier Math Tier 4; DeepMind AlphaProof trajectory to IMO gold 2024-2025. Math not fully 'cooked' but progressing rapidly.",
"validation_status": "hit",
"pre_resolution_prob": 0.64456,
"resolution_evidence": "GPT-5.4 claimed 38% on Frontier Math Tier 4; DeepMind AlphaProof trajectory to IMO gold 2024-2025. Math not fully 'cooked' but progressing rapidly.",
"does_not_update_current_prob": true
}Network propagation neighbors
Top incoming (parents)
Edges that influence THIS node's belief
| Kind | Node | Their prob | P(c|s=T) | P(c|s=F) | Δ implied |
|---|---|---|---|---|---|
| prereq | 234_012 Anthropic revenue will cross OpenAI revenue in middle of 202 — Peter Diamandis | 67.1% | 0.720 | 0.050 | -0.111 |
| prereq | SEM_042 2025 will be the definitive year that agentic systems finall — Kevin Weil | 73.8% | 0.720 | 0.050 | -0.069 |
| prereq | SEM_012 Nvidia quadrupled chip production output while only doubling — Jensen Huang | 75.0% | 0.720 | 0.050 | -0.059 |
| prereq | SEM_008 Training runs costing $10 billion for a single model will co — Dario Amodei | 76.9% | 0.720 | 0.050 | -0.047 |
| killer | TK03 AI Regulatory Moratorium (EU/US Capability Freeze) | 10.0% | 0.050 | 0.720 | +0.046 |
Top outgoing (children)
Predictions THIS node influences
| Kind | Node | Their prob | P(c|s=T) | P(c|s=F) | Δ implied |
|---|---|---|---|---|---|
| prereq | 239_004 xAI/Grok will catch up and exceed competitors on coding by m — Elon Musk | 40.2% | 0.500 | 0.050 | -0.084 |
| prereq | 232_055 We're exiting the industrial age permanently as recursive se — Peter Diamandis | 35.5% | 0.700 | 0.050 | +0.083 |
| prereq | 235_030 Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 203 — Ray Kurzweil | 39.2% | 0.750 | 0.050 | +0.075 |
| prereq | 241_038 Chinese AI strategy is edge computing focused vs US AGI/ASI — Eric Schmidt | 43.3% | 0.600 | 0.050 | -0.055 |
| prereq | 241_043 ASI will arrive within 2 years to 5 years to this next decad — Peter Diamandis | 35.9% | 0.650 | 0.050 | +0.049 |
Ticker exposure
Beneficiaries (23)
Adverse (6)
Prerequisites (8)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| prereq | SEM_008 | Training runs costing $10 billion for a single model will commence sometime in 2025. | AI | — |
| prereq | 238_009 | Recursive self-improvement is already happening now (no longer three years out) | AI | — |
| prereq | 234_012 | Anthropic revenue will cross OpenAI revenue in middle of 2026 | Markets/Stocks | — |
| prereq | SEM_012 | Nvidia quadrupled chip production output while only doubling human headcount — achieved by deploying AI coding tools (Cursor, Claude Code) across engineering. | AI/Manufacturing | — |
| prereq | SEM_042 | 2025 will be the definitive year that agentic systems finally hit the mainstream. | AI/Agents | — |
| killer | TK14 | Superbubble Pop (S&P 500 -40%, Moonshot Capital Evaporates) | — | — |
| killer | TK01 | AGI Capability Plateau (2026-27 Training Stall) | — | — |
| killer | TK03 | AI Regulatory Moratorium (EU/US Capability Freeze) | — | — |
Dependents (8)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| prereq | 235_030 | Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 2033. | Biotech/Longevity | — |
| prereq | 232_055 | We're exiting the industrial age permanently as recursive self-improvement unfolds. | AI | — |
| prereq | 241_043 | ASI will arrive within 2 years to 5 years to this next decade | AI | — |
| prereq | 231_013 | Math is cooked (will be solved), physics cooked, biology char broiled. | AI | — |
| prereq | 241_038 | Chinese AI strategy is edge computing focused vs US AGI/ASI centered | AI | — |
| prereq | 241_025 | Elon Musk predicts launch per hour cadence to populate satellite constellations | Space | — |
| prereq | CMQ_002 | By 2028, AI systems will reach 'independent researcher' level — driving autonomous scientific discoveries without human intervention. | AI | — |
| prereq | 239_004 | xAI/Grok will catch up and exceed competitors on coding by mid-2026 | AI | — |
Expected milestones (1)
| Expected by | Description | Status |
|---|---|---|
| 2026-06-30 | [Capability 2026-06] OpenClaw agents by statistical chance. [238_020] Math field is 'cooked' — AI will solve research-level mathematics (first open ha | pending |
Validations (1)
| Observed at | Status | By | Notes |
|---|---|---|---|
| 2026-04-29 | hit | thesis_timeline_v1.0_import | GPT-5.4 claimed 38% on Frontier Math Tier 4; DeepMind AlphaProof trajectory to IMO gold 2024-2025. Math not fully 'cooked' but progressing rapidly. |
Linked documents (10)
| Sim | Source | Title | Market prob | Polarity | Reviewed | Published |
|---|---|---|---|---|---|---|
| 0.676 | arxiv | Benchmarks in Leipzig | — | mentions | pending | 2026-06-04 |
| 0.669 | manifold | Will research-level math become a sport akin to chess before 2035? | 12% | mentions | pending | 2026-05-09 |
| 0.623 | arxiv | Knowing What to Solve Before How: Preplan Empowered LLM Mathematical Reasoning | — | mentions | pending | 2026-05-28 |
| 0.592 | manifold | Will I solve an Erdos problem? | 6% | mentions | pending | 2026-04-27 |
| 0.580 | github_release | facebookresearch/hydra v1.0.3 | — | mentions | pending | 2020-09-23 |
| 0.570 | manifold | Will a particular friend of mine crack anyone while at math camp? | 8% | mentions | pending | 2026-04-24 |
| 0.565 | manifold | What will my mathcounts state score be? | — | mentions | pending | 2026-04-26 |
| 0.555 | polymarket | Boston Red Sox vs. New York Yankees | 43% | mentions | pending | 2026-05-30 |
| 0.554 | polymarket | Chicago Cubs vs. Chicago White Sox | 55% | mentions | pending | 2026-05-09 |
| 0.553 | polymarket | Boston Red Sox vs. New York Yankees | 49% | mentions | pending | 2026-05-31 |
Raw metadata
{
"nia": false,
"qty": "38% on Frontier Math Tier 4",
"url": "https://www.youtube.com/watch?v=d__HRChE2ZE",
"mode": "PREDICTION",
"role": "Host",
"caveats": "rumors",
"context": "now with GPT 5.4 turned up to maximum reasoning capability, we're seeing finally, and this was a prediction I think in our prediction episode, math is cooked. We're we're seeing I think 38% capability... And there are even rumors even in the past 24 to 48 hours that the next tier up the so-called open problems benchmark that 5.4 is reportedly rumored to be on the verge of solving the first open hard math problem.",
"to_year": 2026,
"verbatim": "math is cooked. We're we're seeing I think 38% capability... And there are even rumors even in the past 24 to 48 hours that the next tier up the so-called open problems benchmark that 5.4 is reportedly rumored to be on the verge of solving the first open hard math problem.",
"conv_cues": "math is cooked; on the verge",
"direction": "HAPPEN",
"from_year": 2026,
"timeframe": "Imminent",
"conv_level": "HIGH",
"milestones": [
{
"kind": "prereq",
"label": "Nvidia quadrupled chip production output while only doubling human headcount — achieved by deploying AI coding tools (Cursor, Claude Code) a",
"status": "hit",
"weight": 0.5,
"ordinal": -5,
"source_id": "SEM_012",
"expected_date": "2026-04-29",
"observed_date": "2026-04-29"
},
{
"kind": "prereq",
"label": "Training runs costing $10 billion for a single model will commence sometime in 2025.",
"status": "hit",
"weight": 0.5,
"ordinal": -4,
"source_id": "SEM_008",
"expected_date": "2026-04-29",
"observed_date": "2026-04-29"
},
{
"kind": "prereq",
"label": "Anthropic revenue will cross OpenAI revenue in middle of 2026",
"status": "hit",
"weight": 0.5,
"ordinal": -3,
"source_id": "234_012",
"expected_date": "2026-04-29",
"observed_date": "2026-04-29"
},
{
"kind": "prereq",
"label": "2025 will be the definitive year that agentic systems finally hit the mainstream.",
"status": "hit",
"weight": 0.5,
"ordinal": -2,
"source_id": "SEM_042",
"expected_date": "2026-04-29",
"observed_date": "2026-04-29"
},
{
"kind": "prereq",
"label": "Recursive self-improvement is already happening now (no longer three years out)",
"status": "hit",
"weight": 0.5,
"ordinal": -1,
"source_id": "238_009",
"expected_date": "2026-04-29",
"observed_date": "2026-04-29"
},
{
"kind": "event",
"label": "Math field is 'cooked' — AI will solve research-level mathematics (first open hard math problem imminently)",
"status": "partial",
"weight": 1,
"ordinal": 0,
"source_id": "238_020",
"expected_date": "2026-05-01",
"observed_date": "2026-05-01"
},
{
"kind": "cascade",
"label": "xAI/Grok will catch up and exceed competitors on coding by mid-2026",
"status": "pending",
"weight": 0.5,
"ordinal": 1,
"source_id": "239_004",
"expected_date": "2026-06-20",
"observed_date": null
},
{
"kind": "llm_pre_event",
"label": "AI achieves IMO Gold (top-30 score) on 2026 problems",
"notes": "DeepMind already achieved Silver in 2024; Gold by 2026 is the natural progression. ImoGrandChallenge.org tracks this.",
"source": "DeepMind AlphaProof, OpenAI math results announcements",
"status": "pending",
"weight": 0.4,
"ordinal": 2,
"source_id": null,
"confidence": 0.75,
"expected_date": "2026-08-07",
"research_origin": "training",
"expected_date_range": {
"to": "2026-08-31",
"from": "2026-07-15"
},
"measurement_criterion": "AI lab announces system achieving IMO Gold Medal score (typically 25/42+ on 2026 problems), w
... (truncated)