INF_072predictionAIAGI-scaling-50-50

There is approximately a 50/50 chance that simply scaling existing methodologies (transformer architecture + more data + more compute) will be enough to reach AGI — though "nowhere near" human-level AGI currently.

Predictor: Demis Hassabis

Prior probability

50.0%

Current probability

40.4%

evolves via intake + LBP

Conviction

3/5

Signal quality

Resolution

pending

Window

2030-01-01 – 2042-09-30

Edges in / out

7 / 0

Tickers exposed

Prediction text

There is approximately a 50/50 chance that simply scaling existing methodologies (transformer architecture + more data + more compute) will be enough to reach AGI — though "nowhere near" human-level AGI currently. | Next DeepMind model-release capability evaluation

Key catalyst: Next DeepMind model-release capability evaluation

Watch events: GDPval-scale benchmarks; novel-architecture release from DeepMind

Resolution evidence

Status: pending

Hassabis position consistent across 2024-2026 interviews. AlphaFold Nobel (2024) reinforces his scientific-breakthrough-required thesis.

Predictor: Demis Hassabis

κ + Brier as of 2026-05-22

Full calibration →

κ (discount)

0.583

Brier

0.0064

excellent

Hits / Misses

1 / 0

of 1 resolved

Hit rate

100.0%

Calibration plot (stated vs observed)

Evidence about this node from Demis Hassabis is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).

Reference class: agi_breakthrough_5y

Linked via embedding similarity 0.662

All classes →

Major capability discontinuity (e.g. AGI by named target year, 5-year horizon)

Base rate

20.0%

1/5 historical

Inside weight

—

Outside weight

—

no pull

inside 40.4% → blend 40.4% (Δ 0.0pp)

Tetlock-style outside view: at TRF=1 (just predicted), outside view dominates (w_in=0.3). At TRF=0 (deadline), inside view dominates (w_in=1.0). The blend regularizes overconfident inside views toward the historical base rate.

Probability over time

6 prob_history rows

intake v2milestone miss sweeplbp propagationreference class assignedlegacy v1prior_prob (analyst seed)current = 40.4%

Milestone chain

Pre-event signals (upstream prereqs + window checkpoints) → resolution event → downstream cascades. Status/dates update from linked nodes; re-derive nightly via scripts/ops/derive_milestones.py.

Leading chain: 1 fired ✓ · 9 pending

2025-12-05hitHassabis publicly reaffirms 50/50 odds on transformer-scaling-to-AGI thesis at end-of-2025/2026 forums
How: Hassabis Axios/CNBC/podcast statement confirms 50% AGI-by-2030 stance and ongoing belief one-or-two big ideas remain
Source: deep_research_enrichedconf 95%
2026-05-01 → 2026-12-31pendingNext major DeepMind Gemini release (Gemini 3.x or successor) demonstrates measurable agentic / tool-use gains
How: DeepMind blog announcement + independent benchmark (METR, GPQA, SWE-Bench) showing 20%+ improvement on long-horizon tasks vs Gemini 2.x
Source: deep_research_enrichedconf 80%
2027-09-30pendingScenario fires: AGI fast: drop-in remote worker by 2027-09
2027-01-01 → 2030-12-31pendingFrontier lab demonstrates 'drop-in remote worker' for white-collar task (METR end-to-end >80% pass rate)
How: Public benchmark (METR, OSWorld, GAIA) shows AI agent completing multi-day knowledge-work tasks at human-equivalent quality
Source: deep_research_enrichedconf 55%
2028-01-01 → 2032-12-31pendingAGI declaration (or industry consensus) confirms transformer-scaling alone insufficient — new architecture required
How: Top-3 frontier lab publishes paper / public statement saying scaling alone hit ceiling and new architecture (world-models, neuro-symbolic, etc.) required
Source: deep_research_enrichedconf 45%
2032-01-17pendingQ1 window check-in (25%)
2029-01-01 → 2038-03-05pendingAGI achieved or unambiguously on track per Hassabis 50% bet — community / METR / Survey-of-Researchers consensus
How: AI Impacts survey or equivalent expert poll shows ≥50% probability AGI achieved under transformer-scaling-only path
Source: deep_research_enrichedconf 50%
2034-02-01pendingQ2 window check-in (50%)
2036-02-17pendingQ3 window check-in (75%)
2036-12-31pendingScenario fires: AGI delayed: capability plateau or AI winter
2038-03-05pendingThere is approximately a 50/50 chance that simply scaling existing methodologies (transformer architecture + more data + more compute) will

No downstream cascades — this prediction is a leaf in the dependency graph.

What if this resolves?

Clamp this prediction TRUE or FALSE and run a counterfactual Gibbs sample. Surfaces the predictions whose marginals shift most under that assumption.

(live posterior: 40%)

Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"

Evidence chain

Every probability update with full Bayesian provenance — chronological, latest first

LBP2026-05-10T02:00:02Z40.4%+1.8pp

Network propagation: 38.6% → 40.4%

6-iter LBP, residual 0.00584 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e5c18d29

LBP2026-05-03T02:00:01Z38.6%+3.4pp

Network propagation: 35.2% → 38.6%

6-iter LBP, residual 0.00677 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run 1a683ac9

LBP2026-04-30T16:39:51Z35.2%+7.8pp

Network propagation: 27.5% → 35.2%

5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v2 · run 0c8a4ea3

legacy v12026-04-30T16:13:50Z27.5%-7.8pp

reference_class_assigned bayesian_v2 inside=0.500 blend=0.275 w_in=0.30 agi_breakthrough_5y

LBP2026-04-30T02:18:57Z35.2%+7.8pp

Network propagation: 27.5% → 35.2%

5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v1 · run 592311ef

legacy v12026-04-30T01:56:50Z27.5%-22.5pp

reference_class_assigned bayesian_v2 inside=0.500 blend=0.275 w_in=0.30 agi_breakthrough_5y

Network propagation neighbors

Top edges sorted by latest LBP cross-impact

All propagation →

Top incoming (parents)

Edges that influence THIS node's belief

Kind	Node	Their prob	P(c\|s=T)	P(c\|s=F)	Δ implied
killer	TK03 AI Regulatory Moratorium (EU/US Capability Freeze)	10.0%	0.050	0.500	+0.051
killer	TK02 AI Compute Supply Shock (TSMC/Taiwan Disruption)	12.0%	0.050	0.500	+0.042
killer	TK01 AGI Capability Plateau (2026-27 Training Stall)	15.0%	0.050	0.500	+0.028

Top outgoing (children)

Predictions THIS node influences

No outgoing edges.

Ticker exposure

19 ticker(s) linked

Beneficiaries (13)

SITM VRT ARGAN FLNC FSLR HTHIY HUBB PWR ETN SBGSY SMNEY GEV CMI

Prerequisites (7)

Predictions that must hit first

Type	Pred	Title	Domain	Lag
correlate	S_ASI_SLOW_2040PLUS	ASI slow: post-2040 / soft takeoff	asi_recursive_self_improvement	—
correlate	S_AGI_MID_2029	AGI mid: Kurzweil 2029 path	agi_general_capability	—
correlate	S_AGI_FAST_2027	AGI fast: drop-in remote worker by 2027-09	agi_general_capability	—
correlate	S_AGI_WINTER_2036PLUS	AGI delayed: capability plateau or AI winter	agi_general_capability	—
killer	TK01	AGI Capability Plateau (2026-27 Training Stall)	—	—
killer	TK02	AI Compute Supply Shock (TSMC/Taiwan Disruption)	—	—
killer	TK03	AI Regulatory Moratorium (EU/US Capability Freeze)	—	—

Dependents (0)

Predictions enabled by this

Type	Pred	Title	Domain	Lag
No dependents

Linked documents (10)

Auto-generated by cosine similarity from Polymarket / Manifold / EDGAR / GDELT

Sim	Source	Title	Market prob	Polarity	Reviewed	Published
0.754	arxiv	Pathways to AGI	—	mentions	pending	2026-05-07
0.664	arxiv	On the Optimizer Dependence of Neural Scaling Laws	—	mentions	pending	2026-05-28
0.662	github_release	huggingface/transformers v5.10.1	—	mentions	pending	2026-06-03
0.659	arxiv	GIM: Evaluating models via tasks that integrate multiple cognitive domains	—	mentions	pending	2026-05-18
0.655	arxiv	On Hallucinations in Inverse Problems: Fundamental Limits and Provable Assessment Methods	—	mentions	pending	2026-05-13
0.653	arxiv	InfoFlow: A Framework for Multi-Layer Transformer Analysis	—	mentions	pending	2026-05-18
0.653	arxiv	The Expressive Power of Low Precision Softmax Transformers with (Summarized) Chain-of-Thought	—	mentions	pending	2026-05-18
0.649	github_release	google-deepmind/alphafold v2.2.0	—	mentions	pending	2022-03-10
0.648	arxiv	Kernel Renormalization in Bayesian Deep Neural Networks: the Equivalent Wishart Ansatz in the Proportional Regime	—	mentions	pending	2026-05-28
0.648	arxiv	The Right Answer, the Wrong Direction: Why Transformers Fail at Counting and How to Fix It	—	mentions	pending	2026-05-05

Raw metadata

From Thesis_Timeline_v1.0_FINAL workbook

{
  "nia": false,
  "qty": "50%",
  "mode": "PROBABILITY",
  "role": "Cited-CEO",
  "context": "Hassabis position moderates both the aggressive (Altman/Musk) and conservative (Kurzweil 2029) camps. Couples with CMQ_010 (AGI requires AlphaFold-class breakthroughs).",
  "to_year": 2042,
  "conv_cues": "50/50 framing; CEO FIRST_PERSON",
  "direction": "HAPPEN",
  "from_year": 2030,
  "timeframe": "2030-2042",
  "conv_level": "MEDIUM",
  "milestones": [
    {
      "kind": "llm_pre_event",
      "label": "Hassabis publicly reaffirms 50/50 odds on transformer-scaling-to-AGI thesis at end-of-2025/2026 forums",
      "source": "deep_research_enriched",
      "status": "hit",
      "weight": 0.4,
      "ordinal": -10,
      "source_id": null,
      "confidence": 0.95,
      "source_url": "https://www.axios.com/2025/12/05/ai-deepmind-gemini-agi",
      "expected_date": "2025-12-05",
      "observed_date": "2025-12-05",
      "research_origin": "deep_research",
      "measurement_criterion": "Hassabis Axios/CNBC/podcast statement confirms 50% AGI-by-2030 stance and ongoing belief one-or-two big ideas remain"
    },
    {
      "kind": "llm_pre_event",
      "label": "Next major DeepMind Gemini release (Gemini 3.x or successor) demonstrates measurable agentic / tool-use gains",
      "source": "deep_research_enriched",
      "status": "pending",
      "weight": 0.4,
      "ordinal": -9,
      "source_id": null,
      "confidence": 0.8,
      "source_url": "https://www.metaintro.com/blog/ai-scaling-debate",
      "expected_date": "2026-08-31",
      "research_origin": "deep_research",
      "expected_date_range": {
        "to": "2026-12-31",
        "from": "2026-05-01"
      },
      "measurement_criterion": "DeepMind blog announcement + independent benchmark (METR, GPQA, SWE-Bench) showing 20%+ improvement on long-horizon tasks vs Gemini 2.x"
    },
    {
      "kind": "scenario_signal",
      "label": "Scenario fires: AGI fast: drop-in remote worker by 2027-09",
      "status": "pending",
      "weight": 0.7,
      "ordinal": -8,
      "source_id": "S_AGI_FAST_2027",
      "expected_date": "2027-09-30",
      "observed_date": null
    },
    {
      "kind": "llm_pre_event",
      "label": "Frontier lab demonstrates 'drop-in remote worker' for white-collar task (METR end-to-end >80% pass rate)",
      "source": "deep_research_enriched",
      "status": "pending",
      "weight": 0.4,
      "ordinal": -7,
      "source_id": null,
      "confidence": 0.55,
      "source_url": "https://forum.effectivealtruism.org/posts/YvFjpAKkJNErkiFTN/google-deepmind-ceo-demis-hassabis-on-what-s-still-needed",
      "expected_date": "2028-12-31",
      "research_origin": "deep_research",
      "expected_date_range": {
        "to": "2030-12-31",
        "from": "2027-01-01"
      },
      "measurement_criterion": "Public benchmark (METR, OSWorld, GAIA) shows AI agent completing multi-day knowledge-work tasks at human-equivalent quality"
    },
    {
      "kind": "llm_post_event",
      "label": "AGI declaration (or industry consensus) confirms transformer-scaling alone insufficient — new architecture required",
      "source": "deep_research_enriched",
      "status": "pending",
      "weight": 0.4,
      "ordinal": -6,
      "source_id": null,
      "confidence": 0.45,
      "source_url": "https://kantrowitz.medium.com/demis-hassabis-and-sergey-brin-on-ai-scaling-agi-timeline-robotics-simulation-theory-ef3f7a740eeb",
      "expected_date": "2030-07-02",
      "research_origin": "deep_research",
      "expected_date_range": {
        "to": "2032-12-31",
        "from": "2028-01-01"
      },
      "measurement_criterion": "Top-3 frontier lab publishes paper / public statement saying scaling alone hit ceiling and new architecture (world-models, neuro-symbolic, etc.) required"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q1 window check-in (25%)",
      "status": "pending",
      "weight": 0.05,
      "ordinal": -5,
      "source_id":
... (truncated)