AGI is plausible within 10 years, BUT alignment and safety must be solved BEFORE reaching AGI — not concurrently, and not after.
Predictor: Demis Hassabis
Prediction text
AGI is plausible within 10 years, BUT alignment and safety must be solved BEFORE reaching AGI — not concurrently, and not after. | Interpretability breakthroughs; alignment-vs-capability progress ratio
Key catalyst: Interpretability breakthroughs; alignment-vs-capability progress ratio
Watch events: Alignment research funding vs capability research funding ratio; frontier lab safety evaluations pre-release.
Resolution evidence
Alignment research (Constitutional AI, RLHF, interpretability, SAE probing) scaling with capabilities but gap remains meaningful.
Predictor: Demis Hassabis
Calibration plot (stated vs observed)
Evidence about this node from Demis Hassabis is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).
Reference class: agi_breakthrough_5y
Major capability discontinuity (e.g. AGI by named target year, 5-year horizon)
Tetlock-style outside view: at TRF=1 (just predicted), outside view dominates (w_in=0.3). At TRF=0 (deadline), inside view dominates (w_in=1.0). The blend regularizes overconfident inside views toward the historical base rate.
Probability over time
Milestone chain
- 2026-06-30 → 2028-12-31pendingFirst major frontier lab adopts pause/safety case requirement before next training runHow: OpenAI, Anthropic, Google DeepMind, or xAI publicly commits to (and verifies via 3rd party) a 'safety case' deliverable approved by external auditor before scaling beyond next-gen frontier modelSource: Anthropic RSP; OpenAI Preparedness Frameworkconf 50%
- 2027-12-08pendingQ1 window check-in (25%)
- 2026-01-01 → 2029-12-31pendingDocumented 'misaligned' frontier-model incident triggers government-mandated rollback or shutdownHow: EU AI Act enforcement, US EO action, or analogous regulator orders a frontier model to be paused/withdrawn following documented misalignment behavior (deception, blackmail, autonomous resource acquisition)Source: Anthropic Agentic Misalignment paper 2025; EU AI Act Aug 2026conf 45%
- 2027-01-01 → 2029-12-31pendingMechanistic interpretability scales to reliably explain >=80% of frontier-model behavior on safety-relevant evalsHow: Peer-reviewed paper from Anthropic/DeepMind/Apollo demonstrating circuit-level mechanistic explanation accounting for >=80% of variance in frontier model decisions on Anthropic's deception/sandbagging eval suiteSource: Anthropic interpretability roadmap (target 2027)conf 35%
- 2027-06-30 → 2030-12-31pendingInternational AI safety treaty or compute governance framework signed by US/EU/ChinaHow: Binding multilateral agreement (G7+, UN-AI, or bilateral US-China) covering compute thresholds, training disclosure, and incident reporting; ratified by signatoriesSource: AI Seoul/Bletchley Process; US-China AI dialogueconf 25%
- 2029-11-14pendingQ2 window check-in (50%)
- 2028-01-01 → 2031-12-31pendingCapability vs. alignment progress widely declared mismatched in major-lab safety reportsHow: At least 2 of OpenAI/Anthropic/DeepMind annual safety reports explicitly state alignment progress is failing to keep pace with capability scaling, with quantitative gap metrics publishedSource: FLI 'No alignment or control strategy' 2025 indicatorconf 55%
- 2031-10-21pendingQ3 window check-in (75%)
No downstream cascades — this prediction is a leaf in the dependency graph.
What if this resolves?
Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"
Evidence chain
Network propagation neighbors
Top incoming (parents)
Edges that influence THIS node's belief
Top outgoing (children)
Predictions THIS node influences
No outgoing edges.
Prerequisites (6)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| correlate | S_AGI_MID_2029 | AGI mid: Kurzweil 2029 path | agi_general_capability | — |
| correlate | S_ASI_MID_2034 | ASI mid: Schmidt 'ASI in 6 years' | asi_recursive_self_improvement | — |
| correlate | S_AGI_SLOW_2031 | AGI slow: Schmidt/Hassabis 5-10 year path | agi_general_capability | — |
| correlate | S_AGI_WINTER_2036PLUS | AGI delayed: capability plateau or AI winter | agi_general_capability | — |
| killer | TK01 | AGI Capability Plateau (2026-27 Training Stall) | — | — |
| killer | TK03 | AI Regulatory Moratorium (EU/US Capability Freeze) | — | — |
Dependents (0)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| No dependents | ||||
Linked documents (5)
| Sim | Source | Title | Market prob | Polarity | Reviewed | Published |
|---|---|---|---|---|---|---|
| 0.725 | manifold | Will ASI be achieved less than a year after continual learning? | 31% | mentions | pending | 2026-05-28 |
| 0.688 | manifold | If ASI is achieved before Manifest 2027, will Manifest 2027 occur? | 77% | mentions | pending | 2026-05-06 |
| 0.642 | manifold | When will the Jacobian challenge be solved in Lean? | — | mentions | pending | 2026-05-29 |
| 0.629 | manifold | To what extent will Developmental Cognitive Interpretability be successful [Read Updated Description] | — | mentions | pending | 2026-05-29 |
| 0.564 | manifold | Will my resin casted pepperoni mold within one year from today? | 81% | mentions | pending | 2026-05-26 |
Raw metadata
{
"nia": false,
"qty": "AGI conditional on alignment",
"mode": "THESIS",
"role": "Cited-CEO",
"caveats": "Conditional — tied to alignment progress tracking AGI progress.",
"context": "Hassabis safety-sequencing thesis; overhyped timelines risk damaging public trust in AI research.",
"to_year": 2036,
"conv_cues": "plausible; conditional framing",
"direction": "HAPPEN",
"from_year": 2026,
"timeframe": "next 10 years",
"conv_level": "MEDIUM",
"milestones": [
{
"kind": "llm_pre_event",
"label": "First major frontier lab adopts pause/safety case requirement before next training run",
"source": "Anthropic RSP; OpenAI Preparedness Framework",
"status": "pending",
"weight": 0.4,
"ordinal": -9,
"source_id": null,
"confidence": 0.5,
"source_url": "https://www.anthropic.com/news/anthropics-responsible-scaling-policy",
"expected_date": "2027-09-30",
"research_origin": "training",
"expected_date_range": {
"to": "2028-12-31",
"from": "2026-06-30"
},
"measurement_criterion": "OpenAI, Anthropic, Google DeepMind, or xAI publicly commits to (and verifies via 3rd party) a 'safety case' deliverable approved by external auditor before scaling beyond next-gen frontier model"
},
{
"kind": "quartile_checkpoint",
"label": "Q1 window check-in (25%)",
"status": "pending",
"weight": 0.05,
"ordinal": -8,
"source_id": null,
"expected_date": "2027-12-08",
"observed_date": null
},
{
"kind": "llm_pre_event",
"label": "Documented 'misaligned' frontier-model incident triggers government-mandated rollback or shutdown",
"source": "Anthropic Agentic Misalignment paper 2025; EU AI Act Aug 2026",
"status": "pending",
"weight": 0.4,
"ordinal": -7,
"source_id": null,
"confidence": 0.45,
"expected_date": "2028-01-01",
"research_origin": "training",
"expected_date_range": {
"to": "2029-12-31",
"from": "2026-01-01"
},
"measurement_criterion": "EU AI Act enforcement, US EO action, or analogous regulator orders a frontier model to be paused/withdrawn following documented misalignment behavior (deception, blackmail, autonomous resource acquisition)"
},
{
"kind": "llm_pre_event",
"label": "Mechanistic interpretability scales to reliably explain >=80% of frontier-model behavior on safety-relevant evals",
"source": "Anthropic interpretability roadmap (target 2027)",
"status": "pending",
"weight": 0.4,
"ordinal": -6,
"source_id": null,
"confidence": 0.35,
"expected_date": "2028-07-01",
"research_origin": "training",
"expected_date_range": {
"to": "2029-12-31",
"from": "2027-01-01"
},
"measurement_criterion": "Peer-reviewed paper from Anthropic/DeepMind/Apollo demonstrating circuit-level mechanistic explanation accounting for >=80% of variance in frontier model decisions on Anthropic's deception/sandbagging eval suite"
},
{
"kind": "scenario_signal",
"label": "Scenario fires: AGI mid: Kurzweil 2029 path",
"status": "pending",
"weight": 0.7,
"ordinal": -5,
"source_id": "S_AGI_MID_2029",
"expected_date": "2029-03-31",
"observed_date": null
},
{
"kind": "llm_pre_event",
"label": "International AI safety treaty or compute governance framework signed by US/EU/China",
"source": "AI Seoul/Bletchley Process; US-China AI dialogue",
"status": "pending",
"weight": 0.4,
"ordinal": -4,
"source_id": null,
"confidence": 0.25,
"expected_date": "2029-03-31",
"research_origin": "training",
"expected_date_range": {
"to": "2030-12-31",
"from": "2027-06-30"
},
"measurement_criterion": "Binding multilateral agreement (G7+, UN-AI, or bilateral US-China) covering c
... (truncated)