← Cockpit
CMQ_004predictionAIAGI-capability-roadmap

AGI-like models matching or outperforming human experts across most professional domains could arrive as early as 2026-2027.

Predictor: Dario Amodei

Prior probability
60.0%
Current probability
27.6%
evolves via intake + LBP
Conviction
5/5
Signal quality
A
Resolution
pending
Window
2026-01-01 – 2027-11-30
Edges in / out
11 / 5
Tickers exposed
13

Prediction text

AGI-like models matching or outperforming human experts across most professional domains could arrive as early as 2026-2027. | Claude expert-level benchmarks; Anthropic ARR trajectory

Key catalyst: Claude expert-level benchmarks; Anthropic ARR trajectory

Watch events: Claude 4.x/5.x model releases; agentic task completion benchmarks; enterprise white-collar substitution metrics.

Resolution evidence

Status: pending

Anthropic crossed $30B ARR April 2026 overtaking OpenAI; Claude capability trajectory (Opus 4.6/4.7 family) supports aggressive timeline.

Predictor: Dario Amodei

κ + Brier as of 2026-05-22
κ (discount)
0.688
Brier
0.0363
excellent
Hits / Misses
1 / 0
of 3 resolved
Hit rate
33.3%
Calibration plot (stated vs observed)

Evidence about this node from Dario Amodei is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).

Reference class: agi_breakthrough_5y

Linked via embedding similarity 0.629

Major capability discontinuity (e.g. AGI by named target year, 5-year horizon)

Base rate
20.0%
1/5 historical
Inside weight
0.450
TRF=0.79
Outside weight
0.550
pulling toward base rate
inside 39.1% → blend 27.6% -11.4pp)

Tetlock-style outside view: at TRF=1 (just predicted), outside view dominates (w_in=0.3). At TRF=0 (deadline), inside view dominates (w_in=1.0). The blend regularizes overconfident inside views toward the historical base rate.

Probability over time

7 prob_history rows
0%25%50%75%100%prior 60%2026-04-302026-04-302026-05-30
intake v2milestone miss sweeplbp propagationreference class assignedlegacy v1prior_prob (analyst seed)current = 27.6%

Milestone chain

Pre-event signals (upstream prereqs + window checkpoints) → resolution event → downstream cascades. Status/dates update from linked nodes; re-derive nightly via scripts/ops/derive_milestones.py.
Leading chain: 2 fired ✓ · 1 overdue ⏱ · 5 pending
  1. 2026-01-31hitAnthropic publicly forecasts AGI / powerful AI within 2026-2027 horizon
    How: Anthropic CEO publicly states (essay, interview, blog post, congressional testimony) that AGI-level systems will arrive by 2026 or 2027
    Source: https://www.darioamodei.com/essay/machines-of-loving-graceconf 99%
    Notes: HIT — Amodei's 'Adolescence of Technology' (Jan 2026) and Davos 2026 appearance both reaffirm 2026-2027 powerful-AI window.
  2. 2026-05-17overdueQ1 window check-in (25%)
  3. 2026-10-01pendingQ2 window check-in (50%)
  4. 2026-04-01 → 2027-09-30pendingAnthropic ARR exceeds $30B annualized run-rate
    How: Anthropic publicly reports or analyst consensus shows ARR >=$30B annualized — scale signal for AGI-tier deployment
    Source: https://www.eweek.com/news/anthropic-1-trillion-valuation-neuron/conf 70%
    Notes: Anthropic at $1T valuation 2026 implies ARR-multiple consistent with $20-40B trajectory.
  5. 2026-06-01 → 2027-09-30pendingFrontier model achieves expert-level performance on broad professional certification benchmark
    How: Single frontier model (Claude / GPT / Gemini) scores at 90th-percentile-or-higher on >=5 distinct professional licensing exams (USMLE, bar, CPA, FE/PE engineering, CFA Level 3) within same release
    Source: https://blog.redwoodresearch.org/p/whats-up-with-anthropic-predictingconf 80%
  6. 2027-02-14pendingQ3 window check-in (75%)
  7. 2026-06-01 → 2027-12-31pendingAI fully automates >=80% of typical software-engineering tasks at Fortune 500
    How: Independent industry survey (GitHub / McKinsey / Gartner) reports >=80% of routine SWE tasks at Fortune 500 are AI-completed end-to-end with human review only
    Source: https://eu.36kr.com/en/p/3648851352018565conf 50%
    Notes: Cascade — Amodei's '6-12 months until SWEs replaced' (Davos 2026) is aggressive; 80% task-level automation more plausible by 2027.

What if this resolves?

Clamp this prediction TRUE or FALSE and run a counterfactual Gibbs sample. Surfaces the predictions whose marginals shift most under that assumption.
(live posterior: 28%)

Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"

Evidence chain

Every probability update with full Bayesian provenance — chronological, latest first
metadata_milestone_miss_sweep2026-05-30T22:15:00Z27.6%-18.2pp
metadata_milestone_miss_sweep bayesian_v2 n=1 inside=0.391 blend=0.276 LLR=-0.279 κ=0.69 w_in=0.45 agi_breakthrough_5y
Raw metadata
{
  "trf": 0.7852047391286945,
  "kappa": 0.6875,
  "base_rate": 0.2,
  "predictor": "Dario Amodei",
  "total_llr": -0.4054651081081644,
  "grace_days": 7,
  "bayesian_v2": true,
  "prior_logit": -0.16559666538750165,
  "bayes_factor": "1.3:1 against",
  "blend_reason": "blend 45% inside / 54% outside (TRF=0.785, base_rate=0.200 from agi_breakthrough_5y)",
  "inside_prior": 0.458695179819821,
  "kappa_source": "predictor_table",
  "n_milestones": 1,
  "blend_applied": true,
  "contributions": [
    {
      "llr": -0.4054651081081644,
      "kind": "quartile_checkpoint",
      "kappa": 0.6875,
      "label": "Q1 window check-in (25%)",
      "weight": 0.05,
      "strength": "weak",
      "confidence": null,
      "source_url": null,
      "adjusted_llr": -0.278757261824363,
      "expected_date": "2026-05-17",
      "measurement_criterion": null
    }
  ],
  "evidence_kind": "metadata_milestone_miss_sweep",
  "inside_source": "history_v2",
  "inside_weight": 0.45035668260991385,
  "outside_weight": 0.5496433173900861,
  "posterior_prob": 0.2764608989788214,
  "posterior_logit": -0.4443539272118646,
  "predictor_brier": 0.0363,
  "inside_posterior": 0.3907040059674444,
  "blended_posterior": 0.2764608989788214,
  "reference_class_id": "agi_breakthrough_5y",
  "total_adjusted_llr": -0.278757261824363,
  "predictor_n_resolved": 3
}
LBP2026-05-10T02:00:02Z45.9%+1.4pp
Network propagation: 44.5% → 45.9%
6-iter LBP, residual 0.00584 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e5c18d29
LBP2026-05-03T02:00:01Z44.5%+2.9pp
Network propagation: 41.6% → 44.5%
6-iter LBP, residual 0.00677 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run 1a683ac9
LBP2026-04-30T16:39:51Z41.6%+7.2pp
Network propagation: 34.5% → 41.6%
5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v2 · run 0c8a4ea3
legacy v12026-04-30T16:13:50Z34.5%-8.8pp
reference_class_assigned bayesian_v2 inside=0.600 blend=0.345 w_in=0.41 agi_breakthrough_5y
LBP2026-04-30T02:18:57Z43.3%+8.9pp
Network propagation: 34.4% → 43.3%
5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v1 · run 592311ef
legacy v12026-04-30T01:56:50Z34.4%-25.6pp
reference_class_assigned bayesian_v2 inside=0.600 blend=0.344 w_in=0.41 agi_breakthrough_5y

Network propagation neighbors

Top edges sorted by latest LBP cross-impact
All propagation →

Top incoming (parents)

Edges that influence THIS node's belief

KindNodeTheir probP(c|s=T)P(c|s=F)Δ implied
killerTK03
AI Regulatory Moratorium (EU/US Capability Freeze)
10.0%0.0500.600+0.269
killerTK01
AGI Capability Plateau (2026-27 Training Stall)
15.0%0.0500.600+0.241
prereq238_009
Recursive self-improvement is already happening now (no longAlex Wissner-Gross
78.1%0.6000.050+0.198

Top outgoing (children)

Predictions THIS node influences

KindNodeTheir probP(c|s=T)P(c|s=F)Δ implied
prereq242_001
Elon's Terafab will build 1 terawatt of AI compute per year,Elon Musk
43.9%0.5500.050-0.206
prereq241_043
ASI will arrive within 2 years to 5 years to this next decadPeter Diamandis
35.9%0.6500.050-0.089
prereq235_030
Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 203Ray Kurzweil
39.2%0.7500.050-0.086
prereqCMQ_002
By 2028, AI systems will reach 'independent researcher' leveSam Altman
31.4%0.5500.050-0.081
prereqSEM_034
True artificial general intelligence will be achieved betweeDemis Hassabis
28.7%0.5500.050-0.054

Ticker exposure

13 ticker(s) linked

Beneficiaries (13)

BBAINVDAGTLBSOUNAIMETAMSFTORCLTCEHYAMZNBABAGOOGLIBM

Prerequisites (11)

Predictions that must hit first
TypePredTitleDomainLag
prereq238_009Recursive self-improvement is already happening now (no longer three years out)AI
correlateS_ASI_SLOW_2040PLUSASI slow: post-2040 / soft takeoffasi_recursive_self_improvement
correlateS_AGI_MID_2029AGI mid: Kurzweil 2029 pathagi_general_capability
correlateS_ASI_MID_2034ASI mid: Schmidt 'ASI in 6 years'asi_recursive_self_improvement
correlateS_AGI_FAST_2027AGI fast: drop-in remote worker by 2027-09agi_general_capability
correlateS_AGI_SLOW_2031AGI slow: Schmidt/Hassabis 5-10 year pathagi_general_capability
correlateS_HUMANOID_MASS_2033Humanoid R4: 10M+ cumulative by Dec 2033humanoid_deployment
correlateS_AGI_WINTER_2036PLUSAGI delayed: capability plateau or AI winteragi_general_capability
correlateS_ASI_FAST_2031ASI fast: RSI within 5y of AGIasi_recursive_self_improvement
killerTK01AGI Capability Plateau (2026-27 Training Stall)
killerTK03AI Regulatory Moratorium (EU/US Capability Freeze)

Dependents (5)

Predictions enabled by this
TypePredTitleDomainLag
prereq235_030Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 2033.Biotech/Longevity
prereq241_043ASI will arrive within 2 years to 5 years to this next decadeAI
prereqCMQ_002By 2028, AI systems will reach 'independent researcher' level — driving autonomous scientific discoveries without human intervention.AI
prereqSEM_034True artificial general intelligence will be achieved between 2032 and 2042 — 'first we solve AI, then use AI to solve everything else'.AI/AGI
prereq242_001Elon's Terafab will build 1 terawatt of AI compute per year, 50x current global productionAI

Linked documents (10)

Auto-generated by cosine similarity from Polymarket / Manifold / EDGAR / GDELT
SimSourceTitleMarket probPolarityReviewedPublished
0.647manifoldWhich prediction market CEOs will I meet before 2028?mentionspending2026-05-02
0.633manifoldWhen will the first prediction market ETF launch?mentionspending2026-04-29
0.631manifoldAny new Claude Model Before May 16th?4%mentionspending2026-05-05
0.630manifoldWill a prediction market-related court case reach the Supreme Court before 2029?65%mentionspending2026-05-27
0.624manifoldWill Claude become a Pokèmon Master by the end of August 2026?58%mentionspending2026-05-12
0.616manifoldGlobal Average Temperature May 2026 per LOTI v4 vs 1951-1980 base period (NASA Gistemp)mentionspending2026-04-25
0.615manifoldWill Manifold successfully predict the next pandemic at least 2 weeks in advance?53%mentionspending2026-05-18
0.608manifoldClaude 5 released by End of September 2026?78%mentionspending2026-05-18
0.607manifoldGlobal Average Temperature June 2026 per LOTI v4 vs 1951-1980 base period (NASA Gistemp)mentionspending2026-06-01
0.605manifoldBest METR 80% Time Horizon before August 2026mentionspending2026-06-04

Raw metadata

From Thesis_Timeline_v1.0_FINAL workbook
{
  "nia": false,
  "qty": "expert-level AGI",
  "mode": "FORECAST",
  "role": "Cited-CEO",
  "context": "Amodei's 'Machines of Loving Grace' timeline; more aggressive than Altman on near-term expert-level AI.",
  "to_year": 2027,
  "conv_cues": "could arrive as early as; CEO",
  "direction": "HAPPEN",
  "from_year": 2026,
  "timeframe": "2026-2027",
  "conv_level": "HIGH",
  "milestones": [
    {
      "kind": "llm_pre_event",
      "label": "Anthropic publicly forecasts AGI / powerful AI within 2026-2027 horizon",
      "notes": "HIT — Amodei's 'Adolescence of Technology' (Jan 2026) and Davos 2026 appearance both reaffirm 2026-2027 powerful-AI window.",
      "source": "https://www.darioamodei.com/essay/machines-of-loving-grace",
      "status": "hit",
      "weight": 0.4,
      "ordinal": -8,
      "source_id": null,
      "confidence": 0.99,
      "source_url": "https://www.darioamodei.com/essay/machines-of-loving-grace",
      "expected_date": "2026-01-31",
      "observed_date": "2026-01-31",
      "research_origin": "deep_research",
      "measurement_criterion": "Anthropic CEO publicly states (essay, interview, blog post, congressional testimony) that AGI-level systems will arrive by 2026 or 2027"
    },
    {
      "kind": "prereq",
      "label": "Recursive self-improvement is already happening now (no longer three years out)",
      "status": "hit",
      "weight": 0.5,
      "ordinal": -7,
      "source_id": "238_009",
      "expected_date": "2026-04-29",
      "observed_date": "2026-04-29"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q1 window check-in (25%)",
      "status": "overdue",
      "weight": 0.05,
      "ordinal": -6,
      "source_id": null,
      "expected_date": "2026-05-17",
      "observed_date": null,
      "miss_emitted_at": "2026-05-30T22:15:00.756418+00:00",
      "miss_emitted_by": "metadata_milestone_sweep"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q2 window check-in (50%)",
      "status": "pending",
      "weight": 0.05,
      "ordinal": -5,
      "source_id": null,
      "expected_date": "2026-10-01",
      "observed_date": null
    },
    {
      "kind": "llm_pre_event",
      "label": "Anthropic ARR exceeds $30B annualized run-rate",
      "notes": "Anthropic at $1T valuation 2026 implies ARR-multiple consistent with $20-40B trajectory.",
      "source": "https://www.eweek.com/news/anthropic-1-trillion-valuation-neuron/",
      "status": "pending",
      "weight": 0.4,
      "ordinal": -4,
      "source_id": null,
      "confidence": 0.7,
      "source_url": "https://www.eweek.com/news/anthropic-1-trillion-valuation-neuron/",
      "expected_date": "2026-12-30",
      "research_origin": "deep_research",
      "expected_date_range": {
        "to": "2027-09-30",
        "from": "2026-04-01"
      },
      "measurement_criterion": "Anthropic publicly reports or analyst consensus shows ARR >=$30B annualized — scale signal for AGI-tier deployment"
    },
    {
      "kind": "llm_pre_event",
      "label": "Frontier model achieves expert-level performance on broad professional certification benchmark",
      "source": "https://blog.redwoodresearch.org/p/whats-up-with-anthropic-predicting",
      "status": "pending",
      "weight": 0.4,
      "ordinal": -3,
      "source_id": null,
      "confidence": 0.8,
      "source_url": "https://blog.redwoodresearch.org/p/whats-up-with-anthropic-predicting",
      "expected_date": "2027-01-30",
      "research_origin": "deep_research",
      "expected_date_range": {
        "to": "2027-09-30",
        "from": "2026-06-01"
      },
      "measurement_criterion": "Single frontier model (Claude / GPT / Gemini) scores at 90th-percentile-or-higher on >=5 distinct professional licensing exams (USMLE, bar, CPA, FE/PE engineering, CFA Level 3) within same release"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q3 window check-in (75%)",
      "status": "pending",
      "weight": 0.05,
      "ordina
... (truncated)