CMQ_004predictionAIAGI-capability-roadmap

AGI-like models matching or outperforming human experts across most professional domains could arrive as early as 2026-2027.

Predictor: Dario Amodei

Prior probability

60.0%

Current probability

27.6%

evolves via intake + LBP

Conviction

5/5

Signal quality

Resolution

pending

Window

2026-01-01 – 2027-11-30

Edges in / out

11 / 5

Tickers exposed

Prediction text

AGI-like models matching or outperforming human experts across most professional domains could arrive as early as 2026-2027. | Claude expert-level benchmarks; Anthropic ARR trajectory

Key catalyst: Claude expert-level benchmarks; Anthropic ARR trajectory

Watch events: Claude 4.x/5.x model releases; agentic task completion benchmarks; enterprise white-collar substitution metrics.

Resolution evidence

Status: pending

Anthropic crossed $30B ARR April 2026 overtaking OpenAI; Claude capability trajectory (Opus 4.6/4.7 family) supports aggressive timeline.

Predictor: Dario Amodei

κ + Brier as of 2026-05-22

Full calibration →

κ (discount)

0.688

Brier

0.0363

excellent

Hits / Misses

1 / 0

of 3 resolved

Hit rate

33.3%

Calibration plot (stated vs observed)

Evidence about this node from Dario Amodei is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).

Reference class: agi_breakthrough_5y

Linked via embedding similarity 0.629

All classes →

Major capability discontinuity (e.g. AGI by named target year, 5-year horizon)

Base rate

20.0%

1/5 historical

Inside weight

0.450

TRF=0.79

Outside weight

0.550

pulling toward base rate

inside 39.1% → blend 27.6% (Δ -11.4pp)

Tetlock-style outside view: at TRF=1 (just predicted), outside view dominates (w_in=0.3). At TRF=0 (deadline), inside view dominates (w_in=1.0). The blend regularizes overconfident inside views toward the historical base rate.

Probability over time

7 prob_history rows

intake v2milestone miss sweeplbp propagationreference class assignedlegacy v1prior_prob (analyst seed)current = 27.6%

Milestone chain

Pre-event signals (upstream prereqs + window checkpoints) → resolution event → downstream cascades. Status/dates update from linked nodes; re-derive nightly via scripts/ops/derive_milestones.py.

Leading chain: 2 fired ✓ · 1 overdue ⏱ · 5 pending

2026-01-31hitAnthropic publicly forecasts AGI / powerful AI within 2026-2027 horizon
How: Anthropic CEO publicly states (essay, interview, blog post, congressional testimony) that AGI-level systems will arrive by 2026 or 2027
Source: https://www.darioamodei.com/essay/machines-of-loving-graceconf 99%
Notes: HIT — Amodei's 'Adolescence of Technology' (Jan 2026) and Davos 2026 appearance both reaffirm 2026-2027 powerful-AI window.
2026-04-29hitRecursive self-improvement is already happening now (no longer three years out)
2026-05-17overdueQ1 window check-in (25%)
2026-10-01pendingQ2 window check-in (50%)
2026-04-01 → 2027-09-30pendingAnthropic ARR exceeds $30B annualized run-rate
How: Anthropic publicly reports or analyst consensus shows ARR >=$30B annualized — scale signal for AGI-tier deployment
Source: https://www.eweek.com/news/anthropic-1-trillion-valuation-neuron/conf 70%
Notes: Anthropic at $1T valuation 2026 implies ARR-multiple consistent with $20-40B trajectory.
2026-06-01 → 2027-09-30pendingFrontier model achieves expert-level performance on broad professional certification benchmark
How: Single frontier model (Claude / GPT / Gemini) scores at 90th-percentile-or-higher on >=5 distinct professional licensing exams (USMLE, bar, CPA, FE/PE engineering, CFA Level 3) within same release
Source: https://blog.redwoodresearch.org/p/whats-up-with-anthropic-predictingconf 80%
2027-02-14pendingQ3 window check-in (75%)
2026-06-01 → 2027-12-31pendingAI fully automates >=80% of typical software-engineering tasks at Fortune 500
How: Independent industry survey (GitHub / McKinsey / Gartner) reports >=80% of routine SWE tasks at Fortune 500 are AI-completed end-to-end with human review only
Source: https://eu.36kr.com/en/p/3648851352018565conf 50%
Notes: Cascade — Amodei's '6-12 months until SWEs replaced' (Davos 2026) is aggressive; 80% task-level automation more plausible by 2027.
2027-07-01pendingAGI-like models matching or outperforming human experts across most professional domains could arrive as early as 2026-2027.
2028-09-07pendingBy 2028, AI systems will reach 'independent researcher' level — driving autonomous scientific discoveries without human intervention.
2030-11-04pendingElon's Terafab will build 1 terawatt of AI compute per year, 50x current global production
2033-07-30pendingRay Kurzweil predicts Longevity Escape Velocity (LEV) by 2033.
2033-08-10pendingASI will arrive within 2 years to 5 years to this next decade
2039-09-22pendingTrue artificial general intelligence will be achieved between 2032 and 2042 — 'first we solve AI, then use AI to solve everything else'.

What if this resolves?

Clamp this prediction TRUE or FALSE and run a counterfactual Gibbs sample. Surfaces the predictions whose marginals shift most under that assumption.

(live posterior: 28%)

Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"

Evidence chain

Every probability update with full Bayesian provenance — chronological, latest first

metadata_milestone_miss_sweep2026-05-30T22:15:00Z27.6%-18.2pp

metadata_milestone_miss_sweep bayesian_v2 n=1 inside=0.391 blend=0.276 LLR=-0.279 κ=0.69 w_in=0.45 agi_breakthrough_5y

Raw metadata

{
  "trf": 0.7852047391286945,
  "kappa": 0.6875,
  "base_rate": 0.2,
  "predictor": "Dario Amodei",
  "total_llr": -0.4054651081081644,
  "grace_days": 7,
  "bayesian_v2": true,
  "prior_logit": -0.16559666538750165,
  "bayes_factor": "1.3:1 against",
  "blend_reason": "blend 45% inside / 54% outside (TRF=0.785, base_rate=0.200 from agi_breakthrough_5y)",
  "inside_prior": 0.458695179819821,
  "kappa_source": "predictor_table",
  "n_milestones": 1,
  "blend_applied": true,
  "contributions": [
    {
      "llr": -0.4054651081081644,
      "kind": "quartile_checkpoint",
      "kappa": 0.6875,
      "label": "Q1 window check-in (25%)",
      "weight": 0.05,
      "strength": "weak",
      "confidence": null,
      "source_url": null,
      "adjusted_llr": -0.278757261824363,
      "expected_date": "2026-05-17",
      "measurement_criterion": null
    }
  ],
  "evidence_kind": "metadata_milestone_miss_sweep",
  "inside_source": "history_v2",
  "inside_weight": 0.45035668260991385,
  "outside_weight": 0.5496433173900861,
  "posterior_prob": 0.2764608989788214,
  "posterior_logit": -0.4443539272118646,
  "predictor_brier": 0.0363,
  "inside_posterior": 0.3907040059674444,
  "blended_posterior": 0.2764608989788214,
  "reference_class_id": "agi_breakthrough_5y",
  "total_adjusted_llr": -0.278757261824363,
  "predictor_n_resolved": 3
}

LBP2026-05-10T02:00:02Z45.9%+1.4pp

Network propagation: 44.5% → 45.9%

6-iter LBP, residual 0.00584 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e5c18d29

LBP2026-05-03T02:00:01Z44.5%+2.9pp

Network propagation: 41.6% → 44.5%

6-iter LBP, residual 0.00677 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run 1a683ac9

LBP2026-04-30T16:39:51Z41.6%+7.2pp

Network propagation: 34.5% → 41.6%

5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v2 · run 0c8a4ea3

legacy v12026-04-30T16:13:50Z34.5%-8.8pp

reference_class_assigned bayesian_v2 inside=0.600 blend=0.345 w_in=0.41 agi_breakthrough_5y

LBP2026-04-30T02:18:57Z43.3%+8.9pp

Network propagation: 34.4% → 43.3%

5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v1 · run 592311ef

legacy v12026-04-30T01:56:50Z34.4%-25.6pp

reference_class_assigned bayesian_v2 inside=0.600 blend=0.344 w_in=0.41 agi_breakthrough_5y

Network propagation neighbors

Top edges sorted by latest LBP cross-impact

All propagation →

Top incoming (parents)

Edges that influence THIS node's belief

Kind	Node	Their prob	P(c\|s=T)	P(c\|s=F)	Δ implied
killer	TK03 AI Regulatory Moratorium (EU/US Capability Freeze)	10.0%	0.050	0.600	+0.269
killer	TK01 AGI Capability Plateau (2026-27 Training Stall)	15.0%	0.050	0.600	+0.241
prereq	238_009 Recursive self-improvement is already happening now (no long — Alex Wissner-Gross	78.1%	0.600	0.050	+0.198

Top outgoing (children)

Predictions THIS node influences

Kind	Node	Their prob	P(c\|s=T)	P(c\|s=F)	Δ implied
prereq	242_001 Elon's Terafab will build 1 terawatt of AI compute per year, — Elon Musk	43.9%	0.550	0.050	-0.206
prereq	241_043 ASI will arrive within 2 years to 5 years to this next decad — Peter Diamandis	35.9%	0.650	0.050	-0.089
prereq	235_030 Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 203 — Ray Kurzweil	39.2%	0.750	0.050	-0.086
prereq	CMQ_002 By 2028, AI systems will reach 'independent researcher' leve — Sam Altman	31.4%	0.550	0.050	-0.081
prereq	SEM_034 True artificial general intelligence will be achieved betwee — Demis Hassabis	28.7%	0.550	0.050	-0.054

Ticker exposure

13 ticker(s) linked

Beneficiaries (13)

BBAI NVDA GTLB SOUN AI META MSFT ORCL TCEHY AMZN BABA GOOGL IBM

Prerequisites (11)

Predictions that must hit first

Type	Pred	Title	Domain	Lag
prereq	238_009	Recursive self-improvement is already happening now (no longer three years out)	AI	—
correlate	S_ASI_SLOW_2040PLUS	ASI slow: post-2040 / soft takeoff	asi_recursive_self_improvement	—
correlate	S_AGI_MID_2029	AGI mid: Kurzweil 2029 path	agi_general_capability	—
correlate	S_ASI_MID_2034	ASI mid: Schmidt 'ASI in 6 years'	asi_recursive_self_improvement	—
correlate	S_AGI_FAST_2027	AGI fast: drop-in remote worker by 2027-09	agi_general_capability	—
correlate	S_AGI_SLOW_2031	AGI slow: Schmidt/Hassabis 5-10 year path	agi_general_capability	—
correlate	S_HUMANOID_MASS_2033	Humanoid R4: 10M+ cumulative by Dec 2033	humanoid_deployment	—
correlate	S_AGI_WINTER_2036PLUS	AGI delayed: capability plateau or AI winter	agi_general_capability	—
correlate	S_ASI_FAST_2031	ASI fast: RSI within 5y of AGI	asi_recursive_self_improvement	—
killer	TK01	AGI Capability Plateau (2026-27 Training Stall)	—	—
killer	TK03	AI Regulatory Moratorium (EU/US Capability Freeze)	—	—

Dependents (5)

Predictions enabled by this

Type	Pred	Title	Domain	Lag
prereq	235_030	Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 2033.	Biotech/Longevity	—
prereq	241_043	ASI will arrive within 2 years to 5 years to this next decade	AI	—
prereq	CMQ_002	By 2028, AI systems will reach 'independent researcher' level — driving autonomous scientific discoveries without human intervention.	AI	—
prereq	SEM_034	True artificial general intelligence will be achieved between 2032 and 2042 — 'first we solve AI, then use AI to solve everything else'.	AI/AGI	—
prereq	242_001	Elon's Terafab will build 1 terawatt of AI compute per year, 50x current global production	AI	—

Linked documents (10)

Auto-generated by cosine similarity from Polymarket / Manifold / EDGAR / GDELT

Sim	Source	Title	Market prob	Polarity	Reviewed	Published
0.647	manifold	Which prediction market CEOs will I meet before 2028?	—	mentions	pending	2026-05-02
0.633	manifold	When will the first prediction market ETF launch?	—	mentions	pending	2026-04-29
0.631	manifold	Any new Claude Model Before May 16th?	4%	mentions	pending	2026-05-05
0.630	manifold	Will a prediction market-related court case reach the Supreme Court before 2029?	65%	mentions	pending	2026-05-27
0.624	manifold	Will Claude become a Pokèmon Master by the end of August 2026?	58%	mentions	pending	2026-05-12
0.616	manifold	Global Average Temperature May 2026 per LOTI v4 vs 1951-1980 base period (NASA Gistemp)	—	mentions	pending	2026-04-25
0.615	manifold	Will Manifold successfully predict the next pandemic at least 2 weeks in advance?	53%	mentions	pending	2026-05-18
0.608	manifold	Claude 5 released by End of September 2026?	78%	mentions	pending	2026-05-18
0.607	manifold	Global Average Temperature June 2026 per LOTI v4 vs 1951-1980 base period (NASA Gistemp)	—	mentions	pending	2026-06-01
0.605	manifold	Best METR 80% Time Horizon before August 2026	—	mentions	pending	2026-06-04

Raw metadata

From Thesis_Timeline_v1.0_FINAL workbook

{
  "nia": false,
  "qty": "expert-level AGI",
  "mode": "FORECAST",
  "role": "Cited-CEO",
  "context": "Amodei's 'Machines of Loving Grace' timeline; more aggressive than Altman on near-term expert-level AI.",
  "to_year": 2027,
  "conv_cues": "could arrive as early as; CEO",
  "direction": "HAPPEN",
  "from_year": 2026,
  "timeframe": "2026-2027",
  "conv_level": "HIGH",
  "milestones": [
    {
      "kind": "llm_pre_event",
      "label": "Anthropic publicly forecasts AGI / powerful AI within 2026-2027 horizon",
      "notes": "HIT — Amodei's 'Adolescence of Technology' (Jan 2026) and Davos 2026 appearance both reaffirm 2026-2027 powerful-AI window.",
      "source": "https://www.darioamodei.com/essay/machines-of-loving-grace",
      "status": "hit",
      "weight": 0.4,
      "ordinal": -8,
      "source_id": null,
      "confidence": 0.99,
      "source_url": "https://www.darioamodei.com/essay/machines-of-loving-grace",
      "expected_date": "2026-01-31",
      "observed_date": "2026-01-31",
      "research_origin": "deep_research",
      "measurement_criterion": "Anthropic CEO publicly states (essay, interview, blog post, congressional testimony) that AGI-level systems will arrive by 2026 or 2027"
    },
    {
      "kind": "prereq",
      "label": "Recursive self-improvement is already happening now (no longer three years out)",
      "status": "hit",
      "weight": 0.5,
      "ordinal": -7,
      "source_id": "238_009",
      "expected_date": "2026-04-29",
      "observed_date": "2026-04-29"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q1 window check-in (25%)",
      "status": "overdue",
      "weight": 0.05,
      "ordinal": -6,
      "source_id": null,
      "expected_date": "2026-05-17",
      "observed_date": null,
      "miss_emitted_at": "2026-05-30T22:15:00.756418+00:00",
      "miss_emitted_by": "metadata_milestone_sweep"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q2 window check-in (50%)",
      "status": "pending",
      "weight": 0.05,
      "ordinal": -5,
      "source_id": null,
      "expected_date": "2026-10-01",
      "observed_date": null
    },
    {
      "kind": "llm_pre_event",
      "label": "Anthropic ARR exceeds $30B annualized run-rate",
      "notes": "Anthropic at $1T valuation 2026 implies ARR-multiple consistent with $20-40B trajectory.",
      "source": "https://www.eweek.com/news/anthropic-1-trillion-valuation-neuron/",
      "status": "pending",
      "weight": 0.4,
      "ordinal": -4,
      "source_id": null,
      "confidence": 0.7,
      "source_url": "https://www.eweek.com/news/anthropic-1-trillion-valuation-neuron/",
      "expected_date": "2026-12-30",
      "research_origin": "deep_research",
      "expected_date_range": {
        "to": "2027-09-30",
        "from": "2026-04-01"
      },
      "measurement_criterion": "Anthropic publicly reports or analyst consensus shows ARR >=$30B annualized — scale signal for AGI-tier deployment"
    },
    {
      "kind": "llm_pre_event",
      "label": "Frontier model achieves expert-level performance on broad professional certification benchmark",
      "source": "https://blog.redwoodresearch.org/p/whats-up-with-anthropic-predicting",
      "status": "pending",
      "weight": 0.4,
      "ordinal": -3,
      "source_id": null,
      "confidence": 0.8,
      "source_url": "https://blog.redwoodresearch.org/p/whats-up-with-anthropic-predicting",
      "expected_date": "2027-01-30",
      "research_origin": "deep_research",
      "expected_date_range": {
        "to": "2027-09-30",
        "from": "2026-06-01"
      },
      "measurement_criterion": "Single frontier model (Claude / GPT / Gemini) scores at 90th-percentile-or-higher on >=5 distinct professional licensing exams (USMLE, bar, CPA, FE/PE engineering, CFA Level 3) within same release"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q3 window check-in (75%)",
      "status": "pending",
      "weight": 0.05,
      "ordina
... (truncated)