CMQ_001predictionAIAGI-capability-roadmap

By 2026, AI will reach 'intern-level' capability — millions of virtual interns performing supervised, economically useful tasks.

Predictor: Sam Altman

Prior probability

72.0%

Current probability

44.8%

evolves via intake + LBP

Conviction

5/5

Signal quality

Resolution

pending

Window

2026-01-01 – 2026-09-30

Edges in / out

5 / 14

Tickers exposed

Prediction text

By 2026, AI will reach 'intern-level' capability — millions of virtual interns performing supervised, economically useful tasks. | OpenAI GPT-5/6-class agent releases

Key catalyst: OpenAI GPT-5/6-class agent releases

Watch events: OpenAI agent product releases; GDPval-style benchmarks; labor-substitution evidence from early enterprise deployments.

Resolution evidence

Status: pending

Consistent with OpenAI agent releases (Operator, Deep Research, ChatGPT agents) approaching intern-level task execution through 2025-2026.

Predictor: Sam Altman

κ + Brier as of 2026-05-22

Full calibration →

κ (discount)

0.583

Brier

0.0625

excellent

Hits / Misses

0 / 0

of 1 resolved

Hit rate

0.0%

Calibration plot (stated vs observed)

Evidence about this node from Sam Altman is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).

Reference class: agi_breakthrough_5y

Linked

All classes →

Major capability discontinuity (e.g. AGI by named target year, 5-year horizon)

Base rate

20.0%

1/5 historical

Inside weight

0.686

TRF=0.45

Outside weight

0.314

pulling toward base rate

inside 58.1% → blend 44.8% (Δ -13.4pp)

Tetlock-style outside view: at TRF=1 (just predicted), outside view dominates (w_in=0.3). At TRF=0 (deadline), inside view dominates (w_in=1.0). The blend regularizes overconfident inside views toward the historical base rate.

Probability over time

9 prob_history rows

intake v2milestone miss sweeplbp propagationreference class assignedlegacy v1prior_prob (analyst seed)current = 44.8%

Milestone chain

Pre-event signals (upstream prereqs + window checkpoints) → resolution event → downstream cascades. Status/dates update from linked nodes; re-derive nightly via scripts/ops/derive_milestones.py.

Leading chain: 1 fired ✓ · 3 overdue ⏱

2026-02-14overdueQ1 window check-in (25%)
2026-03-30overdueQ2 window check-in (50%)
2026-04-29hitRecursive self-improvement is already happening now (no longer three years out)
2026-05-13overdueQ3 window check-in (75%)
2026-06-26pendingBy 2026, AI will reach 'intern-level' capability — millions of virtual interns performing supervised, economically useful tasks.
2027-06-18pendingBaby AGI agents will need and develop an 'immune system' for prompt injection and cybersecurity threats in real time.
2028-06-22pendingAI learning will improve via closed-loop reinforcement learning cycle making results keep increasing.
2029-12-10pendingElon plans to produce tens of millions of robots per year in just a few years.
2030-09-15pending$100 trillion companies within 5 years (3 years from now, per Diamandis interpretation of Musk)
2033-07-30pendingRay Kurzweil predicts Longevity Escape Velocity (LEV) by 2033.
2037-06-24pendingMass drivers on the moon will shoot AI satellites into deep space; self-sustaining lunar city will follow.

What if this resolves?

Clamp this prediction TRUE or FALSE and run a counterfactual Gibbs sample. Surfaces the predictions whose marginals shift most under that assumption.

(live posterior: 45%)

Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"

Evidence chain

Every probability update with full Bayesian provenance — chronological, latest first

metadata_milestone_miss_sweep2026-05-30T22:15:00Z44.8%-19.0pp

metadata_milestone_miss_sweep bayesian_v2 n=1 inside=0.581 blend=0.448 LLR=-0.237 κ=0.58 w_in=0.69 agi_breakthrough_5y

Raw metadata

{
  "trf": 0.4487974555581937,
  "kappa": 0.5833,
  "base_rate": 0.2,
  "predictor": "Sam Altman",
  "total_llr": -0.4054651081081644,
  "grace_days": 7,
  "bayesian_v2": true,
  "prior_logit": 0.5653088523306947,
  "bayes_factor": "1.3:1 against",
  "blend_reason": "blend 68% inside / 31% outside (TRF=0.449, base_rate=0.200 from agi_breakthrough_5y)",
  "inside_prior": 0.6376800141997431,
  "kappa_source": "predictor_table",
  "n_milestones": 1,
  "blend_applied": true,
  "contributions": [
    {
      "llr": -0.4054651081081644,
      "kind": "quartile_checkpoint",
      "kappa": 0.5833,
      "label": "Q3 window check-in (75%)",
      "weight": 0.05,
      "strength": "weak",
      "confidence": null,
      "source_url": null,
      "adjusted_llr": -0.2365077975594923,
      "expected_date": "2026-05-13",
      "measurement_criterion": null
    }
  ],
  "evidence_kind": "metadata_milestone_miss_sweep",
  "inside_source": "history_v2",
  "inside_weight": 0.6858417811092643,
  "outside_weight": 0.3141582188907357,
  "posterior_prob": 0.4476895519386729,
  "posterior_logit": 0.3288010547712024,
  "predictor_brier": 0.0625,
  "inside_posterior": 0.5814676264348703,
  "blended_posterior": 0.4476895519386729,
  "reference_class_id": "agi_breakthrough_5y",
  "total_adjusted_llr": -0.2365077975594923,
  "predictor_n_resolved": 1
}

LBP2026-05-24T02:00:02Z63.8%-5.1pp

Network propagation: 68.9% → 63.8%

4-iter LBP, residual 0.01000 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run 806b02f8

intake_event_update2026-05-21T23:15:16Z68.9%+9.2pp

intake:7afeeb9a-f217-4dd2-b910-24ff14bdfc39 bayesian_v2 inside=0.689 blend=0.689 LLR=0.404 κ=0.58 no_blend

Raw metadata

{
  "trf": 0.481731855298543,
  "kappa": 0.5833,
  "base_rate": null,
  "predictor": "Sam Altman",
  "total_llr": 0.6931471805599453,
  "bayesian_v2": true,
  "prior_logit": 0.3904375358084699,
  "bayes_factor": "1.5:1 favoring",
  "blend_reason": "no reference_class linked",
  "inside_prior": 0.5963880226965899,
  "kappa_source": "predictor_table",
  "blend_applied": false,
  "contributions": [
    {
      "llr": 0.6931471805599453,
      "kappa": 0.5833,
      "label": "Weeks-long autonomous task capability is well beyond 'intern-level'.",
      "adjusted_llr": 0.4043127504206161
    }
  ],
  "evidence_kind": "intake_event_update",
  "inside_source": "history_v2",
  "inside_weight": 1,
  "outside_weight": 0,
  "posterior_prob": 0.6888503979684238,
  "evidence_origin": "daily_intake",
  "llm_suggestions": [
    {
      "polarity": "corroborates",
      "status_change": "unchanged",
      "evidence_strength": "moderate",
      "delta_prob_suggestion": 0.05
    }
  ],
  "posterior_logit": 0.7947502862290861,
  "predictor_brier": 0.0625,
  "evidence_doc_ids": [],
  "inside_posterior": 0.6888503979684238,
  "blended_posterior": 0.6888503979684238,
  "reference_class_id": null,
  "total_adjusted_llr": 0.4043127504206161,
  "predictor_n_resolved": 1
}

LBP2026-05-10T02:00:02Z59.6%-1.7pp

Network propagation: 61.4% → 59.6%

6-iter LBP, residual 0.00584 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e5c18d29

LBP2026-05-03T02:00:01Z61.4%-3.1pp

Network propagation: 64.4% → 61.4%

6-iter LBP, residual 0.00677 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run 1a683ac9

metadata_milestone_miss_sweep2026-05-02T22:07:21Z64.4%-10.0pp

metadata_milestone_miss_sweep bayesian_v2 n=2 inside=0.644 blend=0.644 LLR=-0.473 κ=0.58 no_blend

Raw metadata

{
  "trf": 0.5517581791161151,
  "kappa": 0.5833,
  "base_rate": null,
  "predictor": "Sam Altman",
  "total_llr": -0.8109302162163288,
  "grace_days": 7,
  "bayesian_v2": true,
  "prior_logit": 1.0675603953538413,
  "bayes_factor": "1.6:1 against",
  "blend_reason": "no reference_class linked",
  "inside_prior": 0.7441326937015715,
  "kappa_source": "predictor_table",
  "n_milestones": 2,
  "blend_applied": false,
  "contributions": [
    {
      "llr": -0.4054651081081644,
      "kind": "quartile_checkpoint",
      "kappa": 0.5833,
      "label": "Q1 window check-in (25%)",
      "weight": 0.05,
      "strength": "weak",
      "confidence": null,
      "source_url": null,
      "adjusted_llr": -0.2365077975594923,
      "expected_date": "2026-02-14",
      "measurement_criterion": null
    },
    {
      "llr": -0.4054651081081644,
      "kind": "quartile_checkpoint",
      "kappa": 0.5833,
      "label": "Q2 window check-in (50%)",
      "weight": 0.05,
      "strength": "weak",
      "confidence": null,
      "source_url": null,
      "adjusted_llr": -0.2365077975594923,
      "expected_date": "2026-03-30",
      "measurement_criterion": null
    }
  ],
  "evidence_kind": "metadata_milestone_miss_sweep",
  "inside_source": "history_v2",
  "inside_weight": 0.6137692746187193,
  "outside_weight": 0.38623072538128067,
  "posterior_prob": 0.6444072531106255,
  "posterior_logit": 0.5945448002348567,
  "predictor_brier": 0.0625,
  "inside_posterior": 0.6444072531106255,
  "blended_posterior": 0.6444072531106255,
  "reference_class_id": null,
  "total_adjusted_llr": -0.4730155951189846,
  "predictor_n_resolved": 1
}

legacy v12026-04-30T19:17:54Z74.4%+8.4pp

intake:99aa73db-75b1-4b1e-8470-a11f87b23937 bayesian_v2 inside=0.744 blend=0.744 LLR=0.404 κ=0.58 no_blend

LBP2026-04-30T16:39:51Z66.0%-2.1pp

Network propagation: 68.1% → 66.0%

5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v2 · run 0c8a4ea3

LBP2026-04-30T02:18:57Z68.1%-3.9pp

Network propagation: 72.0% → 68.1%

5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v1 · run 592311ef

Network propagation neighbors

Top edges sorted by latest LBP cross-impact

All propagation →

Top incoming (parents)

Edges that influence THIS node's belief

Kind	Node	Their prob	P(c\|s=T)	P(c\|s=F)	Δ implied
killer	TK03 AI Regulatory Moratorium (EU/US Capability Freeze)	10.0%	0.050	0.720	+0.205
killer	TK01 AGI Capability Plateau (2026-27 Training Stall)	15.0%	0.050	0.720	+0.172
prereq	238_009 Recursive self-improvement is already happening now (no long — Alex Wissner-Gross	78.1%	0.720	0.050	+0.120

Top outgoing (children)

Predictions THIS node influences

Kind	Node	Their prob	P(c\|s=T)	P(c\|s=F)	Δ implied
prereq	241_057 Elon Musk believes robot building robot is imminent — Elon Musk	44.8%	0.550	0.050	-0.143
prereq	242_001 Elon's Terafab will build 1 terawatt of AI compute per year, — Elon Musk	43.9%	0.550	0.050	-0.134
prereq	233_021 AI learning will improve via closed-loop reinforcement learn — Joe Liemandt	38.7%	0.450	0.050	-0.133
prereq	237_023 Baby AGI agents will need and develop an 'immune system' for — Alex Wissner-Gross	40.7%	0.500	0.050	-0.128
prereq	238_052 $100 trillion companies within 5 years (3 years from now, pe — Elon Musk	41.7%	0.550	0.050	-0.112

Ticker exposure

13 ticker(s) linked

Beneficiaries (13)

BBAI NVDA GTLB SOUN AI META MSFT ORCL TCEHY AMZN BABA GOOGL IBM

Prerequisites (5)

Predictions that must hit first

Type	Pred	Title	Domain	Lag
prereq	238_009	Recursive self-improvement is already happening now (no longer three years out)	AI	—
correlate	S_AGI_MID_2029	AGI mid: Kurzweil 2029 path	agi_general_capability	—
correlate	S_AI_PAUSE_2026	Major-country AI pause beginning 2026	ai_regulatory_pause	—
killer	TK01	AGI Capability Plateau (2026-27 Training Stall)	—	—
killer	TK03	AI Regulatory Moratorium (EU/US Capability Freeze)	—	—

Dependents (14)

Predictions enabled by this

Type	Pred	Title	Domain	Lag
prereq	235_030	Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 2033.	Biotech/Longevity	—
prereq	241_043	ASI will arrive within 2 years to 5 years to this next decade	AI	—
prereq	CMQ_002	By 2028, AI systems will reach 'independent researcher' level — driving autonomous scientific discoveries without human intervention.	AI	—
prereq	232_047	Mass drivers on the moon will shoot AI satellites into deep space; self-sustaining lunar city will follow.	Space	—
prereq	239_009	People will be on Mars within 10 years	Space	—
prereq	241_057	Elon Musk believes robot building robot is imminent	Robotics	—
prereq	242_001	Elon's Terafab will build 1 terawatt of AI compute per year, 50x current global production	AI	—
prereq	SEM_034	True artificial general intelligence will be achieved between 2032 and 2042 — 'first we solve AI, then use AI to solve everything else'.	AI/AGI	—
prereq	238_052	$100 trillion companies within 5 years (3 years from now, per Diamandis interpretation of Musk)	Markets/Stocks	—
prereq	239_008	Moon base will exist in 10 years	Space	—
prereq	237_023	Baby AGI agents will need and develop an 'immune system' for prompt injection and cybersecurity threats in real time.	AI	—
prereq	239_010	Mass driver on the moon within 10 years	Space	—
prereq	233_021	AI learning will improve via closed-loop reinforcement learning cycle making results keep increasing.	AI	—
prereq	230_022	Elon plans to produce tens of millions of robots per year in just a few years.	Robotics	—

Linked documents (10)

Auto-generated by cosine similarity from Polymarket / Manifold / EDGAR / GDELT

Sim	Source	Title	Market prob	Polarity	Reviewed	Published
0.750	codex_research_pack	METR - Measuring AI Ability to Complete Long Tasks	—	corroborates	pending	2025-03-19
0.750	codex_research_pack	OECD - Exploring Possible AI Trajectories Through 2030	—	corroborates	pending	2026-04-26
0.720	manifold	Will OpenAI release a GPT version > 5.5 before June 2026?	11%	mentions	pending	2026-05-15
0.685	manifold	at the end of 2026, will an AI be able to generate a full high-quality tv ep to a prompt?	35%	mentions	pending	2026-05-31
0.650	manifold	Will "How Go Players Disempower Themselves to AI" make the top fifty posts in LessWrong's 2026 Annual Review?	34%	mentions	pending	2026-05-02
0.635	manifold	Will OpenAI de-deploy GPT-5.5 before 2027 for safety, security, cyber-risk, or other threat-related reasons?	8%	mentions	pending	2026-05-04
0.624	github_release	openai/openai-python v2.23.0	—	mentions	pending	2026-02-24
0.611	manifold	GPT 5.6 released by…?	—	mentions	pending	2026-05-18
0.606	manifold	When will Indonesia announce the IMO 2026 Team?	—	mentions	pending	2026-04-27
0.603	manifold	What will I get at IMO 2026?	—	mentions	pending	2026-05-03

Raw metadata

From Thesis_Timeline_v1.0_FINAL workbook

{
  "nia": false,
  "qty": "intern-level",
  "mode": "FORECAST",
  "role": "Cited-CEO",
  "context": "Altman's capability roadmap: 2026 intern-level; 2028 independent researcher; 2030 surpass peak human expert in all cognitive domains. Framed AGI as a 'phase shift' rather than a finish line.",
  "to_year": 2026,
  "conv_cues": "will reach; specific year; CEO FIRST_PERSON",
  "direction": "HAPPEN",
  "from_year": 2026,
  "timeframe": "by 2026",
  "conv_level": "HIGH",
  "milestones": [
    {
      "kind": "quartile_checkpoint",
      "label": "Q1 window check-in (25%)",
      "status": "overdue",
      "weight": 0.05,
      "ordinal": -4,
      "source_id": null,
      "expected_date": "2026-02-14",
      "observed_date": null,
      "miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
      "miss_emitted_by": "metadata_milestone_sweep"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q2 window check-in (50%)",
      "status": "overdue",
      "weight": 0.05,
      "ordinal": -3,
      "source_id": null,
      "expected_date": "2026-03-30",
      "observed_date": null,
      "miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
      "miss_emitted_by": "metadata_milestone_sweep"
    },
    {
      "kind": "prereq",
      "label": "Recursive self-improvement is already happening now (no longer three years out)",
      "status": "hit",
      "weight": 0.5,
      "ordinal": -2,
      "source_id": "238_009",
      "expected_date": "2026-04-29",
      "observed_date": "2026-04-29"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q3 window check-in (75%)",
      "status": "overdue",
      "weight": 0.05,
      "ordinal": -1,
      "source_id": null,
      "expected_date": "2026-05-13",
      "observed_date": null,
      "miss_emitted_at": "2026-05-30T22:15:00.756418+00:00",
      "miss_emitted_by": "metadata_milestone_sweep"
    },
    {
      "kind": "event",
      "label": "By 2026, AI will reach 'intern-level' capability — millions of virtual interns performing supervised, economically useful tasks.",
      "status": "pending",
      "weight": 1,
      "ordinal": 0,
      "source_id": "CMQ_001",
      "expected_date": "2026-06-26",
      "observed_date": null
    },
    {
      "kind": "cascade",
      "label": "Baby AGI agents will need and develop an 'immune system' for prompt injection and cybersecurity threats in real time.",
      "status": "pending",
      "weight": 0.5,
      "ordinal": 1,
      "source_id": "237_023",
      "expected_date": "2027-06-18",
      "observed_date": null
    },
    {
      "kind": "cascade",
      "label": "AI learning will improve via closed-loop reinforcement learning cycle making results keep increasing.",
      "status": "pending",
      "weight": 0.5,
      "ordinal": 2,
      "source_id": "233_021",
      "expected_date": "2028-06-22",
      "observed_date": null
    },
    {
      "kind": "cascade",
      "label": "Elon plans to produce tens of millions of robots per year in just a few years.",
      "status": "pending",
      "weight": 0.5,
      "ordinal": 3,
      "source_id": "230_022",
      "expected_date": "2029-12-10",
      "observed_date": null
    },
    {
      "kind": "cascade",
      "label": "$100 trillion companies within 5 years (3 years from now, per Diamandis interpretation of Musk)",
      "status": "pending",
      "weight": 0.5,
      "ordinal": 4,
      "source_id": "238_052",
      "expected_date": "2030-09-15",
      "observed_date": null
    },
    {
      "kind": "cascade",
      "label": "Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 2033.",
      "status": "pending",
      "weight": 0.5,
      "ordinal": 5,
      "source_id": "235_030",
      "expected_date": "2033-07-30",
      "observed_date": null
    },
    {
      "kind": "cascade",
      "label": "Mass drivers on the moon will shoot AI satellites into deep space; self-sustaining lunar city will follow.",
      "status": "pending",
      "weight": 0.5,
     
... (truncated)