234_019predictionAIAI-scaling

Expect big AI capability improvements via recursive self-improvement over next few weeks

Predictor: Alex Wissner-Gross · ep#234 "Anthropic vs. The Pentagon, Claude Outpaces ChatGPT, and Consulting Gets Replaced" · source

Prior probability

50.0%

Current probability

39.4%

evolves via intake + LBP

Conviction

4/5

Signal quality

Resolution

pending

Window

2026-01-01 – 2026-11-30

Edges in / out

11 / 5

Tickers exposed

Prediction text

Expect big AI capability improvements via recursive self-improvement over next few weeks | So expect big things over the next few weeks. We're capability jumps in weeks not quarters.

Verbatim quote

From episode "Anthropic vs. The Pentagon, Claude Outpaces ChatGPT, and Consulting Gets Replaced"

So expect big things over the next few weeks. We're capability jumps in weeks not quarters.

Predictor: Alex Wissner-Gross

κ + Brier as of 2026-05-22

Full calibration →

κ (discount)

0.844

Brier

0.0341

excellent

Hits / Misses

6 / 1

of 11 resolved

Hit rate

54.5%

Calibration plot (stated vs observed)

Evidence about this node from Alex Wissner-Gross is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).

Reference class: agi_breakthrough_5y

Linked via embedding similarity 0.621

All classes →

Major capability discontinuity (e.g. AGI by named target year, 5-year horizon)

Base rate

20.0%

1/5 historical

Inside weight

—

Outside weight

—

no pull

inside 39.4% → blend 39.4% (Δ 0.0pp)

Tetlock-style outside view: at TRF=1 (just predicted), outside view dominates (w_in=0.3). At TRF=0 (deadline), inside view dominates (w_in=1.0). The blend regularizes overconfident inside views toward the historical base rate.

Probability over time

9 prob_history rows

intake v2milestone miss sweeplbp propagationreference class assignedlegacy v1prior_prob (analyst seed)current = 39.4%

Milestone chain

Pre-event signals (upstream prereqs + window checkpoints) → resolution event → downstream cascades. Status/dates update from linked nodes; re-derive nightly via scripts/ops/derive_milestones.py.

Leading chain: 4 fired ✓ · 5 overdue ⏱ · 1 pending

2026-02-15overdueOpenAI's GPT-5.5 trained at Stargate Abilene with internal AI assistance
How: OpenAI publicly confirms GPT-5.5 was trained at the flagship Stargate Abilene site with significant AI-assisted training/debugging contributions, citing recursive self-improvement loop language
Source: https://openai.com/index/building-the-compute-infrastructure-for-the-intelligence-age/ — OpenAI infrastructure announcementconf 85%
2026-02-28overdueAnthropic publicly states most code now written by AI
How: Anthropic executive (CEO/CTO) publicly confirms majority of internal code generation is AI-driven, indicating recursive feedback loop in production
Source: https://eu.36kr.com/en/p/3680937611062920 — Silicon Valley Article on AI Singularityconf 80%
2026-02-01 → 2026-03-31overdueMajor frontier model release cluster (Feb-Mar 2026)
How: Google, Anthropic, OpenAI, xAI, and Alibaba all release significant model updates within 60-day window with measurable benchmark improvements >5pp on capability suites
Source: https://blog.mean.ceo/new-ai-model-releases-news-april-2026/ — AI model release roundupconf 92%
2026-03-05overdueGPT-5.4 sets new computer-use benchmark record
How: OpenAI releases GPT-5.4 with publicly reported benchmark improvement >5pp on at least one major capability benchmark (OSWorld, SWE-Bench, MMLU)
Source: https://kersai.com/ai-breakthroughs-in-2026-march-update/ — March 2026 AI breakthroughsconf 90%
2026-04-15overdueRecursive Superintelligence startup funding signal
How: Self-improving AI startup raises >=$500M funding round within 6 months of founding, validating market belief in recursive self-improvement
Source: https://the-decoder.com/self-improving-ai-startup-recursive-superintelligence-pulls-in-500-million-just-four-months-after-founding/ — The Decoderconf 90%
2026-04-29hitNvidia became the world's first $5 trillion company (late 2025), operating a near-monopoly on advanced AI chips.
2026-04-29hitNvidia Data Center revenue +66% YoY, contributing ~90% of $57B fiscal Q3 revenue; >$4.5T market cap entirely underpinned by AI silicon.
2026-04-29hitNvidia's Arizona-based TSMC factory successfully fabricated cutting-edge semiconductors on US soil for first time in decades (October 2025).
2026-04-29hitNvidia quadrupled chip production output while only doubling human headcount — achieved by deploying AI coding tools (Cursor, Claude Code) a
2026-06-25pendingNvidia agreed to remit 15% of China chip-sale revenue directly to US government in exchange for reversing specific AI chip export bans.
2026-07-29pendingExpect big AI capability improvements via recursive self-improvement over next few weeks
2026-08-01 → 2026-12-31pendingCascade: AI capability gains compress to <60-day cycles
How: Major frontier-model release cadence drops below 60 days between SOTA-setting releases from any single lab, indicating accelerating self-improvement loop
Source: Pattern extrapolation from observed Dec 2025 to Feb 2026 OpenAI cadenceconf 55%
2028-06-25pendingWe're exiting the industrial age permanently as recursive self-improvement unfolds.
2030-09-27pendingMost large companies' business models will be disrupted in 2-5 years
2063-06-21pendingPeter's 14-year-old son Milan will never get a driver's license.

What if this resolves?

Clamp this prediction TRUE or FALSE and run a counterfactual Gibbs sample. Surfaces the predictions whose marginals shift most under that assumption.

(live posterior: 39%)

Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"

Evidence chain

Every probability update with full Bayesian provenance — chronological, latest first

LBP2026-05-24T02:00:02Z39.4%+2.0pp

Network propagation: 37.4% → 39.4%

4-iter LBP, residual 0.01000 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run 806b02f8

LBP2026-05-17T02:00:01Z37.4%+3.9pp

Network propagation: 33.5% → 37.4%

5-iter LBP, residual 0.00689 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e607fa96

LBP2026-05-10T02:00:02Z33.5%+7.3pp

Network propagation: 26.2% → 33.5%

6-iter LBP, residual 0.00584 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e5c18d29

LBP2026-05-03T02:00:01Z26.2%+11.1pp

Network propagation: 15.1% → 26.2%

6-iter LBP, residual 0.00677 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run 1a683ac9

metadata_milestone_miss_sweep2026-05-02T22:07:21Z15.1%-22.6pp

metadata_milestone_miss_sweep bayesian_v2 n=5 inside=0.120 blend=0.151 LLR=-1.495 κ=0.84 w_in=0.56 agi_breakthrough_5y

Raw metadata

{
  "trf": 0.6338685427014515,
  "kappa": 0.8438,
  "base_rate": 0.2,
  "predictor": "Alex Wissner-Gross",
  "total_llr": -2.027325540540822,
  "grace_days": 7,
  "bayesian_v2": true,
  "prior_logit": -0.4998786328260134,
  "bayes_factor": "4.5:1 against",
  "blend_reason": "blend 55% inside / 44% outside (TRF=0.634, base_rate=0.200 from agi_breakthrough_5y)",
  "inside_prior": 0.37756919095844854,
  "kappa_source": "predictor_table",
  "n_milestones": 5,
  "blend_applied": true,
  "contributions": [
    {
      "llr": -0.4054651081081644,
      "kind": "llm_pre_event",
      "kappa": 0.7172299999999999,
      "label": "OpenAI's GPT-5.5 trained at Stargate Abilene with internal AI assistance",
      "weight": 0.4,
      "strength": "weak",
      "confidence": 0.85,
      "source_url": "https://openai.com/index/building-the-compute-infrastructure-for-the-intelligence-age/",
      "adjusted_llr": -0.2908117394884187,
      "expected_date": "2026-02-15",
      "measurement_criterion": "OpenAI publicly confirms GPT-5.5 was trained at the flagship Stargate Abilene site with significant AI-assisted training/debugging contributions, citing recursive self-improvement loop language"
    },
    {
      "llr": -0.4054651081081644,
      "kind": "llm_pre_event",
      "kappa": 0.6750400000000001,
      "label": "Anthropic publicly states most code now written by AI",
      "weight": 0.4,
      "strength": "weak",
      "confidence": 0.8,
      "source_url": "https://eu.36kr.com/en/p/3680937611062920",
      "adjusted_llr": -0.27370516657733535,
      "expected_date": "2026-02-28",
      "measurement_criterion": "Anthropic executive (CEO/CTO) publicly confirms majority of internal code generation is AI-driven, indicating recursive feedback loop in production"
    },
    {
      "llr": -0.4054651081081644,
      "kind": "llm_pre_event",
      "kappa": 0.776296,
      "label": "Major frontier model release cluster (Feb-Mar 2026)",
      "weight": 0.4,
      "strength": "weak",
      "confidence": 0.92,
      "source_url": "https://blog.mean.ceo/new-ai-model-releases-news-april-2026/",
      "adjusted_llr": -0.3147609415639356,
      "expected_date": "2026-03-02",
      "measurement_criterion": "Google, Anthropic, OpenAI, xAI, and Alibaba all release significant model updates within 60-day window with measurable benchmark improvements >5pp on capability suites"
    },
    {
      "llr": -0.4054651081081644,
      "kind": "llm_pre_event",
      "kappa": 0.75942,
      "label": "GPT-5.4 sets new computer-use benchmark record",
      "weight": 0.4,
      "strength": "weak",
      "confidence": 0.9,
      "source_url": "https://kersai.com/ai-breakthroughs-in-2026-march-update/",
      "adjusted_llr": -0.3079183123995022,
      "expected_date": "2026-03-05",
      "measurement_criterion": "OpenAI releases GPT-5.4 with publicly reported benchmark improvement >5pp on at least one major capability benchmark (OSWorld, SWE-Bench, MMLU)"
    },
    {
      "llr": -0.4054651081081644,
      "kind": "llm_pre_event",
      "kappa": 0.75942,
      "label": "Recursive Superintelligence startup funding signal",
      "weight": 0.4,
      "strength": "weak",
      "confidence": 0.9,
      "source_url": "https://the-decoder.com/self-improving-ai-startup-recursive-superintelligence-pulls-in-500-million-just-four-months-after-founding/",
      "adjusted_llr": -0.3079183123995022,
      "expected_date": "2026-04-15",
      "measurement_criterion": "Self-improving AI startup raises >=$500M funding round within 6 months of founding, validating market belief in recursive self-improvement"
    }
  ],
  "evidence_kind": "metadata_milestone_miss_sweep",
  "inside_source": "history_v2",
  "inside_weight": 0.5562920201089838,
  "outside_weight": 0.4437079798910162,
  "posterior_prob": 0.15123998288627608,
  "posterior_logit": -1.9949931052547074,
  "predictor_brier": 0.03413,
  "inside_posterior": 0.11972961695514925,
  "blended_posterior": 0.15123998288627608,
  "reference_class_id": "agi_breakthrough_5y",
  "total_adjusted_llr": -1.495114472428694,
  "predictor_n_resolved": 11
}

LBP2026-04-30T16:39:51Z37.8%+3.5pp

Network propagation: 34.3% → 37.8%

5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v2 · run 0c8a4ea3

legacy v12026-04-30T16:13:50Z34.3%-4.0pp

reference_class_assigned bayesian_v2 inside=0.500 blend=0.343 w_in=0.53 agi_breakthrough_5y

LBP2026-04-30T02:18:57Z38.3%+4.1pp

Network propagation: 34.2% → 38.3%

5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v1 · run 592311ef

legacy v12026-04-30T01:56:50Z34.2%-15.8pp

reference_class_assigned bayesian_v2 inside=0.500 blend=0.342 w_in=0.53 agi_breakthrough_5y

Network propagation neighbors

Top edges sorted by latest LBP cross-impact

All propagation →

Top incoming (parents)

Edges that influence THIS node's belief

Kind	Node	Their prob	P(c\|s=T)	P(c\|s=F)	Δ implied
killer	TK03 AI Regulatory Moratorium (EU/US Capability Freeze)	10.0%	0.050	0.500	+0.061
killer	TK09 Energy Grid Cap (Data Center Power Wall)	35.0%	0.050	0.500	-0.052
killer	TK02 AI Compute Supply Shock (TSMC/Taiwan Disruption)	12.0%	0.050	0.500	+0.052
prereq	SEM_015 Nvidia agreed to remit 15% of China chip-sale revenue direct — Jensen Huang	66.3%	0.500	0.050	-0.042
prereq	SEM_027 Nvidia Data Center revenue +66% YoY, contributing ~90% of $5 — Joseph Moore	68.3%	0.500	0.050	-0.042

Top outgoing (children)

Predictions THIS node influences

Kind	Node	Their prob	P(c\|s=T)	P(c\|s=F)	Δ implied
prereq	247_023 AI will be able to do everything a white collar worker does — Dave Blundin	40.8%	0.720	0.050	-0.086
prereq	244_019 Peter's son won't need a driver's license in 2 years — Peter Diamandis	48.4%	0.920	0.050	-0.081
prereq	242_031 Most large companies' business models will be disrupted in 2 — Peter Diamandis	36.1%	0.650	0.050	-0.068
prereq	230_020 Peter's 14-year-old son Milan will never get a driver's lice — Peter Diamandis	34.7%	0.650	0.050	-0.054
prereq	232_055 We're exiting the industrial age permanently as recursive se — Peter Diamandis	35.5%	0.700	0.050	-0.041

Ticker exposure

37 ticker(s) linked

Beneficiaries (24)

MU WULF IREN EQIX ALAB APLD ASMIY ASML PLAB NVDA NBIS CRWV AAPL AMT AMZN DELL GOOGL IRM LNVGY META MSFT ORCL SFTBY STX

Adverse (6)

ACN GEN CHGG IBM WNS LRN

Prerequisites (11)

Predictions that must hit first

Type	Pred	Title	Domain	Lag
prereq	SEM_011	Nvidia became the world's first $5 trillion company (late 2025), operating a near-monopoly on advanced AI chips.	Capital Markets	—
prereq	SEM_027	Nvidia Data Center revenue +66% YoY, contributing ~90% of $57B fiscal Q3 revenue; >$4.5T market cap entirely underpinned by AI silicon.	Capital Markets	—
prereq	SEM_014	Nvidia's Arizona-based TSMC factory successfully fabricated cutting-edge semiconductors on US soil for first time in decades (October 2025).	Manufacturing	—
prereq	SEM_012	Nvidia quadrupled chip production output while only doubling human headcount — achieved by deploying AI coding tools (Cursor, Claude Code) across engineering.	AI/Manufacturing	—
prereq	SEM_015	Nvidia agreed to remit 15% of China chip-sale revenue directly to US government in exchange for reversing specific AI chip export bans.	Policy/Semis	—
correlate	S_ASI_SLOW_2040PLUS	ASI slow: post-2040 / soft takeoff	asi_recursive_self_improvement	—
killer	TK09	Energy Grid Cap (Data Center Power Wall)	—	—
killer	TK05	Rate Regime Persistence (10y > 5% through 2028)	—	—
killer	TK01	AGI Capability Plateau (2026-27 Training Stall)	—	—
killer	TK02	AI Compute Supply Shock (TSMC/Taiwan Disruption)	—	—
killer	TK03	AI Regulatory Moratorium (EU/US Capability Freeze)	—	—

Dependents (5)

Predictions enabled by this

Type	Pred	Title	Domain	Lag
prereq	244_019	Peter's son won't need a driver's license in 2 years	Auto/Transport	—
prereq	247_023	AI will be able to do everything a white collar worker does imminently	AI	—
prereq	232_055	We're exiting the industrial age permanently as recursive self-improvement unfolds.	AI	—
prereq	242_031	Most large companies' business models will be disrupted in 2-5 years	Markets/Stocks	—
prereq	230_020	Peter's 14-year-old son Milan will never get a driver's license.	Auto/Transport	—

Linked documents (10)

Auto-generated by cosine similarity from Polymarket / Manifold / EDGAR / GDELT

Sim	Source	Title	Market prob	Polarity	Reviewed	Published
0.758	manifold	Will AI continue to improve?	84%	mentions	pending	2026-06-01
0.726	manifold	Is ai going to be self repairing?	10%	mentions	pending	2026-05-25
0.635	arxiv	Three-Stage Learning Unlocks Strong Performance in Simple Models for Long-Term Time Series Forecasting	—	mentions	pending	2026-05-13
0.630	manifold	I go through the scaling book this week?	32%	mentions	pending	2026-05-04
0.621	manifold	What goals will I achieve this week?	—	mentions	pending	2026-05-10
0.607	manifold	Which of these will I achieve?	—	mentions	pending	2026-04-24
0.591	manifold	What goals will I achieve this week?	—	mentions	pending	2026-05-18
0.591	manifold	What goals will I achieve this week?	—	mentions	pending	2026-05-25
0.591	manifold	What goals will I achieve this week?	—	mentions	pending	2026-06-01
0.589	manifold	Will I speak to a human in the next week?	72%	mentions	pending	2026-05-04

Raw metadata

From Thesis_Timeline_v1.0_FINAL workbook

{
  "nia": false,
  "qty": "orders of magnitude improvement in capability density by parameter",
  "url": "https://www.youtube.com/watch?v=dmtvGKuRE64",
  "mode": "PREDICTION",
  "role": "Host",
  "context": "we're starting even over the past week or two, we're getting into the era when you can get smarter, better, faster models by asking a previous model just emit the weights, the parameters directly for a successor model and you can get orders of magnitude improvement in terms of capability density by by parameter. So expect big things over the next few weeks.",
  "to_year": 2026,
  "verbatim": "So expect big things over the next few weeks. We're capability jumps in weeks not quarters.",
  "conv_cues": "expect big things",
  "direction": "HAPPEN",
  "from_year": 2026,
  "timeframe": "Next few weeks (March-April 2026)",
  "conv_level": "HIGH",
  "milestones": [
    {
      "kind": "llm_pre_event",
      "label": "OpenAI's GPT-5.5 trained at Stargate Abilene with internal AI assistance",
      "source": "https://openai.com/index/building-the-compute-infrastructure-for-the-intelligence-age/ — OpenAI infrastructure announcement",
      "status": "overdue",
      "weight": 0.4,
      "ordinal": -10,
      "source_id": null,
      "confidence": 0.85,
      "source_url": "https://openai.com/index/building-the-compute-infrastructure-for-the-intelligence-age/",
      "expected_date": "2026-02-15",
      "miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
      "miss_emitted_by": "metadata_milestone_sweep",
      "research_origin": "deep_research",
      "measurement_criterion": "OpenAI publicly confirms GPT-5.5 was trained at the flagship Stargate Abilene site with significant AI-assisted training/debugging contributions, citing recursive self-improvement loop language"
    },
    {
      "kind": "llm_pre_event",
      "label": "Anthropic publicly states most code now written by AI",
      "source": "https://eu.36kr.com/en/p/3680937611062920 — Silicon Valley Article on AI Singularity",
      "status": "overdue",
      "weight": 0.4,
      "ordinal": -9,
      "source_id": null,
      "confidence": 0.8,
      "source_url": "https://eu.36kr.com/en/p/3680937611062920",
      "expected_date": "2026-02-28",
      "miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
      "miss_emitted_by": "metadata_milestone_sweep",
      "research_origin": "deep_research",
      "measurement_criterion": "Anthropic executive (CEO/CTO) publicly confirms majority of internal code generation is AI-driven, indicating recursive feedback loop in production"
    },
    {
      "kind": "llm_pre_event",
      "label": "Major frontier model release cluster (Feb-Mar 2026)",
      "source": "https://blog.mean.ceo/new-ai-model-releases-news-april-2026/ — AI model release roundup",
      "status": "overdue",
      "weight": 0.4,
      "ordinal": -8,
      "source_id": null,
      "confidence": 0.92,
      "source_url": "https://blog.mean.ceo/new-ai-model-releases-news-april-2026/",
      "expected_date": "2026-03-02",
      "miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
      "miss_emitted_by": "metadata_milestone_sweep",
      "research_origin": "deep_research",
      "expected_date_range": {
        "to": "2026-03-31",
        "from": "2026-02-01"
      },
      "measurement_criterion": "Google, Anthropic, OpenAI, xAI, and Alibaba all release significant model updates within 60-day window with measurable benchmark improvements >5pp on capability suites"
    },
    {
      "kind": "llm_pre_event",
      "label": "GPT-5.4 sets new computer-use benchmark record",
      "source": "https://kersai.com/ai-breakthroughs-in-2026-march-update/ — March 2026 AI breakthroughs",
      "status": "overdue",
      "weight": 0.4,
      "ordinal": -7,
      "source_id": null,
      "confidence": 0.9,
      "source_url": "https://kersai.com/ai-breakthroughs-in-2026-march-update/",
      "expected_date": "2026-03-05",
      "miss_emitted_at": "2026-05-02T22:07:21.384228+00:
... (truncated)