← Cockpit
240_014predictionAIAI-timing

Cost of reasoning models has dropped 1,000x in 16 months

Predictor: Sam Altman · ep#240 "NVIDIA's $1 Trillion Prediction, Anthropic Beats OpenAI, Tesla vs. TSMC & The CS Job Collapse" · source

Prior probability
60.0%
Current probability
50.3%
evolves via intake + LBP
Conviction
4/5
Signal quality
B
Resolution
pending
Window
2024-01-01 – 2026-10-31
Edges in / out
3 / 5
Tickers exposed
33

Prediction text

Cost of reasoning models has dropped 1,000x in 16 months | To get the same answer to a hard problem from that first model to 5.4 for has been a reduction in cost of about a,000x.

Verbatim quote

From episode "NVIDIA's $1 Trillion Prediction, Anthropic Beats OpenAI, Tesla vs. TSMC & The CS Job Collapse"
To get the same answer to a hard problem from that first model to 5.4 for has been a reduction in cost of about a,000x.

Predictor: Sam Altman

κ + Brier as of 2026-05-22
κ (discount)
0.583
Brier
0.0625
excellent
Hits / Misses
0 / 0
of 1 resolved
Hit rate
0.0%
Calibration plot (stated vs observed)

Evidence about this node from Sam Altman is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).

Reference class

Not linked

This node isn't linked to a reference class. The Bayesian update applies without outside-view blending.

Probability over time

7 prob_history rows
0%25%50%75%100%prior 60%2026-04-302026-05-032026-05-24
intake v2milestone miss sweeplbp propagationreference class assignedlegacy v1prior_prob (analyst seed)current = 50.3%

Milestone chain

Pre-event signals (upstream prereqs + window checkpoints) → resolution event → downstream cascades. Status/dates update from linked nodes; re-derive nightly via scripts/ops/derive_milestones.py.
Leading chain: 3 fired ✓ · 4 overdue ⏱
  1. 2024-08-24overdueQ1 window check-in (25%)
  2. 2025-04-18overdueQ2 window check-in (50%)
  3. 2025-12-11overdueQ3 window check-in (75%)
  4. 2026-01-31hitGPT-4-class inference cost drops to $0.40/M tokens (1000x reduction)
    How: Industry pricing data confirms GPT-4-class equivalent performance available at <=$0.40/M tokens, vs $20/M in late 2022 (>=1000x drop)
    Source: https://www.gpunex.com/blog/ai-inference-economics-2026/ — 1000x cost collapseconf 95%
  5. 2026-02-15hitDeepSeek R1 runs 20-50x cheaper than OpenAI equivalent
    How: Sam Altman or OpenAI executive publicly acknowledges DeepSeek R1 runs at 20-50x cheaper inference cost than OpenAI equivalent reasoning model
    Source: https://www.gpunex.com/blog/ai-inference-economics-2026/ — 20-50x cheaperconf 92%
  6. 2026-03-15hit$18B allocated to foundation model APIs in 2025 (paradox confirmation)
    How: 2025 industry totals confirm >=$18B spent on foundation model APIs (vs $4B training infra) — confirms cost-down/usage-up paradox
    Source: https://www.arturmarkus.com/the-inference-cost-paradox-why-generative-ai-spending-surged-320-in-2025-despite-per-token-costs-dropping-1-000x-and-what-it-means-for-your-ai-budget-in-2026/ — Inference paradoxconf 90%
  7. 2026-03-31overdueDeepSeek V4 Pro launches at 98% less than GPT-5.5 Pro
    How: DeepSeek launches V4 Pro at <=2% the cost of GPT-5.5 Pro for equivalent reasoning benchmark performance
    Source: https://decrypt.co/365455/deepseek-v4-launch-pro-version-costs-less-gpt-5-pro — DeepSeek V4conf 85%
  8. 2026-06-01 → 2026-12-31pendingEpoch AI publishes inference price-trend data showing further drops 2026
    How: Epoch AI publishes 2026 inference price-trend update showing reasoning-model cost-per-token down >=50% YoY in 2026
    Source: https://epoch.ai/data-insights/llm-inference-price-trends — Epoch AI trendsconf 80%
  9. 2026-09-01 → 2027-03-31pendingCascade: Enterprise inference spend exceeds $50B 2026 despite per-token drops
    How: 2026 full-year foundation-model-API spend >=$50B globally despite continuing per-token price decline
    Source: Cascade from $18B 2025 base + reasoning-model token explosionconf 65%

What if this resolves?

Clamp this prediction TRUE or FALSE and run a counterfactual Gibbs sample. Surfaces the predictions whose marginals shift most under that assumption.
(live posterior: 50%)

Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"

Evidence chain

Every probability update with full Bayesian provenance — chronological, latest first
LBP2026-05-24T02:00:02Z50.3%+1.3pp
Network propagation: 49.0% → 50.3%
4-iter LBP, residual 0.01000 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run 806b02f8
LBP2026-05-17T02:00:01Z49.0%+2.6pp
Network propagation: 46.5% → 49.0%
5-iter LBP, residual 0.00689 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e607fa96
LBP2026-05-10T02:00:02Z46.5%+5.1pp
Network propagation: 41.4% → 46.5%
6-iter LBP, residual 0.00584 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e5c18d29
LBP2026-05-03T02:00:01Z41.4%+9.4pp
Network propagation: 32.1% → 41.4%
6-iter LBP, residual 0.00677 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run 1a683ac9
metadata_milestone_miss_sweep2026-05-02T22:07:21Z32.1%-21.9pp
metadata_milestone_miss_sweep bayesian_v2 n=4 inside=0.321 blend=0.321 LLR=-0.911 κ=0.58 no_blend
Raw metadata
{
  "trf": 0.17512400843286588,
  "kappa": 0.5833,
  "base_rate": null,
  "predictor": "Sam Altman",
  "total_llr": -1.6218604324326575,
  "grace_days": 7,
  "bayesian_v2": true,
  "prior_logit": 0.15932169579091693,
  "bayes_factor": "2.5:1 against",
  "blend_reason": "no reference_class linked",
  "inside_prior": 0.5397463846206305,
  "kappa_source": "predictor_table",
  "n_milestones": 4,
  "blend_applied": false,
  "contributions": [
    {
      "llr": -0.4054651081081644,
      "kind": "quartile_checkpoint",
      "kappa": 0.5833,
      "label": "Q1 window check-in (25%)",
      "weight": 0.05,
      "strength": "weak",
      "confidence": null,
      "source_url": null,
      "adjusted_llr": -0.2365077975594923,
      "expected_date": "2024-08-24",
      "measurement_criterion": null
    },
    {
      "llr": -0.4054651081081644,
      "kind": "quartile_checkpoint",
      "kappa": 0.5833,
      "label": "Q2 window check-in (50%)",
      "weight": 0.05,
      "strength": "weak",
      "confidence": null,
      "source_url": null,
      "adjusted_llr": -0.2365077975594923,
      "expected_date": "2025-04-18",
      "measurement_criterion": null
    },
    {
      "llr": -0.4054651081081644,
      "kind": "quartile_checkpoint",
      "kappa": 0.5833,
      "label": "Q3 window check-in (75%)",
      "weight": 0.05,
      "strength": "weak",
      "confidence": null,
      "source_url": null,
      "adjusted_llr": -0.2365077975594923,
      "expected_date": "2025-12-11",
      "measurement_criterion": null
    },
    {
      "llr": -0.4054651081081644,
      "kind": "llm_pre_event",
      "kappa": 0.495805,
      "label": "DeepSeek V4 Pro launches at 98% less than GPT-5.5 Pro",
      "weight": 0.4,
      "strength": "weak",
      "confidence": 0.85,
      "source_url": "https://decrypt.co/365455/deepseek-v4-launch-pro-version-costs-less-gpt-5-pro",
      "adjusted_llr": -0.20103162792556845,
      "expected_date": "2026-03-31",
      "measurement_criterion": "DeepSeek launches V4 Pro at <=2% the cost of GPT-5.5 Pro for equivalent reasoning benchmark performance"
    }
  ],
  "evidence_kind": "metadata_milestone_miss_sweep",
  "inside_source": "history_v2",
  "inside_weight": 0.8774131940969938,
  "outside_weight": 0.12258680590300619,
  "posterior_prob": 0.3205526249296874,
  "posterior_logit": -0.7512333248131284,
  "predictor_brier": 0.0625,
  "inside_posterior": 0.3205526249296874,
  "blended_posterior": 0.3205526249296874,
  "reference_class_id": null,
  "total_adjusted_llr": -0.9105550206040454,
  "predictor_n_resolved": 1
}
LBP2026-04-30T16:39:51Z54.0%-2.1pp
Network propagation: 56.0% → 54.0%
5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v2 · run 0c8a4ea3
LBP2026-04-30T02:18:57Z56.0%-4.0pp
Network propagation: 60.0% → 56.0%
5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v1 · run 592311ef

Network propagation neighbors

Top edges sorted by latest LBP cross-impact
All propagation →

Top incoming (parents)

Edges that influence THIS node's belief

KindNodeTheir probP(c|s=T)P(c|s=F)Δ implied
killerTK03
AI Regulatory Moratorium (EU/US Capability Freeze)
10.0%0.0500.600+0.042
killerTK01
AGI Capability Plateau (2026-27 Training Stall)
15.0%0.0500.600+0.014
killerTK14
Superbubble Pop (S&P 500 -40%, Moonshot Capital Evaporates)
20.0%0.0500.600-0.013

Top outgoing (children)

Predictions THIS node influences

KindNodeTheir probP(c|s=T)P(c|s=F)Δ implied
prereq232_055
We're exiting the industrial age permanently as recursive sePeter Diamandis
35.5%0.7000.050+0.027
prereq235_030
Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 203Ray Kurzweil
39.2%0.7500.050+0.015
prereq231_013
Math is cooked (will be solved), physics cooked, biology chaAlex Wissner-Gross
35.4%0.6200.050-0.013
prereqCMQ_002
By 2028, AI systems will reach 'independent researcher' leveSam Altman
31.4%0.5500.050-0.009
prereq241_043
ASI will arrive within 2 years to 5 years to this next decadPeter Diamandis
35.9%0.6500.050-0.002

Ticker exposure

33 ticker(s) linked

Beneficiaries (23)

SOUNCRWVSITMNVDAARMGTLBBBAITSMAPLDCEVAAIMSFTMRVLSFTBYORCLQCOMAVGOBABAAMDGOOGLIBMAMZNMETA

Adverse (6)

WNSCHGGCTSHIBMINFYACN

Prerequisites (3)

Predictions that must hit first
TypePredTitleDomainLag
killerTK14Superbubble Pop (S&P 500 -40%, Moonshot Capital Evaporates)
killerTK01AGI Capability Plateau (2026-27 Training Stall)
killerTK03AI Regulatory Moratorium (EU/US Capability Freeze)

Dependents (5)

Predictions enabled by this
TypePredTitleDomainLag
prereq235_030Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 2033.Biotech/Longevity
prereq232_055We're exiting the industrial age permanently as recursive self-improvement unfolds.AI
prereq241_043ASI will arrive within 2 years to 5 years to this next decadeAI
prereq231_013Math is cooked (will be solved), physics cooked, biology char broiled.AI
prereqCMQ_002By 2028, AI systems will reach 'independent researcher' level — driving autonomous scientific discoveries without human intervention.AI

Linked documents (10)

Auto-generated by cosine similarity from Polymarket / Manifold / EDGAR / GDELT

Raw metadata

From Thesis_Timeline_v1.0_FINAL workbook
{
  "nia": false,
  "qty": "1000x",
  "url": "https://www.youtube.com/watch?v=uOGHXAfvK8w",
  "mode": "CITED_PREDICTION",
  "role": "Cited-Executive",
  "context": "our first reasoning model was called 01 came out like 16 months ago. Uh and our latest model where we now integrated reasoning is 5.4. To get the same answer to a hard problem from that first model to 5.4 for has been a reduction in cost of about a,000x.",
  "to_year": 2026,
  "cited_by": "Peter Diamandis",
  "verbatim": "To get the same answer to a hard problem from that first model to 5.4 for has been a reduction in cost of about a,000x.",
  "conv_cues": "has been",
  "direction": "DOWN",
  "from_year": 2024,
  "timeframe": "Past 16 months / ongoing",
  "conv_level": "HIGH",
  "milestones": [
    {
      "kind": "quartile_checkpoint",
      "label": "Q1 window check-in (25%)",
      "status": "overdue",
      "weight": 0.05,
      "ordinal": -7,
      "source_id": null,
      "expected_date": "2024-08-24",
      "observed_date": null,
      "miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
      "miss_emitted_by": "metadata_milestone_sweep"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q2 window check-in (50%)",
      "status": "overdue",
      "weight": 0.05,
      "ordinal": -6,
      "source_id": null,
      "expected_date": "2025-04-18",
      "observed_date": null,
      "miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
      "miss_emitted_by": "metadata_milestone_sweep"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q3 window check-in (75%)",
      "status": "overdue",
      "weight": 0.05,
      "ordinal": -5,
      "source_id": null,
      "expected_date": "2025-12-11",
      "observed_date": null,
      "miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
      "miss_emitted_by": "metadata_milestone_sweep"
    },
    {
      "kind": "llm_pre_event",
      "label": "GPT-4-class inference cost drops to $0.40/M tokens (1000x reduction)",
      "source": "https://www.gpunex.com/blog/ai-inference-economics-2026/ — 1000x cost collapse",
      "status": "hit",
      "weight": 0.4,
      "ordinal": -4,
      "source_id": null,
      "confidence": 0.95,
      "source_url": "https://www.gpunex.com/blog/ai-inference-economics-2026/",
      "expected_date": "2026-01-31",
      "observed_date": "2026-01-31",
      "research_origin": "deep_research",
      "measurement_criterion": "Industry pricing data confirms GPT-4-class equivalent performance available at <=$0.40/M tokens, vs $20/M in late 2022 (>=1000x drop)"
    },
    {
      "kind": "llm_pre_event",
      "label": "DeepSeek R1 runs 20-50x cheaper than OpenAI equivalent",
      "source": "https://www.gpunex.com/blog/ai-inference-economics-2026/ — 20-50x cheaper",
      "status": "hit",
      "weight": 0.4,
      "ordinal": -3,
      "source_id": null,
      "confidence": 0.92,
      "source_url": "https://www.gpunex.com/blog/ai-inference-economics-2026/",
      "expected_date": "2026-02-15",
      "observed_date": "2026-02-15",
      "research_origin": "deep_research",
      "measurement_criterion": "Sam Altman or OpenAI executive publicly acknowledges DeepSeek R1 runs at 20-50x cheaper inference cost than OpenAI equivalent reasoning model"
    },
    {
      "kind": "llm_pre_event",
      "label": "$18B allocated to foundation model APIs in 2025 (paradox confirmation)",
      "source": "https://www.arturmarkus.com/the-inference-cost-paradox-why-generative-ai-spending-surged-320-in-2025-despite-per-token-costs-dropping-1-000x-and-what-it-means-for-your-ai-budget-in-2026/ — Inference paradox",
      "status": "hit",
      "weight": 0.4,
      "ordinal": -2,
      "source_id": null,
      "confidence": 0.9,
      "source_url": "https://www.arturmarkus.com/the-inference-cost-paradox-why-generative-ai-spending-surged-320-in-2025-despite-per-token-costs-dropping-1-000x-and-what-it-means-for-your-ai-budget-in-2026/",
      "expected_date": "2026-03-15",
      "observ
... (truncated)