240_014predictionAIAI-timing

Cost of reasoning models has dropped 1,000x in 16 months

Predictor: Sam Altman · ep#240 "NVIDIA's $1 Trillion Prediction, Anthropic Beats OpenAI, Tesla vs. TSMC & The CS Job Collapse" · source

Prior probability

60.0%

Current probability

50.3%

evolves via intake + LBP

Conviction

4/5

Signal quality

Resolution

pending

Window

2024-01-01 – 2026-10-31

Edges in / out

3 / 5

Tickers exposed

Prediction text

Cost of reasoning models has dropped 1,000x in 16 months | To get the same answer to a hard problem from that first model to 5.4 for has been a reduction in cost of about a,000x.

Verbatim quote

From episode "NVIDIA's $1 Trillion Prediction, Anthropic Beats OpenAI, Tesla vs. TSMC & The CS Job Collapse"

To get the same answer to a hard problem from that first model to 5.4 for has been a reduction in cost of about a,000x.

Predictor: Sam Altman

κ + Brier as of 2026-05-22

Full calibration →

κ (discount)

0.583

Brier

0.0625

excellent

Hits / Misses

0 / 0

of 1 resolved

Hit rate

0.0%

Calibration plot (stated vs observed)

Evidence about this node from Sam Altman is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).

Reference class

Not linked

This node isn't linked to a reference class. The Bayesian update applies without outside-view blending.

Probability over time

7 prob_history rows

intake v2milestone miss sweeplbp propagationreference class assignedlegacy v1prior_prob (analyst seed)current = 50.3%

Milestone chain

Pre-event signals (upstream prereqs + window checkpoints) → resolution event → downstream cascades. Status/dates update from linked nodes; re-derive nightly via scripts/ops/derive_milestones.py.

Leading chain: 3 fired ✓ · 4 overdue ⏱

2024-08-24overdueQ1 window check-in (25%)
2025-04-18overdueQ2 window check-in (50%)
2025-12-11overdueQ3 window check-in (75%)
2026-01-31hitGPT-4-class inference cost drops to $0.40/M tokens (1000x reduction)
How: Industry pricing data confirms GPT-4-class equivalent performance available at <=$0.40/M tokens, vs $20/M in late 2022 (>=1000x drop)
Source: https://www.gpunex.com/blog/ai-inference-economics-2026/ — 1000x cost collapseconf 95%
2026-02-15hitDeepSeek R1 runs 20-50x cheaper than OpenAI equivalent
How: Sam Altman or OpenAI executive publicly acknowledges DeepSeek R1 runs at 20-50x cheaper inference cost than OpenAI equivalent reasoning model
Source: https://www.gpunex.com/blog/ai-inference-economics-2026/ — 20-50x cheaperconf 92%
2026-03-15hit$18B allocated to foundation model APIs in 2025 (paradox confirmation)
How: 2025 industry totals confirm >=$18B spent on foundation model APIs (vs $4B training infra) — confirms cost-down/usage-up paradox
Source: https://www.arturmarkus.com/the-inference-cost-paradox-why-generative-ai-spending-surged-320-in-2025-despite-per-token-costs-dropping-1-000x-and-what-it-means-for-your-ai-budget-in-2026/ — Inference paradoxconf 90%
2026-03-31overdueDeepSeek V4 Pro launches at 98% less than GPT-5.5 Pro
How: DeepSeek launches V4 Pro at <=2% the cost of GPT-5.5 Pro for equivalent reasoning benchmark performance
Source: https://decrypt.co/365455/deepseek-v4-launch-pro-version-costs-less-gpt-5-pro — DeepSeek V4conf 85%
2026-08-05pendingCost of reasoning models has dropped 1,000x in 16 months
2026-06-01 → 2026-12-31pendingEpoch AI publishes inference price-trend data showing further drops 2026
How: Epoch AI publishes 2026 inference price-trend update showing reasoning-model cost-per-token down >=50% YoY in 2026
Source: https://epoch.ai/data-insights/llm-inference-price-trends — Epoch AI trendsconf 80%
2026-09-01 → 2027-03-31pendingCascade: Enterprise inference spend exceeds $50B 2026 despite per-token drops
How: 2026 full-year foundation-model-API spend >=$50B globally despite continuing per-token price decline
Source: Cascade from $18B 2025 base + reasoning-model token explosionconf 65%
2027-06-26pendingMath is cooked (will be solved), physics cooked, biology char broiled.
2028-06-25pendingWe're exiting the industrial age permanently as recursive self-improvement unfolds.
2028-09-07pendingBy 2028, AI systems will reach 'independent researcher' level — driving autonomous scientific discoveries without human intervention.
2033-07-30pendingRay Kurzweil predicts Longevity Escape Velocity (LEV) by 2033.
2033-08-10pendingASI will arrive within 2 years to 5 years to this next decade

What if this resolves?

Clamp this prediction TRUE or FALSE and run a counterfactual Gibbs sample. Surfaces the predictions whose marginals shift most under that assumption.

(live posterior: 50%)

Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"

Evidence chain

Every probability update with full Bayesian provenance — chronological, latest first

LBP2026-05-24T02:00:02Z50.3%+1.3pp

Network propagation: 49.0% → 50.3%

4-iter LBP, residual 0.01000 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run 806b02f8

LBP2026-05-17T02:00:01Z49.0%+2.6pp

Network propagation: 46.5% → 49.0%

5-iter LBP, residual 0.00689 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e607fa96

LBP2026-05-10T02:00:02Z46.5%+5.1pp

Network propagation: 41.4% → 46.5%

6-iter LBP, residual 0.00584 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e5c18d29

LBP2026-05-03T02:00:01Z41.4%+9.4pp

Network propagation: 32.1% → 41.4%

6-iter LBP, residual 0.00677 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run 1a683ac9

metadata_milestone_miss_sweep2026-05-02T22:07:21Z32.1%-21.9pp

metadata_milestone_miss_sweep bayesian_v2 n=4 inside=0.321 blend=0.321 LLR=-0.911 κ=0.58 no_blend

Raw metadata

{
  "trf": 0.17512400843286588,
  "kappa": 0.5833,
  "base_rate": null,
  "predictor": "Sam Altman",
  "total_llr": -1.6218604324326575,
  "grace_days": 7,
  "bayesian_v2": true,
  "prior_logit": 0.15932169579091693,
  "bayes_factor": "2.5:1 against",
  "blend_reason": "no reference_class linked",
  "inside_prior": 0.5397463846206305,
  "kappa_source": "predictor_table",
  "n_milestones": 4,
  "blend_applied": false,
  "contributions": [
    {
      "llr": -0.4054651081081644,
      "kind": "quartile_checkpoint",
      "kappa": 0.5833,
      "label": "Q1 window check-in (25%)",
      "weight": 0.05,
      "strength": "weak",
      "confidence": null,
      "source_url": null,
      "adjusted_llr": -0.2365077975594923,
      "expected_date": "2024-08-24",
      "measurement_criterion": null
    },
    {
      "llr": -0.4054651081081644,
      "kind": "quartile_checkpoint",
      "kappa": 0.5833,
      "label": "Q2 window check-in (50%)",
      "weight": 0.05,
      "strength": "weak",
      "confidence": null,
      "source_url": null,
      "adjusted_llr": -0.2365077975594923,
      "expected_date": "2025-04-18",
      "measurement_criterion": null
    },
    {
      "llr": -0.4054651081081644,
      "kind": "quartile_checkpoint",
      "kappa": 0.5833,
      "label": "Q3 window check-in (75%)",
      "weight": 0.05,
      "strength": "weak",
      "confidence": null,
      "source_url": null,
      "adjusted_llr": -0.2365077975594923,
      "expected_date": "2025-12-11",
      "measurement_criterion": null
    },
    {
      "llr": -0.4054651081081644,
      "kind": "llm_pre_event",
      "kappa": 0.495805,
      "label": "DeepSeek V4 Pro launches at 98% less than GPT-5.5 Pro",
      "weight": 0.4,
      "strength": "weak",
      "confidence": 0.85,
      "source_url": "https://decrypt.co/365455/deepseek-v4-launch-pro-version-costs-less-gpt-5-pro",
      "adjusted_llr": -0.20103162792556845,
      "expected_date": "2026-03-31",
      "measurement_criterion": "DeepSeek launches V4 Pro at <=2% the cost of GPT-5.5 Pro for equivalent reasoning benchmark performance"
    }
  ],
  "evidence_kind": "metadata_milestone_miss_sweep",
  "inside_source": "history_v2",
  "inside_weight": 0.8774131940969938,
  "outside_weight": 0.12258680590300619,
  "posterior_prob": 0.3205526249296874,
  "posterior_logit": -0.7512333248131284,
  "predictor_brier": 0.0625,
  "inside_posterior": 0.3205526249296874,
  "blended_posterior": 0.3205526249296874,
  "reference_class_id": null,
  "total_adjusted_llr": -0.9105550206040454,
  "predictor_n_resolved": 1
}

LBP2026-04-30T16:39:51Z54.0%-2.1pp

Network propagation: 56.0% → 54.0%

5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v2 · run 0c8a4ea3

LBP2026-04-30T02:18:57Z56.0%-4.0pp

Network propagation: 60.0% → 56.0%

5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v1 · run 592311ef

Network propagation neighbors

Top edges sorted by latest LBP cross-impact

All propagation →

Top incoming (parents)

Edges that influence THIS node's belief

Kind	Node	Their prob	P(c\|s=T)	P(c\|s=F)	Δ implied
killer	TK03 AI Regulatory Moratorium (EU/US Capability Freeze)	10.0%	0.050	0.600	+0.042
killer	TK01 AGI Capability Plateau (2026-27 Training Stall)	15.0%	0.050	0.600	+0.014
killer	TK14 Superbubble Pop (S&P 500 -40%, Moonshot Capital Evaporates)	20.0%	0.050	0.600	-0.013

Top outgoing (children)

Predictions THIS node influences

Kind	Node	Their prob	P(c\|s=T)	P(c\|s=F)	Δ implied
prereq	232_055 We're exiting the industrial age permanently as recursive se — Peter Diamandis	35.5%	0.700	0.050	+0.027
prereq	235_030 Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 203 — Ray Kurzweil	39.2%	0.750	0.050	+0.015
prereq	231_013 Math is cooked (will be solved), physics cooked, biology cha — Alex Wissner-Gross	35.4%	0.620	0.050	-0.013
prereq	CMQ_002 By 2028, AI systems will reach 'independent researcher' leve — Sam Altman	31.4%	0.550	0.050	-0.009
prereq	241_043 ASI will arrive within 2 years to 5 years to this next decad — Peter Diamandis	35.9%	0.650	0.050	-0.002

Ticker exposure

33 ticker(s) linked

Beneficiaries (23)

SOUN CRWV SITM NVDA ARM GTLB BBAI TSM APLD CEVA AI MSFT MRVL SFTBY ORCL QCOM AVGO BABA AMD GOOGL IBM AMZN META

Adverse (6)

WNS CHGG CTSH IBM INFY ACN

Prerequisites (3)

Predictions that must hit first

Type	Pred	Title	Domain	Lag
killer	TK14	Superbubble Pop (S&P 500 -40%, Moonshot Capital Evaporates)	—	—
killer	TK01	AGI Capability Plateau (2026-27 Training Stall)	—	—
killer	TK03	AI Regulatory Moratorium (EU/US Capability Freeze)	—	—

Dependents (5)

Predictions enabled by this

Type	Pred	Title	Domain	Lag
prereq	235_030	Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 2033.	Biotech/Longevity	—
prereq	232_055	We're exiting the industrial age permanently as recursive self-improvement unfolds.	AI	—
prereq	241_043	ASI will arrive within 2 years to 5 years to this next decade	AI	—
prereq	231_013	Math is cooked (will be solved), physics cooked, biology char broiled.	AI	—
prereq	CMQ_002	By 2028, AI systems will reach 'independent researcher' level — driving autonomous scientific discoveries without human intervention.	AI	—

Linked documents (10)

Auto-generated by cosine similarity from Polymarket / Manifold / EDGAR / GDELT

Sim	Source	Title	Market prob	Polarity	Reviewed	Published
0.691	arxiv	Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost	—	mentions	pending	2026-05-07
0.679	arxiv	An Enigma of Artificial Reason: Investigating the Production-Evaluation Gap in Large Reasoning Models	—	mentions	pending	2026-05-31
0.663	arxiv	Thinking Economically: A Hierarchical Framework for Adaptive-Complexity Reasoning in LLMs	—	mentions	pending	2026-05-31
0.662	arxiv	Evaluating Reasoning Models for Queries with Presuppositions	—	mentions	pending	2026-05-04
0.658	arxiv	Extreme Low-Bit Inference in Reasoning Models: Failure Modes and Targeted Recovery	—	mentions	pending	2026-06-01
0.653	arxiv	A Primer in Post-Training Reasoning Data: What We Know About How It Works	—	mentions	pending	2026-06-01
0.647	arxiv	Policy-Guided Stepwise Model Routing for Cost-Effective Reasoning	—	mentions	pending	2026-05-07
0.644	arxiv	Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation	—	mentions	pending	2026-06-04
0.619	arxiv	Rethinking Stepwise Model Routing: A Cost-Efficient Table Reasoning Perspective	—	mentions	pending	2026-05-28
0.604	arxiv	ReasoningFlow: Discourse Structures for Understanding LLM Reasoning Traces	—	mentions	pending	2026-06-03

Raw metadata

From Thesis_Timeline_v1.0_FINAL workbook

{
  "nia": false,
  "qty": "1000x",
  "url": "https://www.youtube.com/watch?v=uOGHXAfvK8w",
  "mode": "CITED_PREDICTION",
  "role": "Cited-Executive",
  "context": "our first reasoning model was called 01 came out like 16 months ago. Uh and our latest model where we now integrated reasoning is 5.4. To get the same answer to a hard problem from that first model to 5.4 for has been a reduction in cost of about a,000x.",
  "to_year": 2026,
  "cited_by": "Peter Diamandis",
  "verbatim": "To get the same answer to a hard problem from that first model to 5.4 for has been a reduction in cost of about a,000x.",
  "conv_cues": "has been",
  "direction": "DOWN",
  "from_year": 2024,
  "timeframe": "Past 16 months / ongoing",
  "conv_level": "HIGH",
  "milestones": [
    {
      "kind": "quartile_checkpoint",
      "label": "Q1 window check-in (25%)",
      "status": "overdue",
      "weight": 0.05,
      "ordinal": -7,
      "source_id": null,
      "expected_date": "2024-08-24",
      "observed_date": null,
      "miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
      "miss_emitted_by": "metadata_milestone_sweep"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q2 window check-in (50%)",
      "status": "overdue",
      "weight": 0.05,
      "ordinal": -6,
      "source_id": null,
      "expected_date": "2025-04-18",
      "observed_date": null,
      "miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
      "miss_emitted_by": "metadata_milestone_sweep"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q3 window check-in (75%)",
      "status": "overdue",
      "weight": 0.05,
      "ordinal": -5,
      "source_id": null,
      "expected_date": "2025-12-11",
      "observed_date": null,
      "miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
      "miss_emitted_by": "metadata_milestone_sweep"
    },
    {
      "kind": "llm_pre_event",
      "label": "GPT-4-class inference cost drops to $0.40/M tokens (1000x reduction)",
      "source": "https://www.gpunex.com/blog/ai-inference-economics-2026/ — 1000x cost collapse",
      "status": "hit",
      "weight": 0.4,
      "ordinal": -4,
      "source_id": null,
      "confidence": 0.95,
      "source_url": "https://www.gpunex.com/blog/ai-inference-economics-2026/",
      "expected_date": "2026-01-31",
      "observed_date": "2026-01-31",
      "research_origin": "deep_research",
      "measurement_criterion": "Industry pricing data confirms GPT-4-class equivalent performance available at <=$0.40/M tokens, vs $20/M in late 2022 (>=1000x drop)"
    },
    {
      "kind": "llm_pre_event",
      "label": "DeepSeek R1 runs 20-50x cheaper than OpenAI equivalent",
      "source": "https://www.gpunex.com/blog/ai-inference-economics-2026/ — 20-50x cheaper",
      "status": "hit",
      "weight": 0.4,
      "ordinal": -3,
      "source_id": null,
      "confidence": 0.92,
      "source_url": "https://www.gpunex.com/blog/ai-inference-economics-2026/",
      "expected_date": "2026-02-15",
      "observed_date": "2026-02-15",
      "research_origin": "deep_research",
      "measurement_criterion": "Sam Altman or OpenAI executive publicly acknowledges DeepSeek R1 runs at 20-50x cheaper inference cost than OpenAI equivalent reasoning model"
    },
    {
      "kind": "llm_pre_event",
      "label": "$18B allocated to foundation model APIs in 2025 (paradox confirmation)",
      "source": "https://www.arturmarkus.com/the-inference-cost-paradox-why-generative-ai-spending-surged-320-in-2025-despite-per-token-costs-dropping-1-000x-and-what-it-means-for-your-ai-budget-in-2026/ — Inference paradox",
      "status": "hit",
      "weight": 0.4,
      "ordinal": -2,
      "source_id": null,
      "confidence": 0.9,
      "source_url": "https://www.arturmarkus.com/the-inference-cost-paradox-why-generative-ai-spending-surged-320-in-2025-despite-per-token-costs-dropping-1-000x-and-what-it-means-for-your-ai-budget-in-2026/",
      "expected_date": "2026-03-15",
      "observ
... (truncated)