Cost of reasoning models has dropped 1,000x in 16 months
Predictor: Sam Altman · ep#240 "NVIDIA's $1 Trillion Prediction, Anthropic Beats OpenAI, Tesla vs. TSMC & The CS Job Collapse" · source
Prediction text
Cost of reasoning models has dropped 1,000x in 16 months | To get the same answer to a hard problem from that first model to 5.4 for has been a reduction in cost of about a,000x.
Verbatim quote
To get the same answer to a hard problem from that first model to 5.4 for has been a reduction in cost of about a,000x.
Predictor: Sam Altman
Calibration plot (stated vs observed)
Evidence about this node from Sam Altman is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).
Reference class
This node isn't linked to a reference class. The Bayesian update applies without outside-view blending.
Probability over time
Milestone chain
- 2024-08-24overdueQ1 window check-in (25%)
- 2025-04-18overdueQ2 window check-in (50%)
- 2025-12-11overdueQ3 window check-in (75%)
- 2026-01-31hitGPT-4-class inference cost drops to $0.40/M tokens (1000x reduction)How: Industry pricing data confirms GPT-4-class equivalent performance available at <=$0.40/M tokens, vs $20/M in late 2022 (>=1000x drop)Source: https://www.gpunex.com/blog/ai-inference-economics-2026/ — 1000x cost collapseconf 95%
- 2026-02-15hitDeepSeek R1 runs 20-50x cheaper than OpenAI equivalentHow: Sam Altman or OpenAI executive publicly acknowledges DeepSeek R1 runs at 20-50x cheaper inference cost than OpenAI equivalent reasoning modelSource: https://www.gpunex.com/blog/ai-inference-economics-2026/ — 20-50x cheaperconf 92%
- 2026-03-15hit$18B allocated to foundation model APIs in 2025 (paradox confirmation)How: 2025 industry totals confirm >=$18B spent on foundation model APIs (vs $4B training infra) — confirms cost-down/usage-up paradoxSource: https://www.arturmarkus.com/the-inference-cost-paradox-why-generative-ai-spending-surged-320-in-2025-despite-per-token-costs-dropping-1-000x-and-what-it-means-for-your-ai-budget-in-2026/ — Inference paradoxconf 90%
- 2026-03-31overdueDeepSeek V4 Pro launches at 98% less than GPT-5.5 ProHow: DeepSeek launches V4 Pro at <=2% the cost of GPT-5.5 Pro for equivalent reasoning benchmark performanceSource: https://decrypt.co/365455/deepseek-v4-launch-pro-version-costs-less-gpt-5-pro — DeepSeek V4conf 85%
- 2026-06-01 → 2026-12-31pendingEpoch AI publishes inference price-trend data showing further drops 2026How: Epoch AI publishes 2026 inference price-trend update showing reasoning-model cost-per-token down >=50% YoY in 2026Source: https://epoch.ai/data-insights/llm-inference-price-trends — Epoch AI trendsconf 80%
- 2026-09-01 → 2027-03-31pendingCascade: Enterprise inference spend exceeds $50B 2026 despite per-token dropsHow: 2026 full-year foundation-model-API spend >=$50B globally despite continuing per-token price declineSource: Cascade from $18B 2025 base + reasoning-model token explosionconf 65%
What if this resolves?
Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"
Evidence chain
Raw metadata
{
"trf": 0.17512400843286588,
"kappa": 0.5833,
"base_rate": null,
"predictor": "Sam Altman",
"total_llr": -1.6218604324326575,
"grace_days": 7,
"bayesian_v2": true,
"prior_logit": 0.15932169579091693,
"bayes_factor": "2.5:1 against",
"blend_reason": "no reference_class linked",
"inside_prior": 0.5397463846206305,
"kappa_source": "predictor_table",
"n_milestones": 4,
"blend_applied": false,
"contributions": [
{
"llr": -0.4054651081081644,
"kind": "quartile_checkpoint",
"kappa": 0.5833,
"label": "Q1 window check-in (25%)",
"weight": 0.05,
"strength": "weak",
"confidence": null,
"source_url": null,
"adjusted_llr": -0.2365077975594923,
"expected_date": "2024-08-24",
"measurement_criterion": null
},
{
"llr": -0.4054651081081644,
"kind": "quartile_checkpoint",
"kappa": 0.5833,
"label": "Q2 window check-in (50%)",
"weight": 0.05,
"strength": "weak",
"confidence": null,
"source_url": null,
"adjusted_llr": -0.2365077975594923,
"expected_date": "2025-04-18",
"measurement_criterion": null
},
{
"llr": -0.4054651081081644,
"kind": "quartile_checkpoint",
"kappa": 0.5833,
"label": "Q3 window check-in (75%)",
"weight": 0.05,
"strength": "weak",
"confidence": null,
"source_url": null,
"adjusted_llr": -0.2365077975594923,
"expected_date": "2025-12-11",
"measurement_criterion": null
},
{
"llr": -0.4054651081081644,
"kind": "llm_pre_event",
"kappa": 0.495805,
"label": "DeepSeek V4 Pro launches at 98% less than GPT-5.5 Pro",
"weight": 0.4,
"strength": "weak",
"confidence": 0.85,
"source_url": "https://decrypt.co/365455/deepseek-v4-launch-pro-version-costs-less-gpt-5-pro",
"adjusted_llr": -0.20103162792556845,
"expected_date": "2026-03-31",
"measurement_criterion": "DeepSeek launches V4 Pro at <=2% the cost of GPT-5.5 Pro for equivalent reasoning benchmark performance"
}
],
"evidence_kind": "metadata_milestone_miss_sweep",
"inside_source": "history_v2",
"inside_weight": 0.8774131940969938,
"outside_weight": 0.12258680590300619,
"posterior_prob": 0.3205526249296874,
"posterior_logit": -0.7512333248131284,
"predictor_brier": 0.0625,
"inside_posterior": 0.3205526249296874,
"blended_posterior": 0.3205526249296874,
"reference_class_id": null,
"total_adjusted_llr": -0.9105550206040454,
"predictor_n_resolved": 1
}Network propagation neighbors
Top incoming (parents)
Edges that influence THIS node's belief
Top outgoing (children)
Predictions THIS node influences
| Kind | Node | Their prob | P(c|s=T) | P(c|s=F) | Δ implied |
|---|---|---|---|---|---|
| prereq | 232_055 We're exiting the industrial age permanently as recursive se — Peter Diamandis | 35.5% | 0.700 | 0.050 | +0.027 |
| prereq | 235_030 Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 203 — Ray Kurzweil | 39.2% | 0.750 | 0.050 | +0.015 |
| prereq | 231_013 Math is cooked (will be solved), physics cooked, biology cha — Alex Wissner-Gross | 35.4% | 0.620 | 0.050 | -0.013 |
| prereq | CMQ_002 By 2028, AI systems will reach 'independent researcher' leve — Sam Altman | 31.4% | 0.550 | 0.050 | -0.009 |
| prereq | 241_043 ASI will arrive within 2 years to 5 years to this next decad — Peter Diamandis | 35.9% | 0.650 | 0.050 | -0.002 |
Ticker exposure
Beneficiaries (23)
Adverse (6)
Prerequisites (3)
Dependents (5)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| prereq | 235_030 | Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 2033. | Biotech/Longevity | — |
| prereq | 232_055 | We're exiting the industrial age permanently as recursive self-improvement unfolds. | AI | — |
| prereq | 241_043 | ASI will arrive within 2 years to 5 years to this next decade | AI | — |
| prereq | 231_013 | Math is cooked (will be solved), physics cooked, biology char broiled. | AI | — |
| prereq | CMQ_002 | By 2028, AI systems will reach 'independent researcher' level — driving autonomous scientific discoveries without human intervention. | AI | — |
Linked documents (10)
Raw metadata
{
"nia": false,
"qty": "1000x",
"url": "https://www.youtube.com/watch?v=uOGHXAfvK8w",
"mode": "CITED_PREDICTION",
"role": "Cited-Executive",
"context": "our first reasoning model was called 01 came out like 16 months ago. Uh and our latest model where we now integrated reasoning is 5.4. To get the same answer to a hard problem from that first model to 5.4 for has been a reduction in cost of about a,000x.",
"to_year": 2026,
"cited_by": "Peter Diamandis",
"verbatim": "To get the same answer to a hard problem from that first model to 5.4 for has been a reduction in cost of about a,000x.",
"conv_cues": "has been",
"direction": "DOWN",
"from_year": 2024,
"timeframe": "Past 16 months / ongoing",
"conv_level": "HIGH",
"milestones": [
{
"kind": "quartile_checkpoint",
"label": "Q1 window check-in (25%)",
"status": "overdue",
"weight": 0.05,
"ordinal": -7,
"source_id": null,
"expected_date": "2024-08-24",
"observed_date": null,
"miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
"miss_emitted_by": "metadata_milestone_sweep"
},
{
"kind": "quartile_checkpoint",
"label": "Q2 window check-in (50%)",
"status": "overdue",
"weight": 0.05,
"ordinal": -6,
"source_id": null,
"expected_date": "2025-04-18",
"observed_date": null,
"miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
"miss_emitted_by": "metadata_milestone_sweep"
},
{
"kind": "quartile_checkpoint",
"label": "Q3 window check-in (75%)",
"status": "overdue",
"weight": 0.05,
"ordinal": -5,
"source_id": null,
"expected_date": "2025-12-11",
"observed_date": null,
"miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
"miss_emitted_by": "metadata_milestone_sweep"
},
{
"kind": "llm_pre_event",
"label": "GPT-4-class inference cost drops to $0.40/M tokens (1000x reduction)",
"source": "https://www.gpunex.com/blog/ai-inference-economics-2026/ — 1000x cost collapse",
"status": "hit",
"weight": 0.4,
"ordinal": -4,
"source_id": null,
"confidence": 0.95,
"source_url": "https://www.gpunex.com/blog/ai-inference-economics-2026/",
"expected_date": "2026-01-31",
"observed_date": "2026-01-31",
"research_origin": "deep_research",
"measurement_criterion": "Industry pricing data confirms GPT-4-class equivalent performance available at <=$0.40/M tokens, vs $20/M in late 2022 (>=1000x drop)"
},
{
"kind": "llm_pre_event",
"label": "DeepSeek R1 runs 20-50x cheaper than OpenAI equivalent",
"source": "https://www.gpunex.com/blog/ai-inference-economics-2026/ — 20-50x cheaper",
"status": "hit",
"weight": 0.4,
"ordinal": -3,
"source_id": null,
"confidence": 0.92,
"source_url": "https://www.gpunex.com/blog/ai-inference-economics-2026/",
"expected_date": "2026-02-15",
"observed_date": "2026-02-15",
"research_origin": "deep_research",
"measurement_criterion": "Sam Altman or OpenAI executive publicly acknowledges DeepSeek R1 runs at 20-50x cheaper inference cost than OpenAI equivalent reasoning model"
},
{
"kind": "llm_pre_event",
"label": "$18B allocated to foundation model APIs in 2025 (paradox confirmation)",
"source": "https://www.arturmarkus.com/the-inference-cost-paradox-why-generative-ai-spending-surged-320-in-2025-despite-per-token-costs-dropping-1-000x-and-what-it-means-for-your-ai-budget-in-2026/ — Inference paradox",
"status": "hit",
"weight": 0.4,
"ordinal": -2,
"source_id": null,
"confidence": 0.9,
"source_url": "https://www.arturmarkus.com/the-inference-cost-paradox-why-generative-ai-spending-surged-320-in-2025-despite-per-token-costs-dropping-1-000x-and-what-it-means-for-your-ai-budget-in-2026/",
"expected_date": "2026-03-15",
"observ
... (truncated)