← Cockpit
238_024predictionAIAI-timing

AI token speed will jump from ~50 tokens/sec to ~1,000 tokens/sec (Cerebras)

Predictor: Emad Mostaque · ep#238 "Meta Buys Moltbook, GPT 5.4, and Fruitfly Brain Upload | Moonshots Live at The Abundance Summit 238" · source

Prior probability
50.0%
Current probability
37.2%
evolves via intake + LBP
Conviction
4/5
Signal quality
C
Resolution
pending
Window
2026-04-30 – 2027-09-30
Edges in / out
4 / 0
Tickers exposed
33

Prediction text

AI token speed will jump from ~50 tokens/sec to ~1,000 tokens/sec (Cerebras) | it's like 50 tokens a second or something like when we use GPT 5.4 Pro extended... You're going from 50 tokens a second of this level of knowledge to 1,000. So in codeex now if you use 5.3 fast it's a thousand tokens a second

Verbatim quote

From episode "Meta Buys Moltbook, GPT 5.4, and Fruitfly Brain Upload | Moonshots Live at The Abundance Summit 238"
it's like 50 tokens a second or something like when we use GPT 5.4 Pro extended... You're going from 50 tokens a second of this level of knowledge to 1,000. So in codeex now if you use 5.3 fast it's a thousand tokens a second

Predictor: Emad Mostaque

κ + Brier as of 2026-05-22
κ (discount)
0.722
Brier
0.0073
excellent
Hits / Misses
3 / 0
of 4 resolved
Hit rate
75.0%
Calibration plot (stated vs observed)

Evidence about this node from Emad Mostaque is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).

Reference class

Not linked

This node isn't linked to a reference class. The Bayesian update applies without outside-view blending.

Probability over time

5 prob_history rows
0%25%50%75%100%prior 50%2026-04-302026-05-032026-05-17
intake v2milestone miss sweeplbp propagationreference class assignedlegacy v1prior_prob (analyst seed)current = 37.2%

Milestone chain

Pre-event signals (upstream prereqs + window checkpoints) → resolution event → downstream cascades. Status/dates update from linked nodes; re-derive nightly via scripts/ops/derive_milestones.py.
Leading chain: 3 fired ✓ · 5 pending
  1. 2025-10-01hitOpenAI-Cerebras 750 MW partnership announced
    How: OpenAI press release confirms multi-year deployment of 750 MW of Cerebras wafer-scale systems
    Source: https://openai.com/index/cerebras-partnership/conf 97%
  2. 2026-02-01hitCerebras CS-3 delivers 1,800+ tokens/sec on Llama 3.3 70B
    How: Independent benchmarks confirm Cerebras CS-3 at 1,800+ tokens/sec on Llama 3.3 70B (~10-20x GPU baseline of 50-200 tok/s)
    Source: https://www.cerebras.ai/blog/openai-partners-with-cerebras-to-bring-high-speed-inference-to-the-mainstreamconf 95%
    Notes: HIT — exceeds the 1,000 tok/s prediction by ~80%.
  3. 2026-03-01hitGPT-OSS-120B running at 3,000 tokens/sec on Cerebras
    How: OpenAI/Cerebras confirms gpt-oss-120B running at 3,000 tok/sec
    Source: https://www.cerebras.ai/blog/openai-partners-with-cerebras-to-bring-high-speed-inference-to-the-mainstreamconf 95%
    Notes: HIT — 3x the 1,000 tok/s prediction. 60x baseline GPT-5 throughput.
  4. 2026-03-01 → 2026-09-30pendingAWS-Cerebras inference cloud collaboration GA
    How: AWS/Cerebras collaboration announced March 2026 reaches GA inference availability for enterprise customers
    Source: https://press.aboutamazon.com/aws/2026/3/aws-and-cerebras-collaboration-aims-to-set-a-new-standard-for-ai-inference-speed-and-performance-in-the-cloudconf 80%
  5. 2026-07-15pendingQ1 window check-in (25%)
  6. 2026-06-01 → 2026-12-31pendingChatGPT user-facing 1,000+ tok/s mode rollout
    How: OpenAI rolls out user-facing inference mode (Codex, ChatGPT high-speed) at 1,000+ tok/s on Cerebras infrastructure
    Source: https://www.startuphub.ai/ai-news/ai-video/2026/openais-10-billion-cerebras-deal-signals-the-true-ai-battleground-is-inference-speed/conf 85%
    Notes: Mostaque's specific 'Codex 5.3 fast at 1,000 tok/s' claim — central to the prediction.
  7. 2026-09-29pendingQ2 window check-in (50%)
  8. 2026-12-14pendingQ3 window check-in (75%)

No downstream cascades — this prediction is a leaf in the dependency graph.

What if this resolves?

Clamp this prediction TRUE or FALSE and run a counterfactual Gibbs sample. Surfaces the predictions whose marginals shift most under that assumption.
(live posterior: 37%)

Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"

Evidence chain

Every probability update with full Bayesian provenance — chronological, latest first
LBP2026-05-17T02:00:01Z37.2%-1.1pp
Network propagation: 38.3% → 37.2%
5-iter LBP, residual 0.00689 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e607fa96
LBP2026-05-10T02:00:02Z38.3%-2.2pp
Network propagation: 40.5% → 38.3%
6-iter LBP, residual 0.00584 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e5c18d29
LBP2026-05-03T02:00:01Z40.5%-4.5pp
Network propagation: 45.0% → 40.5%
6-iter LBP, residual 0.00677 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run 1a683ac9
LBP2026-04-30T16:39:51Z45.0%-1.7pp
Network propagation: 46.7% → 45.0%
5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v2 · run 0c8a4ea3
LBP2026-04-30T02:18:57Z46.7%-3.3pp
Network propagation: 50.0% → 46.7%
5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v1 · run 592311ef

Network propagation neighbors

Top edges sorted by latest LBP cross-impact
All propagation →

Top incoming (parents)

Edges that influence THIS node's belief

KindNodeTheir probP(c|s=T)P(c|s=F)Δ implied
prereqS_AGI_FAST_2027
AGI fast: drop-in remote worker by 2027-09
30.0%0.5000.050-0.187
killerTK03
AI Regulatory Moratorium (EU/US Capability Freeze)
10.0%0.0500.500+0.083
killerTK01
AGI Capability Plateau (2026-27 Training Stall)
15.0%0.0500.500+0.061
killerTK14
Superbubble Pop (S&P 500 -40%, Moonshot Capital Evaporates)
20.0%0.0500.500+0.038

Top outgoing (children)

Predictions THIS node influences

No outgoing edges.

Ticker exposure

33 ticker(s) linked

Beneficiaries (23)

SOUNCRWVSITMNVDAARMGTLBBBAITSMAPLDCEVAAIMSFTMRVLSFTBYORCLQCOMAVGOBABAAMDGOOGLIBMAMZNMETA

Adverse (6)

WNSCHGGCTSHIBMINFYACN

Prerequisites (4)

Predictions that must hit first
TypePredTitleDomainLag
prereqS_AGI_FAST_2027AGI fast: drop-in remote worker by 2027-09agi_general_capability
killerTK14Superbubble Pop (S&P 500 -40%, Moonshot Capital Evaporates)
killerTK01AGI Capability Plateau (2026-27 Training Stall)
killerTK03AI Regulatory Moratorium (EU/US Capability Freeze)

Dependents (0)

Predictions enabled by this
TypePredTitleDomainLag
No dependents

Linked documents (10)

Auto-generated by cosine similarity from Polymarket / Manifold / EDGAR / GDELT

Raw metadata

From Thesis_Timeline_v1.0_FINAL workbook
{
  "nia": false,
  "qty": "50 -> 1,000 tokens/sec (20x)",
  "url": "https://www.youtube.com/watch?v=d__HRChE2ZE",
  "mode": "PREDICTION",
  "role": "Host",
  "context": "OpenAI also just did a deal with Cerebris. So when you're using it right now, it looks like when you're dealing with again a human on the other side, it's like 50 tokens a second or something... You're going from 50 tokens a second of this level of knowledge to 1,000.",
  "verbatim": "it's like 50 tokens a second or something like when we use GPT 5.4 Pro extended... You're going from 50 tokens a second of this level of knowledge to 1,000. So in codeex now if you use 5.3 fast it's a thousand tokens a second",
  "conv_cues": "you're going from",
  "direction": "UP",
  "timeframe": "Imminent",
  "conv_level": "HIGH",
  "milestones": [
    {
      "kind": "llm_pre_event",
      "label": "OpenAI-Cerebras 750 MW partnership announced",
      "source": "https://openai.com/index/cerebras-partnership/",
      "status": "hit",
      "weight": 0.4,
      "ordinal": -8,
      "source_id": null,
      "confidence": 0.97,
      "source_url": "https://openai.com/index/cerebras-partnership/",
      "expected_date": "2025-10-01",
      "observed_date": "2025-10-01",
      "research_origin": "deep_research",
      "measurement_criterion": "OpenAI press release confirms multi-year deployment of 750 MW of Cerebras wafer-scale systems"
    },
    {
      "kind": "llm_pre_event",
      "label": "Cerebras CS-3 delivers 1,800+ tokens/sec on Llama 3.3 70B",
      "notes": "HIT — exceeds the 1,000 tok/s prediction by ~80%.",
      "source": "https://www.cerebras.ai/blog/openai-partners-with-cerebras-to-bring-high-speed-inference-to-the-mainstream",
      "status": "hit",
      "weight": 0.4,
      "ordinal": -7,
      "source_id": null,
      "confidence": 0.95,
      "source_url": "https://www.cerebras.ai/blog/openai-partners-with-cerebras-to-bring-high-speed-inference-to-the-mainstream",
      "expected_date": "2026-02-01",
      "observed_date": "2026-02-01",
      "research_origin": "deep_research",
      "measurement_criterion": "Independent benchmarks confirm Cerebras CS-3 at 1,800+ tokens/sec on Llama 3.3 70B (~10-20x GPU baseline of 50-200 tok/s)"
    },
    {
      "kind": "llm_pre_event",
      "label": "GPT-OSS-120B running at 3,000 tokens/sec on Cerebras",
      "notes": "HIT — 3x the 1,000 tok/s prediction. 60x baseline GPT-5 throughput.",
      "source": "https://www.cerebras.ai/blog/openai-partners-with-cerebras-to-bring-high-speed-inference-to-the-mainstream",
      "status": "hit",
      "weight": 0.4,
      "ordinal": -6,
      "source_id": null,
      "confidence": 0.95,
      "source_url": "https://www.cerebras.ai/blog/openai-partners-with-cerebras-to-bring-high-speed-inference-to-the-mainstream",
      "expected_date": "2026-03-01",
      "observed_date": "2026-03-01",
      "research_origin": "deep_research",
      "measurement_criterion": "OpenAI/Cerebras confirms gpt-oss-120B running at 3,000 tok/sec"
    },
    {
      "kind": "llm_post_event",
      "label": "AWS-Cerebras inference cloud collaboration GA",
      "source": "https://press.aboutamazon.com/aws/2026/3/aws-and-cerebras-collaboration-aims-to-set-a-new-standard-for-ai-inference-speed-and-performance-in-the-cloud",
      "status": "pending",
      "weight": 0.4,
      "ordinal": -5,
      "source_id": null,
      "confidence": 0.8,
      "source_url": "https://press.aboutamazon.com/aws/2026/3/aws-and-cerebras-collaboration-aims-to-set-a-new-standard-for-ai-inference-speed-and-performance-in-the-cloud",
      "expected_date": "2026-06-15",
      "research_origin": "deep_research",
      "expected_date_range": {
        "to": "2026-09-30",
        "from": "2026-03-01"
      },
      "measurement_criterion": "AWS/Cerebras collaboration announced March 2026 reaches GA inference availability for enterprise customers"
    },
    {
      "kind": "quartile_checkpoint",
      "label": "Q1 window check-in (25%)
... (truncated)