'Software 3.0' LLM infrastructure will operate like public utilities — requiring massive upfront capex (training compute, specialized hardware), specialized networking protocols for synchrony across hundreds of thousands of GPUs, and flawless uninterru...
Predictor: Andrej Karpathy
Prediction text
'Software 3.0' LLM infrastructure will operate like public utilities — requiring massive upfront capex (training compute, specialized hardware), specialized networking protocols for synchrony across hundreds of thousands of GPUs, and flawless uninterrupted uptime; foundational software infrastructure will pivot from text-versioned Git to binary-weight / real-time inference platforms. | Next-gen kernel/hardware co-design releases
Key catalyst: Next-gen kernel/hardware co-design releases
Watch events: Claude / ChatGPT uptime SLAs; weights-format standardization; kernel-engineering open-source growth
Resolution evidence
ChatGPT/Claude/Gemini 99.9% uptime demands; Modal / Replicate / Anyscale / Anthropic Bedrock proliferation. Weights-as-binary tooling (LoRA, SafeTensors, ggml) mainstream.
Predictor: Andrej Karpathy
Calibration plot (stated vs observed)
Evidence about this node from Andrej Karpathy is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).
Reference class
This node isn't linked to a reference class. The Bayesian update applies without outside-view blending.
Probability over time
Milestone chain
- 2026-06-20pendingQ1 window check-in (25%)
- 2026-07-01 → 2026-12-31pendingNVIDIA + Google Cloud announce Vera Rubin NVL72 rack-scale availability for H2 2026 cloud deploymentHow: Google Cloud publicly opens Vera Rubin NVL72 rack-scale compute to customers in 2H 2026Source: https://cloud.google.com/blog/products/compute/google-cloud-ai-infrastructure-at-nvidia-gtc-2026conf 85%
- 2026-12-07pendingQ2 window check-in (50%)
- 2026-06-01 → 2027-06-30pendingAMD MI400 series (CDNA Next) ships at 40 PFLOPS FP4 / 432 GB HBM4 / 19.6 TB/sHow: AMD publicly ships MI400 with stated peak specs (40 PFLOPS FP4, 432GB HBM4, 19.6 TB/s)Source: https://www.tomshardware.com/tech-industry/semiconductors/nvidia-enterprise-roadmap-rubin-rubin-ultra-feynman-and-silicon-photonicsconf 65%
- 2026-06-01 → 2027-12-31pendingFullFlat optical NVLink network topology demonstrated for inter-node uniform bandwidthHow: NVIDIA / hyperscaler publishes data-center deployment using FullFlat or optical NVLink topology for LLM training/inferenceSource: https://arxiv.org/html/2506.15006v2 — Scaling Intelligence, FullFlat optical topologyconf 60%
- 2026-06-01 → 2027-12-31pendingHardware-software co-design tooling (Nemotron-Flash / Liquid AI) hits production for inferenceHow: NVIDIA Nemotron-Flash or Liquid AI co-design framework adopted by ≥1 large model provider in productionSource: https://developer.nvidia.com/blog/how-nvidia-extreme-hardware-software-co-design-delivered-a-large-inference-boost-for-sarvam-ais-sovereign-models/conf 65%
- 2027-05-26pendingQ3 window check-in (75%)
- 2026-09-01 → 2028-10-31pendingHyperscaler announces dedicated 'AI utility' offering with SLA-backed inference uptime ≥99.99%How: AWS / Azure / GCP launches inference platform with public utility-grade SLA on uptime + token throughputSource: https://intuitionlabs.ai/articles/llm-inference-hardware-enterprise-guideconf 50%
No downstream cascades — this prediction is a leaf in the dependency graph.
What if this resolves?
Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"
Evidence chain
Raw metadata
{
"actor": "system:auto_corroborate",
"method": "lbp",
"cred_avg": 0.79,
"polarity": "corroborates",
"doc_count": 5,
"applied_llr": 0.3765,
"evidence_kind": "auto_consensus",
"evidence_origin": "auto_corroborate",
"predictor_kappa": 0.6875,
"n_distinct_sources": 4
}Network propagation neighbors
Top incoming (parents)
Edges that influence THIS node's belief
Top outgoing (children)
Predictions THIS node influences
No outgoing edges.
Ticker exposure
Beneficiaries (9)
Prerequisites (3)
Dependents (0)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| No dependents | ||||
Expected milestones (2)
| Expected by | Description | Status |
|---|---|---|
| 2026-12-31 | Vera Rubin partner availability target for H2 2026 | pending |
| 2027-03-31 | Thinking Machines/NVIDIA 1 GW Vera Rubin deployment target early 2027 | pending |
Validations (1)
| Observed at | Status | By | Notes |
|---|---|---|---|
| 2026-04-29 | partial | thesis_timeline_v1.0_import | ChatGPT/Claude/Gemini 99.9% uptime demands; Modal / Replicate / Anyscale / Anthropic Bedrock proliferation. Weights-as-binary tooling (LoRA, SafeTensors, ggml) mainstream. |
Linked documents (10)
Raw metadata
{
"nia": false,
"mode": "FORECAST",
"role": "Cited-Other",
"context": "Maximum hardware utilization will depend on engineers 'tuning the hardware itself' — Tensor Core manipulation, custom kernel engineering to extract fractional-percent performance gains.",
"to_year": 2028,
"conv_cues": "explicit paradigm framing; utility analogy",
"direction": "HAPPEN",
"from_year": 2026,
"timeframe": "2026-2028",
"conv_level": "HIGH",
"milestones": [
{
"kind": "quartile_checkpoint",
"label": "Q1 window check-in (25%)",
"status": "pending",
"weight": 0.05,
"ordinal": -8,
"source_id": null,
"expected_date": "2026-06-20",
"observed_date": null
},
{
"kind": "llm_pre_event",
"label": "NVIDIA + Google Cloud announce Vera Rubin NVL72 rack-scale availability for H2 2026 cloud deployment",
"source": "https://cloud.google.com/blog/products/compute/google-cloud-ai-infrastructure-at-nvidia-gtc-2026",
"status": "pending",
"weight": 0.4,
"ordinal": -7,
"source_id": null,
"confidence": 0.85,
"expected_date": "2026-09-30",
"research_origin": "deep_research",
"expected_date_range": {
"to": "2026-12-31",
"from": "2026-07-01"
},
"measurement_criterion": "Google Cloud publicly opens Vera Rubin NVL72 rack-scale compute to customers in 2H 2026"
},
{
"kind": "quartile_checkpoint",
"label": "Q2 window check-in (50%)",
"status": "pending",
"weight": 0.05,
"ordinal": -6,
"source_id": null,
"expected_date": "2026-12-07",
"observed_date": null
},
{
"kind": "llm_pre_event",
"label": "AMD MI400 series (CDNA Next) ships at 40 PFLOPS FP4 / 432 GB HBM4 / 19.6 TB/s",
"source": "https://www.tomshardware.com/tech-industry/semiconductors/nvidia-enterprise-roadmap-rubin-rubin-ultra-feynman-and-silicon-photonics",
"status": "pending",
"weight": 0.4,
"ordinal": -5,
"source_id": null,
"confidence": 0.65,
"expected_date": "2026-12-15",
"research_origin": "deep_research",
"expected_date_range": {
"to": "2027-06-30",
"from": "2026-06-01"
},
"measurement_criterion": "AMD publicly ships MI400 with stated peak specs (40 PFLOPS FP4, 432GB HBM4, 19.6 TB/s)"
},
{
"kind": "llm_pre_event",
"label": "FullFlat optical NVLink network topology demonstrated for inter-node uniform bandwidth",
"source": "https://arxiv.org/html/2506.15006v2 — Scaling Intelligence, FullFlat optical topology",
"status": "pending",
"weight": 0.4,
"ordinal": -4,
"source_id": null,
"confidence": 0.6,
"expected_date": "2027-03-17",
"research_origin": "deep_research",
"expected_date_range": {
"to": "2027-12-31",
"from": "2026-06-01"
},
"measurement_criterion": "NVIDIA / hyperscaler publishes data-center deployment using FullFlat or optical NVLink topology for LLM training/inference"
},
{
"kind": "llm_pre_event",
"label": "Hardware-software co-design tooling (Nemotron-Flash / Liquid AI) hits production for inference",
"source": "https://developer.nvidia.com/blog/how-nvidia-extreme-hardware-software-co-design-delivered-a-large-inference-boost-for-sarvam-ais-sovereign-models/",
"status": "pending",
"weight": 0.4,
"ordinal": -3,
"source_id": null,
"confidence": 0.65,
"expected_date": "2027-03-17",
"research_origin": "deep_research",
"expected_date_range": {
"to": "2027-12-31",
"from": "2026-06-01"
},
"measurement_criterion": "NVIDIA Nemotron-Flash or Liquid AI co-design framework adopted by ≥1 large model provider in production"
},
{
"kind": "quartile_checkpoint",
"label": "Q3 window check-in (75%)",
"status": "pending",
"weight": 0.05,
"ordinal": -2,
"sou
... (truncated)