Next major revolutions in foundation models will come from small language models
Predictor: Alex Wissner-Gross · ep#234 "Anthropic vs. The Pentagon, Claude Outpaces ChatGPT, and Consulting Gets Replaced" · source
Prediction text
Next major revolutions in foundation models will come from small language models | I I I strongly suspect that the next major revolutions in in like 01 level revolutions in in foundation models will come from the small side because it's so much more accessible and so much easier for researchers to make progress
Verbatim quote
I I I strongly suspect that the next major revolutions in in like 01 level revolutions in in foundation models will come from the small side because it's so much more accessible and so much easier for researchers to make progress
Predictor: Alex Wissner-Gross
Calibration plot (stated vs observed)
Evidence about this node from Alex Wissner-Gross is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).
Reference class
This node isn't linked to a reference class. The Bayesian update applies without outside-view blending.
Probability over time
Milestone chain
- 2026-09-01 → 2027-09-30pendingAn open SLM (<=15B params) matches GPT-4-class on MMLU/HellaSwag standard benchmarksHow: Public benchmark (MMLU >=86, GPQA >=50) achieved by a model with <=15B activated params, peer-reproducedSource: Phi-4-mini 67% MMLU at 3.8B (Microsoft 2025-2026)conf 75%
- 2026-06-01 → 2028-06-30pendingFirst 'O1-class' reasoning architectural breakthrough published from SLM-side researchHow: Peer-reviewed or arXiv paper from non-frontier-lab origin demonstrates novel reasoning paradigm at <=20B params with clear lift over prior SOTASource: Wissner-Gross thesis on accessibility-driven researchconf 60%
- 2027-01-01 → 2028-12-31pendingAcademic research output on SLMs surpasses LLM-scaling outputHow: ArXiv cs.CL submissions tagged for compact/efficient/SLM models exceed those tagged for >100B-scale work, per Semantic Scholar trendSource: Wissner-Gross accessibility argument + observed 2024-2026 trendconf 65%
- 2027-06-01 → 2029-12-31pendingOn-device SLMs reach 100% smartphone shipment penetrationHow: All major mobile OEMs (Apple, Samsung, Google) ship flagship + mid-tier with on-device SLM by default per IDC trackerSource: Apple Intelligence + Galaxy AI + Gemini Nano deployment trajectoryconf 80%
- 2028-11-10pendingQ1 window check-in (25%)
- 2028-01-01 → 2030-06-30pendingSLM enterprise adoption crosses LLM API spend (industry inflection)How: Gartner, IDC or analogous tracker reports majority of enterprise inference spend (>50%) on locally-hosted SLMsSource: Gartner 3x SLM-vs-LLM use forecast by 2027conf 55%
- 2031-05-25pendingQ2 window check-in (50%)
- 2033-12-05pendingQ3 window check-in (75%)
No downstream cascades — this prediction is a leaf in the dependency graph.
What if this resolves?
Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"
Evidence chain
Network propagation neighbors
Top incoming (parents)
Edges that influence THIS node's belief
| Kind | Node | Their prob | P(c|s=T) | P(c|s=F) | Δ implied |
|---|---|---|---|---|---|
| prereq | S_ASI_SLOW_2040PLUS ASI slow: post-2040 / soft takeoff | 60.0% | 0.500 | 0.050 | -0.092 |
| killer | TK09 Energy Grid Cap (Data Center Power Wall) | 35.0% | 0.050 | 0.500 | -0.070 |
| killer | TK05 Rate Regime Persistence (10y > 5% through 2028) | 30.0% | 0.050 | 0.500 | -0.047 |
| killer | TK03 AI Regulatory Moratorium (EU/US Capability Freeze) | 10.0% | 0.050 | 0.500 | +0.043 |
| killer | TK02 AI Compute Supply Shock (TSMC/Taiwan Disruption) | 12.0% | 0.050 | 0.500 | +0.034 |
Top outgoing (children)
Predictions THIS node influences
No outgoing edges.
Ticker exposure
Beneficiaries (24)
Adverse (6)
Prerequisites (6)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| prereq | S_ASI_SLOW_2040PLUS | ASI slow: post-2040 / soft takeoff | asi_recursive_self_improvement | — |
| killer | TK09 | Energy Grid Cap (Data Center Power Wall) | — | — |
| killer | TK05 | Rate Regime Persistence (10y > 5% through 2028) | — | — |
| killer | TK01 | AGI Capability Plateau (2026-27 Training Stall) | — | — |
| killer | TK02 | AI Compute Supply Shock (TSMC/Taiwan Disruption) | — | — |
| killer | TK03 | AI Regulatory Moratorium (EU/US Capability Freeze) | — | — |
Dependents (0)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| No dependents | ||||
Linked documents (10)
Raw metadata
{
"nia": false,
"url": "https://www.youtube.com/watch?v=dmtvGKuRE64",
"mode": "PREDICTION",
"role": "Host",
"context": "I I I strongly suspect that the next major revolutions in in like 01 level revolutions in in foundation models will come from the small side because it's so much more accessible and so much easier for researchers to make progress",
"verbatim": "I I I strongly suspect that the next major revolutions in in like 01 level revolutions in in foundation models will come from the small side because it's so much more accessible and so much easier for researchers to make progress",
"conv_cues": "strongly suspect",
"direction": "HAPPEN",
"timeframe": "Unspecified future",
"conv_level": "HIGH",
"milestones": [
{
"kind": "llm_pre_event",
"label": "An open SLM (<=15B params) matches GPT-4-class on MMLU/HellaSwag standard benchmarks",
"source": "Phi-4-mini 67% MMLU at 3.8B (Microsoft 2025-2026)",
"status": "pending",
"weight": 0.4,
"ordinal": -8,
"source_id": null,
"confidence": 0.75,
"source_url": "https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/",
"expected_date": "2027-03-17",
"research_origin": "training",
"expected_date_range": {
"to": "2027-09-30",
"from": "2026-09-01"
},
"measurement_criterion": "Public benchmark (MMLU >=86, GPQA >=50) achieved by a model with <=15B activated params, peer-reproduced"
},
{
"kind": "llm_pre_event",
"label": "First 'O1-class' reasoning architectural breakthrough published from SLM-side research",
"source": "Wissner-Gross thesis on accessibility-driven research",
"status": "pending",
"weight": 0.4,
"ordinal": -7,
"source_id": null,
"confidence": 0.6,
"expected_date": "2027-06-16",
"research_origin": "training",
"expected_date_range": {
"to": "2028-06-30",
"from": "2026-06-01"
},
"measurement_criterion": "Peer-reviewed or arXiv paper from non-frontier-lab origin demonstrates novel reasoning paradigm at <=20B params with clear lift over prior SOTA"
},
{
"kind": "llm_pre_event",
"label": "Academic research output on SLMs surpasses LLM-scaling output",
"source": "Wissner-Gross accessibility argument + observed 2024-2026 trend",
"status": "pending",
"weight": 0.4,
"ordinal": -6,
"source_id": null,
"confidence": 0.65,
"expected_date": "2028-01-01",
"research_origin": "training",
"expected_date_range": {
"to": "2028-12-31",
"from": "2027-01-01"
},
"measurement_criterion": "ArXiv cs.CL submissions tagged for compact/efficient/SLM models exceed those tagged for >100B-scale work, per Semantic Scholar trend"
},
{
"kind": "llm_pre_event",
"label": "On-device SLMs reach 100% smartphone shipment penetration",
"source": "Apple Intelligence + Galaxy AI + Gemini Nano deployment trajectory",
"status": "pending",
"weight": 0.4,
"ordinal": -5,
"source_id": null,
"confidence": 0.8,
"expected_date": "2028-09-15",
"research_origin": "training",
"expected_date_range": {
"to": "2029-12-31",
"from": "2027-06-01"
},
"measurement_criterion": "All major mobile OEMs (Apple, Samsung, Google) ship flagship + mid-tier with on-device SLM by default per IDC tracker"
},
{
"kind": "quartile_checkpoint",
"label": "Q1 window check-in (25%)",
"status": "pending",
"weight": 0.05,
"ordinal": -4,
"source_id": null,
"expected_date": "2028-11-10",
"observed_date": null
},
{
"kind": "llm_pre_event",
"label": "SLM enterprise adoption crosses LLM API spend (industry inflection)",
"source": "Gartner 3x SLM-vs-LLM use forecast by 2027",
"status": "pending",
"weight": 0.4,
... (truncated)