AI models will distill down to a few million parameters as end-state
Predictor: Alex Wissner-Gross · ep#242 "Elon Enters the Chip Race, the S&P 500 Repricing, and Human Drivers Will Become Illegal | EP #242" · source
Prediction text
AI models will distill down to a few million parameters as end-state | at the end of the the distillation rainbow we get like the the distilled black hole of a model or a neutron star or something, the ultimate phase change where it's maybe like a few million parameters
Verbatim quote
at the end of the the distillation rainbow we get like the the distilled black hole of a model or a neutron star or something, the ultimate phase change where it's maybe like a few million parameters
Predictor: Alex Wissner-Gross
Calibration plot (stated vs observed)
Evidence about this node from Alex Wissner-Gross is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).
Reference class
This node isn't linked to a reference class. The Bayesian update applies without outside-view blending.
Probability over time
Milestone chain
- 2025-11-15hitDensing Law published in Nature MI: capability-per-parameter doubles every 3.5 monthsHow: Peer-reviewed paper published in Nature Machine Intelligence or comparable venue establishes empirical 'densing law' with doubling time <=4 months for capability-per-parameter at fixed benchmark performanceSource: https://www.nature.com/articles/s42256-025-01137-0conf 90%
- 2026-04-01hitDistilled small model achieves frontier-class performance with 5x-50x cost reductionHow: Public model release (DeepSeek, Phi, Gemma, Llama, Qwen class) <=8B parameters achieves >=85% of GPT-4-class MMLU score, with documented training-cost reduction >=5x vs frontierSource: https://www.aitechboss.com/ai-model-distillation-2026/conf 90%
- 2026-11-02pendingQ1 window check-in (25%)
- 2026-06-01 → 2027-06-30pendingSub-1B-parameter model achieves GPT-3.5-class performance on standard benchmarksHow: Public model release <1B params achieves >=70% MMLU AND >=80% HumanEval pass@1, validated by Hugging Face leaderboard or independent evaluationSource: https://medium.com/@hs5492349/the-model-optimization-revolution-how-pruning-distillation-and-peft-are-reshaping-ai-in-2025-c9f79a9e7c2bconf 70%
- 2027-05-07pendingQ2 window check-in (50%)
- 2026-12-01 → 2028-05-14pendingSub-100M-parameter model achieves competent task-specific performanceHow: Public release of <100M parameter model achieving >=90% performance vs frontier model on a useful narrow task (medical Q&A, legal contract analysis, code completion in single language)Source: https://datanorth.ai/blog/model-distillation-how-to-cut-inference-costs-without-losing-qualityconf 70%
- 2027-11-09pendingQ3 window check-in (75%)
- 2027-06-01 → 2028-12-31pendingSub-10M-parameter task-specific model demonstrates near-frontier on bounded domainHow: Research paper or commercial product demonstrates <10M parameter model achieving >=95% of frontier performance on a narrow, well-bounded benchmark (e.g., medical-coding, named-entity recognition)Source: https://www.trendflash.net/posts/the-rise-of-small-models-why-lightweight-ai-is-overtaking-giants-in-real-world-useconf 50%
- 2028-01-01 → 2029-12-31pendingCascade: 'Few million parameters' end-state model class deployed on consumer edge devicesHow: Commercial deployment of <10M parameter inference models running >=1B inferences/month on smartphones / wearables / IoT, per Apple/Google/Meta disclosureSource: https://redis.io/blog/model-distillation-llm-guide/conf 40%
What if this resolves?
Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"
Evidence chain
Network propagation neighbors
Top incoming (parents)
Edges that influence THIS node's belief
| Kind | Node | Their prob | P(c|s=T) | P(c|s=F) | Δ implied |
|---|---|---|---|---|---|
| prereq | S_AGI_MID_2029 AGI mid: Kurzweil 2029 path | 35.0% | 0.450 | 0.050 | -0.157 |
| killer | TK03 AI Regulatory Moratorium (EU/US Capability Freeze) | 10.0% | 0.050 | 0.450 | +0.063 |
| killer | TK02 AI Compute Supply Shock (TSMC/Taiwan Disruption) | 12.0% | 0.050 | 0.450 | +0.055 |
| killer | TK01 AGI Capability Plateau (2026-27 Training Stall) | 15.0% | 0.050 | 0.450 | +0.043 |
| killer | TK09 Energy Grid Cap (Data Center Power Wall) | 35.0% | 0.050 | 0.450 | -0.037 |
Top outgoing (children)
Predictions THIS node influences
No outgoing edges.
Ticker exposure
Beneficiaries (24)
Adverse (6)
Prerequisites (6)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| prereq | S_AGI_MID_2029 | AGI mid: Kurzweil 2029 path | agi_general_capability | — |
| killer | TK09 | Energy Grid Cap (Data Center Power Wall) | — | — |
| killer | TK05 | Rate Regime Persistence (10y > 5% through 2028) | — | — |
| killer | TK01 | AGI Capability Plateau (2026-27 Training Stall) | — | — |
| killer | TK02 | AI Compute Supply Shock (TSMC/Taiwan Disruption) | — | — |
| killer | TK03 | AI Regulatory Moratorium (EU/US Capability Freeze) | — | — |
Dependents (0)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| No dependents | ||||
Linked documents (10)
Raw metadata
{
"nia": false,
"qty": "few million parameters",
"url": "https://www.youtube.com/watch?v=wMLcIWLlcWg",
"mode": "SPECULATION",
"role": "Host",
"context": "the ultimate phase change where it's maybe like a few million parameters",
"verbatim": "at the end of the the distillation rainbow we get like the the distilled black hole of a model or a neutron star or something, the ultimate phase change where it's maybe like a few million parameters",
"conv_cues": "maybe",
"direction": "HAPPEN",
"timeframe": "unspecified future",
"conv_level": "MEDIUM",
"milestones": [
{
"kind": "llm_pre_event",
"label": "Densing Law published in Nature MI: capability-per-parameter doubles every 3.5 months",
"source": "https://www.nature.com/articles/s42256-025-01137-0",
"status": "hit",
"weight": 0.4,
"ordinal": -8,
"source_id": null,
"confidence": 0.9,
"expected_date": "2025-11-15",
"observed_date": "2025-11-15",
"research_origin": "deep_research",
"measurement_criterion": "Peer-reviewed paper published in Nature Machine Intelligence or comparable venue establishes empirical 'densing law' with doubling time <=4 months for capability-per-parameter at fixed benchmark performance"
},
{
"kind": "llm_pre_event",
"label": "Distilled small model achieves frontier-class performance with 5x-50x cost reduction",
"source": "https://www.aitechboss.com/ai-model-distillation-2026/",
"status": "hit",
"weight": 0.4,
"ordinal": -7,
"source_id": null,
"confidence": 0.9,
"expected_date": "2026-04-01",
"observed_date": "2026-04-01",
"research_origin": "deep_research",
"measurement_criterion": "Public model release (DeepSeek, Phi, Gemma, Llama, Qwen class) <=8B parameters achieves >=85% of GPT-4-class MMLU score, with documented training-cost reduction >=5x vs frontier"
},
{
"kind": "quartile_checkpoint",
"label": "Q1 window check-in (25%)",
"status": "pending",
"weight": 0.05,
"ordinal": -6,
"source_id": null,
"expected_date": "2026-11-02",
"observed_date": null
},
{
"kind": "llm_pre_event",
"label": "Sub-1B-parameter model achieves GPT-3.5-class performance on standard benchmarks",
"source": "https://medium.com/@hs5492349/the-model-optimization-revolution-how-pruning-distillation-and-peft-are-reshaping-ai-in-2025-c9f79a9e7c2b",
"status": "pending",
"weight": 0.4,
"ordinal": -5,
"source_id": null,
"confidence": 0.7,
"expected_date": "2026-12-15",
"research_origin": "deep_research",
"expected_date_range": {
"to": "2027-06-30",
"from": "2026-06-01"
},
"measurement_criterion": "Public model release <1B params achieves >=70% MMLU AND >=80% HumanEval pass@1, validated by Hugging Face leaderboard or independent evaluation"
},
{
"kind": "quartile_checkpoint",
"label": "Q2 window check-in (50%)",
"status": "pending",
"weight": 0.05,
"ordinal": -4,
"source_id": null,
"expected_date": "2027-05-07",
"observed_date": null
},
{
"kind": "llm_pre_event",
"label": "Sub-100M-parameter model achieves competent task-specific performance",
"source": "https://datanorth.ai/blog/model-distillation-how-to-cut-inference-costs-without-losing-quality",
"status": "pending",
"weight": 0.4,
"ordinal": -3,
"source_id": null,
"confidence": 0.7,
"expected_date": "2027-08-23",
"research_origin": "deep_research",
"expected_date_range": {
"to": "2028-05-14",
"from": "2026-12-01"
},
"measurement_criterion": "Public release of <100M parameter model achieving >=90% performance vs frontier model on a useful narrow task (medical Q&A, legal contract analysis, code completion in single language)"
},
{
"kind": "quar
... (truncated)