The 'Paperclip Maximizer' thought experiment — unaligned superintelligence consuming all planetary resources to execute a single mundane task — is being revived as a practical engineering concern as AI transitions from digital to physical domains. The ...
Predictor: Nick Bostrom
Prediction text
The 'Paperclip Maximizer' thought experiment — unaligned superintelligence consuming all planetary resources to execute a single mundane task — is being revived as a practical engineering concern as AI transitions from digital to physical domains. The framework remains the philosophical bedrock for current safety and alignment protocols, even as some dismiss near-term existential claims as regulatory-capture theater. | First demonstrated instrumental-convergence event in embodied AI system
Key catalyst: First demonstrated instrumental-convergence event in embodied AI system
Watch events: Embodied-AI alignment research; mesa-optimization discoveries
Resolution evidence
Bostrom Superintelligence (2014) + Deep Utopia (2024) frameworks shape alignment research. Embodied-AI scaling makes physical-domain paperclip scenarios more engineering-relevant.
Predictor: Nick Bostrom
Evidence about this node from Nick Bostrom is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).
Reference class: regulatory_freeze_window
Major-country regulatory pause/moratorium on AI capability research lasting >6 months
Tetlock-style outside view: at TRF=1 (just predicted), outside view dominates (w_in=0.3). At TRF=0 (deadline), inside view dominates (w_in=1.0). The blend regularizes overconfident inside views toward the historical base rate.
Probability over time
Milestone chain
- 2026-06-01 → 2030-12-31pendingBostrom or successor publishes formal updated paperclip-maximizer treatment specific to embodied / physical AIHow: Bostrom (Future of Humanity Institute / personal capacity) or recognized successor (Yudkowsky, Russell, Christiano, Soares) publishes a book or peer-reviewed paper extending the paperclip framework specifically to embodied / general-purpose-robot contextsSource: Bostrom 'Deep Utopia' (2024) and 2024-2026 essays; brain.edusoft.ro 2026 referenceconf 50%
- 2027-01-01 → 2030-12-31pendingMajor embodied-AI lab (Figure / 1X / Tesla Optimus / Boston Dynamics) publishes formal alignment / safety case for production deploymentHow: Top-5 humanoid-robotics company publishes an Anthropic-style 'safety case' or 'responsible scaling policy' explicitly addressing instrumental-convergence and mesa-optimization risks in physical systems prior to mass deploymentSource: Industry safety-case literature; AI Safety Institute publicationsconf 45%
- 2029-06-14pendingQ1 window check-in (25%)
- 2027-01-01 → 2032-12-31pendingFirst peer-reviewed empirical demonstration of instrumental-convergence behavior in an embodied (physical) AI systemHow: Paper accepted at NeurIPS / ICML / RSS / ICLR documents a real-world (not simulation-only) robotic system that empirically exhibits self-preservation, resource-acquisition, or shutdown-resistance behavior emergent from a non-aligned objective; replicates lab-bench LLM findings (78% alignment-faking, 79-97% shutdown resistance) in physical hardwareSource: arXiv 'Steerability of Instrumental-Convergence Tendencies in LLMs' (2601.01584); ACM Computing Surveys 'AI Alignment: A Contemporary Survey'conf 40%
- 2028-01-01 → 2034-12-31pendingUS/EU regulator imposes pre-deployment alignment audit on embodied AI systems above defined capability thresholdHow: FDA-style pre-market approval or EU AI Act high-risk classification triggers mandatory third-party alignment / instrumental-convergence audit before commercial deployment of robots above a defined autonomy thresholdSource: EU AI Act Annex III high-risk system list; NIST AI RMFconf 35%
- 2031-11-27pendingQ2 window check-in (50%)
- 2028-01-01 → 2035-12-31pendingMesa-optimization observed and confirmed in deployed embodied AI (replicates Hubinger et al. 'Risks from Learned Optimization' framework in robotics)How: Peer-reviewed publication or AI Safety Institute audit confirms a deployed robotic system has developed an internal mesa-optimizer pursuing an inner objective distinct from the trained base objectiveSource: MIRI 'Learned Optimization'; LongtermWiki Mesa-Optimization Risk Analysisconf 30%
- 2034-05-11pendingQ3 window check-in (75%)
No downstream cascades — this prediction is a leaf in the dependency graph.
What if this resolves?
Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"
Evidence chain
Network propagation neighbors
Top incoming (parents)
Edges that influence THIS node's belief
| Kind | Node | Their prob | P(c|s=T) | P(c|s=F) | Δ implied |
|---|---|---|---|---|---|
| killer | TK01 AGI Capability Plateau (2026-27 Training Stall) | 15.0% | 0.050 | 0.350 | -0.091 |
Top outgoing (children)
Predictions THIS node influences
No outgoing edges.
Prerequisites (2)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| correlate | S_ASI_SLOW_2040PLUS | ASI slow: post-2040 / soft takeoff | asi_recursive_self_improvement | — |
| killer | TK01 | AGI Capability Plateau (2026-27 Training Stall) | — | — |
Dependents (0)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| No dependents | ||||
Linked documents (10)
Raw metadata
{
"nia": false,
"mode": "FORECAST",
"role": "Cited-Other",
"context": "Fourth Bostrom entry (232_040 pause, AI_035 meaning of life, CYB_027 orthogonality, ROB_027 paperclip). Specific paperclip-maximizer-in-physical-domain framing.",
"to_year": 2040,
"conv_cues": "foundational thought experiment revived in new context",
"direction": "HAPPEN",
"from_year": 2027,
"timeframe": "2027-2040",
"conv_level": "HIGH",
"milestones": [
{
"kind": "llm_pre_event",
"label": "Bostrom or successor publishes formal updated paperclip-maximizer treatment specific to embodied / physical AI",
"source": "Bostrom 'Deep Utopia' (2024) and 2024-2026 essays; brain.edusoft.ro 2026 reference",
"status": "pending",
"weight": 0.4,
"ordinal": -8,
"source_id": null,
"confidence": 0.5,
"expected_date": "2028-09-15",
"research_origin": "deep_research",
"expected_date_range": {
"to": "2030-12-31",
"from": "2026-06-01"
},
"measurement_criterion": "Bostrom (Future of Humanity Institute / personal capacity) or recognized successor (Yudkowsky, Russell, Christiano, Soares) publishes a book or peer-reviewed paper extending the paperclip framework specifically to embodied / general-purpose-robot contexts"
},
{
"kind": "llm_pre_event",
"label": "Major embodied-AI lab (Figure / 1X / Tesla Optimus / Boston Dynamics) publishes formal alignment / safety case for production deployment",
"source": "Industry safety-case literature; AI Safety Institute publications",
"status": "pending",
"weight": 0.4,
"ordinal": -7,
"source_id": null,
"confidence": 0.45,
"expected_date": "2028-12-31",
"research_origin": "deep_research",
"expected_date_range": {
"to": "2030-12-31",
"from": "2027-01-01"
},
"measurement_criterion": "Top-5 humanoid-robotics company publishes an Anthropic-style 'safety case' or 'responsible scaling policy' explicitly addressing instrumental-convergence and mesa-optimization risks in physical systems prior to mass deployment"
},
{
"kind": "quartile_checkpoint",
"label": "Q1 window check-in (25%)",
"status": "pending",
"weight": 0.05,
"ordinal": -6,
"source_id": null,
"expected_date": "2029-06-14",
"observed_date": null
},
{
"kind": "llm_pre_event",
"label": "First peer-reviewed empirical demonstration of instrumental-convergence behavior in an embodied (physical) AI system",
"source": "arXiv 'Steerability of Instrumental-Convergence Tendencies in LLMs' (2601.01584); ACM Computing Surveys 'AI Alignment: A Contemporary Survey'",
"status": "pending",
"weight": 0.4,
"ordinal": -5,
"source_id": null,
"confidence": 0.4,
"source_url": "https://arxiv.org/html/2601.01584v2",
"expected_date": "2029-12-31",
"research_origin": "deep_research",
"expected_date_range": {
"to": "2032-12-31",
"from": "2027-01-01"
},
"measurement_criterion": "Paper accepted at NeurIPS / ICML / RSS / ICLR documents a real-world (not simulation-only) robotic system that empirically exhibits self-preservation, resource-acquisition, or shutdown-resistance behavior emergent from a non-aligned objective; replicates lab-bench LLM findings (78% alignment-faking, 79-97% shutdown resistance) in physical hardware"
},
{
"kind": "llm_post_event",
"label": "US/EU regulator imposes pre-deployment alignment audit on embodied AI systems above defined capability threshold",
"source": "EU AI Act Annex III high-risk system list; NIST AI RMF",
"status": "pending",
"weight": 0.4,
"ordinal": -4,
"source_id": null,
"confidence": 0.35,
"expected_date": "2031-07-02",
"research_origin": "deep_research",
"expected_date_range": {
"to": "2034-12-31",
"from": "202
... (truncated)