True autonomous agents are 'not anywhere close' — AGI and reliable long-horizon agents will require a full decade (2034 or beyond) to develop the holistic contextual reasoning and robust world models needed for unconstrained physical and digital enviro...
Predictor: Andrej Karpathy
Prediction text
True autonomous agents are 'not anywhere close' — AGI and reliable long-horizon agents will require a full decade (2034 or beyond) to develop the holistic contextual reasoning and robust world models needed for unconstrained physical and digital environments. | Long-horizon agent benchmark breakthrough
Key catalyst: Long-horizon agent benchmark breakthrough
Watch events: Long-horizon agent benchmarks; world-model saturation
Resolution evidence
Karpathy "Software 3.0" framing (INF_026) validated; specific decade-out AGI timeline provides bearish anchor against Altman 2030 / Aschenbrenner 2027.
Predictor: Andrej Karpathy
Calibration plot (stated vs observed)
Evidence about this node from Andrej Karpathy is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).
Reference class: agi_breakthrough_5y
Major capability discontinuity (e.g. AGI by named target year, 5-year horizon)
Tetlock-style outside view: at TRF=1 (just predicted), outside view dominates (w_in=0.3). At TRF=0 (deadline), inside view dominates (w_in=1.0). The blend regularizes overconfident inside views toward the historical base rate.
Probability over time
Milestone chain
- 2026-04-01 → 2028-06-30pendingKarpathy publishes update on his decade-away framing or releases follow-up agentic research (Eureka Labs, nanochat successor)How: Karpathy public talk, blog, or repo release where he addresses his decade timeline, allowing direct comparison to Oct 2025 Dwarkesh interviewSource: https://fortune.com/2026/03/17/andrej-karpathy-loop-autonomous-ai-agents-future/conf 85%
- 2026-06-01 → 2029-06-30pendingContinual-learning architecture demonstrated in production agent (memory persists across sessions, no model retraining)How: Frontier lab paper or product demo shows agent that genuinely accumulates and uses session-spanning learning without full retraining cycleSource: https://www.predictiveanalyticsworld.com/machinelearningtimes/agi-is-still-a-decade-away-todays-ai-agents-are-slop-openai-cofounder-andrej-karpathy/13949/conf 50%
- 2026-06-01 → 2029-12-31pendingFrontier agentic benchmark (OSWorld / GAIA / AgentBench long-horizon) crosses 70% pass-rateHow: Public leaderboard for recognized long-horizon agentic benchmark shows >=70% (vs 2025 SOTA typically 30-50% on hardest variants)Source: https://www.remio.ai/post/why-andrej-karpathy-says-ai-agents-are-a-decade-from-realityconf 60%
- 2027-01-01 → 2031-12-31pendingFrontier lab (OpenAI, Anthropic, DeepMind) ships agent product with persistent multi-day autonomous task completion at >85% reliabilityHow: Lab product release with marketing claim of multi-day autonomous task completion plus published evaluation showing >85% success on long-horizon real-world task suiteSource: https://www.landera.ai/guide/karpathy-paradoxconf 55%
- 2028-01-01 → 2032-12-31pendingMulti-modal robotic agent demonstrates >24-hour unscripted task continuity in real-world environmentHow: Public demo or peer-reviewed result of embodied agent (Figure, 1X, Tesla Optimus, Google RT-X) operating >24 hours in real-world workspace without human interventionSource: https://www.flowhunt.io/blog/the-decade-of-ai-agents-andrej-karpathy-agi-timeline/conf 50%
- 2031-01-01 → 2036-12-31pendingCascade: BLS occupational displacement attributable to autonomous agents exceeds 5% of US workforceHow: BLS, OECD, or major economic study attributes >=5% of US labor displacement specifically to autonomous AI agents (not just AI augmentation broadly)Source: https://medium.com/generative-ai-revolution-ai-native-transformation/openai-cofounder-warned-of-an-ai-agent-crisis-agentic-engineering-is-the-way-forward-6b746b9f0946conf 40%
- 2034-03-06pendingQ1 window check-in (25%)
- 2034-05-09pendingQ2 window check-in (50%)
- 2034-07-12pendingQ3 window check-in (75%)
No downstream cascades — this prediction is a leaf in the dependency graph.
What if this resolves?
Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"
Evidence chain
Network propagation neighbors
Top incoming (parents)
Edges that influence THIS node's belief
| Kind | Node | Their prob | P(c|s=T) | P(c|s=F) | Δ implied |
|---|---|---|---|---|---|
| killer | TK06 China-Taiwan Military Conflict | 8.0% | 0.050 | 0.450 | +0.037 |
| killer | TK03 AI Regulatory Moratorium (EU/US Capability Freeze) | 10.0% | 0.050 | 0.450 | +0.029 |
| killer | TK11 Autonomous Regulatory Block (Level 4 Halt) | 10.0% | 0.050 | 0.450 | +0.029 |
| killer | TK01 AGI Capability Plateau (2026-27 Training Stall) | 15.0% | 0.050 | 0.450 | +0.009 |
Top outgoing (children)
Predictions THIS node influences
No outgoing edges.
Ticker exposure
Adverse (4)
Prerequisites (8)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| correlate | S_ASI_SLOW_2040PLUS | ASI slow: post-2040 / soft takeoff | asi_recursive_self_improvement | — |
| correlate | S_AGI_MID_2029 | AGI mid: Kurzweil 2029 path | agi_general_capability | — |
| correlate | S_AGI_SLOW_2031 | AGI slow: Schmidt/Hassabis 5-10 year path | agi_general_capability | — |
| correlate | S_AGI_WINTER_2036PLUS | AGI delayed: capability plateau or AI winter | agi_general_capability | — |
| killer | TK01 | AGI Capability Plateau (2026-27 Training Stall) | — | — |
| killer | TK03 | AI Regulatory Moratorium (EU/US Capability Freeze) | — | — |
| killer | TK11 | Autonomous Regulatory Block (Level 4 Halt) | — | — |
| killer | TK06 | China-Taiwan Military Conflict | — | — |
Dependents (0)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| No dependents | ||||
Linked documents (10)
Raw metadata
{
"nia": false,
"mode": "FORECAST",
"role": "Cited-Other",
"context": "Karpathy's structural-caution anchor provides the bearish extreme in the AGI-timeline debate. Couples with Hassabis jagged-intelligence (AI_005) and CMQ_047 (agents close-the-loop without human).",
"to_year": 2034,
"conv_cues": "decade horizon; ex-senior-researcher framing",
"direction": "HAPPEN",
"from_year": 2034,
"timeframe": "2034+",
"conv_level": "HIGH",
"milestones": [
{
"kind": "llm_pre_event",
"label": "Karpathy publishes update on his decade-away framing or releases follow-up agentic research (Eureka Labs, nanochat successor)",
"source": "https://fortune.com/2026/03/17/andrej-karpathy-loop-autonomous-ai-agents-future/",
"status": "pending",
"weight": 0.4,
"ordinal": -10,
"source_id": null,
"confidence": 0.85,
"expected_date": "2027-05-16",
"research_origin": "training",
"expected_date_range": {
"to": "2028-06-30",
"from": "2026-04-01"
},
"measurement_criterion": "Karpathy public talk, blog, or repo release where he addresses his decade timeline, allowing direct comparison to Oct 2025 Dwarkesh interview"
},
{
"kind": "llm_pre_event",
"label": "Continual-learning architecture demonstrated in production agent (memory persists across sessions, no model retraining)",
"source": "https://www.predictiveanalyticsworld.com/machinelearningtimes/agi-is-still-a-decade-away-todays-ai-agents-are-slop-openai-cofounder-andrej-karpathy/13949/",
"status": "pending",
"weight": 0.4,
"ordinal": -9,
"source_id": null,
"confidence": 0.5,
"expected_date": "2027-12-15",
"research_origin": "training",
"expected_date_range": {
"to": "2029-06-30",
"from": "2026-06-01"
},
"measurement_criterion": "Frontier lab paper or product demo shows agent that genuinely accumulates and uses session-spanning learning without full retraining cycle"
},
{
"kind": "llm_pre_event",
"label": "Frontier agentic benchmark (OSWorld / GAIA / AgentBench long-horizon) crosses 70% pass-rate",
"source": "https://www.remio.ai/post/why-andrej-karpathy-says-ai-agents-are-a-decade-from-reality",
"status": "pending",
"weight": 0.4,
"ordinal": -8,
"source_id": null,
"confidence": 0.6,
"expected_date": "2028-03-16",
"research_origin": "training",
"expected_date_range": {
"to": "2029-12-31",
"from": "2026-06-01"
},
"measurement_criterion": "Public leaderboard for recognized long-horizon agentic benchmark shows >=70% (vs 2025 SOTA typically 30-50% on hardest variants)"
},
{
"kind": "llm_pre_event",
"label": "Frontier lab (OpenAI, Anthropic, DeepMind) ships agent product with persistent multi-day autonomous task completion at >85% reliability",
"source": "https://www.landera.ai/guide/karpathy-paradox",
"status": "pending",
"weight": 0.4,
"ordinal": -7,
"source_id": null,
"confidence": 0.55,
"expected_date": "2029-07-01",
"research_origin": "training",
"expected_date_range": {
"to": "2031-12-31",
"from": "2027-01-01"
},
"measurement_criterion": "Lab product release with marketing claim of multi-day autonomous task completion plus published evaluation showing >85% success on long-horizon real-world task suite"
},
{
"kind": "llm_pre_event",
"label": "Multi-modal robotic agent demonstrates >24-hour unscripted task continuity in real-world environment",
"source": "https://www.flowhunt.io/blog/the-decade-of-ai-agents-andrej-karpathy-agi-timeline/",
"status": "pending",
"weight": 0.4,
"ordinal": -6,
"source_id": null,
"confidence": 0.5,
"expected_date": "2030-07-02",
"research_origin": "training",
"expected_date_range": {
"
... (truncated)