OpenAI codex lead predicts current coding agents will seem primitive in 10 weeks
Predictor: OpenAI Codex Lead · ep#234 "Anthropic vs. The Pentagon, Claude Outpaces ChatGPT, and Consulting Gets Replaced" · source
Prediction text
OpenAI codex lead predicts current coding agents will seem primitive in 10 weeks | I'm beyond excited for the next 10 weeks will bring. I think the current state of coding agents will be remembered as being so primitive it'll be funny in comparison.
Watch events: OpenAI next funding round; IPO timing; revenue disclosures
Verbatim quote
I'm beyond excited for the next 10 weeks will bring. I think the current state of coding agents will be remembered as being so primitive it'll be funny in comparison.
Predictor: OpenAI Codex Lead
Evidence about this node from OpenAI Codex Lead is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).
Reference class
This node isn't linked to a reference class. The Bayesian update applies without outside-view blending.
Probability over time
Milestone chain
- 2026-04-24hitOpenAI ships GPT-5.5 with agentic-coding leadership benchmarksHow: OpenAI publicly releases a new Codex model that scores >=80% on Terminal-Bench 2.0 and >=70% on Expert-SWE long-horizon benchmarkSource: OpenAI: Introducing GPT-5.5 (April 24, 2026)conf 95%Notes: GPT-5.5 hit 82.7% Terminal-Bench 2.0, 73.1% Expert-SWE, 84.9% GDPval — directly validates the 'primitive in 10 weeks' thesis.
- 2026-04-24hitCodex gains long-horizon scheduling and self-wakeup capabilityHow: OpenAI Codex documentation announces ability to schedule future work and resume tasks autonomously across days/weeksSource: OpenAI Codex: Codex for (almost) everythingconf 90%
- 2026-05-15overdueGPT-5.5 1M-token context window enables full-codebase agentic refactorsHow: OpenAI API ships 1M-token context for coding model and at least one published case study of agent autonomously modifying a >100k LOC repositorySource: DigitalApplied: GPT-5.5 Complete Guide — Thinking, Pro & 1M Contextconf 85%
- 2026-05-01 → 2026-08-31pendingCompeting labs ship coding agents matching or exceeding GPT-5.5 by mid-2026How: At least two of (Anthropic, Google DeepMind, xAI) release a coding-specialized model with public Terminal-Bench 2.0 score >=80% within four months of GPT-5.5Source: LM Council Benchmarks April 2026conf 70%
- 2026-06-01 → 2026-09-30pendingPre-2026 coding agents publicly characterized as obsolete by GPT-5.5 era developersHow: Major dev-tools blog (Cursor, GitHub, Replit, Anthropic) publishes retrospective explicitly calling 2025-era coding agents 'primitive' or equivalentSource: Author's prediction (verbatim quote)conf 60%
What if this resolves?
Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"
Evidence chain
Raw metadata
{
"trf": 0.5051911152205567,
"kappa": 0.5,
"base_rate": null,
"predictor": "OpenAI Codex Lead",
"total_llr": -0.4054651081081644,
"grace_days": 7,
"bayesian_v2": true,
"prior_logit": -0.18367569873006,
"bayes_factor": "1.2:1 against",
"blend_reason": "no reference_class linked",
"inside_prior": 0.45420973759066463,
"kappa_source": "predictor_table",
"n_milestones": 1,
"blend_applied": false,
"contributions": [
{
"llr": -0.4054651081081644,
"kind": "llm_pre_event",
"kappa": 0.425,
"label": "GPT-5.5 1M-token context window enables full-codebase agentic refactors",
"weight": 0.4,
"strength": "weak",
"confidence": 0.85,
"source_url": "https://www.digitalapplied.com/blog/gpt-5-5-complete-guide-thinking-pro-1m-context",
"adjusted_llr": -0.17232267094596987,
"expected_date": "2026-05-15",
"measurement_criterion": "OpenAI API ships 1M-token context for coding model and at least one published case study of agent autonomously modifying a >100k LOC repository"
}
],
"evidence_kind": "metadata_milestone_miss_sweep",
"inside_source": "history_v2",
"inside_weight": 0.6463662193456103,
"outside_weight": 0.35363378065438966,
"posterior_prob": 0.41192859177871083,
"posterior_logit": -0.3559983696760299,
"predictor_brier": null,
"inside_posterior": 0.41192859177871083,
"blended_posterior": 0.41192859177871083,
"reference_class_id": null,
"total_adjusted_llr": -0.17232267094596987,
"predictor_n_resolved": 0
}Network propagation neighbors
Top incoming (parents)
Edges that influence THIS node's belief
| Kind | Node | Their prob | P(c|s=T) | P(c|s=F) | Δ implied |
|---|---|---|---|---|---|
| killer | TK03 AI Regulatory Moratorium (EU/US Capability Freeze) | 10.0% | 0.050 | 0.550 | +0.088 |
| killer | TK02 AI Compute Supply Shock (TSMC/Taiwan Disruption) | 12.0% | 0.050 | 0.550 | +0.078 |
| prereq | SEM_014 Nvidia's Arizona-based TSMC factory successfully fabricated — Jensen Huang | 86.1% | 0.550 | 0.050 | +0.064 |
| killer | TK01 AGI Capability Plateau (2026-27 Training Stall) | 15.0% | 0.050 | 0.550 | +0.063 |
| prereq | SEM_011 Nvidia became the world's first $5 trillion company (late 20 — Jensen Huang | 85.5% | 0.550 | 0.050 | +0.063 |
Top outgoing (children)
Predictions THIS node influences
| Kind | Node | Their prob | P(c|s=T) | P(c|s=F) | Δ implied |
|---|---|---|---|---|---|
| prereq | 247_023 AI will be able to do everything a white collar worker does — Dave Blundin | 40.8% | 0.720 | 0.050 | -0.072 |
| prereq | 244_019 Peter's son won't need a driver's license in 2 years — Peter Diamandis | 48.4% | 0.920 | 0.050 | -0.064 |
| prereq | 242_031 Most large companies' business models will be disrupted in 2 — Peter Diamandis | 36.1% | 0.650 | 0.050 | -0.056 |
| prereq | 230_020 Peter's 14-year-old son Milan will never get a driver's lice — Peter Diamandis | 34.7% | 0.650 | 0.050 | -0.041 |
| prereq | 232_055 We're exiting the industrial age permanently as recursive se — Peter Diamandis | 35.5% | 0.700 | 0.050 | -0.028 |
Ticker exposure
Beneficiaries (24)
Adverse (6)
Prerequisites (10)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| prereq | SEM_011 | Nvidia became the world's first $5 trillion company (late 2025), operating a near-monopoly on advanced AI chips. | Capital Markets | — |
| prereq | SEM_027 | Nvidia Data Center revenue +66% YoY, contributing ~90% of $57B fiscal Q3 revenue; >$4.5T market cap entirely underpinned by AI silicon. | Capital Markets | — |
| prereq | SEM_014 | Nvidia's Arizona-based TSMC factory successfully fabricated cutting-edge semiconductors on US soil for first time in decades (October 2025). | Manufacturing | — |
| prereq | SEM_012 | Nvidia quadrupled chip production output while only doubling human headcount — achieved by deploying AI coding tools (Cursor, Claude Code) across engineering. | AI/Manufacturing | — |
| prereq | SEM_015 | Nvidia agreed to remit 15% of China chip-sale revenue directly to US government in exchange for reversing specific AI chip export bans. | Policy/Semis | — |
| killer | TK09 | Energy Grid Cap (Data Center Power Wall) | — | — |
| killer | TK05 | Rate Regime Persistence (10y > 5% through 2028) | — | — |
| killer | TK01 | AGI Capability Plateau (2026-27 Training Stall) | — | — |
| killer | TK02 | AI Compute Supply Shock (TSMC/Taiwan Disruption) | — | — |
| killer | TK03 | AI Regulatory Moratorium (EU/US Capability Freeze) | — | — |
Dependents (5)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| prereq | 244_019 | Peter's son won't need a driver's license in 2 years | Auto/Transport | — |
| prereq | 247_023 | AI will be able to do everything a white collar worker does imminently | AI | — |
| prereq | 232_055 | We're exiting the industrial age permanently as recursive self-improvement unfolds. | AI | — |
| prereq | 242_031 | Most large companies' business models will be disrupted in 2-5 years | Markets/Stocks | — |
| prereq | 230_020 | Peter's 14-year-old son Milan will never get a driver's license. | Auto/Transport | — |
Linked documents (10)
| Sim | Source | Title | Market prob | Polarity | Reviewed | Published |
|---|---|---|---|---|---|---|
| 0.664 | github_release | openai/openai-python v2.9.0 | — | mentions | pending | 2025-12-04 |
| 0.655 | manifold | which will happen first? (Codeforces Rating) | — | mentions | pending | 2026-05-24 |
| 0.654 | github_release | openai/openai-python v2.15.0 | — | mentions | pending | 2026-01-09 |
| 0.653 | github_release | openai/openai-python v2.12.0 | — | mentions | pending | 2025-12-15 |
| 0.651 | github_release | openai/openai-python v2.36.0 | — | mentions | pending | 2026-05-07 |
| 0.650 | github_release | openai/openai-python v2.23.0 | — | mentions | pending | 2026-02-24 |
| 0.648 | github_release | openai/openai-python v2.25.0 | — | mentions | pending | 2026-03-05 |
| 0.648 | github_release | openai/openai-python v2.13.0 | — | mentions | pending | 2025-12-16 |
| 0.647 | github_release | openai/openai-python v2.33.0 | — | mentions | pending | 2026-04-28 |
| 0.643 | github_release | openai/openai-python v2.16.0 | — | mentions | pending | 2026-01-27 |
Raw metadata
{
"nia": false,
"qty": "10 weeks",
"url": "https://www.youtube.com/watch?v=dmtvGKuRE64",
"mode": "CITED_PREDICTION",
"role": "Cited-Executive",
"context": "OpenAI codeex lead predicts rapid evolution of AI agents within 10 weeks. Quote, I'm beyond excited for the next 10 weeks will bring. I think the current state of coding agents will be remembered as being so primitive it'll be funny in comparison.",
"to_year": 2026,
"cited_by": "Peter Diamandis",
"verbatim": "I'm beyond excited for the next 10 weeks will bring. I think the current state of coding agents will be remembered as being so primitive it'll be funny in comparison.",
"conv_cues": "beyond excited; I think",
"direction": "HAPPEN",
"from_year": 2026,
"timeframe": "By mid-May 2026",
"conv_level": "HIGH",
"milestones": [
{
"kind": "llm_pre_event",
"label": "OpenAI ships GPT-5.5 with agentic-coding leadership benchmarks",
"notes": "GPT-5.5 hit 82.7% Terminal-Bench 2.0, 73.1% Expert-SWE, 84.9% GDPval — directly validates the 'primitive in 10 weeks' thesis.",
"source": "OpenAI: Introducing GPT-5.5 (April 24, 2026)",
"status": "hit",
"weight": 0.4,
"ordinal": -10,
"source_id": null,
"confidence": 0.95,
"source_url": "https://openai.com/index/introducing-gpt-5-5/",
"expected_date": "2026-04-24",
"observed_date": "2026-04-24",
"research_origin": "deep_research",
"measurement_criterion": "OpenAI publicly releases a new Codex model that scores >=80% on Terminal-Bench 2.0 and >=70% on Expert-SWE long-horizon benchmark"
},
{
"kind": "prereq",
"label": "Nvidia became the world's first $5 trillion company (late 2025), operating a near-monopoly on advanced AI chips.",
"status": "hit",
"weight": 0.5,
"ordinal": -9,
"source_id": "SEM_011",
"expected_date": "2026-04-29",
"observed_date": "2026-04-29"
},
{
"kind": "prereq",
"label": "Nvidia Data Center revenue +66% YoY, contributing ~90% of $57B fiscal Q3 revenue; >$4.5T market cap entirely underpinned by AI silicon.",
"status": "hit",
"weight": 0.5,
"ordinal": -8,
"source_id": "SEM_027",
"expected_date": "2026-04-29",
"observed_date": "2026-04-29"
},
{
"kind": "prereq",
"label": "Nvidia's Arizona-based TSMC factory successfully fabricated cutting-edge semiconductors on US soil for first time in decades (October 2025).",
"status": "hit",
"weight": 0.5,
"ordinal": -7,
"source_id": "SEM_014",
"expected_date": "2026-04-29",
"observed_date": "2026-04-29"
},
{
"kind": "prereq",
"label": "Nvidia quadrupled chip production output while only doubling human headcount — achieved by deploying AI coding tools (Cursor, Claude Code) a",
"status": "hit",
"weight": 0.5,
"ordinal": -6,
"source_id": "SEM_012",
"expected_date": "2026-04-29",
"observed_date": "2026-04-29"
},
{
"kind": "llm_pre_event",
"label": "Codex gains long-horizon scheduling and self-wakeup capability",
"source": "OpenAI Codex: Codex for (almost) everything",
"status": "hit",
"weight": 0.4,
"ordinal": -5,
"source_id": null,
"confidence": 0.9,
"source_url": "https://openai.com/index/codex-for-almost-everything/",
"expected_date": "2026-04-30",
"observed_date": "2026-04-24",
"research_origin": "deep_research",
"measurement_criterion": "OpenAI Codex documentation announces ability to schedule future work and resume tasks autonomously across days/weeks"
},
{
"kind": "llm_pre_event",
"label": "GPT-5.5 1M-token context window enables full-codebase agentic refactors",
"source": "DigitalApplied: GPT-5.5 Complete Guide — Thinking, Pro & 1M Context",
"status": "overdue",
"weight": 0.4,
"ordinal": -4,
"source_id": null,
"
... (truncated)