Non-coders and engineers alike must build for 'where the models are going, not where they are today' — 'this is the worst the models will ever be'. Next-generation models will 'eat your scaffolding for breakfast'; manual software configuration, standar...
Predictor: Kevin Weil
Prediction text
Non-coders and engineers alike must build for 'where the models are going, not where they are today' — 'this is the worst the models will ever be'. Next-generation models will 'eat your scaffolding for breakfast'; manual software configuration, standard API integrations, static codebase management entirely replaced by living system orchestrators that generate and deploy applications autonomously. Labor market for mid-level developers collapses rapidly. | Next major coding-assistant capability leap
Key catalyst: Next major coding-assistant capability leap
Watch events: Mid-level developer hiring statistics; agentic-dev-platform revenue
Resolution evidence
Claude Code, Cursor Agent, Lovable, Bolt demonstrate scaffolding-replacement; mid-level developer hiring slowing 2024-2026.
Predictor: Kevin Weil
Calibration plot (stated vs observed)
Evidence about this node from Kevin Weil is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).
Reference class
This node isn't linked to a reference class. The Bayesian update applies without outside-view blending.
Probability over time
Milestone chain
- 2025-08-02overdueQ1 window check-in (25%)
- 2026-03-04overdueQ2 window check-in (50%)
- 2026-04-23hitGPT-5.5 ships with frontier agentic-coding benchmarksHow: OpenAI ships GPT-5.5 scoring 82.7% on Terminal-Bench 2.0, 73.1% Expert-SWE, 84.9% GDPval — validating 'this is the worst the models will ever be'Source: Artificial Analysis — 'OpenAI's GPT-5.5 is the new leading AI model'conf 99%Notes: HIT — GPT-5.5 shipped with massive jump on long-horizon agentic coding metrics.
- 2026-04-15hitGemini 3.1 Pro doubles ARC-AGI-2 over predecessorHow: Gemini 3.1 Pro scores 77.1% on ARC-AGI-2 (double predecessor), 78.80% on SWE-bench Verified, validating non-linear improvement curveSource: LM Council Benchmarks April 2026conf 95%
- 2026-10-04pendingQ3 window check-in (75%)
- 2026-06-01 → 2027-06-30pendingMid-level developer headcount declines at major tech firmsHow: Public reporting confirms net mid-level (L4/L5 equivalent) software engineer headcount decline at >=2 of (Google, Meta, Microsoft, Amazon) attributed to AI agent productivitySource: AI Forces Over 50,000 Layoffs 2025 — National CIO Reviewconf 70%
- 2026-06-01 → 2027-12-31pendingClaude Code or equivalent crosses 50% Fortune 100 deploymentHow: GitHub Copilot or Claude Code reports >=50% Fortune 100 enterprise deployment with autonomous multi-file refactor as primary use caseSource: Microsoft / Anthropic enterprise disclosuresconf 65%
- 2026-09-01 → 2027-12-31pendingHand-coded scaffolding pattern becomes obsolete in industry parlanceHow: Stack Overflow Developer Survey or equivalent shows >=60% of devs report agent-driven workflows as primary mode, with hand-built scaffolding/boilerplate cited as legacy practiceSource: Trend extrapolation from vibe coding adoption + GPT-5.5 capability jumpconf 55%
No downstream cascades — this prediction is a leaf in the dependency graph.
What if this resolves?
Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"
Evidence chain
Raw metadata
{
"trf": 0.5549161103469683,
"kappa": 0.6875,
"base_rate": null,
"predictor": "Kevin Weil",
"total_llr": -0.8109302162163288,
"grace_days": 7,
"bayesian_v2": true,
"prior_logit": 0.5701739326829037,
"bayes_factor": "1.7:1 against",
"blend_reason": "no reference_class linked",
"inside_prior": 0.6388033082389746,
"kappa_source": "predictor_table",
"n_milestones": 2,
"blend_applied": false,
"contributions": [
{
"llr": -0.4054651081081644,
"kind": "quartile_checkpoint",
"kappa": 0.6875,
"label": "Q1 window check-in (25%)",
"weight": 0.05,
"strength": "weak",
"confidence": null,
"source_url": null,
"adjusted_llr": -0.278757261824363,
"expected_date": "2025-08-02",
"measurement_criterion": null
},
{
"llr": -0.4054651081081644,
"kind": "quartile_checkpoint",
"kappa": 0.6875,
"label": "Q2 window check-in (50%)",
"weight": 0.05,
"strength": "weak",
"confidence": null,
"source_url": null,
"adjusted_llr": -0.278757261824363,
"expected_date": "2026-03-04",
"measurement_criterion": null
}
],
"evidence_kind": "metadata_milestone_miss_sweep",
"inside_source": "history_v2",
"inside_weight": 0.6115587227571222,
"outside_weight": 0.3884412772428778,
"posterior_prob": 0.5031648099924518,
"posterior_logit": 0.012659409034177727,
"predictor_brier": 0.02,
"inside_posterior": 0.5031648099924518,
"blended_posterior": 0.5031648099924518,
"reference_class_id": null,
"total_adjusted_llr": -0.557514523648726,
"predictor_n_resolved": 3
}Network propagation neighbors
Top incoming (parents)
Edges that influence THIS node's belief
Top outgoing (children)
Predictions THIS node influences
No outgoing edges.
Prerequisites (2)
Dependents (0)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| No dependents | ||||
Validations (1)
| Observed at | Status | By | Notes |
|---|---|---|---|
| 2026-04-29 | partial | thesis_timeline_v1.0_import | Claude Code, Cursor Agent, Lovable, Bolt demonstrate scaffolding-replacement; mid-level developer hiring slowing 2024-2026. |
Linked documents (5)
| Sim | Source | Title | Market prob | Polarity | Reviewed | Published |
|---|---|---|---|---|---|---|
| 0.679 | arxiv | ProgramBench: Can Language Models Rebuild Programs From Scratch? | — | mentions | pending | 2026-05-05 |
| 0.643 | arxiv | Scaffold, Not Vocabulary? A Controlled, Two-Tier, Pre-Registered Study of a Popperian Code-Generation Skill | — | mentions | pending | 2026-06-04 |
| 0.637 | arxiv | Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution | — | mentions | pending | 2026-06-04 |
| 0.599 | github_release | facebookresearch/ProgramBench v1.0.0 | — | mentions | pending | 2026-05-05 |
| 0.580 | github_release | facebookresearch/spdl v0.1.4 | — | mentions | pending | 2025-09-10 |
Raw metadata
{
"nia": false,
"mode": "FORECAST",
"role": "Cited-Other",
"context": "Fourth Weil entry (SPC_022 AI-science 2026, ROB_002 99% code, ROB_003 25yr science in 5, AUT_019 GPU spike). Specific scaffolding-eaten coinage + mid-level developer collapse framing.",
"to_year": 2027,
"conv_cues": "coined phrasing; specific future-infrastructure framing",
"direction": "HAPPEN",
"from_year": 2025,
"timeframe": "late-2025 to 2027",
"conv_level": "HIGH",
"milestones": [
{
"kind": "quartile_checkpoint",
"label": "Q1 window check-in (25%)",
"status": "overdue",
"weight": 0.05,
"ordinal": -8,
"source_id": null,
"expected_date": "2025-08-02",
"observed_date": null,
"miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
"miss_emitted_by": "metadata_milestone_sweep"
},
{
"kind": "quartile_checkpoint",
"label": "Q2 window check-in (50%)",
"status": "overdue",
"weight": 0.05,
"ordinal": -7,
"source_id": null,
"expected_date": "2026-03-04",
"observed_date": null,
"miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
"miss_emitted_by": "metadata_milestone_sweep"
},
{
"kind": "llm_pre_event",
"label": "GPT-5.5 ships with frontier agentic-coding benchmarks",
"notes": "HIT — GPT-5.5 shipped with massive jump on long-horizon agentic coding metrics.",
"source": "Artificial Analysis — 'OpenAI's GPT-5.5 is the new leading AI model'",
"status": "hit",
"weight": 0.4,
"ordinal": -6,
"source_id": null,
"confidence": 0.99,
"source_url": "https://artificialanalysis.ai/articles/openai-gpt5-5-is-the-new-leading-AI-model",
"expected_date": "2026-04-23",
"observed_date": "2026-04-23",
"research_origin": "deep_research",
"measurement_criterion": "OpenAI ships GPT-5.5 scoring 82.7% on Terminal-Bench 2.0, 73.1% Expert-SWE, 84.9% GDPval — validating 'this is the worst the models will ever be'"
},
{
"kind": "llm_pre_event",
"label": "Gemini 3.1 Pro doubles ARC-AGI-2 over predecessor",
"source": "LM Council Benchmarks April 2026",
"status": "hit",
"weight": 0.4,
"ordinal": -5,
"source_id": null,
"confidence": 0.95,
"source_url": "https://lmcouncil.ai/benchmarks",
"expected_date": "2026-04-30",
"observed_date": "2026-04-15",
"research_origin": "deep_research",
"measurement_criterion": "Gemini 3.1 Pro scores 77.1% on ARC-AGI-2 (double predecessor), 78.80% on SWE-bench Verified, validating non-linear improvement curve"
},
{
"kind": "quartile_checkpoint",
"label": "Q3 window check-in (75%)",
"status": "pending",
"weight": 0.05,
"ordinal": -4,
"source_id": null,
"expected_date": "2026-10-04",
"observed_date": null
},
{
"kind": "llm_pre_event",
"label": "Mid-level developer headcount declines at major tech firms",
"source": "AI Forces Over 50,000 Layoffs 2025 — National CIO Review",
"status": "pending",
"weight": 0.4,
"ordinal": -3,
"source_id": null,
"confidence": 0.7,
"source_url": "https://nationalcioreview.com/articles-insights/extra-bytes/ai-forces-over-50000-layoffs-in-2025-at-leading-technology-firms/",
"expected_date": "2026-12-15",
"research_origin": "deep_research",
"expected_date_range": {
"to": "2027-06-30",
"from": "2026-06-01"
},
"measurement_criterion": "Public reporting confirms net mid-level (L4/L5 equivalent) software engineer headcount decline at >=2 of (Google, Meta, Microsoft, Amazon) attributed to AI agent productivity"
},
{
"kind": "llm_pre_event",
"label": "Claude Code or equivalent crosses 50% Fortune 100 deployment",
"source": "Microsoft / Anthropic enterprise disclosures",
"status": "pending",
"weight": 0.4,
"ordinal": -2,
... (truncated)