234_026predictionAIAI-timing

AI rent-a-human service feature of scoring humor/visuals will be gone in months

Predictor: Alex Wissner-Gross · ep#234 "Anthropic vs. The Pentagon, Claude Outpaces ChatGPT, and Consulting Gets Replaced" · source

Prior probability

50.0%

Current probability

40.2%

evolves via intake + LBP

Conviction

4/5

Signal quality

Resolution

pending

Window

2026-01-01 – 2026-11-30

Edges in / out

7 / 5

Tickers exposed

Prediction text

AI rent-a-human service feature of scoring humor/visuals will be gone in months | You know, is this entertaining? Is this funny? Is this image clear? Does it have six fingers? You know, all that stuff is really really good for this service. I I think that's going to be gone in in months if it's not gone already.

Verbatim quote

From episode "Anthropic vs. The Pentagon, Claude Outpaces ChatGPT, and Consulting Gets Replaced"

You know, is this entertaining? Is this funny? Is this image clear? Does it have six fingers? You know, all that stuff is really really good for this service. I I think that's going to be gone in in months if it's not gone already.

Predictor: Alex Wissner-Gross

κ + Brier as of 2026-05-22

Full calibration →

κ (discount)

0.844

Brier

0.0341

excellent

Hits / Misses

6 / 1

of 11 resolved

Hit rate

54.5%

Calibration plot (stated vs observed)

Evidence about this node from Alex Wissner-Gross is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).

Reference class

Not linked

This node isn't linked to a reference class. The Bayesian update applies without outside-view blending.

Probability over time

6 prob_history rows

intake v2milestone miss sweeplbp propagationreference class assignedlegacy v1prior_prob (analyst seed)current = 40.2%

Milestone chain

Pre-event signals (upstream prereqs + window checkpoints) → resolution event → downstream cascades. Status/dates update from linked nodes; re-derive nightly via scripts/ops/derive_milestones.py.

Leading chain: 5 fired ✓ · 1 overdue ⏱ · 1 pending

2026-02-07hitRentAHuman service launches February 2026 — humans rented by AI
How: RentAHuman.ai goes live with >=100K signups within 30 days; AI agents hire humans for tasks AI cannot do (validates that scoring/visuals tasks are flipping in direction)
Source: https://www.gizmochina.com/2026/02/07/humans-for-hire-this-website-lets-ai-rent-humans-for-work/ — Gizmochina launch coverageconf 95%
2026-04-29hitNvidia quadrupled chip production output while only doubling human headcount — achieved by deploying AI coding tools (Cursor, Claude Code) a
2026-04-29hitTraining runs costing $10 billion for a single model will commence sometime in 2025.
2026-04-29hit2025 will be the definitive year that agentic systems finally hit the mainstream.
2026-04-29hitRecursive self-improvement is already happening now (no longer three years out)
2026-04-29overdueOSWorld leader exceeds human baseline by 10pp
How: Top OSWorld-Verified model achieves >=82% (vs 72.4% human baseline) — proves AI can score visual/UI tasks at superhuman level, eroding human-grading need
Source: https://benchlm.ai/benchmarks/osWorldVerified — Holo3 leaderboardconf 85%
2026-04-01 → 2026-09-30pendingMultimodal vision/humor capability matches human-grader pass rate
How: Frontier multimodal model achieves human-grader-parity (>=95% agreement) on visual creativity / humor scoring benchmarks
Source: Stanford AI Index 2026 multimodal capability extrapolationconf 55%
2026-07-31pendingAI rent-a-human service feature of scoring humor/visuals will be gone in months
2026-06-01 → 2026-12-31pendingMajor rent-a-human platform deprecates humor/visual scoring tasks
How: RentAHuman.ai or competitor publicly removes 'humor scoring' or 'visual creativity scoring' SKU from task catalog due to AI substitution
Source: Pattern extrapolation from rapid multimodal capability gainsconf 50%
2026-09-01 → 2027-06-30pendingCascade: Human-as-graders gig economy contracts >30% YoY
How: RLHF/human-grader gig labor demand (Scale AI, Surge, etc.) declines >=30% YoY measured via posted task volume
Source: Scale AI / Surge AI public hiring patternsconf 50%
2027-06-26pendingMath is cooked (will be solved), physics cooked, biology char broiled.
2028-06-25pendingWe're exiting the industrial age permanently as recursive self-improvement unfolds.
2028-09-07pendingBy 2028, AI systems will reach 'independent researcher' level — driving autonomous scientific discoveries without human intervention.
2033-07-30pendingRay Kurzweil predicts Longevity Escape Velocity (LEV) by 2033.
2033-08-10pendingASI will arrive within 2 years to 5 years to this next decade

What if this resolves?

Clamp this prediction TRUE or FALSE and run a counterfactual Gibbs sample. Surfaces the predictions whose marginals shift most under that assumption.

(live posterior: 40%)

Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"

Evidence chain

Every probability update with full Bayesian provenance — chronological, latest first

LBP2026-05-17T02:00:01Z40.2%+1.2pp

Network propagation: 39.0% → 40.2%

5-iter LBP, residual 0.00689 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e607fa96

LBP2026-05-10T02:00:02Z39.0%+2.5pp

Network propagation: 36.5% → 39.0%

6-iter LBP, residual 0.00584 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run e5c18d29

metadata_milestone_miss_sweep2026-05-07T22:13:01Z36.5%-7.0pp

metadata_milestone_miss_sweep bayesian_v2 n=1 inside=0.365 blend=0.365 LLR=-0.291 κ=0.84 no_blend

Raw metadata

{
  "trf": 0.6188417233545699,
  "kappa": 0.8438,
  "base_rate": null,
  "predictor": "Alex Wissner-Gross",
  "total_llr": -0.4054651081081644,
  "grace_days": 7,
  "bayesian_v2": true,
  "prior_logit": -0.2626418454937682,
  "bayes_factor": "1.3:1 against",
  "blend_reason": "no reference_class linked",
  "inside_prior": 0.43471439531203065,
  "kappa_source": "predictor_table",
  "n_milestones": 1,
  "blend_applied": false,
  "contributions": [
    {
      "llr": -0.4054651081081644,
      "kind": "llm_pre_event",
      "kappa": 0.7172299999999999,
      "label": "OSWorld leader exceeds human baseline by 10pp",
      "weight": 0.4,
      "strength": "weak",
      "confidence": 0.85,
      "source_url": "https://benchlm.ai/benchmarks/osWorldVerified",
      "adjusted_llr": -0.2908117394884187,
      "expected_date": "2026-04-29",
      "measurement_criterion": "Top OSWorld-Verified model achieves >=82% (vs 72.4% human baseline) — proves AI can score visual/UI tasks at superhuman level, eroding human-grading need"
    }
  ],
  "evidence_kind": "metadata_milestone_miss_sweep",
  "inside_source": "history_v2",
  "inside_weight": 0.566810793651801,
  "outside_weight": 0.433189206348199,
  "posterior_prob": 0.36506352268234904,
  "posterior_logit": -0.5534535849821869,
  "predictor_brier": 0.03413,
  "inside_posterior": 0.36506352268234904,
  "blended_posterior": 0.36506352268234904,
  "reference_class_id": null,
  "total_adjusted_llr": -0.2908117394884187,
  "predictor_n_resolved": 11
}

LBP2026-05-03T02:00:01Z43.5%-1.4pp

Network propagation: 44.8% → 43.5%

6-iter LBP, residual 0.00677 · damping 0.5, w_intrinsic 0.5 · method lbp_v3 · run 1a683ac9

LBP2026-04-30T16:39:51Z44.8%-2.0pp

Network propagation: 46.8% → 44.8%

5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v2 · run 0c8a4ea3

LBP2026-04-30T02:18:57Z46.8%-3.2pp

Network propagation: 50.0% → 46.8%

5-iter LBP, residual 0.00825 · damping 0.5, w_intrinsic 0.5 · method lbp_v1 · run 592311ef

Network propagation neighbors

Top edges sorted by latest LBP cross-impact

All propagation →

Top incoming (parents)

Edges that influence THIS node's belief

Kind	Node	Their prob	P(c\|s=T)	P(c\|s=F)	Δ implied
killer	TK03 AI Regulatory Moratorium (EU/US Capability Freeze)	10.0%	0.050	0.500	+0.053
killer	TK01 AGI Capability Plateau (2026-27 Training Stall)	15.0%	0.050	0.500	+0.030
prereq	SEM_042 2025 will be the definitive year that agentic systems finall — Kevin Weil	73.8%	0.500	0.050	-0.025
prereq	SEM_012 Nvidia quadrupled chip production output while only doubling — Jensen Huang	75.0%	0.500	0.050	-0.018
prereq	SEM_008 Training runs costing $10 billion for a single model will co — Dario Amodei	76.9%	0.500	0.050	-0.010

Top outgoing (children)

Predictions THIS node influences

Kind	Node	Their prob	P(c\|s=T)	P(c\|s=F)	Δ implied
prereq	231_013 Math is cooked (will be solved), physics cooked, biology cha — Alex Wissner-Gross	35.4%	0.620	0.050	-0.071
prereq	241_043 ASI will arrive within 2 years to 5 years to this next decad — Peter Diamandis	35.9%	0.650	0.050	-0.064
prereq	CMQ_002 By 2028, AI systems will reach 'independent researcher' leve — Sam Altman	31.4%	0.550	0.050	-0.060
prereq	235_030 Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 203 — Ray Kurzweil	39.2%	0.750	0.050	-0.057
prereq	232_055 We're exiting the industrial age permanently as recursive se — Peter Diamandis	35.5%	0.700	0.050	-0.039

Ticker exposure

33 ticker(s) linked

Beneficiaries (23)

SOUN CRWV SITM NVDA ARM GTLB BBAI TSM APLD CEVA AI MSFT MRVL SFTBY ORCL QCOM AVGO BABA AMD GOOGL IBM AMZN META

Adverse (6)

WNS CHGG CTSH IBM INFY ACN

Prerequisites (7)

Predictions that must hit first

Type	Pred	Title	Domain	Lag
prereq	238_009	Recursive self-improvement is already happening now (no longer three years out)	AI	—
prereq	SEM_008	Training runs costing $10 billion for a single model will commence sometime in 2025.	AI	—
prereq	SEM_042	2025 will be the definitive year that agentic systems finally hit the mainstream.	AI/Agents	—
prereq	SEM_012	Nvidia quadrupled chip production output while only doubling human headcount — achieved by deploying AI coding tools (Cursor, Claude Code) across engineering.	AI/Manufacturing	—
killer	TK14	Superbubble Pop (S&P 500 -40%, Moonshot Capital Evaporates)	—	—
killer	TK01	AGI Capability Plateau (2026-27 Training Stall)	—	—
killer	TK03	AI Regulatory Moratorium (EU/US Capability Freeze)	—	—

Dependents (5)

Predictions enabled by this

Type	Pred	Title	Domain	Lag
prereq	235_030	Ray Kurzweil predicts Longevity Escape Velocity (LEV) by 2033.	Biotech/Longevity	—
prereq	232_055	We're exiting the industrial age permanently as recursive self-improvement unfolds.	AI	—
prereq	241_043	ASI will arrive within 2 years to 5 years to this next decade	AI	—
prereq	231_013	Math is cooked (will be solved), physics cooked, biology char broiled.	AI	—
prereq	CMQ_002	By 2028, AI systems will reach 'independent researcher' level — driving autonomous scientific discoveries without human intervention.	AI	—

Linked documents (7)

Auto-generated by cosine similarity from Polymarket / Manifold / EDGAR / GDELT

Sim	Source	Title	Market prob	Polarity	Reviewed	Published
0.590	manifold	Will The Gameoverse Pilot reach 25 million views by the end of June?	37%	mentions	pending	2026-05-12
0.581	manifold	How many views will all plzdontkillus content receive in July?	—	mentions	pending	2026-05-31
0.573	polymarket	Will "In the Grey" score at least 60 on the Rotten Tomatoes Tomatometer?	8%	mentions	pending	2026-05-04
0.572	manifold	What format will 'Jet Lag: The Game - Season Nineteen' take? (Read Description)	—	mentions	pending	2026-05-07
0.561	polymarket	Will the White House Press Secretary say "CDC" or "WHO" during the next White House Press Briefing?	0%	mentions	pending	2026-05-11
0.557	manifold	Confess... something you do to hype yourself up when you need it. (Bilt Rent Free Jun 1, 2026)	—	mentions	pending	2026-05-29
0.555	manifold	The Boys Finale IMDB score ≤ 7.0 at end of May?	30%	mentions	pending	2026-05-09

Raw metadata

From Thesis_Timeline_v1.0_FINAL workbook

{
  "nia": false,
  "qty": "months",
  "url": "https://www.youtube.com/watch?v=dmtvGKuRE64",
  "mode": "PREDICTION",
  "role": "Host",
  "context": "You know, is this entertaining? Is this funny? Is this image clear? Does it have six fingers? You know, all that stuff is really really good for this service. I I think that's going to be gone in in months if it's not gone already.",
  "to_year": 2026,
  "verbatim": "You know, is this entertaining? Is this funny? Is this image clear? Does it have six fingers? You know, all that stuff is really really good for this service. I I think that's going to be gone in in months if it's not gone already.",
  "conv_cues": "going to be gone",
  "direction": "HAPPEN",
  "from_year": 2026,
  "timeframe": "Within months",
  "conv_level": "HIGH",
  "milestones": [
    {
      "kind": "llm_pre_event",
      "label": "RentAHuman service launches February 2026 — humans rented by AI",
      "source": "https://www.gizmochina.com/2026/02/07/humans-for-hire-this-website-lets-ai-rent-humans-for-work/ — Gizmochina launch coverage",
      "status": "hit",
      "weight": 0.4,
      "ordinal": -7,
      "source_id": null,
      "confidence": 0.95,
      "source_url": "https://www.gizmochina.com/2026/02/07/humans-for-hire-this-website-lets-ai-rent-humans-for-work/",
      "expected_date": "2026-02-07",
      "observed_date": "2026-02-07",
      "research_origin": "deep_research",
      "measurement_criterion": "RentAHuman.ai goes live with >=100K signups within 30 days; AI agents hire humans for tasks AI cannot do (validates that scoring/visuals tasks are flipping in direction)"
    },
    {
      "kind": "prereq",
      "label": "Nvidia quadrupled chip production output while only doubling human headcount — achieved by deploying AI coding tools (Cursor, Claude Code) a",
      "status": "hit",
      "weight": 0.5,
      "ordinal": -6,
      "source_id": "SEM_012",
      "expected_date": "2026-04-29",
      "observed_date": "2026-04-29"
    },
    {
      "kind": "prereq",
      "label": "Training runs costing $10 billion for a single model will commence sometime in 2025.",
      "status": "hit",
      "weight": 0.5,
      "ordinal": -5,
      "source_id": "SEM_008",
      "expected_date": "2026-04-29",
      "observed_date": "2026-04-29"
    },
    {
      "kind": "prereq",
      "label": "2025 will be the definitive year that agentic systems finally hit the mainstream.",
      "status": "hit",
      "weight": 0.5,
      "ordinal": -4,
      "source_id": "SEM_042",
      "expected_date": "2026-04-29",
      "observed_date": "2026-04-29"
    },
    {
      "kind": "prereq",
      "label": "Recursive self-improvement is already happening now (no longer three years out)",
      "status": "hit",
      "weight": 0.5,
      "ordinal": -3,
      "source_id": "238_009",
      "expected_date": "2026-04-29",
      "observed_date": "2026-04-29"
    },
    {
      "kind": "llm_pre_event",
      "label": "OSWorld leader exceeds human baseline by 10pp",
      "source": "https://benchlm.ai/benchmarks/osWorldVerified — Holo3 leaderboard",
      "status": "overdue",
      "weight": 0.4,
      "ordinal": -2,
      "source_id": null,
      "confidence": 0.85,
      "source_url": "https://benchlm.ai/benchmarks/osWorldVerified",
      "expected_date": "2026-04-29",
      "miss_emitted_at": "2026-05-07T22:13:01.009021+00:00",
      "miss_emitted_by": "metadata_milestone_sweep",
      "research_origin": "deep_research",
      "measurement_criterion": "Top OSWorld-Verified model achieves >=82% (vs 72.4% human baseline) — proves AI can score visual/UI tasks at superhuman level, eroding human-grading need"
    },
    {
      "kind": "llm_pre_event",
      "label": "Multimodal vision/humor capability matches human-grader pass rate",
      "source": "Stanford AI Index 2026 multimodal capability extrapolation",
      "status": "pending",
      "weight": 0.4,
      "ordinal": -1,
      "source_id": null,
      "confidence": 0.55,
      "
... (truncated)