XPT 2022 tournament assigned mere 2.3% probability to AI achieving gold-medal performance in International Mathematical Olympiad by 2025 — actual achievement empirically reached forcing systemic re-evaluation within forecasting community. Historical tr...
Predictor: Superforecaster Community
Prediction text
XPT 2022 tournament assigned mere 2.3% probability to AI achieving gold-medal performance in International Mathematical Olympiad by 2025 — actual achievement empirically reached forcing systemic re-evaluation within forecasting community. Historical track-record lesson: elite human superforecasters severely underestimated near-term algorithmic-systems progress pre-2023. | Next AI-benchmark milestone exceeding SF prior
Key catalyst: Next AI-benchmark milestone exceeding SF prior
Watch events: Next SF-community AI-capability calibration event
Resolution evidence
DeepMind AlphaProof + AlphaGeometry 2 achieved IMO silver 2024 + gold 2025; empirically forced XPT recalibration.
Predictor: Superforecaster Community
Calibration plot (stated vs observed)
Evidence about this node from Superforecaster Community is multiplied by κ in /api/intake. Lower κ = less weight; floors at 0.10 (effectively silenced) and caps at 1.00 (full weight).
Reference class
This node isn't linked to a reference class. The Bayesian update applies without outside-view blending.
Probability over time
Milestone chain
- 2023-01-30overdueQ1 window check-in (25%)
- 2024-02-29overdueQ2 window check-in (50%)
- 2025-03-30overdueQ3 window check-in (75%)
No downstream cascades — this prediction is a leaf in the dependency graph.
What if this resolves?
Click a button to clamp this prediction and run a Gibbs sample. Returns the predictions whose marginals shift most. ~30s per run; ideal for stress-testing "if X resolves, what else moves?"
Evidence chain
Raw metadata
{
"source": "backfill_resolution_history.py",
"status": "hit",
"bayesian_v2": false,
"outcome_prob": 1,
"evidence_kind": "resolution_terminal",
"posterior_prob": 1,
"delta_to_outcome": 0,
"inside_posterior": 1,
"validation_notes": "DeepMind AlphaProof + AlphaGeometry 2 achieved IMO silver 2024 + gold 2025; empirically forced XPT recalibration.",
"validation_status": "hit",
"pre_resolution_prob": 1,
"resolution_evidence": "DeepMind AlphaProof + AlphaGeometry 2 achieved IMO silver 2024 + gold 2025; empirically forced XPT recalibration.",
"does_not_update_current_prob": true
}Network propagation neighbors
No propagation data yet. Run inference/.venv/bin/python scripts/ops/run_loopy_belief_propagation.py on the droplet, or wait for the Sunday 02:00 UTC weekly cron.
Prerequisites (1)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| correlate | S_AGI_MID_2029 | AGI mid: Kurzweil 2029 path | agi_general_capability | — |
Dependents (0)
| Type | Pred | Title | Domain | Lag |
|---|---|---|---|---|
| No dependents | ||||
Validations (1)
| Observed at | Status | By | Notes |
|---|---|---|---|
| 2026-04-29 | hit | thesis_timeline_v1.0_import | DeepMind AlphaProof + AlphaGeometry 2 achieved IMO silver 2024 + gold 2025; empirically forced XPT recalibration. |
Linked documents (10)
Raw metadata
{
"nia": false,
"qty": "2.3% SF assigned; achieved",
"mode": "OBSERVATION",
"role": "Cited-Other",
"context": "Third Superforecaster Community entry. Critical calibration-lesson anchor. Documents the 2.3% -> achieved pattern justifying ongoing compression.",
"to_year": 2025,
"conv_cues": "empirical tournament result; specific probability",
"direction": "HAPPEN",
"from_year": 2022,
"timeframe": "2022-2025",
"conv_level": "HIGH",
"milestones": [
{
"kind": "quartile_checkpoint",
"label": "Q1 window check-in (25%)",
"status": "overdue",
"weight": 0.05,
"ordinal": -3,
"source_id": null,
"expected_date": "2023-01-30",
"observed_date": null,
"miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
"miss_emitted_by": "metadata_milestone_sweep"
},
{
"kind": "quartile_checkpoint",
"label": "Q2 window check-in (50%)",
"status": "overdue",
"weight": 0.05,
"ordinal": -2,
"source_id": null,
"expected_date": "2024-02-29",
"observed_date": null,
"miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
"miss_emitted_by": "metadata_milestone_sweep"
},
{
"kind": "quartile_checkpoint",
"label": "Q3 window check-in (75%)",
"status": "overdue",
"weight": 0.05,
"ordinal": -1,
"source_id": null,
"expected_date": "2025-03-30",
"observed_date": null,
"miss_emitted_at": "2026-05-02T22:07:21.384228+00:00",
"miss_emitted_by": "metadata_milestone_sweep"
},
{
"kind": "event",
"label": "XPT 2022 tournament assigned mere 2.3% probability to AI achieving gold-medal performance in International Mathematical Olympiad by 2025 — a",
"status": "hit",
"weight": 1,
"ordinal": 0,
"source_id": "FUT_024",
"expected_date": "2026-04-29",
"observed_date": "2026-04-29"
}
],
"repeat_eps": 1,
"affiliation": "XPT 2022 / 80000 Hours",
"attribution": "CITED",
"granularity": "YEAR_RANGE",
"resolved_at": "2026-04-29T22:23:18.376027+00:00",
"source_refs": "3",
"target_date": "2025-07-15T00:00:00",
"display_date": "2026-04-29",
"episode_date": "2026-04-22T00:00:00",
"key_catalyst": "Next AI-benchmark milestone exceeding SF prior",
"parse_method": "YEAR_RANGE observation",
"domain_bucket": "AI",
"episode_title": "The Convergence Architecture: High-Fidelity Macro-Forecasting for the 2026-2031 Global System",
"flag_repeated": false,
"in_5yr_window": false,
"source_report": "Futurist Predictions for 2026-2031.md (2026-04-22)",
"appears_in_eps": "FUT-RPT",
"futurist_phase": "Phase 1 (2026)",
"is_macro_claim": false,
"total_mentions": 1,
"priority_weight": 3,
"report_evidence": "Section: Calibrating Algorithmic Velocity.",
"active_end_month": "2025-12",
"recent_statement": "XPT 2022 + DeepMind IMO 2025 results.",
"watch_events_raw": "Next SF-community AI-capability calibration event",
"months_from_today": -9,
"probability_layer": "Higher (in-flight)",
"active_start_month": "2022-01",
"december_dispersal": {
"reason": "december_dispersal: domain=AI → 09/2025",
"new_date": "2025-09-30",
"old_date": "2025-12-31",
"applied_at": "2026-04-30T16:28:34.304992+00:00"
},
"flag_nia_bracketed": false,
"resolved_at_source": "validations_observed_at",
"track_record_grade": "A",
"track_record_notes": "Critical calibration event for SF community; forced methodology update.",
"contradicting_notes": "Observation, not forecast — 100% retrospective validation.",
"flag_near_term_2027": false,
"flag_high_conviction": false,
"milestones_derived_at": "2026-05-02T03:08:50.857001+00:00",
"reference_class_match": {
"decision": "keyword_filtered",
"computed_at": "2026-04-30T01:49:13.796883+00:00",
"best_id_unfiltered": "regulatory_freeze_window",
"best_similarity_unfiltered": 0.579
},
"validation_status_raw": "CONFIRMED",
"co