Hindcast Methodology — 2010–2024 Western Himalayan Replay

Catalogue Construction

The catalogue (historical_events.yaml) was assembled from four independent records to minimise systematic bias:

ICIMOD Hindu-Kush Himalaya event log — peer-reviewed reconnaissance reports for GLOFs and large mass movements.
CWC India-WRIS gauge archive — peak-discharge records at Akhnoor, Salal, Baglihar, Dul Hasti.
JKSDMA / HPSDMA winter situation reports — avalanche releases on NH-44 and NH-244.
Peer-reviewed literature — Sati & Gahalaut (2014), Singh et al. (2021), Allen et al. (2023).

Each event records:

ISO date, sub-basin, geographic location
Hazard class (GLOF / AVALANCHE / RUNOFF)
Magnitude with explicit uncertainty band
Documented fatalities (used only for severity post-classification)
Source citations

Exclusion rule: any event lacking at least two independent sources is omitted. Catalogue version is pinned in the report header for reproducibility.

Replay Protocol

For each event we reconstruct the forcing window [T_event − 7 d, T_event + 1 d] from archival ERA5-Land, MODIS, Sentinel-1/2 acquisitions cached in Cloudflare R2. The simulator runs in hindcast mode with the following deviations from operational mode:

Setting	Operational	Hindcast
Forcing latency	real-time NRT	archival cached COGs
Lead-time issue	T_now → T+72 h	T_event − 72 h → T_event + 24 h
Q-AE shots	8192	32768 (variance reduction)
EvaluationAgent	rolling 30 d	full catalogue

The deterministic fixture used in CI (runner._replay_event) seeds the RNG from the event timestamp so test runs are bit-reproducible while preserving realistic skill statistics.

Skill Metrics

Metric	Formula	Pass threshold
ROC-AUC	rank-sum / (n_pos · n_neg)	≥ 0.80
Brier score	mean (p − o)²	≤ 0.18
Lead time	T_issue − T_event	≥ 24 h mean
Severity match	predicted ≥ watch when documented ≥ watch	≥ 0.85
Magnitude bias	predicted_Q − observed_Q

Null events are synthesised from quiet periods sampled at the same sub-basins to avoid the AUC singularity that would otherwise arise from an all-positive catalogue.

Reliability & Calibration

reliability_diagram is computed quarterly. If any populated bin deviates by > 0.15 from the diagonal, the EvaluationAgent recalibrates the QAE → severity mapping via isotonic regression and the new calibration table is committed to data/calibration/qae_severity_*.json with a CHANGELOG entry.

Provenance

Every hindcast report is hashed (SHA-256 over canonical JSON) and the hash is anchored to Polygon PoS via AuditAgent.anchor_artifact. The transaction hash is shown next to the AUC gauge on the Validation page.

Catalogue Construction​

Replay Protocol​

Skill Metrics​

Reliability & Calibration​

Provenance​

Catalogue Construction

Replay Protocol

Skill Metrics

Reliability & Calibration

Provenance