When the Report Cannot Tell the Story: Building Incident Programs That Capture as They Respond
Two weeks after a payments outage took a regional bank offline for ninety-three minutes, the post-incident report landed on the CIO’s desk. It ran forty pages. It named the failed service, the ticket numbers, the restoration steps, and the engineers who paged in. It did not answer the question the board had actually asked, which was why the on-call team had spent the first forty-one minutes chasing a downstream symptom rather than the upstream cause.