Operations | Monitoring | ITSM | DevOps | Cloud

Automating BGP Troubleshooting with Kentik AI Advisor

In this demo, we use Kentik AI Advisor to troubleshoot a real-world BGP misconfiguration that brings down a peering session with a transit provider. You’ll see how AI Advisor works both as a dedicated page and as an in-portal overlay, using natural language to identify the affected interface, correlate SNMP and syslog data, and pinpoint a maximum-prefix issue as the root cause. Then we accelerate and standardize the workflow with custom network context and AI-powered runbooks, so every engineer can troubleshoot BGP alerts like an expert.

Ep 24: Governing AI in the age of agentic systems and Model Context Protocol

On this episode of Masters of Data, we unpack David's new white paper on AI governance for agentic systems. He explains model context protocol (MCP) as "APIs for agents", how AI systems talk and execute tasks. The catch? Autonomous agents are insider threats that move fast and cause serious damage. David introduces the Model Control Plane (MoCop), a twelve-pillar framework designed to prevent your AI from going rogue. We cover his roadmap for security leaders to build real controls and telemetry. His advice: treat agents like interns with root access. Get ahead of this before your agents do.

AWS re:Invent 2025 AI-First Incident Management in Slack

Jacky Leybman from PagerDuty and Kaninie Knight from Slack share how their integration streamlines incident response and real-time collaboration. This session highlights practical workflows and measurable gains – such as faster triage and lower MTTR – achieved by connecting on-call operations directly in Slack.

AWS re:Invent 2025 - Smarter Incident Response with Logz.io and PagerDuty

In this session, Jacky Leybman from PagerDuty and David Lotan Bolotnikoff from Logz.io showcase how PagerDuty and Logz.io combine generative AI with rich historical context to automate root cause analysis and accelerate incident response. By correlating real-time telemetry with prior incidents and runbooks, teams reduce manual toil and MTTR while maintaining human-in-the-loop oversight and transparent reasoning.