Operations | Monitoring | ITSM | DevOps | Cloud

Beyond Automation: The Rise of Agentic Networks

Agentic AI is the next evolution in network management, moving beyond simple automation to intelligent systems that can reason, plan, and act autonomously. Justin Ryburn, Kentik Field CTO, highlights how this shift automates expertise, enables proactive problem-solving, and empowers human engineers for strategic innovation.

Bridging the Network Cost Gap: Why Operators Need Real-Time, Traffic-Based Cost Intelligence

Jezzibell Gilmore’s latest blog dives into the critical challenge network operators face: bridging the gap between massive traffic growth and understanding its actual cost. Learn why real-time, traffic-based cost intelligence is no longer optional for maintaining margins and driving revenue in today’s complex network landscape.

Subsea Cables Parted in Red Sea Again

This past weekend saw the latest round of submarine cable cuts to impact internet connectivity between Europe and Asia. And once again they took place in the Red Sea, an historic problem area for subsea cables. In this post, I review some of the impacts that we observed in both the loss of transit in affected countries as well as increased latencies between public cloud regions using Kentik’s Cloud Latency Map.

Introducing Kentik Traffic Costs: Real-Time Network Cost Intelligence

Introducing Kentik Traffic Costs, an industry-first automated workflow delivering instant cost estimates for network traffic slices. Learn how this exciting new feature gives network, financial, and sales teams actionable insights to optimize spend, improve margins, and drive revenue.

Why (Enriched) Flow Data Belongs in Every Network Operator's Daily Toolbox

Flow data has always held immense potential, but was often inaccessible because it lacked context and speed. Kentik removes that friction by automatically enriching flow with human-readable context, making it a daily driver for everyone, not just specialists.

The Starlink Outage and Its Impact on Community Gateways

Last month, Starlink suffered its largest outage in years, arguably its biggest since becoming a major internet provider. In addition to the millions of individual customers around the world, the outage disconnected the Community Gateways, customers of Starlink’s new transit service. In this post, we delve into the outage and its impact on these far-flung networks.

Data Center VXLAN Overlay Visibility at Scale

VXLAN overlays bring flexibility to modern data centers, but they also hide what operators most need to see: true host-to-host and service-to-service traffic. Kentik restores that visibility by decoding VXLAN from sFlow, exposing both overlay endpoints and underlay paths in a single view without the cost and complexity of pervasive packet capture — the result: faster troubleshooting, smarter capacity planning, and confident operations at scale.

Cloudflare's DNS Downtime: Why BGP Hijacks Were Never to Blame

On July 14, Cloudflare’s popular public DNS service (known as 1.1.1.1) suffered an outage lasting over two hours. As rumors swirled about the cause, we were the first to push back on the theory that a BGP hijack had caused the outage. In fact, the hijack was actually a consequence. How did we know this so early when other internet watchers did not? We’ll discuss in this post.

The Network Impact on Job Completion Time in AI Model Training

In large-scale AI model training, network performance is no longer a supporting actor — it’s center stage. Job Completion Time (JCT), the key metric for measuring training efficiency, is heavily influenced by the network interconnecting thousands of GPUs. In this post, learn why JCT matters, how microbursts and GPU synchronization delays inflate it, and how platforms like Kentik give network engineers the visibility and intelligence they need to keep training jobs on schedule.