Operations | Monitoring | ITSM | DevOps | Cloud

SentinelOne outage: July 10 incident went unacknowledged

July 10, 2025, SentinelOne, a leading cybersecurity platform, experienced a widespread outage that disrupted access to its admin consoles across multiple regions. The incident impacted users in Europe, North America, and beyond, preventing security teams from accessing critical management features. Despite the scale of the disruption, no official public acknowledgment or status update was issued by SentinelOne.

Kentik Cause Analysis in 60 Seconds

In a world where network traffic can suddenly spike, manually sifting through flow data is often a daunting task. Kentik AI's new Cause Analysis simplifies troubleshooting by quickly identifying changes in traffic by application, IP, ASN, or service. With just a few clicks, Cause Analysis helps you compare time periods, understand traffic shifts, and detect changes in your network. Kentik: Take the hard work out of running your network.

Beyond AI hype: put reliability at the forefront

Reliability is a constant for every technology, whether it’s cloud, microservices, or AI. Full transcript:  Just a few years ago everybody was screaming about microservices, "That's the wave of the future," and now everybody's looking at AI. No matter what the change in technology hot topic is, your reliability should still be at the forefront of everything that you're doing.

Are you running AI the smart way?

Data locality: AI models often rely on large datasets. Locating compute close to the data reduces transfer times and improves training performance. Latency sensitivity: Real-time AI applications, like recommendation systems or edge analytics, depend on low-latency environments. This can be more easily tuned in private or hybrid setups. Hardware specialization: Some AI workloads benefit from custom hardware like GPUs or TPUs. Private cloud allows more control over this, while public cloud offers broader access but less customization.

Is on-prem the top choice to run AI?

‎‎Subscribe. Fuel your curiosity. In this episode, we break down what we’ve learned from teams running AI at scale, and why on-premises infrastructure is making a strong comeback. We’re seeing a shift: performance, cost control, data sovereignty, and platform flexibility are driving conversations about on-prem strategies for AI. No one-size-fits-all answers, but if you’re building or scaling AI, this might help you think a few steps ahead.

Out-of-the-box Alerting for Frontend Observability in Grafana Cloud

Get alerted on frontend issues the moment they happen — no setup headaches required. In this short demo, Elliot Kirk from Grafana Labs introduces out-of-the-box alerting for frontend observability. Whether you're tracking error counts or web vitals, this new feature makes it easy to stay ahead of performance issues. With just a few clicks, you can: Enable prebuilt alerts for your apps Visualize and edit alerts directly in the UI Customize thresholds and durations Set up notifications and stay in the loop Launch alerting with every new app setup.

Semantic Caching: What We Measured, Why It Matters

Semantic caching promises to make AI systems faster and cheaper by reducing duplicate calls to large language models (LLMs). But what happens when it doesn’t work as expected? We built a test environment to find out. Through a caching system, we evaluated how semantically similar queries would behave. When the cache worked, response times were fast. When it didn’t, things got expensive. In fact, a single semantic cache miss increased latency by more than 2.5x.

Autoscaling Made Easy with Rancher Cluster API

Kubernetes has revolutionized application deployment and management. However, manually adjusting cluster sizes to meet fluctuating workloads, without constantly under- or over-provisioning resources, quickly drains platform teams’ time and energy. While traditional cloud provider autoscaling tools are functional, they often fall short when it comes to truly dynamic, Kubernetes-aware scaling, especially in a world with diverse infrastructure.