Operations | Monitoring | ITSM | DevOps | Cloud

What Is DevSecOps? A Guide To Secure DevOps Workflows

Security used to be something teams added at the end of a release cycle. Engineering pushed code fast. Security teams reviewed it later. But this flow only worked when the software moved slowly. Modern cloud environments broke the old security model. Containers, microservices, APIs, and infrastructure as code now change too fast for security to sit outside delivery workflows.

Lessons From The FinOps In Full Bloom Podcast: 6 Cloud Insights I Didn't Expect

Every time I step on set with a guest for FinOps In Full Bloom, I’m anticipating the lightbulb moments I know will pop up during the podcast. These are the conversations that reveal how curiosity and collaboration can spark real transformation in the cloud.

What is DEX? And Why DEX is Important

Digital Employee Experience (DEX) refers to how employees interact with the digital tools, systems, and technologies they use at work-and how those interactions affect their productivity, satisfaction, and overall work experience. DEX encompasses the quality of the digital interactions and services that employees encounter while using workplace technologies. It includes various factors such as application performance, network connectivity, device usability, and overall user satisfaction.

How Istio Ambient Mode Delivers Real World Solutions

For years, platform teams have known what a service mesh can provide: strong workload identity, authorization, mutual TLS authentication and encryption, fine-grained traffic control, and deep observability across distributed systems. In theory, Istio checked all the boxes. In practice though, many teams hit a wall. Across industries like financial services, media, retail, and SaaS, organizations told a similar story. They wanted mTLS between services to meet regulatory or security requirements.

Transforming Symfony monolith to multi-apps: a step-by-step guide

This blog post is based on Florent Huck, Developer Advocate at Upsun, at SymfonyCon 2023. We utilized AI tools for transcription and to enhance the structure and clarity of the content. The journey from a single monolithic application to a multi-application architecture doesn't have to be daunting. At a recent developer conference, Florent from Upsun's Developer Relations team shared a practical step-by-step guide on how to refactor a monolith into multiple applications using Upsun.

The Observability Stack is Collapsing: Why Context-First Data is the Only Path to AI-Powered Root Cause Analysis

By Bill Balnave, VP of Customer Success at Mezmo The core promise of modern observability is simple: cut Mean Time To Resolution (MTTR). Yet, despite a boom in tooling and investment over the last four years, the data tells a sobering story: our industry is actually getting worse at finding and resolving issues. Dashboards, once our trusted guide, have become the starting point for a chaotic "dashboard hunt" that rarely leads to the definitive root cause.

A Year in Internet Analysis: 2025

This year-end wrap-up covers topics from BGP security (including ASPA and excessive AS-SETs) and the geopolitical (Ukraine’s IPv4 exodus, the Iran internet shutdown, and Red Sea cable cuts) to the year’s most significant outages (TikTok, the Spain/Portugal blackout, and cloud failures at AWS, Azure, and Cloudflare). Plus, we explore Starlink’s new Community Gateways, and revisit the evolving landscape of AS ranking and OTT service tracking.

Text-to-Alert: Generating Netdata Alerts from Natural Language

Netdata has an incredibly powerful alerting engine. But this can sometimes be a double-edged sword: the flexibility to build incredibly specific, intelligent alerts is immense, but mastering its syntax can feel like learning a new language. We’ve heard this from so many of you. You tell us that configuring alerts is often the steepest part of the learning curve, a task that falls to the one “Netdata expert” on the team who has spent the time digging through the documentation.

Day 2 with Cilium: Small configurations that keep large clusters boring

Operating Cilium at a small scale is straightforward. You install the Helm chart, choose a routing mode, and apply a few network policies. Day 1 is about getting packets to flow. Day 2 is about keeping them boring. At Datadog, we run Cilium across hundreds of Kubernetes clusters, tens of thousands of nodes, and hundreds of thousands of pods in multiple clouds. When operating at this scale, small configuration choices stop being minor details and start becoming risk multipliers.

Python memory profiling: Common pitfalls and how to avoid them

Continuous profiling has established itself as core observability practice, so much so that we’ve referred to it as the fourth pillar of observability. But despite the capabilities and growing adoption of continuous profiling, it can still be confusing to approach profiling as a newcomer and correctly apply it to different troubleshooting scenarios.