Operations | Monitoring | ITSM | DevOps | Cloud

Balancing Data Locality, Data Sovereignty, and Data Replication

Modern distributed systems must simultaneously respect where data must live, where it should live for performance, and where it needs to live for resilience. Data sovereignty and residency requirements increasingly affect technical design decisions, not only in regulated industries, but in any global product that must navigate regional expectations, latency constraints, cost structures, and operational realities.

Datadog Data Observability, enables you to detect data quality and pipeline issues early.

See our latest Episode of This Month in Datadog, for a spotlight of Datadog Data Observability, which enables you to detect data quality and pipeline issues early, as well as remediate those issues with end-to-end lineage. We also cover: This Month in Datadog brings you the latest updates on our newest product features, announcements, resources, and events.

Architecting Log Management for Privacy and Scale without the Headache

As companies grow, they inevitably hit a wall: observability data explodes while privacy requirements become stricter. For years, engineers have faced a painful tradeoff—either ship petabytes of sensitive data to a central cloud (incurring egress costs and compliance risks) or manage a complex self-hosted stack that is painful to scale.

Scaling Kubernetes workloads on custom metrics

The 2025 State of Containers and Serverless report found that 64% of organizations use the Kubernetes Horizontal Pod Autoscaler (HPA) to manage Kubernetes workload capacity. But only 20% of those deployments scale on custom metrics. The other four-fifths of organizations rely on resource metrics—CPU and memory utilized by their pods—to trigger autoscaling activity.

How to design cloud environments for AI-powered threat analysis

Cloud environments generate high volumes of security signals every day. With each one, you have to determine if it’s benign, a clear false positive, or something worth investigating. The challenge is needing to make these calls continuously, often without knowing whether any single event is part of a larger attack. Spending too much time investigating benign activity reduces the ability to detect threats elsewhere, and missing a legitimate threat has clear consequences.

Monitor your application and network load balancer logs

Load balancers are the primary entry points to distributed applications. By strategically directing the flow of incoming web traffic to specific endpoints, load balancers help optimize throughput and ensure the horizontal scalability of applications. In modern systems, load balancers often do more than their name suggests: Beyond basic load distribution, they analyze requests and route traffic based on a wide range of variables, such as client identity.

Captur: Observability-First Mobile ML Inference for Better Customer Confidence

Captur builds a mobile SDK that brings real-time image recognition and actionable feedback directly into customers’ apps, running complex machine learning models entirely on device without cloud inference. This architecture delivers privacy and performance, but also creates unique challenges when it comes to observability and debugging, especially as crashes can originate from the host app rather than the SDK itself.

Understanding Karpenter architecture for Kubernetes autoscaling

Karpenter is a fast, flexible Kubernetes autoscaler designed to improve cluster performance and cost efficiency. When the cluster doesn’t have capacity to schedule a pod, Karpenter requests additional compute from the cloud provider, specifying a right-sized instance that matches the preferences you’ve set (for example, instance family).