Operations | Monitoring | ITSM | DevOps | Cloud

Unified observability for Alibaba Cloud with Datadog

Alibaba Cloud is a major cloud provider in APAC, offering industry-leading foundational AI models in addition to compute, managed databases, object storage, and Kubernetes through its Container Service for Kubernetes (ACK). Teams choose Alibaba Cloud for its infrastructure availability across Asia Pacific and its managed services. For SREs and platform engineers, that often means running Alibaba Cloud alongside AWS, Google Cloud, or Microsoft Azure.

Deploy Datadog Kubernetes Autoscaling at scale

Every Kubernetes environment accumulates waste over time. Teams overprovision CPU and memory requests to avoid performance risk, run idle replicas to preserve headroom, and leave Horizontal Pod Autoscalers (HPAs) untouched long after workload behavior has changed. Some of this waste can be addressed at the node level, where Datadog Cluster Autoscaling helps teams rightsize capacity.

Monitor Azure Managed Redis with Datadog

Azure Managed Redis is Microsoft’s fully managed, enterprise-tier in-memory data store. It is designed for the low-latency caching, session storage, and real-time data needs of modern applications, including AI workloads that depend on fast vector and embedding lookups. Because user-facing applications often query Redis directly, even small regressions in latency, hit rate, or memory pressure can degrade the user experience.

Monitor JavaScript framework routing with Datadog RUM

Modern web applications rely on frameworks like Next.js, Vue, and Angular to handle routing and rendering. In these architectures, navigation happens within the application rather than through full page loads, which makes it difficult for traditional browser instrumentation to capture what users actually experience. As a result, teams often see misleading view names, missing navigations, and errors that are either misattributed or not captured at all, especially during hydration or lazy loading.

Instrument LangGraph agents with Datadog: a practical guide

AI agents tend to function as black boxes, and it can be difficult to trace and understand agent workflows end-to-end in order to characterize performance. Particularly, you need visibility into the following: By tracing full agent runs with LLM Observability, Datadog AI Agent Monitoring enables you to visualize workflows with flame graphs and quickly spot sources of failures and latency.

The importance of taking the initiative (a chat with Chris Yates) | The Simple Talk Podcast

Taking the initiative. Prioritizing relationships. Doing the work nobody else wants to do. These are just some of the elements that contributed to Chris Yates’ rise from a developer to a DBA and, eventually, a Senior Vice President. As he explains to Steve Jones, “you are the CEO of your own brand.” Also in the episode: discover Chris’ thoughts on AI, the importance of community, and the one thing he’d now do differently if he were to start from scratch.

Safe Database Change at Scale with Flyway Enterprise | The Tony and Tonie show Ep45

AI-assisted coding may speed up delivery, but it can also increase the risk around database changes. Here’s how Flyway helps teams stay in control. Tony and Tonie discuss how Flyway Enterprise helps teams build control into the database change process: immediate change visibility, continuous risk reduction, and secure, traceable deployment from commit to production.

The Compliance Gap in Test Data Management

Compliance Without Compromise: Test Data Management That Finally Fits You know you shouldn't have sensitive production data in test environments. But every time you look at fixing it, the options feel impossible: enterprise tools that cost six figures and take months to implement, or DIY scripts that sort of work until they don't. So, it stays on the backlog.

The options within Test Data Management - Enterprise, DIY or Redgate

Compliance Without Compromise: Test Data Management That Finally Fits You know you shouldn't have sensitive production data in test environments. But every time you look at fixing it, the options feel impossible: enterprise tools that cost six figures and take months to implement, or DIY scripts that sort of work until they don't. So, it stays on the backlog.