Operations | Monitoring | ITSM | DevOps | Cloud

Continuous Profiling Explained: Master Performance in Production

Backend systems rarely fail in obvious ways. More often, they degrade over time. CPU usage slowly increases, request latency creeps up, and costs rise without a clear explanation. Metrics tell you something is wrong, traces show where requests go, but neither explains why your code behaves the way it does under real load. Continuous profiling fills that gap. Atatus continuous profiling runs automatically in production with minimal overhead.

Bindplane + Oodle.ai: AI-Native Observability Meets AI-Driven Telemetry Pipelines

Today, we’re excited to announce a new integration between Bindplane and Oodle.ai — combining an AI-driven, OpenTelemetry-native telemetry pipeline with an AI-native observability platform built for extreme scale. With Bindplane acting as the control plane for telemetry and Oodle.ai providing AI-powered analysis across logs, metrics, and traces, you get a single, intelligent, vendor-neutral pipeline from raw telemetry to actionable insight.

Optimizing BESS Operations: Real-Time Monitoring & Predictive Maintenance with InfluxDB 3

For IT and OT engineers managing Battery Energy Storage Systems (BESS) and other distributed energy resources (DER), the challenge isn’t just dealing with energy. It’s a data problem, or managing the massive stream of real-time telemetry these systems generate. For example, a BESS site produces a constant stream of time-series data from BMS, PCS, SCADA, EMS, and more, and operating it means ingesting, correlating, and acting on that data in real time. And this challenge changes with scope.

OpAMP Explained: Why OpenTelemetry Needed an Agent Management Protocol (and How We Use It)

OpenTelemetry makes it easy to produce and transmit any type of telemetry. In production environments, this often means deploying the OpenTelemetry Collector as an intermediary to process, enrich, and route telemetry data. As systems scale, so does this infrastructure—sometimes to hundreds or thousands of Collectors spread across environments.

Deploy your Spring Boot application to production

In a previous article, we covered how easy it is to create Spring Boot containers with Rockcraft. So the next logical step is to deploy and operate your application in a production environment. The Juju ecosystem is the key to making this process straightforward. In this article we walk through the steps required to deploy a Spring Boot application to production using Juju and Kubernetes.

4 foundations you need to scale AI in engineering

As a baseline, engineering leaders need their teams to adopt AI tools to speed up velocity and ship faster. Most organizations have already rolled out AI coding assistants or are evaluating them, but there's a really big difference between buying a tool and successfully scaling it across an engineering organization. If you layer AI on top of a chaotic codebase or a disorganized service catalog, you accelerate the creation of legacy code.

How to Automate Tier 1 IT Tickets Without Breaking ITSM Processes

Tier 1 ticket automation is one of the most tempting (and, to be brutally honest, most mishandled) initiatives in IT service management. On paper, it seems simple: automate the high-volume requests, reduce handle time, and give your service desk some breathing room. In practice, though, many teams end up with brittle scripts and automations that quietly drift outside ITSM guardrails.

Risk Appetite, CRQ and Exposure Management: Closing the Loop on Cyber Risk

Executives today operate in a constant state of pressure. Regulatory demands grow faster than budgets, customers expect proof of resilience and every system outage becomes a business event. When each function manages risk in isolation, leaders spend more time reacting than advancing strategy. The real issue is coherence. Most organizations still rely on partial instruments: dashboards filled with red and amber, but no clarity on which risks matter or what an outage would actually cost.

Why Observability Budgets Keep Growing Even When IT Is Asked to Cut Costs

Observability is the surprising budget line that isn’t shrinking. 96% of IT leaders expect observability budgets to hold steady or grow over the next 12 months. And 62% expect those budgets to increase regardless of broader IT budget cuts. Why? Because as infrastructure becomes more distributed and harder to manage, observability has shifted from a “nice to have” to a control point for cost, performance, and risk.

Breaking the Iron Triangle: How AI-powered investigations change the economics of uptime

In engineering, there's a concept known as the Iron Triangle. With three sides—cost, quality, time—it's a framework intended to help you prioritize different aspects of project management Want fast, high-quality features? It'll cost you. Need to keep costs down while maintaining quality? That'll take time. And if you're trying to move fast and cheap? Well, good luck with quality. For years, this has been the brutal reality of running services on the web.