Operations | Monitoring | ITSM | DevOps | Cloud

#055 - From Enterprise Java to Kubernetes and AI-Driven Infrastructure with Dan Hicks (Boomi)

Dan breaks down the fundamental similarities and stark differences between application development and platform engineering. He shares the unexpected hurdles he faced during his transition, from complex networking and CoreDNS latency to the harsh realities exposed by chaos testing in cloud environments.

Telemetry Talks ep 3: OpenTelemetry with VictoriaMetrics observability signals

In this episode of Telemetry Talks, we explore OpenTelemetry observability signals—metrics, logs, and traces, and how VictoriaMetrics handles each of them with high performance, cost efficiency, and seamless integration. We briefly explain what each signal is, discuss common misconceptions, and share guidance on which signal to start with if you're new to observability. Together with our guests, both engineers at VictoriaMetrics, we walk through integrating VictoriaMetrics with the OpenTelemetry demo, showcase Grafana dashboards, and check the playgrounds for all three signals to see them in action.

Node Groups: Organize Your Infrastructure Into Reusable Views

When you’re managing a handful of nodes, the flat list in the nodes tab works fine. When you’re managing hundreds or thousands, it becomes a wall of hostnames. You end up applying the same filters repeatedly: all the production database servers, all the nodes in eu-west, all the Kubernetes workers in the staging cluster. The filters work, but they don’t persist, and there’s no way to share them with the rest of your team. Node groups solve this.

Unified Logging for a Single Source of Truth

In Star Trek, the Borg are a cybernetic alien organism that forcibly assimilates other beings and technologies into its hivemind called “The Collective.” Each assimilated being or technology becomes part of the unified consciousness, with the villainous Borg Queen as the leaders. As the only independent thinker, the Borg Queen leads this rapidly adapting Collective.

The reality check: why manual debugging setups are a hidden factory

The first 70% of a debugging cycle is usually spent on "plumbing", the undocumented toil of syncing databases, matching service versions, and aligning networking to mimic a production failure. This manual setup is a hidden factory that consumes senior engineering capacity and delays recovery. True velocity is found by eliminating the infrastructure variables that make bugs hard to reproduce.

Agno Monitoring & Observability with OpenTelemetry and SigNoz

Learn how to implement end-to-end monitoring and observability for Agno-based AI systems using OpenTelemetry and SigNoz. In this video, we walk through instrumenting your Agno workflows, collecting traces, metrics, and logs, and visualizing everything in SigNoz to gain real-time visibility into performance, failures, and bottlenecks. You'll see how to move from basic logging to production-grade observability—so you can debug faster, optimize latency, and confidently run AI systems at scale.

Secure and Compliant DevOps in an AI-Enabled World

Is Your DevOps Strategy Ready for the AI Era? AI is accelerating modern software delivery—but it’s also raising the stakes for security, compliance, and auditability. As AI-driven change increases, many organizations are discovering that incomplete DevOps practices are creating new risk. Based on insights from 800+ global IT professionals, the 2026 State of DevOps Report reveals why vendor‑backed, enterprise‑grade DevOps platforms are becoming critical for managing AI‑driven risk and meeting evolving regulatory demands.

How to Measure & Improve Engineering Ops (with Cortex)

Is your engineering org actually getting better, or just shipping more? In this overview, we dive into how leadership and platform teams use Cortex to move beyond manual audits and spreadsheets. Learn how to transform "tribal knowledge" into a data-driven culture of engineering excellence by centralizing visibility and automating operational standards. Key Highlights: Mission Control: Using the Service Catalog to map dependencies and ownership without the Slack-pinging or Wiki-hunting.

2026 CMA investigation: What it means for the cloud industry

The UK’s Competition and Markets Authority (CMA) has now set out its latest actions under the Digital Markets Competition Regime (DMCR), following its multi-year Cloud Services Market Investigation. While the regulator has now expanded its focus into business software ecosystems, we must not lose sight of the core issue: the entrenched dominance within the UK's cloud infrastructure.

Introducing kosli evaluate: Rego Policy Evaluation for Your Compliance Data

If you’re evaluating compliance controls against your Kosli trail data today, there’s a good chance you’ve written some glue code to make it work. A script that pulls trail data from the API. Another that downloads attestations one by one. Something that mangles the JSON together into a shape that your chosen compliance engine can evaluate. And then that engine itself, whether it’s OPA, a custom Python script, or something else, installed and configured in your pipeline.