Operations | Monitoring | ITSM | DevOps | Cloud

Get Kafka-Nated Ep 10: From MSK to Diskless Kafka w/ Kyle McCullough

Get Kafka-Nated Ep. 10 Wednesday, November 5th 2025 Guest Focus: Co-Founder & CTO at OpsHelm, former Head of Infrastructure Engineering at ProdPerfect and Lead Engineer at Vivid Seats Kyle McCullough joins host Hugh Evans to explore what it takes to build real-time, multi-cloud streaming infrastructure at scale. As Co-Founder and CTO of OpsHelm, Kyle shares how his team processes hundreds of terabytes of cloud events daily, maintaining sub-second visibility while reducing streaming costs by 78% after migrating from MSK and NATS to Aiven Diskless Kafka.

Navigating the path from startup speed to enterprise scale | Braintrust by Cortex

(00:20) The founding journey: from zero to one hundred customers(01:13) Day one: Building the first version of the service catalog(04:25) Why speed is a startup's only superpower(09:54) The mindset shift to enterprise-grade reliability and scale(13:06) How quality becomes a competitive advantage(14:46) High-leverage early decisions: writing tests and supporting on-prem(17:38) Balancing speed and quality in the age of AI(21:21) How AI will shift, not replace, engineering roles(26:53) Advice for engineering leaders working with founders.

Hyperview DCIM 5.1 Software Release

This release is all about helping you move faster, see more, and manage your infrastructure with greater ease. From real-time polling and smarter layout tools to expanded support for DC power and new visual enhancements in rack views, this update is packed with practical improvements. Plus, with French language support and key bug fixes, it’s more accessible and reliable than ever.

Hyperview DCIM 5.2 Software Release

This release focuses on giving you more control over your infrastructure connections and ensuring your monitoring tools run smoother than ever. From enhanced circuit management and expanded search capabilities to optimized data collectors and advanced Modbus support, this update delivers practical improvements that make your day-to-day operations more efficient.

Connecting the dots: Solving IT asset visibility with Dataprime

In large tech organizations, keeping track of every laptop, desktop, and endpoint is one of the IT department’s toughest challenges. Each device needs to be accounted for, properly assigned, and compliant with the organization’s policies, all while teams, offices, and contractors constantly change.

Autonomous Self-Healing Capabilities for Cloud-Native Infrastructure and Operations

Modern cloud-native infrastructure was adopted to increase agility and scale, but as it grows in scale and complexity, engineering teams are now drowning in operational noise. Industry research (The State of Observability for 2024) reveals that 88% of technology leaders report rising stack complexity, while 81% say manual troubleshooting actively detracts from innovation.

How to Visualize Time Series Data with InfluxDB 3 & Apache Superset

Learn how to visualize time series data from InfluxDB 3 Core using popular open source Apache Superset. This tutorial walks you through setting up both systems with Docker, writing sample IoT data, and creating your first visualization. For more information about Apache Superset, this article may be helpful.

What's New in Calico - Fall 2025 Release

As organizations scale Kubernetes and hybrid infrastructures, many are realizing that more tools don’t mean better security. A recent Microsoft report found that organizations with 16+ point solutions see 2.8x more data security incidents than those with fewer tools. Yet platform teams are still expected to deliver resilience and performance across containers, VMs, and bare metal, often while juggling fragmented tools that introduce risk, downtime, and complexity.

How Prometheus Exporters Work With OpenTelemetry

Running distributed systems means you need clear visibility into how your services behave. Prometheus has been the standard for metrics for a long time, and OpenTelemetry is now giving teams a more consistent way to collect telemetry across their stack. In many setups, you'll have both: existing Prometheus instrumentation that's already in place, and new components instrumented with OpenTelemetry.