Operations | Monitoring | ITSM | DevOps | Cloud

Your AI App Is Lying to You - Here's How to Fix That #devops #observability #programming

You shipped your AI app. But do you have all the answers? Do you actually know which model ran, how many tokens it consumed, or why it stopped? This is what LLM observability gives you, and most AI engineers are skipping it entirely. I built an SOS detection app and used OpenTelemetry to get full visibility into every single call. Token usage, model version, finish reason, and cost per call all in one place, standardised across any provider. Check out the OpenTelemetry GenAI docs in the link below; there is a lot more you can track than you think.

The Bug Hiding in Your Production Traffic

Your logs showed 500 errors. The traces showed the dependency graph. Neither showed the actual bug, a DEL control character getting appended to the query string. This is how I found it. In this video I walk through Speedscale BYOC (bring your own cloud): capture real production traffic, store it in your own Elasticsearch cluster inside your VPC, pull it down locally with a single script, and reproduce the exact bug using proxymock. The data never leaves your environment.

Autonomous Error Remediation in Cursor with Lightrun MCP

Lightrun's Gidi Freud demonstrates how your AI coding agent can now investigate and fix production errors, autonomously. Watch how Cursor, guided by Lightrun's Error Remediation skill, picks up a Sentry error, instruments the live service with a runtime snapshot, captures real evidence, and opens a validated PR for approval.

AI Dev Tools: What 100K Engineers at Google Really Taught Us

AI developer productivity, agentic workflows, and the lessons learned running engineering tools for 100,000+ software engineers at Google. John Montgomery, CCO at GitKraken, sits down with Asim Hussain, co-founder of Alterion AI and former Google VP of Engineering Productivity, to get real about what AI actually changes for engineering teams in 2025.

Atlassian Transforms Product Development with AI

What used to take months now takes weeks, and it’s changing what it means to build great products. At Atlassian, product managers and designers are using Rovo and Jira Product Discovery to move faster at every stage of the development lifecycle. From running deep research across all their tools and documents, to capturing ideas, surfacing insights, and prioritizing what to build next. AI is transforming how product decisions get made.

Federated Search | From Silos to Insight | Azure Blob Schema Discovery with Splunk's Crawler

This walk-through shows how Splunk's Cloud can discover schema and partition keys for Microsoft Azure Blob Storage datasets and create searchable Splunk managed tables. Once the data is mapped, analysts can use Splunk Federated Search to query Azure Blob data where it lives, bringing cloud-resident logs into security, observability, and operational work-flows without re-ingesting the data.

The Observability Journey: Getty Images and Cribl

I recently sat down with Simon Overbey and Lovepreet Singh - the Engineering Manager and systems engineer (respectively) at Getty Images to talk about their experiences implementing Cribl. After getting a rundown of the pre-Cribl environment (described above) I asked to jump straight to the end, the net benefits. If the "before" was a terrifying tidal wave of cost and complexity, what did the "after" look like?

How to deploy Canonical Managed Kubeflow on Microsoft Azure?

Learn how to deploy Canonical Managed Kubeflow on Microsoft Azure step by step. Canonical's Managed Kubeflow on Azure gives enterprise and startup AI teams a fully operational, open source MLOps platform in under an hour. It is managed 24/7 by Canonical's engineers. This means you can focus entirely on building models rather than running infrastructure.

Massive Open Source Success: A Step-By-Step Guide | Ubuntu Summit 26.04

Not all open source projects gain traction -- but a few become movements. In this talk, Nariman, Founder of Puter, shares what actually separates the two, based on his experience of growing Puter to 40K+ stars, gaining hundreds of contributors, and over 500K installations. He breaks down how to gain momentum from a project's foundation, attract contributors, and design projects that capture the imagination.