Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

What AI Has Never Seen: The Context Gap in Code Generation

Your AI coding assistant has read the entire internet. It knows every programming language, every framework, every best practice documented in Stack Overflow answers and GitHub repositories. It can generate a REST API handler in seconds that looks perfect with clean code, proper error handling, following all the patterns. But here’s what it’s never seen: your production traffic. Data from a real API request. Someone filling out a form with messed up or incomplete data.

#052 - The "Short Long Path": Mastering Abstraction, Culture, and Kubernetes Scale with Shemer M...

In this episode, Itiel joins forces with Shemer, Director of Platform Solutions at the gaming giant Playtika, and Scott Rosenberg, Lead Architect at TeraSky, to discuss the realities of platform engineering at a massive scale. The trio dissects Playtika’s multi-year journey from a legacy, homegrown Kubespray infrastructure to a modern, holistic platform built on Spectro Cloud, all while running strictly on-premise to support 25+ games and high-volume traffic.

Your Test Data Environment: Build vs Buy - a conversation we need to have

After three decades of working with databases, one thing I’ve seen over and over is this: we don’t treat our development and test environments with the same respect we do our production systems. Not because people don’t care. Far from it. It’s usually because teams are under pressure, everyone’s juggling multiple priorities, and the quickest path forward often wins the day.

2-day vs. 4-day on-call rotations: Which one fits your team

Teams that find a weekly rotation too long and a daily rotation too short often end up choosing between 2-day and 4-day rotations. This guide compares both these rotations across three key criteria. For each criterion, we have discussed how it works for 2-day and 4-day rotations and recommended what to choose when. To make it easy, we also included a comparison table for a quick overview. This gives you all the information you need at a glance. Let’s dive in! Table of contents.

Scalable AI governance: why your policy needs a platform, not just a PDF

Most IT teams don’t lack AI policies. They lack policies that survive a Git push. In many organizations, AI governance is a paper tiger. There are comprehensive documents outlining data usage, approved models, and risk management. On an auditor's desk, these policies look complete. But inside the workflow, the reality is different. AI tools are being embedded directly into IDEs, CI pipelines, and internal automation scripts.

What mid-market IT teams wish they knew before deploying AI agents

AI agents are quickly shifting from experimentation into day-to-day operations. That shift is showing up in the data. McKinsey’s latest State of AI research highlights both broader AI use and the growing focus on “agentic AI,” even as many organizations still struggle to scale safely. For mid-market IT teams, agents can feel like the unlock: automate repetitive workflows, reduce backlog pressure, and deliver more output without expanding headcount.

Building Trust in the Machine: A Guide to Architecting Agentic AI for SRE

The promise of Artificial Intelligence in Site Reliability Engineering (SRE) is seductive: an autonomous system that never sleeps, instantly detects anomalies, and fixes broken infrastructure while humans focus on high-value work. However, the gap between a demo-ready chatbot and a production-grade Autonomous AI SRE is vast. In complex, noisy environments like Kubernetes, a “naive” implementation of Large Language Models (LLMs) is not just ineffective, it can be dangerous.

AI Tags: Why Cloud Tagging Breaks Down For AI Workloads (And What To Use Instead)

Tags have long been the backbone of cloud cost visibility and governance. They help teams understand who owns what, where spend comes from, and how infrastructure maps back to the value the business delivers. However, AI workloads have altered that model, and exposed the limitations of traditional AI tags in the process. In fact, many of the most expensive AI operations don’t run on taggable cloud resources at all.

AI meets SQL Server 2025 on Ubuntu

Since 2016, when Microsoft announced its intention to make Linux a first class citizen in its ecosystem, Canonical and Microsoft have been working hand in hand to make that vision a reality. Ubuntu was among the first distributions to support the preview of SQL Server on Linux. Ubuntu was the first distribution offered in the launch of Windows Subsystem for Linux (WSL), and it remains the default to this day. Ubuntu was also the first Linux distribution to support Azure’s Confidential VMs.