Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Containers, Kubernetes, Docker and related technologies.

How to run self-hosted AI on your own infrastructure with Konstruct

Civo Platform Engineer M R Rishi demonstrates how to go from zero to self-hosted AI in minutes using Konstruct. While most teams are stuck managing thousands of configuration values across multiple models and tools, Rishi shows how Konstruct eliminates that complexity with GPU cluster provisioning, GitOps catalog deployments, and production-ready infrastructure on day zero.

Secret Manager Integration: One Source of Truth for Humans and Agents.

Production secrets should live in one place and stay there, whether your next deployment is triggered by a developer or an AI agent. The Secret Manager integration connects AWS Secrets Manager, AWS SSM, or GCP Secret Manager to Qovery so secrets are referenced, never copied, and enterprise governance holds regardless of who deploys. Alessandro leads product at Qovery. He drives the changelog, roadmap, and product strategy - turning customer feedback into platform capabilities.

The Two-Sided Scheduling Problem: Reaching the Next Layer of Cloud Savings

You’ve deployed Karpenter or Cluster Autoscaler and tightened your resource requests, but while you saw an initial dip in your cloud bill, your savings have flatlined. Organizations that thought they had the fundamentals of cloud cost under control are now seeing stagnation. The problem isn’t that they need another FinOps tool or better visibility. The problem is that the current state of enterprise cloud cost optimization strategy is fundamentally reactive.

The Inference Paradox: How Split-Brain LLMs Are Killing Your GPU ROI

During the Toronto KCD (Kubernetes Community Days), I attended an insightful talk on AI resource optimization that highlighted a staggering Gartner study: “AI infrastructure is adding $401 billion in new spending this year alone. Yet, real-world audits tell a much darker story, revealing that average GPU utilization in the enterprise is stuck at a dismal 5%”. While many people in the audience were shocked by that number, the data didn’t come as a surprise to us.

A field guide to the agents in your cluster

You know every service in your cluster by name. You know which team owns each one, what it talks to, how it scales, where its logs go. The agents are a different story. That’s not a criticism, it’s an observation, and it’s one we keep running into. Every company we talk to is shipping agents of some kind, from scales of 10s to 1000s. Customer service bots that field tier-one tickets. Internal copilots that draft emails and summarise meetings and write the boring half of every PR.

Five Principles of an Accountable AI Agent Network: How to Evaluate Any Governance Platform

The first post in this series argued that AI agent governance hasn’t kept pace with deployment. The second laid out the five pillars of accountability, and what is required. The third walked through why network policies, API gateways, MCP/A2A protocols, DIY security patterns, and Role-based Access Control (RBAC) each leave critical accountability gaps. So what does good look like? The five pillars define what AI agent accountability requires.

Kubeflow MLOps tutorial: from notebook development to production inference

In this video, our engineering team takes you through a full end-to-end Kubeflow implementation, step by step – from data exploration to production inference. Follow the journey of a house price prediction use case and see how modern MLOps components work together: Kubeflow architectures and starter repositories Notebook-based development workflows Data exploration and model development MLflow for experiment tracking Katib for hyperparameter optimization Kubeflow Pipelines for automated preprocessing and training KServe for scalable model inference.

Coding Agents Write the Code. Who Verifies It Works? We Built the Answer.

Coding agents are good at reading a spec and producing code. But producing code is one step in a longer process. The real loop is Spec -> Code -> Deploy -> Test -> Verify -> Ship. Agents stop at step two. Romaric founded Qovery to make Kubernetes accessible to every engineering team. He writes about platform strategy, developer experience, and the future of cloud infrastructure.