Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

A Practical Guide to Monitoring Ubuntu Servers

Running Ubuntu servers without proper monitoring can lead to unexpected issues. For DevOps engineers and SREs, effective tracking is crucial for maintaining system health and performance. This guide covers everything you need to know about monitoring Ubuntu servers, from the basics to advanced strategies, helping you keep your systems running smoothly, whether you manage a single server or a large fleet.

Apache Logs Explained: A Guide for Effective Troubleshooting

Apache logs are a critical tool for monitoring your web server, but they can often feel overwhelming. For DevOps teams, understanding these logs is essential for diagnosing issues and maintaining system reliability. In this guide, we'll explore the setup and analysis of Apache logs, offering practical tips to help you make sense of them and use them effectively for troubleshooting and optimization.

Easily Query Multiple Metrics in Prometheus

In monitoring setups, working with a single metric rarely tells the complete story. The real power of Prometheus lies in its ability to query multiple metrics simultaneously, creating connections between different data points that reveal the true state of your systems. This guide will walk you through everything you need to know about crafting effective multi-metric queries in Prometheus – from basic concepts to advanced techniques that will help you monitor and troubleshoot your infrastructure.

AI on constrained embedded devices with Ubuntu Core

As AI moves closer to production, running inference on embedded devices is becoming essential, but challenging. In this session, Gabriel Aguiar Noury (Product Manager at Canonical) explores how snaps and Ubuntu Core simplify the packaging and deployment of AI models on resource-constrained devices, while maintaining security and updatability.

Getting started with Jenkins dashboards

Jenkins is an open-source automation server widely used for continuous integration and continuous delivery (CI/CD), enabling developers to automate the building, testing, and deployment of software projects. Jenkins requires a good layer of visualization as it provides real-time visibility into pipeline performance, build statuses, test results, and deployment progress.

Kubernetes v1.33: An Insider Perspective

I was lucky enough to serve on the v1.33 Release Team as Comms Shadow, and it was truly awe-inspiring to see the inner workings of the world’s biggest open-source project. There is a lot to cover around the structure, governance, processes, and maintenance of the Kubernetes project, but in this blog post, I want to focus on the exciting new features that v1.33 brings and what it means for all of us. Check out the official Kubernetes release blog for more details!

How AI Is Reframing the Software Development Operation Rules

There's a revolution in the making in software development. It's about working smarter. Teams scale without lag and produce consistently high-performing systems. And artificial intelligence is stepping up as a game-changing partner. This shift isn't automation - it's intelligence. AI is bringing sense to complexity, allowing teams to cut through noise and ship quality code. It makes DevOps pipelines stronger than ever. Let's get in there and look at how AI is taking drudgery off the table and redefining operational excellence for the tech stack.

Eliminating Flaky Tests with Traffic Replay

There are few things that can derail developer productivity and undermine your pipeline like a flaky test. Testing is the backbone of a good development process, ensuring that your code is as accurate and usable as possible. When these tests point towards faulty development, the impacts can be significant. This information is predicated on an assumption, however – the assumption that what the test says is accurate.