Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

AI in 2025: is it an agentic year?

2024 was the GenAI year. With new and more performant LLMs and a higher number of projects rolled out to production, adoption of GenAI doubled compared to the previous year (source: Gartner). In the same report, organizations answered that they are using AI in more than one part of their business, with 65% of respondents mentioning they use GenAI in one function.

Understanding AWS SNS Pricing: Features, Benefits, And Cost-Saving Strategies

A reliable notifications system can send highly scalable, multi-protocol messages — via email, SMS, or apps — all from one platform. For example, you can send timely cost anomaly alerts directly to your developers on Slack to alert them to potential overspending before it becomes a board meeting emergency. So, what does this have to do with Amazon SNS pricing? Let’s start at the beginning to better understand what you’re paying for when you get that AWS SNS bill.

How to Filter Docker Logs with Grep

Managing logs in Docker can quickly become overwhelming, especially when dealing with multiple containers. If you’ve ever tried to sift through a sea of log entries looking for a specific error or debugging message, you know the struggle. Fortunately, you can pipe docker logs output through grep to filter logs efficiently. This guide breaks down how to use docker logs grep it effectively, including practical examples to help you debug and monitor your containerized applications like a pro.

Ubuntu System Logs: How to Find and Use Them

System logs play a crucial role in debugging and monitoring in Ubuntu. When a service misbehaves or an unexpected crash happens, logs hold the answers. They’re also great for keeping an eye on system performance. Knowing how to access, read, and manage these logs can save you hours of troubleshooting. This guide covers everything you need to know about Ubuntu system logs—from where they’re stored to how to analyze them efficiently.

Distributed Tracing 101: Definition, Working and Implementation

Modern applications rely on microservices, making it tough to track issues across services. Distributed tracing helps by mapping a request’s journey and pinpointing latency, failures, and dependencies. Unlike traditional monitoring, tracing connects the dots between services, offering deeper visibility. But implementing it isn’t easy—it brings high data volumes, performance overhead, and complexity.

AWS CSPM Explained: How to Secure Your Cloud the Right Way

As organizations expand their AWS footprint, maintaining visibility and control over configurations can be challenging. Misconfigurations, unnoticed vulnerabilities, and compliance gaps can create serious security risks. AWS Cloud Security Posture Management (CSPM) helps teams navigate these challenges by automating security checks, ensuring compliance, and providing continuous monitoring. Here’s what you need to know about AWS CSPM and why it’s essential for securing your cloud environment.

Monitoring Kubernetes Resource Usage with kubectl top

Efficient resource utilization is key to running Kubernetes workloads smoothly. Whether you're troubleshooting performance issues, optimizing resource requests and limits, or keeping an eye on cluster health, the kubectl top command is an essential tool. It provides real-time CPU and memory usage metrics for nodes and pods, helping you make informed decisions about scaling and resource allocation.