Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Kubernetes Liveness Probes: A Practical Guide

Have you ever wondered how you can help Kubernetes manage your pods in the most efficient way? Kubernetes can do a decent job “out of the box,” but it can be optimized just like any other system. One such optimization in the Kubernetes world is introducing liveness probes, and in this post, you’ll learn everything about them.

Choosing the Right AWS Messaging Service for Your Application

With the dawn of microservices and serverless, event-driven architectures have become the way to go when building a new system in the cloud. This approach has allowed for greater scalability, as the system can easily adapt and respond to changes in traffic or demand without having to overhaul the entire architecture. Additionally the Event-driven approach means your application is mainly concerned with routing event data to the right services.

How to Monitor Redis with Prometheus

The current popularity of Redis is well deserved; it’s one of the best caching engines available and it addresses numerous use cases – including distributed locking, geospatial indexing, rate limiting, and more. Redis is so widely used today that many major cloud providers, including The Big 3 — offer it as one of their managed services. In this article, we’ll look at how to monitor Redis performance using Prometheus, the similarly popular open-source monitoring system.

Optimize Kubernetes workload resourcing with StormForge and Datadog

StormForge Optimize Live is a machine learning-powered performance and resource optimization solution for Kubernetes workloads. Optimize Live ingests and analyzes production observability data and recommends specific actions to optimize CPU and memory utilization. You can take these actions manually or set them to occur automatically, making it easier to maintain a high level of application performance while minimizing cloud costs.

Ping Management Pack compatibility with SCOM 2022

The Opslogix Ping Management Pack is a powerful Management Pack designed to monitor the availability and performance of network devices and services. It is a must have Management Pack for SCOM, which is used by many IT professionals to monitor their infrastructure. With the release of SCOM 2022, many users are wondering whether the Opslogix Ping Management Pack is compatible with the new version.

The SRE Report 2023: Forecasts and the Current Economy

As questions and challenges loom over the tech industry and the larger economy, now is a perfect time for us to take a step back and learn from the past. As reliability engineers, we regularly use Service Level Objectives (SLOs) to understand the performance, reliability, and trends of our systems to help inform and prioritize our decision making.

Announcing the General Availability of Playwright Test Support

Back in October of 2022 we unveiled the beta of @playwright/test. We’re now happy to announce that Playwright Test (PWT) is now generally available! We’ve worked hard to make Checkly the best way to run your Playwright tests, and we’ve also decided to make Playwright—which is experiencing a surge in usage and popularity— the default and recommended web testing framework to use with Checkly.

Webinar Recap: Taming Data Complexity at Scale

As a Senior Product Manager at Mezmo, I understand the challenges businesses face in managing data complexity and the higher costs that come with it. The explosion of data in the digital age has made it difficult for IT operations teams to control this data and deliver it across teams to serve a range of use cases, from troubleshooting issues in development to responding quickly to security threats and beyond.

Implementing Distributed Tracing in a Java application

Monitoring and troubleshooting distributed systems like those built with microservices is challenging. Traditional monitoring tools struggle with distributed systems as they were made for a single component. Distributed tracing solves this problem by tracking a transaction across components. In this article, we will implement distributed tracing for a Java Spring Boot application with three microservices.
Sponsored Post

5 Advanced DevSecOps Techniques to Try in 2023

If you're here, you know the basic DevSecOps practices like incorporating proper encryption techniques and embracing the principle of least privilege. You may be entering the realm of advanced DevSecOps maturity, where you function as a highly efficient, collaborative team, with developers embracing secure coding and automated security testing best practices.