Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Optimize Equipment with Data-Driven Analytics

We want machines in good working order, making products of superior quality. This isn’t news. But what is newsworthy is that routine maintenance can still lead to more downtime than necessary. Not all maintenance programs are created equally. Keeping capital equipment running doesn’t exist inside a vacuum of chance. Outside the fraction of unavoidable catastrophes, there’s much power in the decision-making process.

How to Tackle Spiraling Observability Costs

As today’s businesses increasingly rely on their digital services to drive revenue, the tolerance for software bugs, slow web experiences, crashed apps, and other digital service interruptions is next to zero. Developers and engineers bear the immense burden of quickly resolving production issues before they impact customer experience.

Mapping hostnames to locations with Icinga Director

Recently I came across the Maps module build and maintained by our community. The module displays host objects and annotations on openstreetmap using the JavaScript library leaflet.js. The module reads the coordinates for each host from custom variables and is able to group multiple hosts on the same location. There is already a guide on our blog that describes how you can use the module with human readable locations instead of numeric geolocations.

Grafana Tempo 2.2 release: TraceQL structural operators are here!

Get excited about Grafana Tempo 2.2! Not only is this release on time, but it is also chock full of TraceQL features and performance improvements. I was honestly a little shocked by how much we have accomplished in the last three months when summarizing the changelog.

What Is APM and How Can It Help Your Services/Applications?

APM is one of those buzzwords that is slowly becoming a necessity. Most people are still unsure what APM means and how it can help their services. But what is it? What does it stand for? And how can it help your services or digital products? This blog will answer your questions—and more.

Using Grafana and Graphite to monitor server load

Since server outages can lead to a loss of customers, reputation, and other troubles and it is important to get information on the status of the server on time. MetricFire's Hosted Grafana and Graphite will help you monitor server load in a timely and efficient manner. Servers generate a large number of metrics and it is essential to not only track their values but also to observe their changes over time. There is also a possibility to correlate app statistics with server load metrics.

Sponsored Post

3 Reasons to Prioritize Observability as part of Application Integration Strategy

Most companies in today's business landscape that deal with large amounts of data want to integrate their applications so that they can pass data between them seamlessly and easily. Being able to ensure that you can see exactly what is happening at every stage of the process is key, and this is where approaching the process with observability in mind can make a real difference. Deciding at the outset that observability is something that you want to be baked into the process means that you can plan and execute with that in mind.

Sponsored Post

Best practices for tracing and debugging microservices

Tracing and debugging microservices is one of the biggest challenges this popular software development architecture comes with - probably the most difficult one. Due to the distributed architecture, it's not as straightforward as debugging traditional monolithic applications. Instead of using direct debugging methods, you'll need to rely on logging and monitoring tools, coding practices, specific databases, and other indirect solutions to successfully debug microservices.

Sponsored Post

Best Practices for SaaS and Network Incident Management

Computer and network systems have (obviously) become vital to business operations. Occasionally, there are SaaS or network incidents and these systems do not operate as needed. Enterprises want to minimize the potential damage and get their systems back online ASAP. Integrated incident management and a strong End User Experience Management (EUEM) platform that provides synthetic and real-user monitoring is a foundation for meeting that objective.

AIOps And Supercloud Adoption

For the last couple of years, I have been highlighting yearly predictions at the beginning of each year. For 2023, I would like to expand on one of the trends that I briefly mentioned in my 2022 predictions: superclouds. Since then, there have been a lot of discussions, vendors staking their claims, and clarity around superclouds. In this article, I will look at how observability, AIOps and automation are going to be key tenets for supercloud abstraction and adoption in 2023.