Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Introducing Honeycomb Service Map: A Dynamic, Interactive, and Actionable View of Your Entire Environment

Today, we're announcing the launch of Honeycomb Service Map. This isn't your grandparent's version of a service map. This feature reimagines what it is that you want to know or investigate when looking at visualizations of how your services communicate with one another.

HAProxy Logging Configuration Explained: How to Enable and View Log Files

HAProxy is generally the frontend layer of your application, which means it plays a critical role since all traffic first lands on this layer. Because of this, you need to make sure everything is working at this layer all the time, as any issue can directly impact your business. Therefore, having visibility on this layer is crucial. Visibility can come from two aspects: the metrics HAProxy emits and the logs it generates while handling requests.

The Roblox Outage

Just before Halloween 2021, Roblox engineers experienced a horror story: a service outage that also took down critical monitoring systems. It seemed like the issue was a hardware problem, but it wasn’t. Users were frustrated, and the clock was ticking. After three full days of downtime, service was finally restored on Halloween day. While the incident itself was an IT nightmare, Roblox’s detailed technical post-mortem several months later was an excellent way to bounce back.

Monitoring your Network SNMP devices using Hosted Graphite

When you design architecture to monitor your digital assets - either software applications or hardware devices, you need to use different strategies depending on your monitoring target. The factors you want to consider can vary including methods of retrieving monitoring data, frequency of data collection, and how you want to surface metrics and insight you find to stakeholders. In this article, we will mainly discuss how we can monitor your network SNMP devices using Hosted Graphite.

How to monitor server load

We often hear the term "load" used to describe the state of a server or a device. But what does it really mean? System load is a measure of the amount of computational work that a system performs. An overloaded system, by definition, isn't able to complete all its tasks per schedule - this affects the performance and productivity of the system. And while "load" often gets conflated with CPU usage there's a lot more to it.