Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Facebook outage highlights need for DNS monitoring

If you were one of billions of frustrated users of Facebook who weren’t able to access their accounts Monday, rest assured in knowing that downtime is a thing of the distant past and the mega-social media platform is back online. End users can now relax knowing that the brush fire has been extinguished. Remarkably, the nearly seven-hour outage could not be attributed to the deluge of recent high-profile attacks on government, enterprise, and educational servers throughout the world.

Rollbar Pro Tips: Merging Items

Item merging allows you to combine multiple items into one 'group' for easier management and more accurate metrics. All past and future occurrences of any merged items will automatically be combined. Rollbar is the leading continuous code improvement platform that proactively discovers, predicts, and remediates errors with real-time AI-assisted workflows. With Rollbar, developers continually improve their code and constantly innovate rather than spending time monitoring, investigating, and debugging.

5 things to look for in a third-party monitoring tool

A key finding from Redgate’s recent State of Database Monitoring Survey of over 2,500 IT professionals was that 79% of respondents reported using either a third-party or in-house monitoring tool. It’s notable because it’s an increase of 10 percentage points from the same survey last year – and the satisfaction rate with third-party monitoring tools also saw an increase of 18 percentage points to 86%.

Observing container environments with Cloud Operations

Did you know GKE isn’t the only place you can run containers in Google Cloud? In this episode of Engineering for Reliability, we show three options for running containers, as well as how to instrument each one for observability with Cloud Operations. Watch to learn how Cloud operations can help visualize metrics and analyze logs emitted by container workloads running on GKE, on Cloud Run, and on an Anthos cluster!

Facebook Outage: The Case for Configuration & Change Management

In the age of cloud, digital transformation, application modernization, and the mobile economy, the network is the lifeblood behind enabling excellent customer experiences. Network Operations (NetOps) and IT Operations (ITOps) teams are constantly aware that a disruption in core network systems performance can have a massive impact on their business.

Is service catalog the modern CMDB?

SquaredUp recently launched a PowerShell tile that lets you visualize data returned from a PowerShell script. This has opened virtually infinite doors to the sources you can get data from. PowerShell can work with crazy text formats obscure databases, and endpoints that are open on the internet. If you can access it, PowerShell can work with it. And SquaredUp lets you leverage that power so you can get the information you need and visualize it in a format that makes sense.

What is Digital Experience Monitoring?

Businesses globally have been steadily shifting to digital as early as a decade ago. With the coronavirus pandemic happening, the digital transformation has now shifted into fifth gear. Digital experience is the key to business success. As of 2020, there were almost 30 billion end users that’s connected to the internet. Digital revenue has increased dramatically and digital will surely drive retail sales up.

10 SQL Server Performance Tuning Best Practices

There are a large number of best practices around SQL Server performance tuning – I could easily write a whole book on the topic, especially when you consider the number of different database settings, SQL Server settings, coding practices, SQL wait types, and so on that can affect performance.

A snapshot of my daily work

Today I show you a snapshot of my daily work. It is especially interesting this time, because it’s a not-so simple problem to solve. It’s not difficult per se, but involves quite some understanding of the Icinga Web 2 framework and how it communicates with the web server. Disclaimer: What I’m going to show, is not a feature preview or anything. It’s more of a proof of concept, and it may be that forever and won’t be continued further.

Honeycomb Differentiators Series: SLOs That Tell the Whole Story

In the recent past, most engineering teams had a vague notion of what Service Level Agreements (SLAs) and Service Level Objectives (SLOs) were—mainly things that their more business-focused colleagues talked about at length during contract negotiations. The success or failure of SLAs were tallied via magic calculations (what is “available” anyway?!) at the end of the month or quarter, and adjustments were made in the form of credits or celebrations in the break room.