Operations | Monitoring | ITSM | DevOps | Cloud

Monitoring

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Reimagine All You Have Learned: APM and the Skills Gap

APM tools have been formerly and primarily siloed in the application development arena, with only the most important and mission-critical applications having their APM instrumentation extended into production use due to complexity and cost. In the modern world of application monitoring, the requirements for Dev and Ops need to be tightly integrated.

Monitor Alcide kAudit logs with Datadog

Kubernetes audit logs contain detailed information about every request to the Kubernetes API server and are critical to detecting misconfigurations and vulnerabilities in your clusters. But because even a small Kubernetes environment can rapidly generate lots of audit logs, it’s very difficult to manually analyze them.

TL;DR InfluxDB Tech Tips - How to Extract Values, Visualize Scalars, and Perform Custom Aggregations with Flux and InfluxDB

In this post, we learn how to use the reduce(), findColumn(), and findRecord() Flux functions to perform custom aggregations with InfluxDB. This TL;DR assumes that you have either registered for an InfluxDB Cloud account – registering for a free account is the easiest way to get started with InfluxDB – or installed InfluxDB 2.0 OSS. In order to easily demonstrate how these functions work, let’s use the array.from() function to build an ad hoc table to use in the query.

How To Succeed When Adopting A Multi Cloud Environment

Today, a vast majority of companies are working with multiple cloud providers. But moving IT operations to the cloud has significant consequences they need to deal with. Discover how Broadcom helps customers to manage critical workloads in multi-cloud environments, simplifying and accelerating the deployment of new business services.

How Automation Helps The Site Reliability Engineer

Automation has been with us for decades now and with years of experience and experimentation we are arriving at a best practice known as site reliability engineering. Site reliability engineering seeks to manage the risk imposed from multiple agile changes to protect business revenues and sustain positive customer experiences.

Why Admins HATE Their Backups

Many of us hate our backup environments. That’s because backups kind of suck, even with a backup product as great as IBM Spectrum Protect. As I said in another post, it’s the thing that everyone needs, but no one cares about, and most definitely can make your life crappy. Ask any backup admin, and I know they’ll agree. Go ahead; I’ll wait. Yep, they said the same thing, didn’t they?

Find Where N+1 Database Queries Affect Your Application

One of the Scout’s key features is its ability to quickly highlight N+1 queries in your application that you might not have been aware of, and then show you the exact line of code that you need to look at in order to fix it. In this video, we will use a Ruby on Rails application as an example, but the same concepts apply to other popular web frameworks.