Operations | Monitoring | ITSM | DevOps | Cloud

How to filter metrics by label?

It is sometimes easy to get lost in the mountain of metrics and infinite number of dimensions when working with an infrastructure monitoring tool. Being able to filter metrics by label and visualize only what is relevant to the current scope of monitoring & troubleshooting, becomes absolutely crucial to the success of SREs, Sysadmins and DevOps professionals.

So You've Troubleshooted the Alert. Now What?

Welcome to the companion post to So You Received an Alert. Now What? Last time, we broke down the process between receiving the Uptime.com check alert and figuring out what broke. Today, we’re going to show you how to communicate your efforts so that everyone – your end users, coworkers, and bosses – know what’s going on. Your first step is to update your Status Page, your central hub for incident management and communication.

Tutorial: How to Use ChaosSearch with Grafana for Observability

In my last blog post, Building a Cost-Effective Full Observability Solution Around Open APIs and CNCF Projects, we introduced using ChaosSearch in combination with the most popular open source front- and back-ends in the application observability space. In case you missed it, the TL;DR version is that you can use a variety of open source projects and open API-based components to build the best-of-breed observability stack of your choice rather than relying on expensive, all-in-one solutions.

Understanding the Different IT Security Certifications

Data security is more important than ever. High-profile cyber attacks in 2021, like the Colonial Pipeline Breach, caused major services to grind to a standstill. Ransomware is still on the rise, and there’s a fear that cybercriminals have the ability to break through 93% of company networks.

External Services Monitoring for Python

Python web applications are taking over more and more of the internet (source). However, with great Pythonic power comes great responsibility — ensuring that your web applications consistently deliver in terms of performance and reliability. It is one thing to build and ship an application and another to continually monitor and maintain it on the internet.

Five Reasons Why Python Is Popular

One of my first projects as a consultant created a web application for a small tax software company in Omaha, Nebraska. They were looking to improve their online presence by offering customers the ability to automatically obtain the license for the application. Their website would allow the customer, potentially within minutes, to gain access to their software. They hired me to develop a process with an interface to their existing system to generate a license code, store it somewhere, and then email it.

What Is a Firewall?

A firewall is a cybersecurity tool used to prevent unauthorized access to your private device or network. It could refer to any software or hardware that checks the data and traffic coming in and going out of a network to ensure they comply with cybersecurity rules. Firewalls can also include what is known as an intrusion detection system (IDS), which additionally blocks malicious traffic while allowing legitimate and authorized traffic access to a network.

Where Are My App's Traces? Understanding the Black Magic of Instrumentation

Many developers don’t know what instrumentation really is, and those who do don’t really understand the black magic that takes an application and makes it emit telemetry, especially when automatic instrumentation is involved. On top of that, each programming language has its own tricks. I wanted to unwrap this loaded topic on my podcast, OpenObservability Talks. For this topic I invited Eden Federman, CTO of Keyval, a company focused on making observability simpler.

Building a Performant iOS Profiler

Profilers measure the performance of a program at runtime by adding instrumentation to collect information about the frequency and duration of function calls. They are crucial tools for understanding the real-world performance characteristics of code and are often the first step in optimizing a program. Apple and Google have first party profiling tools, but they are only usable for local debugging during development.

What Is the 3-2-1 Backup Rule?

The 3-2-1 backup rule is a strategy to ensure your data is recoverable in case of data loss incidents. It recommends having at least: The rule was conceptualized by US photographer Peter Krogh. After initially impacting the photography world, Krogh’s idea was quickly adopted by other technology disciplines. It’s a great way to evaluate and manage data risks.