Operations | Monitoring | ITSM | DevOps | Cloud

Lessons Learned from Managing Kafka Costs

You probably have seen ads where someone claims that their app can save you money by finding subscriptions you forgot about. I have a hard time imaging someone with $100s of dollars of expenses they forgot about, but I have had the occasional one that was missed. The problem is that people are inefficient when it comes to managing “stuff”. That is why there are so many places to store “stuff”.

Database Observability Provides the Features Customers Need for Effective Monitoring

I began working with database customers back in the day with VividCortex until it was purchased by SolarWinds. Since then, I’ve had the opportunity to work with tons of our database solution customers as an account manager and now lead our DPM renewals initiative. In these roles, I’ve helped our customers transition from VividCortex to Database Performance Monitor (DPM) and now migrate into Database Observability.

Lessons in Incident Response I Learned While Waiting Tables

Before I stumbled into the tech industry (a story for another day), I spent several years in the customer service world as a server and front-of-house manager in restaurants. It was in these jobs that I first honed some critical skills that would later lead me on the path to incident response.

Network Latency & How To Improve Latency

Cloud-based services have changed how individuals and businesses get things done. That doesn’t mean it’s all positive — there are some tradeoffs and compromises that come with cloud services and the internet. One major tradeoff is speed. For instance, if your website fails to load within three seconds, 40% of your visitors will abandon your site. That’s a serious dent for anyone doing business online. The culprit here is latency.

What you can't do with Kubernetes network policies (unless you use Calico): Policies to all namespaces or pods

Continuing from my previous blog on the series, What you can’t do with Kubernetes network policies (unless you use Calico), this post will be focusing on use case number five — Default policies which are applied to all namespaces or pods.

Getting started with IT operations automation

Tech companies face a daunting challenge: a staggering 90% of their IT teams are stuck doing mundane, repetitive tasks, leaving only 10% to focus on strategic innovation. Companies know that automation is the solution to these repetitive, low-level incident response actions; however, many need support to begin automating.

The ultimate guide to incident management KPIs and metrics

IT incident management aims to swiftly identify, address, and resolve IT disruptions to restore normal service operations. Tracking IT incident management key performance indicators (KPIs) is a vital step toward minimizing disruptions for customers and users. But there are several different KPI and metrics choices, and it’s not easy to identify the right ones that can drive meaningful improvements in incident management.