Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Monitoring Cloud Database Costs with OpenTelemetry and Honeycomb

In the last few years, the usage of databases that charge by request, query, or insert—rather than by provisioned compute infrastructure (e.g., CPU, RAM, etc.)—has grown significantly. They’re popular for a lot of the same reasons that serverless compute functions are, as the cost will scale with your usage. No one is using your site? No problem: you’re not charged.

Modern Canadian MSSP drives next-gen MDR with Logz.io and Tines

Today’s Managed Security Service Providers (MSSPs) are trying to grow their business quickly, improving margins and onboarding customers with high-quality tool sets that scale with the business. This means reducing cost, improving onboarding time and building the next generation of Managed Detection and Response (MDR) to deal with threats that are increasing in volume and sophistication.

How to mute alerts during maintenance windows or scheduled backups?

The health management APIs in Netdata allows teams to eliminate unnecessary alerting during scheduled maintenance, testing, auto scaling events, and instance reboots. For all SREs, it is absolutely crucial to filter out expected events during maintenance windows and quickly pinpoint critical issues in your infrastructure. Every minute is crucial while dealing with troubleshooting issues and any distractions that may hijack the troubleshooting process should be subdued.

AIOps (artificial intelligence for IT operations)

Artificial intelligence for IT operations (AIOps) is an umbrella term for the use of big data analytics, machine learning (ML) and other artificial intelligence (AI) technologies to automate the identification and resolution of common IT issues. The systems, services and applications in a large enterprise produce immense volumes of log and performance data. AIOps uses this data to monitor assets and gain visibility into dependencies within and outside of IT systems.

What is Jaeger Distributed Tracing?

Distributed tracing is the ability to follow a request through a software system from beginning to end. While that may sound trivial, a single request can easily spawn multiple child requests to different microservices with modern distributed architectures. These, in turn, trigger further sub-requests, resulting in a complex web of transactions to service a single originating request.

A New Era of Sentry

Today we are releasing Dynamic Sampling, available to all new customers, and opt-in for existing customers. This goes beyond a new feature however and is an overhaul to the way we package Sentry’s Performance Monitoring product. We are saying goodbye to the days of static, magic number sampling configured within the SDK and moving to a world of flexibility.

High Five! Splunk Honored With Five TrustRadius Best Software Awards

Customers have spoken, and we’re feeling the love. Splunk has just been honored with no fewer than five “Best Software” Awards from TrustRadius! Based exclusively on customer reviews, Splunk Enterprise Security (ES) took home the top spot in three categories: Best Software for Enterprise, Best Software for Mid-Sized Businesses, and Best Software for Small Businesses.

User Experience for Observability

Modern software applications involve multiple layers of code and services, working together to meet increasingly demanding user requirements. To achieve this, systems became distributed, providing improved scalability, fault tolerance, and complexity. However, this innovation brought new challenges to basic troubleshooting and performance monitoring to maintain the health of systems. It’s for these reasons that observability is trending.