Monthly Archive

Sponsored Post

The Top 5 Security Logging Best Practices to Follow Now

Mar 31, 2025 By David Bunting In ChaosSearch

Security logging is a critical part of modern cybersecurity, providing the foundation for detecting, analyzing, and responding to potential threats. As highlighted by OWASP, security logging and monitoring failures can lead to undetected security breaches. With the average cost of a data breach adding up to $4.45 million, most organizations can't afford to miss a security incident.

Read Post

ChaosSearch

Read more about The Top 5 Security Logging Best Practices to Follow Now

DX Operational Observability: Troubleshoot WebHook Notification Channels with WebHook Data Collector

Mar 31, 2025 By Jörg Mertin In Broadcom

The power of AIOps and Observability relies on the ability to ingest, normalize, and correlate the large volumes and huge variety of data available to IT operations teams. With its support for both Broadcom and third-party data, DX Operational Observability (DX O2) gives these teams unmatched observability and insights. With so much data coming to DX O2, monitoring operators need to be notified when important events may occur: Without notifications, important alerts may be overlooked.

Read Post

Broadcom

Read more about DX Operational Observability: Troubleshoot WebHook Notification Channels with WebHook Data Collector

Introduction to Private Locations in Splunk Synthetic Monitoring

Mar 31, 2025 By Splunk In Splunk

In this tutorial, we’ll demonstrate how to create and use private locations in Splunk Synthetic Monitoring to test internal or pre-production applications within a Kubernetes environment. You'll learn exactly what private locations and private runners are, common use cases, and step-by-step instructions on how to deploy a private runner using Helm. Finally, you'll see how to set up a simple browser test to run synthetics against a service available only within a Kubernetes cluster.

View Video

Splunk

Read more about Introduction to Private Locations in Splunk Synthetic Monitoring

Troubleshoot microservice-based apps faster with Splunk Observability Cloud

Mar 31, 2025 By Splunk In Splunk

When something goes wrong with your microservice-based apps, Splunk Observability Cloud offers a unified Observability platform to make debugging processes easier and faster. By using features like the Service Map to identify the cause of the error and Related Logs in Log Observer to pinpoint its location, you can get back up and running quickly, limiting the impact to your bottom line and keeping your customers happy.

View Video

Splunk

Read more about Troubleshoot microservice-based apps faster with Splunk Observability Cloud

Optimizing SQL (and DataFrames) in DataFusion: Part 1

Mar 31, 2025 By Andrew Lamb In InfluxData

Sometimes Query Optimizers are seen as a sort of black magic, “the most challenging problem in computer science,” according to Father Pavlo, or some behind-the-scenes player. We believe this perception is because: However, Query Optimizers are no more complicated in theory or practice than other parts of a database system, as we will argue in a series of posts: Part 1: Part 2: After reading these blogs, we hope people will use DataFusion to.

Read Post

InfluxData

Read more about Optimizing SQL (and DataFrames) in DataFusion: Part 1

Optimizing Kubernetes node resources: How to avoid exhaustion and improve performance

Mar 31, 2025 By Grace Nalini In Site24x7

Resource exhaustion at a node remains a critical issue. However, the automation of deployment and management of containerized applications is executed relatively efficiently in Kubernetes. When a node is low on resources—as in CPU, memory, or storage—a workload may suffer from failures, degraded performance, and eviction.

Read Post

Site24x7

Read more about Optimizing Kubernetes node resources: How to avoid exhaustion and improve performance

How SNMP traps help prevent network failures: A use case analysis

Mar 31, 2025 By Rama Venkatesan In Site24x7

You're likely well aware of how damaging network downtime can be to an enterprise's revenue, reputation, and overall operational efficiency. But what if you could spot potential issues before they turn into major problems? That's how Simple Network Management Protocol (SNMP) traps help enterprises stay ahead of failures and keep networks running smoothly. SNMP traps are an essential tool for network observability in enterprises looking to maximize uptime, optimize costs, and enhance resilience.

Read Post

Site24x7

Read more about How SNMP traps help prevent network failures: A use case analysis

The Rise of Shadow AI & the Tech Debt Tsunami

Mar 31, 2025 By Jade Lassery In logz.io

Recently, Logz.io co-founder and CTO Asaf Yigal teamed up with DevOps legend John Willis for an engaging webinar exploring the exciting—and occasionally intimidating—world of Shadow AI and the “tech debt tsunami” on the horizon. This lively session dove into how generative AI (GenAI) is reshaping software development, DevOps practices, and infrastructure management, along with some friendly advice on how organizations can navigate these changes without getting swept away.

Read Post

logz.io

Read more about The Rise of Shadow AI & the Tech Debt Tsunami

Why Intelligent Traffic Steering is Critical for Performance and Cost Optimization

Mar 31, 2025 By Madan Gopal N In Catchpoint

In today’s world of globally distributed applications, user experience is everything. Whether your platform runs across multiple cloud providers or uses a Multi CDN with numerous points of presence (PoPs), efficiently routing user traffic can make or break performance. That's where intelligent traffic steering becomes not just a nice-to-have, but a must-have.

Read Post

Catchpoint

Read more about Why Intelligent Traffic Steering is Critical for Performance and Cost Optimization

Top 6 Reasons Why You Need a Status Page Aggregator

Mar 31, 2025 By Hrishikesh Barua In IncidentHub

Your business depends on the reliability of the third-party services you use. Monitoring the status pages of these services is the best way of keeping track of their outages and maintenances. Although some status pages let you subscribe to alerts, there is no standard way of doing this. Service providers can change their status page providers, disable subscriptions, or not support the same notification options.

Read Post

IncidentHub

Read more about Top 6 Reasons Why You Need a Status Page Aggregator

Enabling Design System Observability Using Honeycomb

Mar 31, 2025 By Grady Salzman In Honeycomb

At Honeycomb, we’re actively growing our design system, Lattice, to ensure accessibility, optimize performance, and establish consistent design patterns across our product. One metric we use to measure Lattice is the adoption of components across the product. Adoption is about understanding how, where, and why they’re being used.

Read Post

Honeycomb

Read more about Enabling Design System Observability Using Honeycomb

Monitor the performance of queues and topics with Azure Service Bus

Mar 31, 2025 By Nicholas Thomson In Datadog

Azure Service Bus is a fully managed enterprise message broker that enables asynchronous messaging between distributed applications. It is designed to decouple application components, allowing them to communicate reliably, securely, and at scale. With Datadog’s Azure Service Bus integration, you can.

Read Post

Datadog

Read more about Monitor the performance of queues and topics with Azure Service Bus

Enrich your existing Datadog telemetry with custom metadata using Reference Tables

Mar 31, 2025 By Jinwu Li In Datadog

As your applications scale and generate more telemetry, it becomes increasingly difficult to sift through the data and analyze it against cost, business functions, and security measures. Logs, events, and other telemetry on their own may not include enough meaningful context or readable details, leading to slower troubleshooting, inefficient business processes, and higher costs.

Read Post

Datadog

Read more about Enrich your existing Datadog telemetry with custom metadata using Reference Tables

Remediate Kubernetes incidents faster using private actions in your apps and workflows

Mar 31, 2025 By Aneesh Kethini In Datadog

The Datadog Action Catalog provides more than 1,400 actions to help you accelerate remediation across your infrastructure directly within Datadog. With actions, you can use Workflow Automation to configure workflows that automatically address issues as they happen and build custom apps in App Builder that empower anyone in your organization to act when incidents occur.

Read Post

Datadog

Read more about Remediate Kubernetes incidents faster using private actions in your apps and workflows

How to prevent performance bottlenecks in Google Compute Engine: CPU spikes, RAM waste, and network overload

Mar 31, 2025 By Vasil Kaftandzhiev In Grafana

Cloud computing is all about efficiency. You need to get the most out of your resources without overspending or causing performance issues. For example, if you’re running virtual machines in Google Compute Engine, you need to size your instances correctly, optimize your workloads, and monitor your network traffic to prevent unexpected failures. However, when resources aren’t properly managed, things can quickly spiral out of control.

Read Post

Grafana

Read more about How to prevent performance bottlenecks in Google Compute Engine: CPU spikes, RAM waste, and network overload

Simplifying public sector observability with OpenTelemetry and Elastic

Mar 31, 2025 By Darren Meiss In Elastic

Public sector organizations today face unique challenges in maintaining and optimizing their IT infrastructure and prioritizing efficiency and interoperability. With a mix of modern cloud and legacy systems, ensuring consistent performance, reliability, and security is paramount. To effectively observe across these environments, government agencies need observability tools that are open, flexible, and scalable. OpenTelemetry (OTel) is fast becoming a pivotal part of that flexible toolset.

Read Post

Elastic

Read more about Simplifying public sector observability with OpenTelemetry and Elastic

Is It Time to Switch Your Network Monitoring Tool? How to Know & Choose the Right Upgrade

Mar 31, 2025 By Andrii Kernitskyi In Obkio

A while ago, your company chose a network monitoring tool that worked perfectly — back when most employees worked in the office, networks were centralized, applications ran on-premise, and "the cloud" was just a buzzword.

Read Post

Obkio

Read more about Is It Time to Switch Your Network Monitoring Tool? How to Know & Choose the Right Upgrade

Optimizing Item Search: How Rollbar Engineered Faster, More Capable Search

Mar 31, 2025 By Rollbar In Rollbar

Searching through error data efficiently is critical for developers using monitoring tools. At Rollbar, we recently completed a significant overhaul of our Item Search backend. The previous system faced performance limitations and constraints on search capabilities. This post details the technical challenges, the architectural changes we implemented, and the resulting performance gains.

Read Post

Rollbar

Read more about Optimizing Item Search: How Rollbar Engineered Faster, More Capable Search

Coroot v1.9: Kubernetes-Native Database Monitoring Made Easy

Mar 31, 2025 By Nikolay Sivko In Coroot

From day one, we built Coroot to work beyond just Kubernetes. Many teams still run databases and other stateful services on dedicated VMs or bare-metal servers. But that’s starting to change. More and more teams no longer see Kubernetes as a platform just for stateless apps. Powerful Kubernetes operators now handle day-2 operations like failover, backups, and disaster recovery—making it easier than ever to run databases on Kubernetes. And the number of teams choosing this path keeps growing.

Read Post

Coroot

Read more about Coroot v1.9: Kubernetes-Native Database Monitoring Made Easy

Finding UX Friction (...Before It Becomes a Problem)

Mar 31, 2025 By Germain UX Team In Germain UX

Make it smooth. Reduce friction. Keep users moving. That’s solid advice. No one enjoys filling out a form with 10 unnecessary fields or dealing with a checkout process that feels like a maze. But you can’t fix friction if you don’t know where it’s happening. Big companies like Amazon, Netflix, and Airbnb don’t just guess where users are struggling. They track the right UX metrics, run experiments, and fine-tune their products constantly.

Read Post

Germain UX

Read more about Finding UX Friction (...Before It Becomes a Problem)

From surface-level to strategic: Benefits of network traffic analysis

Mar 30, 2025 By Rama Venkatesan In Site24x7

Enterprises are experiencing fluctuations in workforce dynamics amidst the insurgence of new technologies while also tackling the growing prevalence of cyberthreats. They are increasingly turning to cloud technologies, which are scalable and flexible, to adapt to these changes.

Read Post

Site24x7

Read more about From surface-level to strategic: Benefits of network traffic analysis

How to Add a Health Check Endpoint to Your Next.js Application

Mar 29, 2025 By Leo Baecker In Hyperping

As your application grows, ensuring it remains operational is crucial. Let's explore how you can add a health check endpoint to your Next.js app, providing you with peace of mind and proactive monitoring capabilities.

Read Post

Hyperping

Read more about How to Add a Health Check Endpoint to Your Next.js Application

Understanding the Meaning of a Waterfall Chart #coding #chromedevtools #programming

Mar 29, 2025 By Request Metrics In Request Metrics

Decode website loading sequences with Todd Gardner's essential guide to waterfall charts in this Concepts of Web Performance tutorial. Perfect for entry-level web developers struggling with slow websites, this video demystifies those intimidating colored bars you've seen in Chrome DevTools, WebPageTest, and monitoring tools like Request Metrics. Learn to interpret the crucial elements of waterfall charts—from request queuing and waiting times to content downloading phases—all visualized on a timeline measured in milliseconds. Discover how to identify two major performance bottlenecks.

View Video

Request Metrics

Monitoring

Read more about Understanding the Meaning of a Waterfall Chart #coding #chromedevtools #programming

Elevating Strategic DEX Management with AI Sentiment Analytics

Mar 28, 2025 By Dave Wagner In Nexthink

Nexthink has long been the leader in Digital Employee Experience (DEX) management, in large part to Nexthink Employee Engagement, a powerful way for IT to communicate timely information, fix issues collaboratively, and understand employee’ experience with technology. Its hyper-targeted campaigns to which employees actively respond, gives IT Leaders and teams the Sentiment context needed to have confidence they are addressing the technology issues employees consider important.

Read Post

Nexthink

Read more about Elevating Strategic DEX Management with AI Sentiment Analytics

MySQL Logs: Your Guide for Database Performance

Mar 28, 2025 By Faiz Shaikh In Last9

MySQL logs are basically your database's diary – they record everything happening behind the scenes. Think of them as the black box of your database operations. You've got error logs showing you when things go sideways, query logs documenting every question asked of your database, and binary logs tracking changes like they're gossip in a small town.

Read Post

Last9

Read more about MySQL Logs: Your Guide for Database Performance

New Relic vs Datadog: The Complete Comparison

Mar 28, 2025 By Anjali Udasi In Last9

Choosing between New Relic and Datadog? Here's what you need to know: Let's break it down. If you're comparing New Relic and Datadog for observability, you might also find this guide on microservices monitoring tools helpful.

Read Post

Last9

Read more about New Relic vs Datadog: The Complete Comparison

Python Loguru: The Logging Cheat Code You Need in Your Life

Mar 28, 2025 By Preeti Dewani In Last9

Debugging is rarely anyone's idea of a good time. You're cruising along, building something cool, when suddenly your code breaks and you're stuck digging through console outputs that look like they were written by a robot having an existential crisis. Enter Loguru – the Python logging library that feels like it was built for humans, not machines.

Read Post

Last9

Read more about Python Loguru: The Logging Cheat Code You Need in Your Life

System Center 2025: Migration Insights and Expert Discussion on SCOM, SCORCH, SCSM, and Beyond

Mar 28, 2025 By NiCE IT Management Solutions In NiCE IT Mgmt

The future of IT operations is here! Join us for an exclusive expert panel discussion on Microsoft System Center 2025 updates and migration strategies, where industry leaders will explore the latest advancements and strategies for optimizing enterprise IT environments.

View Video

NiCE IT Mgmt

Read more about System Center 2025: Migration Insights and Expert Discussion on SCOM, SCORCH, SCSM, and Beyond

Visualizing Snyk data with SquaredUp

Mar 28, 2025 By SquaredUp In Squared Up

A brief introduction to getting started with the SquaredUp data source for Snyk. Learn how easy it is to visualize and monitor your Snyk security data in real-time, helping you stay on top of vulnerabilities and improve your overall security posture.

View Video

Squared Up

Read more about Visualizing Snyk data with SquaredUp

Visualizing PostgreSQL data with SquaredUp

Mar 28, 2025 By SquaredUp In Squared Up

Visualizing the data in your PostgreSQL data source couldn't be easier with SquaredUp. In this video, we’ll walk you through how to quickly connect your PostgreSQL database, create stunning, interactive dashboards, and gain deeper insights from your data. Happy dashboarding!

View Video

Squared Up

Read more about Visualizing PostgreSQL data with SquaredUp

5 Critical Network Security Threats for 2025

Mar 28, 2025 By ManageEngine Site24x7 In Site24x7

In this video, we break down the top 5 critical network security threats and show you how Site24x7’s comprehensive security features can help you: Detect misconfigurations before ransomware strikes Identify insider threats with intelligent traffic analysis Secure IoT devices with automated compliance checks Prevent privilege escalation by monitoring configuration changes Protect against supply chain attacks with SDN and SD-WAN monitoring Don’t wait for a security breach to take action! Start monitoring your network today with Site24x7.

View Video

Site24x7

Read more about 5 Critical Network Security Threats for 2025

Pro tips for ManageEngine Site24x7 Capacity Planning

Mar 28, 2025 By ManageEngine Site24x7 In Site24x7

Pro Tips for Smarter Capacity Planning in Site24x7!

View Video

Site24x7

Monitoring

Read more about Pro tips for ManageEngine Site24x7 Capacity Planning

Ruby on Rails Health Check Endpoint: Ensuring Uptime and Peace of Mind

Mar 28, 2025 By Leo Baecker In Hyperping

As your customer base grows, your responsibility to provide a functioning service at all times increases too. Let's explore how you can add a few lines of code to your Ruby on Rails application and gain significant peace of mind!

Read Post

Hyperping

Read more about Ruby on Rails Health Check Endpoint: Ensuring Uptime and Peace of Mind

Last9 MCP Server: Talk to your agent and fix production issues in your local environment

Mar 28, 2025 By Last9 - High Cardinality Monitoring In Last9

Get started at: https://last9.io/mcp/

Read about why we launched an MCP server: https://last9.io/blog/launching-last9-mcp-server

GitHub: https://github.com/last9/last9-mcp-server

View Video

Last9

Read more about Last9 MCP Server: Talk to your agent and fix production issues in your local environment

Last9 MCP Server: Talk to your agent and fix production exceptions in your local environment

Mar 28, 2025 By Last9 - High Cardinality Monitoring In Last9

Demo of using the `get_exceptions` Last9 MCP tool.

View Video

Last9

Read more about Last9 MCP Server: Talk to your agent and fix production exceptions in your local environment

Install OpenTelemetry Collector Contrib (Part 1) #opentelemetry #collector

Mar 28, 2025 By Bindplane In ObservIQ

Check out the full ‪‪@bindplane‬ OTel deep dive workshop.

View Video

ObservIQ

Read more about Install OpenTelemetry Collector Contrib (Part 1) #opentelemetry #collector

Configuring the OpenTelemetry Collector's default config.yaml (Part 2) #opentelemetry #collector

Mar 28, 2025 By Bindplane In ObservIQ

Check out the full ‪‪@bindplane‬ OTel deep dive workshop.

View Video

ObservIQ

Read more about Configuring the OpenTelemetry Collector's default config.yaml (Part 2) #opentelemetry #collector

Identifying Sequential Chain Performance Issues in Waterfall Charts #chromedevtools #coding

Mar 28, 2025 By Request Metrics In Request Metrics

View Video

Request Metrics

Monitoring

Read more about Identifying Sequential Chain Performance Issues in Waterfall Charts #chromedevtools #coding

How to Monitor Login Pages for Performance and Security

Mar 28, 2025 By Dotcom-Monitor In Dotcom-Monitor

Login pages are the front door to your website or application, and just like any front door, they need to be secure and easy to open. If your login page is slow or vulnerable to attacks, it can frustrate users and expose sensitive information. Whether you’re managing a small e-commerce site or a large enterprise application, monitoring your login pages for performance and security is crucial.

Read Post

Dotcom-Monitor

Read more about How to Monitor Login Pages for Performance and Security

JavaScript needs Debug IDs

Mar 28, 2025 By Abhijeet Prasad In Sentry

This represents a big step to getting JavaScript debug ids more formally recognized across the JavaScript ecosystem. It also shows how Sentry is developing the maturity to lead open standards and engage in consensus building. Want to learn more? Read on!

Read Post

Sentry

Read more about JavaScript needs Debug IDs

From Chaos to Clarity With Victorialogs - Tech Talks #3

Mar 28, 2025 By VictoriaMetrics In VictoriaMetrics

In the third episode we will guide you through efficiently ingesting and optimizing log pipelines with. We'll provide actionable insights on streamlining your processes, enhancing performance, and, most importantly, extracting valuable insights from your data to improve your operations, troubleshoot issues, and gain a competitive edge.

View Video

VictoriaMetrics

Read more about From Chaos to Clarity With Victorialogs - Tech Talks #3

Debugging Applications with Tracing

Mar 28, 2025 By Sentry In Sentry

Tracing lets you view the path of a request through the different parts of your application. In this video, Cody shows us how you can use Tracing and Spans to debug everything from performance slowdowns in your applications to authentication problems.

View Video

Sentry

Read more about Debugging Applications with Tracing

System Center 2025 Unveiled: Migration Insights and Expert Discussion

Mar 27, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

Read Post

NiCE IT Mgmt

Read more about System Center 2025 Unveiled: Migration Insights and Expert Discussion

Everything you need to know about HAProxy log format

Mar 27, 2025 By Sumo Logic In Sumo Logic

HAProxy is one of today’s fastest and most widely used load balancing solutions. If you’re already using HAProxy or considering using it in your environment, understanding HAProxy logging is essential. Let’s discuss why HAProxy logging is vital to the load balancer implementation, the logging HAProxy offers, and how to manage and configure HAProxy logs to suit your unique needs.

Read Post

Sumo Logic

Read more about Everything you need to know about HAProxy log format

The Future of Dynamic Observability with Sumo Logic -- Customer Brown Bag -- March 27th, 2025

Mar 27, 2025 By Sumo Logic, Inc. In Sumo Logic

Join us as Sr. Dir. Technical Marketer, Adam White, and Sr. Product Marketing Manager, Hadijah Creary, go beyond the usual technical deep dive—focusing on the mindset, industry trends, and thought leadership shaping modern observability and the future of dynamic observability with Sumo Logic.

View Video

Sumo Logic

Read more about The Future of Dynamic Observability with Sumo Logic -- Customer Brown Bag -- March 27th, 2025

How to use data source variables in Grafana dashboards

Mar 27, 2025 By Grafana In Grafana

Data source variables let you change where Grafana looks for data without having to create duplicate dashboards. So for example, if you have multiple different Prometheus databases, you can have one dashboard and use a data source variable to choose which Prometheus that dashboard uses. We'll look at how to set these up in this video. Grafana Cloud is the easiest way to get started with Grafana dashboards, metrics, logs, and traces. Our forever-free tier includes access to 10k metrics, 50GB logs, 50GB traces and more. We also have plans for every use case.

View Video

Grafana

Read more about How to use data source variables in Grafana dashboards

How JavaScript Execution Can Cause Browser Performance Issues #coding #chromedevtools #programming

Mar 27, 2025 By Request Metrics In Request Metrics

View Video

Request Metrics

Read more about How JavaScript Execution Can Cause Browser Performance Issues #coding #chromedevtools #programming

Starlink Enters Transit Market With Community Gateways

Mar 27, 2025 By Doug Madory In Kentik

Starlink moves beyond being strictly a direct-to-consumer service provider with the recent activations of its Community Gateways. In recent months, Starlink has become a transit provider to a small but growing number of service providers in remote parts of the world as its unique and groundbreaking service continues to evolve.

Read Post

Kentik

Read more about Starlink Enters Transit Market With Community Gateways

How to get started with error budgets to meet SLOs for improved service reliability

Mar 27, 2025 By Ramkumar Ramaswamy In Site24x7

As modern IT systems grow in complexity, IT operations teams have to work harder to ensure reliability. "What gets measured gets managed" is a management mantra that emphasizes the role of metrics in management. To ensure everything works well, operations teams need service-level objectives (SLOs). This industry term measures how an application meets the agreed-upon quality and reliability standards, serving as a bellwether of good software.

Read Post

Site24x7

Read more about How to get started with error budgets to meet SLOs for improved service reliability

From failure to fix: Diagnose Kubernetes Node and Pod problems with Site24x7

Mar 27, 2025 By Grace Nalini In Site24x7

Picture a busy Monday morning. You are working on leftover projects from the previous week, and assuming everything is fine with your applications as you had not received support tickets during the weekend. All of a sudden, during the middle of the day, you get a flood of reports from users who complain about slow response in your application and error pages piling up. You and your team are scrambling hard to figure out the issue.

Read Post

Site24x7

Read more about From failure to fix: Diagnose Kubernetes Node and Pod problems with Site24x7

Don't Let Downtime Define You: 10 Status Page Templates [2025]

Mar 27, 2025 By Leo Baecker In Hyperping

In today's always-on world, your website or application is the lifeblood of your business. Downtime isn't just an inconvenience; it's a threat to your reputation, customer loyalty, and bottom line. As we highlighted in our recent article on MTTR, quickly resolving incidents is crucial. But equally important is how you communicate those incidents to your users. That's where status page templates come in.

Read Post

Hyperping

Read more about Don't Let Downtime Define You: 10 Status Page Templates [2025]

Bulk Edit Your Monitors

Mar 27, 2025 By Leo Baecker In Hyperping

Today we're introducing bulk editing for monitors, making it easier than ever to update multiple monitors simultaneously. This powerful new feature helps you efficiently manage your monitoring setup at scale.

Read Post

Hyperping

Read more about Bulk Edit Your Monitors

An Easy and Comprehensive Guide to Prometheus API

Mar 27, 2025 By Faiz Shaikh In Last9

Monitoring is the backbone of any reliable DevOps setup. And if you’re working with monitoring, you’ve likely used Prometheus. This open-source powerhouse has redefined how we track system performance, but are you making the most of its API? Prometheus is the go-to solution for monitoring container-based environments, particularly in Kubernetes. Its pull-based model and flexible query language provide deep visibility into your systems.

Read Post

Last9

Read more about An Easy and Comprehensive Guide to Prometheus API

21 PromQL Tricks Every Developer Should Know

Mar 27, 2025 By Preeti Dewani In Last9

So you've got Prometheus up and running, but now you're scratching your head looking at those queries. PromQL (Prometheus Query Language) looks simple on the surface, but it packs some serious power once you know how to wield it. Whether you're debugging production issues at 2 AM or building dashboards that actually tell you something useful, these PromQL tricks will upgrade your monitoring game.

Read Post

Last9

Read more about 21 PromQL Tricks Every Developer Should Know

Meet Ted Young, OpenTelemetry co-founder and the newest Grafanista

Mar 27, 2025 By Trevor Jones In Grafana

In just a few short years, OpenTelemetry has become the second largest CNCF project behind Kubernetes and is well on its way to becoming an industry standard for collecting and exporting telemetry data. And with KubeCon + CloudNativeCon Europe 2025 just around the corner, there’s no one better to talk to about the state of OpenTelemetry than Ted Young. Ted is the co-founder of OpenTelemetry and serves on the OpenTelemetry Governance Committee.

Read Post

Grafana

Read more about Meet Ted Young, OpenTelemetry co-founder and the newest Grafanista

License to observe: Why observability solutions need agents

Mar 27, 2025 By Dominik Süß In Grafana

Note: The original version of this blog post published on ;login: on February 24, 2025. When architecting the flow of observability data such as logs, metrics, traces or profiles, you’ve likely noticed that most solutions ask you to deploy an agent or collector. Understandably, you might be hesitant to deploy yet another application just so you can get your data into your storage system of choice.

Read Post

Grafana

Read more about License to observe: Why observability solutions need agents

Stopping the Finger Pointing: Speed Mean Time to Innocence with AppNeta

Mar 27, 2025 By Alec Pinkham In Broadcom

When network issues arise, it doesn’t take long for fingers to start pointing—often in the direction of network operations teams. In such moments, being forced to rely on guesswork or speculative theories is the last thing any team wants. Making matters worse, even if answers are found, but it takes too long to arrive at them, the reputational damage, not to mention the negative repercussions of the actual outage, are already done.

Read Post

Broadcom

Read more about Stopping the Finger Pointing: Speed Mean Time to Innocence with AppNeta

Boosting the Availability of Revenue-Generating Financial Services

Mar 27, 2025 By Alec Pinkham In Broadcom

In any industry, network downtime and performance issues can have a significant cost. But when it comes to financial services, the impact is even more profound, particularly for revenue-generating applications. Financial service firms and their customers rely constantly on these applications. When these applications experience slowdowns or outages, the impact can extend beyond revenue loss and lead to customer dissatisfaction, reduced employee productivity, and potential reputational damage.

Read Post

Broadcom

Read more about Boosting the Availability of Revenue-Generating Financial Services

Prevent Silent Failures and Monitor Any Process with AppSignal Wrap

Mar 27, 2025 By Connor James In AppSignal

Silent failures — like missed cron jobs, database crashes, or backup issues — can cause real damage if they go unnoticed. Traditional monitoring often focuses on requests and server metrics but misses crucial background processes. This creates a significant monitoring blind spot where critical elements of your application can fail without immediate detection. To help eliminate this blind spot, we've introduced AppSignal Wrap.

Read Post

AppSignal

Read more about Prevent Silent Failures and Monitor Any Process with AppSignal Wrap

What does cloud monitoring really mean today?

Mar 27, 2025 By Netdata In netdata

In this short talk, Netdata Founder & CEO Costa Tsaousis breaks down the difference between monitoring in the cloud and monitoring the cloud.

View Video

netdata

Read more about What does cloud monitoring really mean today?

Optimizing Every Layer: From Cloud to On-Premises

Mar 27, 2025 By Virtana Insight In Virtana

As digital infrastructures become more complex, businesses need an agile, unified platform that spans traditional on-premises systems to modern cloud-native environments. At Virtana, our latest feature updates across Global View, Container Observability, and Infrastructure Observability are designed to empower you to optimize every layer of your IT ecosystem.

Read Post

Virtana

Read more about Optimizing Every Layer: From Cloud to On-Premises

Playwright in Production

Mar 27, 2025 By Checkly In Checkly

Join Jonathan Canales and Filip Hric (@filip_hric) in this webinar, "Playwright in Production," as they dive into practical use cases for enhancing your workflow with Playwright automations.

View Video

Checkly

Read more about Playwright in Production

Use Playwright's "addLocatorHandler" to close unpredictable UI elements!

Mar 27, 2025 By Checkly In Checkly

Join Stefan Judis, Playwright ambassador, as he explains how to automatically close and remove unpredictable UI elements like cookie banners to prevent your end-to-end tests from failing.

View Video

Checkly

Read more about Use Playwright's "addLocatorHandler" to close unpredictable UI elements!

Do you have to be an SRE to get value from the 2025 SRE Report? #sre #devops #IT

Mar 27, 2025 By Catchpoint In Catchpoint

The answer is no! Check out the 2025 SRE Report for the latest trends and insights: https://www.catchpoint.com/asset/2025-sre-report

#sre #devops #IT

View Video

Catchpoint

Read more about Do you have to be an SRE to get value from the 2025 SRE Report? #sre #devops #IT

An aerial view of your Azure DevOps, GitHub Actions, and Jenkins pipeline landscape

Mar 27, 2025 By John Hayes In Squared Up

Most engineering teams today have a multiplicity of tools to meet all of the different challenges they face. Some people characterize this as a problem and describe it as 'tool sprawl'. At SquaredUp, we just see it as a fact of life that no tool can excel at every job and engineers will want to choose the best tool for each task. Many companies have multiple toolchains spread across different teams and departments.

Read Post

Squared Up

Read more about An aerial view of your Azure DevOps, GitHub Actions, and Jenkins pipeline landscape

Sponsored Post

Monitoring for operations of SAP S/4HANA Cloud, public edition

Mar 26, 2025 By Robert MacDonald In Avantra

"Do I need to monitor SAP S/4HANA Cloud, public edition?" is the question many SAP customers are asking right now as projects are going live. As an SaaS product run by SAP, customers get access only through a public website, and SAP are responsible for the availability of that website and the hardware resources. The places where traditional monitoring focussed either aren't relevant, aren't visible, or superficially aren't the customer's problem anymore. Does that mean there is no need to monitor anything in SAP S/4HANA Cloud, public edition?

Read Post

Avantra

Read more about Monitoring for operations of SAP S/4HANA Cloud, public edition

Retail digital performance event recap: Key insights from IBM & Catchpoint

Mar 26, 2025 By Howard Beader In Catchpoint

We hosted the first IBM and Catchpoint Retail Digital Performance event on Wednesday, March 19, 2025. The sessions offered practical, thought-provoking insights on speed, resilience, and user-centric design—giving attendees fresh strategies to improve digital experiences at scale.

Read Post

Catchpoint

Read more about Retail digital performance event recap: Key insights from IBM & Catchpoint

Fine-Grained Authorization for Saved Searches

Mar 26, 2025 By Rishita Rai In Splunk

Splunk is excited to provide fine-grained authorization for Knowledge Objects starting with Saved Searches. Saved Searches are the most used Knowledge Object (KOs), and admins spend the most time delegating access to users for Saved Searches.

Read Post

Splunk

Read more about Fine-Grained Authorization for Saved Searches

Dashboard updates: Fewer clicks, more control, faster widget building

Mar 26, 2025 By Alexandra Cota In Sentry

You're reviewing your production metrics when suddenly an error spike appears on your dashboard. Your immediate thought isn't "how do I build a new view to investigate this?" but rather "how do I find out the cause quickly?" This is exactly what happened to one of our engineering teams last month when they spotted an unusual pattern in their API response times. Instead of running ad-hoc queries from scratch, they turned to a custom dashboard they had built after a past incident.

Read Post

Sentry

Read more about Dashboard updates: Fewer clicks, more control, faster widget building

Preventing Alert Storms with InfluxDB 3's Processing Engine Cache

Mar 26, 2025 By Paul Dix In InfluxData

A common problem in monitoring and alerting systems is not just alerting on what you’re seeing but preventing alert storms from overwhelming operators. When a system generates multiple notifications for the same incident, it leads to alert fatigue and can mask other important issues. For time series data, alert fatigue can result in missed anomalies, delayed responses to critical trends, and difficulty distinguishing real performance degradations from noise.

Read Post

InfluxData

Read more about Preventing Alert Storms with InfluxDB 3's Processing Engine Cache

Better CloudWatch Metrics in Honeycomb with the OpenTelemetry Collector

Mar 26, 2025 By Davin Taddeo In Honeycomb

CloudWatch metrics can be a very useful source of information for a number of AWS services that don’t produce telemetry as well as instrumented code. There are also a number of useful metrics for non-web-request based functions, like metrics on concurrent database requests. We use them at Honeycomb to get statistics on load balancers and RDS instances. The Amazon Data Firehose is able to export directly to Honeycomb as well, which makes getting the data into Honeycomb straightforward.

Read Post

Honeycomb

Read more about Better CloudWatch Metrics in Honeycomb with the OpenTelemetry Collector

Top 6 EC2 rightsizing recommendations that you can't ignore

Mar 26, 2025 By CloudSpend In ManageEngine

Imagine a day at work where you realize that your team’s youngest developer has failed to kill a compute instance; the bill spikes and the budget is breached. Rightsizing recommendations would come to the rescue and play a crucial role in such situations by identifying underutilized, overutilized, or mismanaged resources and suggesting corrective actions.

Read Post

ManageEngine

Read more about Top 6 EC2 rightsizing recommendations that you can't ignore

How to Receive IncidentHub Alerts in your Webhook

Mar 26, 2025 By Hrishikesh Barua In IncidentHub

IncidentHub has many integrations to receive alerts. You can choose from Slack, Webhook, Email, Discord, PagerDuty, and more. In this article, we will explore how to receive IncidentHub alerts in your webhooks.

Read Post

IncidentHub

Read more about How to Receive IncidentHub Alerts in your Webhook

Top 10 Changes and Key Improvements in Apache Kafka 4.0.0

Mar 26, 2025 By Navdeep Sidhu In meshIQ

In this post, we summarize the major changes in the recently officially released Apache Kafka 4.0.0 version. We will look at the most notable features compared to the previous versions and explain what these changes mean in real production environments and what improvements they can bring to your streaming infrastructure.

Read Post

meshIQ

Read more about Top 10 Changes and Key Improvements in Apache Kafka 4.0.0

Debugging performance issues in Azure Service Bus

Mar 26, 2025 By Mahalashmi Narayanan In Site24x7

Azure Service Bus is a critical messaging service for building scalable cloud applications, but performance bottlenecks can lead to delayed message processing, throttling, or even dropped messages. It is essential to identify and resolve these issues to maintain smooth application workflows and prevent downtime. This blog explores common Azure Service Bus performance problems, provides step-by-step debugging strategies, and highlights how proactive monitoring can prevent recurring issues.

Read Post

Site24x7

Read more about Debugging performance issues in Azure Service Bus

Utilizing browser emulation and automation languages in digital experience monitoring

Mar 26, 2025 By Bela Susan Thomas In Site24x7

With multiple factors affecting the performance of online businesses, offering glitch-free transactions has become a necessity. A key component of delivering great user experience is effective digital experience monitoring(DEM), which involves closely tracking performance across different devices, browsers, and locations.

Read Post

Site24x7

Read more about Utilizing browser emulation and automation languages in digital experience monitoring

Dynatrace vs Elastic stack - A Detailed Comparison for 2025

Mar 26, 2025 By Pavithra Parthiban In Atatus

Organizations looking for monitoring and observability solutions often compare ELK (Elasticsearch, Logstash, and Kibana) and Dynatrace. While both tools serve the purpose of log management and monitoring, their approaches, features, and use cases differ significantly. This article provides an in-depth ELK Stack vs Dynatrace comparison, helping users understand which tool best suits their needs.

Read Post

Atatus

Read more about Dynatrace vs Elastic stack - A Detailed Comparison for 2025

Top 7 Microservices Monitoring Tools to Consider in 2025

Mar 26, 2025 By Anjali Udasi In Last9

Let's talk about keeping those microservices in check. If you're running a distributed system (and who isn't these days?), you know the drill – more services mean more potential failure points. We've got the lowdown on the best microservices monitoring tools that'll have your back in 2025.

Read Post

Last9

Read more about Top 7 Microservices Monitoring Tools to Consider in 2025

RabbitMQ Logs: Monitoring, Troubleshooting & Configuration

Mar 26, 2025 By Prathamesh Sonpatki In Last9

If your RabbitMQ queues keep growing and you have no idea why, or if messages aren’t getting picked up like they should, logs can save you a lot of guesswork. They’re basically a detailed record of what’s happening behind the scenes. This guide breaks down where to find RabbitMQ logs, how to set them up, and what to look for when things start acting up. Consider it your go-to cheat sheet for keeping RabbitMQ running smoothly.

Read Post

Last9

Read more about RabbitMQ Logs: Monitoring, Troubleshooting & Configuration

Ubuntu Crash Logs: Find, Fix, and Prevent System Failures

Mar 26, 2025 By Preeti Dewani In Last9

If your system keeps crashing and you have no clue why, Ubuntu’s crash logs might have the answers. Whether you’re running a production server or just trying to keep your personal setup stable, these logs tell you exactly what went wrong. Instead of sifting through endless system logs, Ubuntu gives you focused crash reports—kind of like a security camera that only records when something breaks. Let’s break down where to find these logs and how to make sense of them.

Read Post

Last9

Read more about Ubuntu Crash Logs: Find, Fix, and Prevent System Failures

Grafana 11.6 release: new data visualization features, LBAC for metrics data sources, alerting updates, and more

Mar 26, 2025 By Grafana Labs Team In Grafana

Our engineering team is hard at work on Grafana 12, the next major release of the open source data visualization platform that we’re launching at GrafanaCON this May, but in the meantime, Grafana 11.6 is officially here — and there’s a lot to be excited about. The latest minor release delivers a number of new dashboarding features, including one-click data links and actions, along with other notable updates related to security, alerting, and more.

Read Post

Grafana

Read more about Grafana 11.6 release: new data visualization features, LBAC for metrics data sources, alerting updates, and more

How we structure on-call rotations at Datadog

Mar 26, 2025 By Laura de Vesine In Datadog

A well-structured on-call rotation helps you ensure the reliability of your services and meet your customers’ expectations by designating staff to respond to emerging issues. But the pressures of on-call work—such as long shifts, overnight hours, and dynamic situations—can compromise the well-being of your team members. This makes it harder for them to maximize service uptime during their on-call shifts and can limit the velocity of the feature work they do outside of their on-call duty.

Read Post

Datadog

Read more about How we structure on-call rotations at Datadog

How to create an effective paging strategy

Mar 26, 2025 By Addie Beach In Datadog

Empowered engineers and effective tools are the foundation of incident management, and having a solid on-call process can help facilitate both. In practice, however, many paging approaches have the opposite effect, often overwhelming responders and increasing burnout. To create an effective paging strategy, organizations should focus responder attention on the most important issues and help facilitate a sense of ownership over them.

Read Post

Datadog

Read more about How to create an effective paging strategy

Grafana Loki: The do's and don'ts of scaling Loki cache

Mar 26, 2025 By Grafana In Grafana

In this Loki community call, Poyzan (Staff Engineer on the Loki team) returns this time with Paul Rogers (Staff Engineer on the Loki team) to talk all about Loki cache! This topic was heavily discussed in our "How to run Loki at scale on Kubernetes" call.

View Video

Grafana

Read more about Grafana Loki: The do's and don'ts of scaling Loki cache

New in 11.6: LBAC for Metrics Data Sources | Demo | Grafana Labs

Mar 26, 2025 By Grafana In Grafana

Learn how LBAC for metrics data sources, a feature in Grafana 11.6, allows fine-grained access control to data sources by filtering metrics based on labels.

View Video

Grafana

Read more about New in 11.6: LBAC for Metrics Data Sources | Demo | Grafana Labs

Exploring the Resource Loading Process in an HTML Document #coding #webdevelopertools #programming

Mar 26, 2025 By Request Metrics In Request Metrics

View Video

Request Metrics

Monitoring

Read more about Exploring the Resource Loading Process in an HTML Document #coding #webdevelopertools #programming

System Center 2025 Unveiled | Migrating SCOM SCORCH SCSM and Beyond

Mar 26, 2025 By NiCE IT Management Solutions In NiCE IT Mgmt

The future of IT operations is here! Join us for an exclusive expert panel discussion on Microsoft System Center 2025, where industry leaders will explore the latest advancements and strategies for optimizing enterprise IT environments.

View Video

NiCE IT Mgmt

Read more about System Center 2025 Unveiled | Migrating SCOM SCORCH SCSM and Beyond

Web Optimization for 2025: Tools & Methods to Boost Performance

Mar 26, 2025 By Catchpoint In Catchpoint

Every second counts. Web performance isn’t just a technical task—it’s a business imperative. Today’s users expect fast, seamless, and reliable digital experiences. In 2025, these expectations have never been higher. In this webinar, you’ll hear from experts on advanced web optimization methods, tools, and strategies to help you enhance performance, deliver exceptional user experiences, and implement continuous optimization to stay ahead in 2025.

View Video

Catchpoint

Monitoring

Read more about Web Optimization for 2025: Tools & Methods to Boost Performance

Global View: Optimizing Every Layer with Innovative New Capabilities for On-Premise

Mar 26, 2025 By Meeta Lalwani In Virtana

Managing a hybrid IT environment is more complex than ever, requiring real-time visibility, automation, and intelligent cost control across cloud and on-premises infrastructure. Virtana’s latest innovations help organizations streamline operations, optimize costs, and enhance security, validating that every layer of IT—whether in the cloud or on-prem—operates at peak efficiency.

Read Post

Virtana

Read more about Global View: Optimizing Every Layer with Innovative New Capabilities for On-Premise

Container Observability: Optimizing Every Layer with Innovative New Capabilities for Kubernetes & Windows

Mar 26, 2025 By David McNerney In Virtana

Managing containerized workloads and Windows environments requires more than just basic monitoring—it demands deep observability to prevent performance bottlenecks, optimize costs, and accelerate troubleshooting. Virtana’s latest Container Observability enhancements provide IT teams with greater control, visibility, and analytics across Kubernetes and Windows-based workloads.

Read Post

Virtana

Read more about Container Observability: Optimizing Every Layer with Innovative New Capabilities for Kubernetes & Windows

Infrastructure Observability: Optimizing Every Layer with Innovative New Capabilities

Mar 26, 2025 By Marc Bachmeier In Virtana

Modern IT environments are complex, spanning on-premises, cloud, and hybrid infrastructures. Without deep observability at every layer, performance bottlenecks, inefficiencies, and troubleshooting challenges can drain resources and impact business outcomes. Virtana’s latest Infrastructure Observability enhancements are designed to eliminate blind spots, automate performance tuning, and simplify IT operations.

Read Post

Virtana

Read more about Infrastructure Observability: Optimizing Every Layer with Innovative New Capabilities

Unlock the Power of Global Dashboard Parameters

Mar 26, 2025 By Winston Bowden In Lumigo

In the world of observability, data is only as valuable as your ability to interact with it. That’s why we’re introducing Global Dashboard Parameters, a new Lumigo feature that empowers you to filter and explore your data across traces, logs, and metrics more dynamically than ever before.

Read Post

Lumigo

Read more about Unlock the Power of Global Dashboard Parameters

Auvik Execs react to earning 5-Star Rating in 2025 CRN Partner Program Guide

Mar 26, 2025 By Auvik In Auvik

We're over the moon that we earned a 5-star rating in this year's @CRN Partner Program Guide! So we asked Auvik execs to share what the rating means to them and what sets Auvik's Partner Program apart... here's what they said!

View Video

Auvik

Read more about Auvik Execs react to earning 5-Star Rating in 2025 CRN Partner Program Guide

Easiest Way to Monitor Your Java Application Using OpenTelemetry

Mar 25, 2025 By Benjamin Pitts In MetricFire

When you're running a Java application, the JVM is doing a ton of work behind the scenes but unless you're actively collecting its internal metrics, you're essentially flying blind. Fortunately, the JMX Prometheus Receiver paired with the JMX Java Exporter Agent offers one of the simplest and most effective ways to expose JVM performance data.

Read Post

MetricFire

Read more about Easiest Way to Monitor Your Java Application Using OpenTelemetry

New Relic vs DataDog - Features, Pricing, and Performance Compared (2025)

Mar 25, 2025 By Ankit Anand In SigNoz

New Relic vs DataDog: Both tools are popular for application and infrastructure monitoring, offering a wide range of features. This post compares New Relic and DataDog on key aspects like APM, log management, infrastructure monitoring, and OpenTelemetry support. Info I instrumented a sample Spring Boot Application and sent data to Datadog and New Relic to evaluate my experience. Some takeaways are subjective and based on personal preference.

Read Post

SigNoz

Read more about New Relic vs DataDog - Features, Pricing, and Performance Compared (2025)

7 Open-Source Log Management Tools that You Can Consider in 2025

Mar 25, 2025 By Favour Daniel In SigNoz

Open-source log management tools provide cost-effective, customizable approaches for collecting and analyzing log data. They help teams quickly identify patterns, spot anomalies, and resolve issues. With numerous options available, it's important to understand their strengths and limitations. This article examines the top open-source log management tools in 2025, focusing on their capabilities, performance, and best use cases.

Read Post

SigNoz

Read more about 7 Open-Source Log Management Tools that You Can Consider in 2025

Why your business can't afford to skip website monitoring

Mar 25, 2025 By Mattias Geniar In Oh Dear

Your website is your business’ storefront, sales team, customer service department, and potentially even your primary revenue channel. Just like you’d protect the physical presence of these aspects of your business with a security system, you also need to protect the online aspects too. That means keeping an eye on your website with monitoring.

Read Post

Oh Dear

Read more about Why your business can't afford to skip website monitoring

Observability Pipeline: An Easy-to-Follow Guide for Engineers

Mar 25, 2025 By Anjali Udasi In Last9

You've got systems spitting out more logs, metrics, and traces than you can handle. Your monitoring costs are through the roof. And somehow, when something breaks at 3 AM, you still can't find the exact data you need. Sound familiar? Welcome to the observability pipeline conversation—no jargon, no fluff.

Read Post

Last9

Read more about Observability Pipeline: An Easy-to-Follow Guide for Engineers

Zero Code Instrumentation: The Missing Link in Observability

Mar 25, 2025 By Anjali Udasi In Last9

Have you ever struggled with systems that fail to tell you what went wrong? The kind where you’re digging through logs at 2 AM while alerts keep piling up. In DevOps, clear visibility into your applications isn’t a luxury—it’s essential. This is where instrumentation without code changes can help. It simplifies observability, reducing the manual effort needed to track down issues. If you haven’t explored it yet, you might be making troubleshooting harder than it needs to be.

Read Post

Last9

Read more about Zero Code Instrumentation: The Missing Link in Observability

End-to-End Monitoring: Your Guide to System Visibility

Mar 25, 2025 By Faiz Shaikh In Last9

Have you ever dealt with an outage in the middle of the night with no clear cause? Or struggled to understand why your application suddenly slowed down? End-to-end monitoring helps you connect the dots, ensuring you’re not left guessing when things go wrong.

Read Post

Last9

Read more about End-to-End Monitoring: Your Guide to System Visibility

The state of observability in 2025: a deep dive on our third annual Observability Survey

Mar 25, 2025 By Trevor Jones In Grafana

Across companies of all shapes and sizes, observability practices are maturing and getting attention at the highest levels. At the same time, cost and complexity continue to hinder efforts as teams look to emerging tools to help simplify their processes in hopes of better outcomes. With so much in flux, we went into our third annual Observability Survey hoping to get a window into the ways the community is approaching observability and where it wants it to go next.

Read Post

Grafana

Read more about The state of observability in 2025: a deep dive on our third annual Observability Survey

Keeping Compliance Headache-Free: Automating Network Audits for Security and Efficiency

Mar 25, 2025 By ScienceLogic In ScienceLogic

Regulatory compliance is a moving target, and keeping up with evolving security policies and industry regulations can feel like a never-ending battle. Manual network audits? They’re slow, error-prone, and a major time sink. But skipping them isn’t an option—compliance failures can lead to security breaches, hefty fines, and reputational damage. So, how can IT teams ensure they stay ahead without burning out? The answer: automation and real-time observability.

Read Post

ScienceLogic

Read more about Keeping Compliance Headache-Free: Automating Network Audits for Security and Efficiency

Grafana 12, k6 1.0, Mimir 3.0 + One-of-a-Kind Experiences | Here's What's Coming to GrafanaCON 2025

Mar 25, 2025 By Grafana In Grafana

From Grafana 12 and k6 1.0 to our inaugural Science Fair and a "Night at the Museum" reception, the GrafanaCON 2025 agenda includes a lot to be excited about.

View Video

Grafana

Read more about Grafana 12, k6 1.0, Mimir 3.0 + One-of-a-Kind Experiences | Here's What's Coming to GrafanaCON 2025

The Biggest Trends Shaping Observability in 2025: Highlights from Grafana Labs' Observability Survey

Mar 25, 2025 By Grafana In Grafana

The Grafana Labs 3rd annual Observability Survey has landed and we're excited to launch a limited video series that breaks down the findings from over 1200 observability practitioners and leaders around the world. In this video, CTO Tom Wilkie breaks down the 4 biggest trends shaping observability in 2025 across open source, executive buy-in, AI, and cost vs. value. Stay tuned for more video explainers!

View Video

Grafana

Read more about The Biggest Trends Shaping Observability in 2025: Highlights from Grafana Labs' Observability Survey

Cribl Copilot Brings Natural Language Queries to Cribl Search

Mar 25, 2025 By Cribl In Cribl

Join us today as we look at generating Cribl Search queries directly from natural language inputs to reduce errors and accelerate your time to value.

View Video

Cribl

Read more about Cribl Copilot Brings Natural Language Queries to Cribl Search

Connected Devices: Unlocking the next frontier of Internet Performance Monitoring

Mar 25, 2025 By Howard Beader In Catchpoint

While incidents like last year’s CrowdStrike outage tend to dominate headlines, far more often, the real battle for Internet Resilience isn’t fought on a global stage. It’s waged in the shadows of financial districts, within overloaded cloud data centers, or a rural ISP’s overtaxed peering points. Traditional monitoring tools, designed for broad strokes, miss these hyper-specific failures.

Read Post

Catchpoint

Read more about Connected Devices: Unlocking the next frontier of Internet Performance Monitoring

Anomaly Alerts Now in Open Beta: Smarter Monitoring, Fewer False Alarms

Mar 25, 2025 By Rachel Wang In Sentry

A few weeks ago, we introduced anomaly alerts to early adopters. Today, we’re excited to announce that anomaly alerts are officially in open beta and available to all Sentry users on the Trial, Business, or Enterprise plans.

Read Post

Sentry

Read more about Anomaly Alerts Now in Open Beta: Smarter Monitoring, Fewer False Alarms

Understanding observability metrics: Types, golden signals, and best practices

Mar 25, 2025 By Elastic Observability Team In Elastic

Observability metrics provide insights into the performance, behavior, and health of applications, systems, and infrastructure — enabling observability practices, which is how a system’s internal state is understood by examining its data. As organizations continue to collect more and more data, observability metrics are a key telemetry signal for observability.

Read Post

Elastic

Read more about Understanding observability metrics: Types, golden signals, and best practices

How to Set Up Real-Time SMS/WhatsApp Alerts with InfluxDB 3 Processing Engine

Mar 25, 2025 By Suyash Joshi In InfluxData

In Industrial IoT for real-time monitoring, timely alerts are crucial. While Slack and email notifications are common, they can be easily missed or buried in a flood of other notifications. SMS and WhatsApp on the other hand, offer a level of immediacy and directness that’s hard to ignore.

Read Post

InfluxData

Read more about How to Set Up Real-Time SMS/WhatsApp Alerts with InfluxDB 3 Processing Engine

How does a prompt turn into a response?

Mar 25, 2025 By solarwindsinc In SolarWinds

Connect with SolarWinds.

View Video

SolarWinds

Read more about How does a prompt turn into a response?

Kubernetes Cluster Metrics 101 - ManageEngine Site24x7

Mar 25, 2025 By ManageEngine Site24x7 In Site24x7

Managing Kubernetes clusters without visibility is like flying blind. With Site24x7, you get real-time insights into critical Kubernetes metrics to keep your clusters running smoothly.

View Video

Site24x7

Read more about Kubernetes Cluster Metrics 101 - ManageEngine Site24x7

Decoding AI-led event correlation for mastering modern IT management

Mar 25, 2025 By Ramkumar Ramaswamy In Site24x7

"The whole is more than the sum of its parts," said Aristotle. This quote fits the amazing world of modern IT, where several intricate, interwoven, and intensely dynamic ecosystems come together. Today, every component, from applications and microservices to networks and databases, interacts dynamically. To ensure seamless operations, IT teams are expected to decode the language of these interactions: events and incidents.

Read Post

Site24x7

Read more about Decoding AI-led event correlation for mastering modern IT management

Leveraging AI for enhanced network monitoring in healthcare: A guide for CXOs

Mar 25, 2025 By Rama Venkatesan In Site24x7

During emergencies and illnesses, people expect intuitive healthcare services. When multiple tests and reports are involved, patients anticipate that the results will be available to their doctors instantly for quick diagnoses. Waiting for a paper copy of each test result is not feasible.

Read Post

Site24x7

Read more about Leveraging AI for enhanced network monitoring in healthcare: A guide for CXOs

Waterfall Charts - Concepts of Web Performance

Mar 25, 2025 By Request Metrics In Request Metrics

View Video

Request Metrics

Monitoring

Read more about Waterfall Charts - Concepts of Web Performance

What is a Branch in Git and How to Use It - Ultimate Guide

Mar 25, 2025 By Vaishnavi In Atatus

Developing a website or software isn't easy, a team of developers will be developing a new feature, other team will be testing whether the built feature works as expected, other might be fixing the bugs and so on. Managing these different versions of same code base must be a little tricky. Here comes the concept called branch in git which is used as a pointer to a snapshot of your changes. When we talk about branches in git these are the major questions that arises in our mind.

Read Post

Atatus

Read more about What is a Branch in Git and How to Use It - Ultimate Guide

A Guide to Logging in React Native

Mar 25, 2025 By Matthew C. In Sentry

Basic console logging is a good starting point for debugging and understanding an app. For larger, more complex apps, it’s helpful to include additional information and persist logs. In this guide, you’ll learn how to create and view logs in React Native and how to create and save custom logs to a file. We’ll focus on JavaScript logs.

Read Post

Sentry

Read more about A Guide to Logging in React Native

How IoT and Dual Dash Cams Keep Drivers in Focus

Mar 25, 2025 By OpsMatters In OpsMatters

Picture this: you're managing a fleet of delivery trucks, and one of your drivers is out on a long haul. You can't ride along to make sure they're driving safely, but what if you could keep an eye on them anyway? That's where IoT and dual dash cams step in. These aren't just regular cameras-they're smart, connected, and built to keep drivers in focus, both literally and figuratively. In today's fast-paced world, where safety and efficiency are everything, these tools are a total game-changer.

Read Post

OpsMatters

Read more about How IoT and Dual Dash Cams Keep Drivers in Focus

Fine Tuning (RAG) or Retrieval Augmented Generation when dealing with multi-domain datasets?

Mar 24, 2025 By Shailesh Manjrekar In Fabrix

In the world of large language models (LLMs), two approaches have dominated how we adapt AI to specific use cases: Retrieval-Augmented Generation (RAG) and Fine-Tuning. But the landscape is rapidly evolving with advanced techniques like MoE, LoRA, and GRPO. Let’s explore how these approaches compare and combine to create more powerful AI systems.

Read Post

Fabrix

Read more about Fine Tuning (RAG) or Retrieval Augmented Generation when dealing with multi-domain datasets?

CLI Tool for Monitoring for Key System Metrics - Here's How It Works!

Mar 24, 2025 By Elliot Langston In MetricFire

At MetricFire, we’re always looking for ways to make monitoring more efficient and accessible. That’s why we’re excited to introduce the MetricFire HG-CLI, our new command-line tool designed to make setting up server monitoring faster and easier than ever. Just like our Hosted Graphite service, the HG-CLI is built on open-source flexibility while focusing on simplicity, eliminating the hassle of manual configurations and streamlining the onboarding process for teams of all sizes.

Read Post

MetricFire

Read more about CLI Tool for Monitoring for Key System Metrics - Here's How It Works!

Revolutionize Product Development with Feedback-Driven Customer Advisory Boards

Mar 24, 2025 By Barbara Janczer In Splunk

In a rapidly evolving business landscape, understanding and responding to customer needs is not just an advantage — it's a necessity. At Splunk, we've taken a bold step by applying a product manager mindset to our Customer Advisory Board (CAB) program, transforming it into a dynamic platform for both customers and our product teams.

Read Post

Splunk

Read more about Revolutionize Product Development with Feedback-Driven Customer Advisory Boards

What Is CDN? The Complete Guide for DevOps Engineers

Mar 24, 2025 By Anjali Udasi In Last9

Has your site ever slowed down under peak traffic? For DevOps teams managing scalability, Content Delivery Networks (CDNs) are a critical component of modern infrastructure. This guide explains what CDNs are, how they work, and why they’re essential for performance and reliability.

Read Post

Last9

Read more about What Is CDN? The Complete Guide for DevOps Engineers

No-Jargon Guide to Application Dependency Mapping

Mar 24, 2025 By Faiz Shaikh In Last9

Your systems are complex—multiple services talking to each other, third-party APIs doing their thing, and databases working overtime. Without a clear map of what's connecting to what, you're flying blind. That's where application dependency mapping comes in.

Read Post

Last9

Read more about No-Jargon Guide to Application Dependency Mapping

An In-Depth Metricbeat Guide for DevOps Teams

Mar 24, 2025 By Preeti Dewani In Last9

Metricbeat is a powerful tool that can transform how you monitor your systems and services. If you're working in DevOps or as an SRE, this guide will help you understand and implement Metricbeat effectively in your environment.

Read Post

Last9

Read more about An In-Depth Metricbeat Guide for DevOps Teams

10+ Best SaaS Monitoring Tools: Ensure Optimal Performance for Your Applications

Mar 24, 2025 By Colin Bartlett In StatusGator

With the SaaS market valued at approximately $250 billion in 2025 and projected to reach $299 billion by the end of the year, businesses are increasingly relying on SaaS monitoring tools to ensure the optimal performance of their cloud applications. Ensuring the availability, security, and performance of these applications is vital to maintaining business continuity. That’s where SaaS monitoring tools come into play.

Read Post

StatusGator

Read more about 10+ Best SaaS Monitoring Tools: Ensure Optimal Performance for Your Applications

Key Differences Between Docker and Kubernetes: A Comprehensive Guide

Mar 24, 2025 By Wendy Howard In eG Innovations

As microservices-based architectures have taken off, Docker and Kubernetes have risen as two leading platforms for container operations. While Docker helped popularize the container model, Kubernetes has evolved into a versatile solution for orchestrating production container workloads at a massive scale. However, their similarities obscure important distinctions in how each approaches container management. This post sheds light on the functional differences between Docker and Kubernetes.

Read Post

eG Innovations

Read more about Key Differences Between Docker and Kubernetes: A Comprehensive Guide

DX Operational Observability and Native Integration of Synthetics: Enable Synthetics for Proactive Issue Identification and Remediation

Mar 24, 2025 By Jörg Mertin In Broadcom

With application synthetic monitoring capabilities, DX Operational Observability (DX O2) can monitor websites and other services by probing the target from various globally distributed monitoring stations. These capabilities, which support SaaS and on-premises deployments, help teams shift from reactive to proactive management, elevate user experience for monitoring, and raise observability to a new level.

Read Post

Broadcom

Read more about DX Operational Observability and Native Integration of Synthetics: Enable Synthetics for Proactive Issue Identification and Remediation

AI for Software Engineering is Just Another "Paradigm Shift"

Mar 24, 2025 By Ken Rimple In Honeycomb

Take a drink if you hate that term. Tim O’Reilly has written a post about the huge shift AI is making with our relationship to software engineering (read: The End of Programming as we Know It). Scary title. But the content is less so. Why?

Read Post

Honeycomb

Read more about AI for Software Engineering is Just Another "Paradigm Shift"

Uptime Monitoring from Sentry

Mar 24, 2025 By Sentry In Sentry

Uptime monitoring is live in Sentry! Uptime monitors are automatically created based on the most common URL for your application, or you can configure them manually. Uptime monitoring uses tracing to get better views into the health of your services. Everyone gets 1 uptime monitor for free; go check them out in the Alerts section in Sentry.

View Video

Sentry

Monitoring

Read more about Uptime Monitoring from Sentry

Debugging Applications With Sentry

Mar 24, 2025 By Sentry In Sentry

Sentry is all about bringing together all the context that comes along with when your application is having problems into one place, so you can debug issues faster and get applications up and running. In this End to End demo video, Cody takes you through a common workflow including Sentry's AI powered Autofix, Stack Traces, Session Replays, and diving into Traces and Spans for debugging.

View Video

Sentry

Read more about Debugging Applications With Sentry

How Obkio Works: A Technical Overview of Obkio's Network Performance Monitoring Tool

Mar 24, 2025 By Obkio In Obkio

With its innovative, performance-focused approach, Obkio’s Network Performance Monitoring tool sends up to 95% fewer unnecessary alerts than traditional NPM solutions. In this short video, discover how Obkio’s powerful, easy-to-deploy network monitoring platform helps you quickly diagnose network and application issues for all types of users and networks.

View Video

Obkio

Read more about How Obkio Works: A Technical Overview of Obkio's Network Performance Monitoring Tool

8 Common Zoom Network Issues & How to Fix Them

Mar 24, 2025 By Andrii Kernitskyi In Obkio

Zoom has become a lifeline for remote work, virtual meetings, and online collaboration. But even the best tools can crumble when network performance takes a hit. For remote users, nothing is more frustrating than a Zoom call that freezes, lags, or drops mid-conversation. The truth is, most Zoom (or AWS because Zoom is supported by AWS) performance issues aren’t caused by the platform itself — they’re rooted in network problems.

Read Post

Obkio

Read more about 8 Common Zoom Network Issues & How to Fix Them

Continuous compliance monitoring in dynamic network environments

Mar 23, 2025 By Rama Venkatesan In Site24x7

With hybrid cloud models and multi-cloud infrastructures, network administrators often find that managing compliance requires constant ingenuity that’s as fluid and unpredictable as the technologies they’re using. For CXOs, it’s a ticking time bomb. One wrong turn or a misstep in managing compliance could lead to penalties, legal nightmares, and a reputation that takes years to rebuild. So, the real question is: How do you keep up with the tech landscape and stay compliant?

Read Post

Site24x7

Read more about Continuous compliance monitoring in dynamic network environments

What is Git Checkout Remote Branch? Benefits, Best Practices & More

Mar 23, 2025 By Janani In Atatus

Git is a terrific tool that many developers use to keep track of their projects’ versions. Despite the fact that there are many different version control systems, git is by far the most used. The focus on distributed development and the ease with which branches can be used for good reasons.

Read Post

Atatus

Read more about What is Git Checkout Remote Branch? Benefits, Best Practices & More

Copy as Markdown

Mar 22, 2025 By Sentry In Sentry

Pro tip: Cmd+Option+C (or Ctrl+Alt+C) in a Sentry issue, and share the Markdown with others or... paste into a LLM to ask questions about your issues!

View Video

Sentry

Monitoring

Read more about Copy as Markdown

Deferring Script Execution Until DOM Content Loaded #coding #webdevelopertools #programming

Mar 22, 2025 By Request Metrics In Request Metrics

View Video

Request Metrics

Read more about Deferring Script Execution Until DOM Content Loaded #coding #webdevelopertools #programming

Sponsored Post

SCOM, PRTG, and Beyond: Navigating the IT Monitoring Landscape

Mar 21, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

This whitepaper highlights the role of IT monitoring in complex environments by exploring SCOM, PRTG, and other leading tools. It provides an in-depth comparison of these monitoring tools, focusing on capabilities, strengths, and limitations. By leveraging insights from various monitoring tools, organizations can optimize performance, enhance system reliability, as well as streamline operations. This whitepaper aims to guide IT professionals in selecting the most suitable monitoring tool for their specific needs, ensuring proactive management and peak IT infrastructure performance.

Read Post

NiCE IT Mgmt

Read more about SCOM, PRTG, and Beyond: Navigating the IT Monitoring Landscape

Top 5 dashboards for DevOps leaders

Mar 21, 2025 By John Hayes In Squared Up

If you are a DevOps manager you will be keenly aware that the role involves managing multiple toolchains across different clouds, platforms and environments. You also need to report on KPIs, DORA metrics, governance, security and a lot more. At SquaredUp, we understand these demands and have developed a suite of plugins and ready-to-run dashboards to help you reduce toil as well as pull all of your key analytics together within a single pane of glass.

Read Post

Squared Up

Read more about Top 5 dashboards for DevOps leaders

Zendesk outage: A case for proactive monitoring and faster incident response

Mar 21, 2025 By Kshantha Sagar In Catchpoint

On March 20, 2025, starting at 15:43 AM UTC, Zendesk users globally encountered 503 “Service Unavailable” errors and 5xx server-side issues, disrupting access to critical support tools and communication channels. While immediate mitigations stabilized core services, intermittent issues continued for over 24 hours, underscoring the complexity of multi-pod infrastructure failures.

Read Post

Catchpoint

Read more about Zendesk outage: A case for proactive monitoring and faster incident response

Introducing status page accuracy ratings

Mar 21, 2025 By Colin Bartlett In StatusGator

At StatusGator, we know that not all status pages are created equal. Some providers promptly acknowledge incidents, while others lag behind—or worse, never report issues at all. To help users quickly assess the reliability of provider status pages, we’re introducing Status page accuracy ratings.

Read Post

StatusGator

Read more about Introducing status page accuracy ratings

Why IT Teams Are Switching from SolarWinds to LogicMonitor

Mar 21, 2025 By LogicMonitor In LogicMonitor

On February 7, 2025, SolarWinds announced that they will be acquired by Turn/River for $4.4 billion and go private as soon as Q2 2025. This development has left customers questioning what’s next. Acquisitions often promise innovation, but Turn/River’s track record with similar purchases, like Paessler PRTG, has raised concerns.

Read Post

LogicMonitor

Read more about Why IT Teams Are Switching from SolarWinds to LogicMonitor

Internet Connectivity Plays a Critical Role: Make it a Part of Your Observability Picture

Mar 21, 2025 By Alec Pinkham In Broadcom

In today’s digital age, businesses and customers alike are increasingly reliant on internet connectivity for day-to-day operations, communications, and transactions. Now more than ever, organizations depend on ISPs and cloud providers to deliver critical applications and services, making uninterrupted connectivity essential for success.

Read Post

Broadcom

Read more about Internet Connectivity Plays a Critical Role: Make it a Part of Your Observability Picture

How to Monitor JVM with OpenTelemetry and MetricFire

Mar 21, 2025 By Benjamin Pitts In MetricFire

When you're running a Java application, the JVM is doing a ton of work behind the scenes but unless you're monitoring those internals, it's hard to know how your app is really performing. JVM metrics give you a window into the heart of the runtime: how much memory you're using, how often garbage collection is kicking in, how many threads are active, and where potential bottlenecks might be hiding.

Read Post

MetricFire

Read more about How to Monitor JVM with OpenTelemetry and MetricFire

Optimizing JavaScript Loading with 'defer' #coding #chromedevtools #programming #webdevelopertools

Mar 21, 2025 By Request Metrics In Request Metrics

View Video

Request Metrics

Read more about Optimizing JavaScript Loading with 'defer' #coding #chromedevtools #programming #webdevelopertools

Release v2.3: Crash Handling, Extreme Cardinality Protection, Nodes Ephemerality & more

Mar 21, 2025 By Netdata In netdata

The Netdata Team is thrilled to introduce Netdata v2.3, packed with significant enhancements to monitoring reliability and scalability.

View Video

netdata

Read more about Release v2.3: Crash Handling, Extreme Cardinality Protection, Nodes Ephemerality & more

Using Azure Blob Storage for InfluxDB 3 Core and Enterprise

Mar 20, 2025 By Heather Downing In InfluxData

InfluxDB 3 Core and Enterprise introduce a powerful new diskless architecture that lets you store your time series data in cloud object storage while running the database engine locally. This approach offers significant advantages: you get the performance of a local database combined with the durability, scalability, and cost-effectiveness of cloud storage. In this tutorial, I’ll show you how to set up InfluxDB 3 Core or Enterprise with Azure Blob Storage as your object store.

Read Post

InfluxData

Read more about Using Azure Blob Storage for InfluxDB 3 Core and Enterprise

How to use text box variables in Grafana dashboards

Mar 20, 2025 By Grafana In Grafana

Text box variables let users type whatever they want -- great for text filtering and searching! In this video we'll look at how to use text box variables in Grafana dashboards. Grafana Cloud is the easiest way to get started with Grafana dashboards, metrics, logs, and traces. Our forever-free tier includes access to 10k metrics, 50GB logs, 50GB traces and more. We also have plans for every use case.

View Video

Grafana

Read more about How to use text box variables in Grafana dashboards

Grafana k6 Studio is Now Generally Available! | Demo | Grafana Labs | Performance Testing

Mar 20, 2025 By Grafana In Grafana

Watch a deep-dive demo of Grafana k6 Studio, an open source desktop application that helps you create k6 test scripts quickly and easily via a visual interface.

View Video

Grafana

Read more about Grafana k6 Studio is Now Generally Available! | Demo | Grafana Labs | Performance Testing

VictoriaMetrics Observability is always on! Join us for our first virtual Meet Up in 2025!

Mar 20, 2025 By VictoriaMetrics In VictoriaMetrics

Preliminary Agenda: Warm up VictoriaMetrics Roadmap Update (Roman)

View Video

VictoriaMetrics

Read more about VictoriaMetrics Observability is always on! Join us for our first virtual Meet Up in 2025!

How SSL Certificate Monitoring Ensures Brand Trust and Credibility

Mar 20, 2025 By Simon Rodgers In WebSitePulse

See that little padlock icon to the left of our URL in the address bar? That shows the website is protected by an SSL certificate. It's a great way to tell potential customers that your brand is trustworthy. But if you don't keep an eye on the status of your SSL certificates, there can be serious consequences for your website and your reputation. In this post, we'll explore how SSL certificate monitoring works, how it affects brand trust and credibility, and how to do it right.

Read Post

WebSitePulse

Read more about How SSL Certificate Monitoring Ensures Brand Trust and Credibility

Top tips: Shine a spotlight on your shadow IT

Mar 20, 2025 By General In ManageEngine

Top tips is a weekly column where we highlight what’s trending in the tech world today and list ways to explore these trends. This week, we’re going over four ways to minimize shadow IT within your organization. IT is the backbone of every modern enterprise, but managing it effectively requires full visibility into all users, devices, and activity—both inside and outside your infrastructure.

Read Post

ManageEngine

Read more about Top tips: Shine a spotlight on your shadow IT

Seamless Issue Management with AppSignal: How to Quickly Assign, Track, and Resolve Incidents

Mar 20, 2025 By Connor James In AppSignal

When an incident occurs, you need to assign a clear owner for a swift resolution. You can now more easily assign issues, filter by severity, and track their progress in AppSignal — all from one centralized place. In this post, we'll walk through improvements we've made to the assigned issues page to help your team collaborate effectively and improve app performance, one issue at a time.

Read Post

AppSignal

Read more about Seamless Issue Management with AppSignal: How to Quickly Assign, Track, and Resolve Incidents

Tiered Observability: How To Prioritize and Mature Observability Investments

Mar 20, 2025 By Mike Simon In Splunk

You may be surprised that delivering observability is a journey and isn’t about observing everything at once — it’s about driving outcomes like proactive detection, faster troubleshooting, and aligning with business priorities. If you’ve followed this series, you’ve already taken steps to.

Read Post

Splunk

Read more about Tiered Observability: How To Prioritize and Mature Observability Investments

10 top Cisco Meraki monitoring tools

Mar 20, 2025 By Colin Bartlett In StatusGator

As IT infrastructures grow more complex, having the right Meraki monitoring tools is essential for maintaining network health, performance, and security. Cisco Meraki offers cloud-managed solutions, but some organizations need additional monitoring software to gain deeper insights, improve efficiency, and proactively address issues. In this article, we’ll explore the top Meraki monitoring tools that help IT teams manage and optimize their networks effectively.

Read Post

StatusGator

Read more about 10 top Cisco Meraki monitoring tools

Getting started with Zendesk dashboards

Mar 20, 2025 By Dan Watts In Squared Up

Zendesk is one of the most popular customer service platforms, known for its ease of use, robust ticketing system, and powerful automation capabilities. While Zendesk comes with native reporting and dashboards, they can be limited in terms of customization and data correlation across different sources. Additionally, building complex visualizations in Zendesk often requires more advanced knowledge of their reporting tools. This is where SquaredUp comes in!

Read Post

Squared Up

Read more about Getting started with Zendesk dashboards

Building AI-Ready Infrastructure for the Enterprise

Mar 20, 2025 By Kate Guarente-Smith In Honeycomb

Last week marked the inaugural HumanX conference, a convening of leaders, technologists, policy makers, and media, all brought together to discuss the state of AI and its potential impact on the future of software, business, and society.

Read Post

Honeycomb

Read more about Building AI-Ready Infrastructure for the Enterprise

Proactive Monitoring: How Engineers Use CloudWatch to Save Customers Money

Mar 20, 2025 By Benjamin Pitts In MetricFire

At MetricFire, we love talking with engineers about their tech stacks, SRE challenges, and how they approach infrastructure monitoring. Recently, we had a great chat with Yoimer Roman from a Latin American cloud consulting company, that helps clients make smarter business decisions by leveraging AWS CloudWatch monitoring. Yoimer wears many hats: mentoring his team on all things AWS, designing custom cloud environments, and bridging the gap between technical challenges and non-technical stakeholders.

Read Post

MetricFire

Read more about Proactive Monitoring: How Engineers Use CloudWatch to Save Customers Money

State of Observability in Communications and Media

Mar 20, 2025 By Splunk In Splunk

We surveyed ITOps and engineering professionals worldwide to learn how communications and media organizations build leading observability practices. In our webinar, “The State of Observability in Communications and Media,” we explore three priorities for today’s organizations — and what it takes to claim your spot on the observability leaderboard. Join us to discuss the implications of insights including.

View Video

Splunk

Read more about State of Observability in Communications and Media

The Reason Loading JavaScript Takes So Long #coding #webdevelopertools #chromedevtools #programming

Mar 20, 2025 By Request Metrics In Request Metrics

View Video

Request Metrics

Read more about The Reason Loading JavaScript Takes So Long #coding #webdevelopertools #chromedevtools #programming

How to redact secrets from logs with Grafana Alloy and Loki

Mar 20, 2025 By Romain Gaillard In Grafana

In any observability stack, logs are essential for uncovering insights, troubleshooting issues, and ensuring system health. However, managing the security of logged data presents its own challenges, especially when it comes to preventing sensitive information, like API keys and credentials, from slipping into logs. Secrets can originate from a variety of sources, and it’s often challenging to predict which applications or services might inadvertently expose sensitive information.

Read Post

Grafana

Read more about How to redact secrets from logs with Grafana Alloy and Loki

An open source app for easily building performance tests: Grafana k6 Studio is generally available

Mar 20, 2025 By Llandy Riveron Del Risco In Grafana

Here at Grafana Labs, we have an on-going commitment to providing solutions that increase productivity without sacrificing ease-of-use. Last year, in line with that effort, we introduced experimental and public preview releases of Grafana k6 Studio, an open source desktop application that helps you create k6 test scripts quickly and easily via a visual interface. Today, we’re excited to share the general availability of k6 Studio v1.0.

Read Post

Grafana

Read more about An open source app for easily building performance tests: Grafana k6 Studio is generally available

Getting started with Azure dashboards

Mar 20, 2025 By Sameer Mhaisekar In Squared Up

Azure is the cloud service provider of choice for a variety of reasons – such as its ease of use, its wide variety of services, the strong community around it and its integration with other Microsoft services. While Azure comes with native data visualization solutions such as dashboards and workbooks, they require a significant amount of Azure knowledge to create and maintain.

Read Post

Squared Up

Read more about Getting started with Azure dashboards

Microlesson: Using Mo Copilot

Mar 20, 2025 By Sumo Logic, Inc. In Sumo Logic

This video demonstrates how to access and use the Sumo Logic Mo Copilot to write log search queries in plain English.

View Video

Sumo Logic

Read more about Microlesson: Using Mo Copilot

Dashboarding your K6 load tests in SquaredUp

Mar 20, 2025 By John Hayes In Squared Up

Load testing is an extremely valuable practice for assessing how your application will actually perform in production. Whether you're expecting a handful of concurrent users or anticipate thousands, it's important to have an idea of the kind of loads that will be placed on your systems and be aware of where bottlenecks or saturation may occur.

Read Post

Squared Up

Read more about Dashboarding your K6 load tests in SquaredUp

Future-Proof Your Oracle Monitoring with NiCE Oracle Management Pack 5.4

Mar 19, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

As IT complexity rises and hybrid cloud adoptions accelerate, organizations need trusted, future-ready Oracle monitoring without the overhead.

Read Post

NiCE IT Mgmt

Read more about Future-Proof Your Oracle Monitoring with NiCE Oracle Management Pack 5.4

North Korea Downed by Faulty ROA

Mar 19, 2025 By Doug Madory In Kentik

On March 18, 2025, North Korea’s BGP routes became RPKI-invalid due to the publication of a faulty ROA. In this post, we’ll visualize the impact on the propagation of affected routes and what North Korea (and you too!) can do to avoid problems with the optional maxLength attribute.

Read Post

Kentik

Read more about North Korea Downed by Faulty ROA

How state, local, and education organizations can manage logs flexibly and efficiently using Datadog Observability Pipelines

Mar 19, 2025 By Abe Rosloff In Datadog

State, local, and education (SLED) organizations need their logs to provide clear, structured insights into system performance, user behavior, and security risks. But often, the picture becomes scattered and chaotic instead, with critical log data buried in noise and gaps that make logs difficult to interpret.

Read Post

Datadog

Read more about How state, local, and education organizations can manage logs flexibly and efficiently using Datadog Observability Pipelines

AWS TimeStream for InfluxDB with Read Replica

Mar 19, 2025 By InfluxData In InfluxData

Overview and demo on how to use AWS TimeStream for InfluxDB 2.x with Read Replica.

View Video

InfluxData

Read more about AWS TimeStream for InfluxDB with Read Replica

Your Observability Questions, Answered

Mar 19, 2025 By Anjali Udasi In Last9

Monitoring used to be simple—set up some dashboards, configure alerts, and call it a day. But with microservices and cloud-native systems, things aren’t so straightforward anymore. Keeping track of everything can feel like an endless game of whack-a-mole. That’s where observability comes in. If you’re just getting started or looking to refine your approach, this guide answers the most common (and important) questions.

Read Post

Last9

Read more about Your Observability Questions, Answered

Supply Chain Security: Leveraging NDR to Combat Cyberthreats

Mar 19, 2025 By Filip Cerny In Flowmon

Supply chains are crucial to business operations. It’s essential to verify that the connections required for them to operate don’t provide an opaque pathway for cybercriminals to exploit. This makes supply chain security a critical concern for organizations everywhere. The criminals determined to breach security and establish a persistent presence on networks are increasingly targeting vulnerabilities in supply chains. Through a single entry point, they can compromise multiple organizations.

Read Post

Flowmon

Read more about Supply Chain Security: Leveraging NDR to Combat Cyberthreats

Boosting in-app purchase success rates: Five proven strategies for seamless transactions

Mar 19, 2025 By Sindu Priyadharshini V In Site24x7

In-app purchases (IAPs) are the lifeblood of mobile app monetization, but getting users to complete a transaction isn’t always easy. A slow checkout page, a failed payment request, or even a minor delay in loading the purchase screen can make users abandon their purchase altogether. So, how do you optimize the app conversion rate and ensure that a user has a successful transaction every time?

Read Post

Site24x7

Read more about Boosting in-app purchase success rates: Five proven strategies for seamless transactions

What happens when networks aren't monitored? Key risks and consequences

Mar 19, 2025 By Rama Venkatesan In Site24x7

In today's hyper-disruptive risk climate, most businesses are under-prepared. With cyberattacks threatening organizations every day, even the most experienced risk professionals are under growing uncertainty. In this climate, can you really afford not to monitor your networks? Failing to monitor your network isn't just a technical oversight; it's a strategic vulnerability.

Read Post

Site24x7

Read more about What happens when networks aren't monitored? Key risks and consequences

Datadog vs Zabbix - Which Monitoring Tool is Right for You?

Mar 19, 2025 By Pavithra Parthiban In Atatus

When it comes to infrastructure and application monitoring, Datadog and Zabbix are two widely recognized tools, each catering to different needs. While Datadog is a cloud-based observability platform offering end-to-end monitoring, Zabbix is an open-source monitoring solution known for its flexibility in tracking network devices and server performance. But which one should you choose?

Read Post

Atatus

Read more about Datadog vs Zabbix - Which Monitoring Tool is Right for You?

Looking for a PRTG Alternative? Here's Why You Should Consider Icinga

Mar 19, 2025 By Feu Mourek In Icinga

If you’re reading this, chances are high you’re looking for a PRTG alternative and considering switching from Paessler PRTG to Icinga. Maybe it’s the rising costs of PRTG, or maybe you want a monitoring solution that gives you more flexibility and control. Whatever your reason, I want to give you an honest, technical perspective on what that switch entails. I’m not here to tell you PRTG is bad – far from it.

Read Post

Icinga

Read more about Looking for a PRTG Alternative? Here's Why You Should Consider Icinga

Common database performance monitoring pitfalls and how to avoid them

Mar 19, 2025 By Site24x7 In ManageEngine

Databases are fundamental to almost all applications, facilitating everything from financial dealings to social media engagements. Nonetheless, efficient database performance monitoring frequently resembles maneuvering through a labyrinth, with concealed traps that may result in diminished performance or expensive downtime. In this article, we will examine frequent mistakes in database monitoring and offer helpful advice to avoid them.

Read Post

ManageEngine

Read more about Common database performance monitoring pitfalls and how to avoid them

Introducing Coralogix's AI Center: Real-time AI Observability

Mar 19, 2025 By Coralogix In Coralogix

Traditional observability wasn't built for. The reason? AI operates in shades of grey, where outcomes are non-deterministic. That's why we built the AI Center, bringing real-time AI observability to thousands of enterprises worldwide. As part of our AI Center, we built an evaluation engine, designed to oversee and detect specific issues that are most common when building AI agents. Teams can choose the evaluators they want to oversee each agent and receive live alerts and reports into specific quality, security and compliance issues.

View Video

Coralogix

Read more about Introducing Coralogix's AI Center: Real-time AI Observability

Proactive Monitoring: How DinoCloud Uses CloudWatch to Save Clients Money

Mar 19, 2025 By Benjamin Pitts In MetricFire

At MetricFire, we love talking with engineers about their tech stacks, SRE challenges, and how they approach infrastructure monitoring. Recently, we had a great chat with Yoimer Roman from DinoCloud, a Latin American company that helps clients make smarter business decisions by leveraging AWS CloudWatch monitoring. Yoimer wears many hats: mentoring his team on all things AWS, designing custom cloud environments, and bridging the gap between technical challenges and non-technical stakeholders.

Read Post

MetricFire

Read more about Proactive Monitoring: How DinoCloud Uses CloudWatch to Save Clients Money

New In Playwright 1.51 - Can AI Fix Failing Tests With The New Error Prompt?

Mar 19, 2025 By Checkly In Checkly

In this episode, Stefan Judis, Playwright ambassador, explores the new 'Copy as prompt' feature in Playwright 1.51. This feature allows you to copy a pre-filled LLM prompt with all the context of a failing test case. Does this mean that AIs can take over and magically fix all the failing tests? Let's find out!

View Video

Checkly

Read more about New In Playwright 1.51 - Can AI Fix Failing Tests With The New Error Prompt?

Server Monitoring Explained: How to Outwit Downtime Before it Strikes

Mar 19, 2025 By Rebecca Grassing In Auvik

Server monitoring is the practice of continuously tracking server health, performance, and resource usage to catch issues before they cause downtime. When a server crashes, it can mean lost revenue, frustrated users, and a mad scramble to fix the problem. The right server monitoring tool helps your IT team stay ahead by providing real-time alerts and visibility into critical metrics. In this guide, we’ll break down how server monitoring works, why it matters, and what to look for in a solution.

Read Post

Auvik

Read more about Server Monitoring Explained: How to Outwit Downtime Before it Strikes

Optimizing Script Placement for Web Performance

Mar 19, 2025 By Request Metrics In Request Metrics

Master the art of loading JavaScript efficiently in this essential Concepts of Web Performance tutorial with Todd Gardner from Request Metrics. Perfect for entry-level web developers struggling with slow websites, this video breaks down the critical differences between standard blocking scripts, async, and defer attributes that dramatically impact your site's performance. Learn when and why to use each loading technique, understand how JavaScript execution blocks HTML parsing and CSS rendering through clear waterfall and flame chart visualizations, and discover why defer is usually your best option for most scenarios.

View Video

Request Metrics

Read more about Optimizing Script Placement for Web Performance

Modernizing Data Centers for AI: Bridging Observability, Cost Control, and Intelligent Automation

Mar 19, 2025 By LogicMonitor In LogicMonitor

Attend our webinar on April 3 to see our latest innovations live. Register IT Operations are more complex than ever, with modern data centers spanning on-premises, containers, multi-cloud environments, and AI-powered infrastructure. The rapid expansion of data sources has created an overwhelming volume of information, making manual monitoring across multiple tools impractical. Visibility gaps slow down troubleshooting and delay critical decisions, impacting business performance.

Read Post

LogicMonitor

Read more about Modernizing Data Centers for AI: Bridging Observability, Cost Control, and Intelligent Automation

Best Logging Practices: 14 Do's and Don'ts for Better Logging

Mar 19, 2025 By Sematext In Sematext

Ever found yourself drowning in a sea of log data, struggling to make sense of the overwhelming noise? Or perhaps faced a major system breakdown, only to find that your logs didn’t provide the answers you needed, leaving you in the dark? Effective logging is a critical yet often overlooked aspect of software development and operations, highlighting why logging is important – it’s the foundation upon which observability, troubleshooting, and system maintenance are built.

Read Post

Sematext

Read more about Best Logging Practices: 14 Do's and Don'ts for Better Logging

Mastering MySQL connection pooling: Why monitoring matters

Mar 18, 2025 By Grace Nalini In Site24x7

Because you've navigated here, it's clear you know the significance of managing your databases. We all agree that maintaining the speed and responsiveness of our applications depends upon how we manage our database connections. In this blog post, we will focus on MySQL databases. MySQL connection pooling is revolutionary because it speeds up queries, conserves resources, and allows applications to handle high traffic effortlessly.

Read Post

Site24x7

Read more about Mastering MySQL connection pooling: Why monitoring matters

Managing Network Change to Minimize Unnecessary Drama

Mar 18, 2025 By ScienceLogic In ScienceLogic

In today’s fast-paced IT world, keeping your network rock-solid is more crucial than ever. Businesses depend on their networks to keep things running smoothly, but with all the complexity and rapid changes, risks are always lurking around the corner. Nailing network changes is key to cutting downtime, staying compliant, and keeping services up and running. By tapping into automation and smart observability, IT teams can boost efficiency and keep disruptions at bay.

Read Post

ScienceLogic

Read more about Managing Network Change to Minimize Unnecessary Drama

Why observability is crucial for your Kubernetes deployments: A fireside chat with ManageEngine and DevOps Toolkit

Mar 18, 2025 By General In ManageEngine

Kubernetes is at the heart of modern cloud-native applications, but achieving effective observability is no easy feat. Managing workloads, ensuring performance efficiency, and keeping costs under control demand the right strategies and tools. If you’re grappling with Kubernetes complexity, struggling with monitoring blind spots, or seeking to optimize your deployments, we have the perfect event for you.

Read Post

ManageEngine

Read more about Why observability is crucial for your Kubernetes deployments: A fireside chat with ManageEngine and DevOps Toolkit

Grafana Cloud updates: Fleet Management is now GA, a unified app for IRM, and more

Mar 18, 2025 By Kristin Knapp In Grafana

We consistently roll out helpful updates and fun features in Grafana Cloud, our fully managed observability platform powered by the open source Grafana LGTM Stack (Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics). In case you missed them, here’s our monthly round-up of the latest and greatest Grafana Cloud updates. You can also read about all the features we add to Grafana Cloud in our What’s New in Grafana Cloud documentation.

Read Post

Grafana

Read more about Grafana Cloud updates: Fleet Management is now GA, a unified app for IRM, and more

Elasticsearch in the aviation industry: A game-changer for data management

Mar 18, 2025 By Adam La Roche In Elastic

Digital customer experience is no longer a luxury but a necessity for European airlines. It drives customer satisfaction, enhances operational efficiency, and creates a sustainable competitive advantage. As the industry continues to evolve, airlines that prioritise investment in cutting-edge digital technologies and platforms will be better positioned to thrive in a dynamic and demanding market.

Read Post

Elastic

Read more about Elasticsearch in the aviation industry: A game-changer for data management

What is Outbound Packet Loss & How to Detect It

Mar 18, 2025 By Alyssa Lamberti In Obkio

Imagine you're on an important Zoom call, and suddenly, your voice starts cutting out, or your video freezes mid-sentence. Frustrating, right? One of the sneaky culprits behind this issue is outbound packet loss, when data packets leaving your network never make it to their destination. Outbound packet loss can wreak havoc on voice calls, video meetings, online gaming, and cloud apps, making everything feel laggy or unresponsive.

Read Post

Obkio

Read more about What is Outbound Packet Loss & How to Detect It

7 Cisco Meraki alternatives: the best MDM solutions for IT teams

Mar 18, 2025 By Colin Bartlett In StatusGator

Are you searching for a Cisco Meraki alternative? Or perhaps you need a mobile device management (MDM) solution that seamlessly integrates with your IT infrastructure. Whether you’re an IT team or a managed service provider (MSP), choosing the right MDM software is crucial for efficiently managing mobile devices, securing endpoints, and maintaining compliance.

Read Post

StatusGator

Read more about 7 Cisco Meraki alternatives: the best MDM solutions for IT teams

Prometheus Alerting 101: Rules, Recording Rules, and Alertmanager

Mar 18, 2025 By Phuong Le In VictoriaMetrics

Alerting Rules, Recording Rules, and Alertmanager This discussion is part of the basic monitoring series, an effort to eliminate confusion in monitoring for both beginners and experienced users.

Read Post

VictoriaMetrics

Read more about Prometheus Alerting 101: Rules, Recording Rules, and Alertmanager

PlayStation, Xbox, Switch, PC, or Mobile - wherever you've got bugs to crush, Sentry can help

Mar 18, 2025 By Sasha Blumenfeld In Sentry

Whether it's a boss fight freeze or a sudden disconnect in multiplayer, crashes break immersion and make your players mad. Debugging these issues across multiple platforms—each with its own error-reporting system—only makes things harder.

Read Post

Sentry

Read more about PlayStation, Xbox, Switch, PC, or Mobile - wherever you've got bugs to crush, Sentry can help

Stackify Retrace Use Cases - Quality Assurance

Mar 18, 2025 By James Michaelis In Stackify

High tech companies that use their own solutions project confidence to their customers that solutions truly work. Many teams across Stackify use Retrace internally, and my time in customer support gave me great insights into how our customers relied on Retrace to ensure applications consistently delivered a great user experience.

Read Post

Stackify

Read more about Stackify Retrace Use Cases - Quality Assurance

What is Internet Stack Map?

Mar 18, 2025 By Catchpoint In Catchpoint

To understand, optimize and ensure application reliability, you must look beyond just the code only from the cloud. Internet Performance Monitoring gives you visibility into the Internet stack from DNS latency to ISP performance to API response times. Catchpoint Internet Stack Map is the world's first live visual dashboard, providing true end to end monitoring for everything impacting applications and user experience.

View Video

Catchpoint

Monitoring

Read more about What is Internet Stack Map?

Introducing Catchpoint Internet Stack Map

Mar 18, 2025 By Catchpoint In Catchpoint

Discover how Catchpoint’s Internet Stack Map can transform your digital service monitoring! Features.

View Video

Catchpoint

Monitoring

Read more about Introducing Catchpoint Internet Stack Map

Two separate research teams. Two different reports. Same conclusions?

Mar 18, 2025 By Catchpoint In Catchpoint

Watch to see the similarities between the DORA and SRE Report: https://catch.pt/43Vfgps

View Video

Catchpoint

Monitoring

Read more about Two separate research teams. Two different reports. Same conclusions?

Lightrun Named to Fast Company's Annual List of the World's Most Innovative Companies of 2025

Mar 18, 2025 By Eran Kinsbruner In Lightrun

(March 18, 2025) — Lightrun is proud to have been named to Fast Company’s prestigious list of the World’s Most Innovative Companies of 2025. This year’s list shines a spotlight on businesses that are shaping industry and culture through their innovations to set new standards and achieve remarkable milestones in all sectors of the economy. Alongside the World’s 50 Most Innovative Companies, Fast Company recognizes 609 organizations across 58 sectors and regions.

Read Post

Lightrun

Read more about Lightrun Named to Fast Company's Annual List of the World's Most Innovative Companies of 2025

Log File Analysis: A Guide for DevOps Engineers

Mar 18, 2025 By Faiz Shaikh In Last9

Ever found yourself buried in endless log files, trying to piece together what went wrong? For DevOps engineers, log analysis isn’t just about debugging—it’s a crucial skill for maintaining reliable systems and catching issues before they escalate. In this guide, we’ll cover everything you need to know about log file analysis, from the fundamentals to the best tools available today.

Read Post

Last9

Read more about Log File Analysis: A Guide for DevOps Engineers

OpenTelemetry Backends: A Practical Implementation Guide

Mar 18, 2025 By Prathamesh Sonpatki In Last9

If you’ve ever found yourself sifting through logs, metrics, and traces without a clear answer to why your app crashed at 2 AM, you’re not alone. Troubleshooting without the right tools can feel like chasing shadows. That’s where the right OpenTelemetry backend makes all the difference—bringing everything together and turning scattered data into a clear picture.

Read Post

Last9

Read more about OpenTelemetry Backends: A Practical Implementation Guide

Website Logging: Everything You Need to Get Started

Mar 18, 2025 By Anjali Udasi In Last9

If you're new to DevOps, you’ve likely noticed that website logging plays a bigger role than it seems at first. It’s not just a routine task—it’s how you keep systems stable, troubleshoot issues, and understand what’s happening under the hood. A good logging setup captures what went wrong, when, and why—helping you fix problems faster instead of guessing.

Read Post

Last9

Read more about Website Logging: Everything You Need to Get Started

Godot Updates

Mar 18, 2025 By Sentry In Sentry

Having trouble with bugs in your Godot game? Sentry's Godot SDK helps you track down crash reports, stack traces, and runtime errors. In this video, Stefan will show you how the SDK works in practice. You'll see how it helps whether you're working with GDScript or C#, providing a place to see your errors, so you can fix them and keep your players happy.

View Video

Sentry

Monitoring

Read more about Godot Updates

The challenge with prompt engineering

Mar 18, 2025 By SolarWinds In SolarWinds

Connect with SolarWinds.

View Video

SolarWinds

Read more about The challenge with prompt engineering

Proactive monitoring pays. (Here's the proof.)

Mar 18, 2025 By Mia Martello In Martello Technologies

We’ve always known the proactive monitoring and advanced analytics provided by Martello’s Vantage DX can save organizations time and money while getting more from their investments in Microsoft Teams. We recently set out to prove that by building a research-based cost model with the help of our friends, the expert consultants at Enable UC. The results of that study were even more compelling than we expected.

Read Post

Martello Technologies

Read more about Proactive monitoring pays. (Here's the proof.)

AWS ALB vs ELB: Which load balancer is right for you?

Mar 18, 2025 By David Girvin In Sumo Logic

Load balancers play a key role in Amazon Web Services (AWS) systems by maintaining traffic distribution, detecting server issues, and redirecting client requests to available servers without any downtime. But, choosing the right AWS load balancer can be daunting, as it’s essential for optimizing your application performance and scalability. Depending on your use case, you may find that an Elastic Load Balancer (ELB) or Application Load Balancer (ALB) better suits your needs.

Read Post

Sumo Logic

Read more about AWS ALB vs ELB: Which load balancer is right for you?

Loading JavaScript on your Website - Concepts of Web Performance

Mar 18, 2025 By Request Metrics In Request Metrics

View Video

Request Metrics

Read more about Loading JavaScript on your Website - Concepts of Web Performance

7 SaaS Compliance Pitfalls and How Proactive Managed IT Can Prevent Them

Mar 18, 2025 By Ben Botti In Auvik

MSPs and IT teams are trusted to maintain the security and compliance of sensitive data while also being on the hook for end-user experience. In and of itself, this is a tricky balancing act. When you add SaaS to the mix, compliance gets even more complex. SaaS compliance with regulations like GDPR, HIPAA, and CCPA is more complicated than traditional “castle and moat” style on-prem networks, where data resides squarely within an organization’s control.

Read Post

Auvik

Read more about 7 SaaS Compliance Pitfalls and How Proactive Managed IT Can Prevent Them

Less War, More Room: Breaking Down Operational Silos

Mar 18, 2025 By Last9 - High Cardinality Monitoring In Last9

Our Dev Evangelist, Prathamesh Sonpatki, shared insights on alert fatigue at a @ClickHouseDB meetup—sparking great conversations on observability, OpenTelemetry, and the need for a telemetry data platform.

View Video

Last9

Read more about Less War, More Room: Breaking Down Operational Silos

Top 7 Real User Monitoring (RUM) Tools and Software for Better User Experience

Mar 18, 2025 By Janani In Atatus

As a software-based company, the most critical thing you can do is maintain control over your users' digital experiences and satisfaction levels. However, without a monitoring plan and technologies that allow you to see how customers interact with your application or website from their perspective is impossible. They provide you with the information you need to determine how well your webapp or website is operating and to avoid slow pages or screens that drive customers to your competitors.

Read Post

Atatus

Read more about Top 7 Real User Monitoring (RUM) Tools and Software for Better User Experience

The latest in Kubernetes Monitoring: new features to track persistent storage, simplify alerting, and more

Mar 17, 2025 By Kristin Knapp In Grafana

Monitoring is an essential part of any Kubernetes deployment, helping organizations optimize cluster health, streamline troubleshooting, and control their costs. In Grafana Cloud, we offer all these capabilities (and more) in our out-of-the-box Kubernetes Monitoring solution. Since introducing Kubernetes Monitoring in 2022, we’ve been steadily adding new features, improving the UI, and making it even easier to gain insights into the state of your Kubernetes fleet.

Read Post

Grafana

Read more about The latest in Kubernetes Monitoring: new features to track persistent storage, simplify alerting, and more

Achieving Business Continuity with Managed IT Services and Cloud Security Solutions

Mar 17, 2025 By Arpit Sharma In Motadata

The digital world is evolving rapidly, and businesses must always stay up and running. Any disruption—from cyberattacks, hardware failures, or natural disasters—can cause financial losses and harm a company’s reputation. This is why business continuity is essential. Managed IT services and cloud security solutions help businesses stay operational even during unexpected events.

Read Post

Motadata

Read more about Achieving Business Continuity with Managed IT Services and Cloud Security Solutions

So, What's the Difference Between Observability and Monitoring?

Mar 17, 2025 By Martin Thwaites In Honeycomb

Observability and monitoring are not about gathering different data—they differ in their purpose, but share the same data. Monitoring is focused on notification based on predefined questions. Whether that’s through Dashboards people watch, or push-based alerts to notification systems like SMS or purpose-built platforms like PagerDuty.

Read Post

Honeycomb

Read more about So, What's the Difference Between Observability and Monitoring?

The AI Revolution is Here - Are You Ready for the Hidden Threats?

Mar 17, 2025 By Teneo In Teneo

In a recent webinar, Gartner unveiled its Top 10 Strategic Technology Trends for 2025*, which all focus on the concept of ‘Responsible Innovation’. They break this down across three pivotal themes: AI Imperatives and Risks, New Frontiers of Computing, and Human-Machine Synergy.

Read Post

Teneo

Read more about The AI Revolution is Here - Are You Ready for the Hidden Threats?

The Rise of BYOAI: How Shadow AI is Reshaping the Workplace and the Security Risks You Can't Ignore

Mar 17, 2025 By Teneo In Teneo

The Tech Show 2025, held on March 12-13, was a testament to the rapid integration of artificial intelligence (AI) across various vendors. A significant number of companies showcased their latest AI advancements, underscoring the technology’s pivotal role in shaping the future. From startups to established tech giants, exhibitors demonstrated AI’s transformative potential.

Read Post

Teneo

Read more about The Rise of BYOAI: How Shadow AI is Reshaping the Workplace and the Security Risks You Can't Ignore

Best Remote Support Software: Top Tools, Features, and Comparisons

Mar 17, 2025 By Staff Contributor In SolarWinds

The best remote support tool is secure, user-friendly, and provides five-star customer support. IT professionals seeking software for their organization should consider pricing and licensing restrictions, and compatibility with their existing infrastructure and compliance with industry regulations. As remote work continues to rise, we expect to see the use of remote support programs expand beyond IT help desks and customer support teams.

Read Post

SolarWinds

Read more about Best Remote Support Software: Top Tools, Features, and Comparisons

InfluxDB 3 Core and Enterprise Are Now in Beta

Mar 17, 2025 By Paul Dix In InfluxData

Today we’re excited to announce that InfluxDB 3 Core, our new open source product licensed under MIT/Apache 2, and InfluxDB 3 Enterprise are now in beta. InfluxDB 3 Core is a high-speed, recent-data engine that collects and processes data in real-time, while persisting it to local disk or object storage. InfluxDB 3 Enterprise is a commercial product that builds on Core’s foundation, adding high availability, read replicas, enhanced security, and data compaction for faster queries.

Read Post

InfluxData

Read more about InfluxDB 3 Core and Enterprise Are Now in Beta

Python Logging Format: Best Practices for Monitoring and Troubleshooting

Mar 17, 2025 By Wendy Howard In eG Innovations

Effective logging is essential for any Python application, especially those powering critical backend services. Logs capture diagnostic information about a system’s performance and behavior, enabling better observability and uninterrupted monitoring—both critical as distributed systems grow in complexity. Luckily, Python’s built-in logging module streamlines log management with customizable formats that enhance readability.

Read Post

eG Innovations

Read more about Python Logging Format: Best Practices for Monitoring and Troubleshooting

AI-Powered IT Resilience: Faster Recovery, Lower Costs

Mar 17, 2025 By Raja Shekar Mulpuri In HEAL Software

According to industry benchmarks, unplanned downtime costs enterprises an average of $5,600 per minute. For industries like fintech, e-commerce, and SaaS, where customer experience is a competitive differentiator, prolonged outages translate into customer churn, SLA penalties, and reputational damage.

Read Post

HEAL Software

Read more about AI-Powered IT Resilience: Faster Recovery, Lower Costs

5 Actions you can take to improve digital performance

Mar 17, 2025 By Leo Vasiliou In Catchpoint

Slow is officially the new down. That’s a major finding of The SRE Report 2025, with 53% of study respondents agreeing with this expression, and 44% stating “performance should be tracked against a service-level objective.”

Read Post

Catchpoint

Read more about 5 Actions you can take to improve digital performance

Full-Stack Observability: What It Is [Minus the Fluff]

Mar 17, 2025 By Anjali Udasi In Last9

You've heard the term thrown around in meetups and Slack channels, but what exactly is full-stack observability? Simply put, you can see, understand, and quickly act on everything happening across your entire tech stack—from frontend user interactions to backend services, cloud infrastructure, and third-party integrations. Full-stack observability isn't just another tech buzzword. It's the difference between being blindsided by outages and catching issues before your users tweet about them.

Read Post

Last9

Read more about Full-Stack Observability: What It Is [Minus the Fluff]

Distributed Tracing: An Advanced Guide for DevOps & SREs

Mar 17, 2025 By Anjali Udasi In Last9

In the microservices world, tracking down performance issues feels like solving a mystery with pieces scattered across dozens of systems. When users report slowness, your team needs answers fast—not hours of guesswork. Distributed tracing is emerged as the solution, but implementing it effectively requires more than just understanding the basics. This guide takes you beyond the fundamentals to show you how DevOps teams and SREs can build truly effective tracing strategies.

Read Post

Last9

Read more about Distributed Tracing: An Advanced Guide for DevOps & SREs

#InfluxDB 3 Open Source in Beta!

Mar 17, 2025 By InfluxData In InfluxData

InfluxData PM Peter Barnett breaks down the key improvements since alpha and what’s next on the road to GA. InfluxDB 3 Core: A high-speed, open source recent-data engine (MIT/Apache 2) for real-time data collection, processing, and storage. InfluxDB 3 Enterprise: Built on Core, with high availability, read replicas, enhanced security, and a free tier for at-home use.

View Video

InfluxData

Read more about #InfluxDB 3 Open Source in Beta!

Modernizing Government IT: Observability, Security & Cost Optimization with Datadog

Mar 17, 2025 By Datadog In Datadog

Government IT leaders face the monumental challenge of modernizing aging systems, migrating to the cloud, and enhancing citizen services—all while ensuring security, compliance, and cost efficiency. Siloed tools and limited visibility create roadblocks to achieving these goals. Datadog’s FedRAMP-authorized platform provides full-stack observability, AI-powered security, and cloud cost optimization, helping agencies simplify complexity, strengthen Zero Trust security, and maximize IT budgets.

View Video

Datadog

Read more about Modernizing Government IT: Observability, Security & Cost Optimization with Datadog

Updates to the Sentry Unreal Engine SDK

Mar 17, 2025 By Sentry In Sentry

Sentry's Unreal Engine SDK has gotten an uplift! We've added support for distributed tracing, and make Unreal's Crash-Reporter for desktop optional. Teams can now automatically send crashes and errors to sentry, along with breadcrumbs, events filers, release health monitoring and more. Cody takes us through how we can get started using the Unreal Engine SDK, and how you can use it to see crashes and errors, track down performance issues, and even get screenshots of what users were seeing right before their game crashed.

View Video

Sentry

Read more about Updates to the Sentry Unreal Engine SDK

What Is a Network Outage? Causes, Symptoms, Detection, and How to Fix It

Mar 17, 2025 By Andrii Kernitskyi In Obkio

If you’ve ever found yourself asking questions like: Why is my Internet acting weird? What is going on with the Wi-Fi? Is the network down for anyone else? Is everything down? Why is there weird behaviour with Teams and Outlook? When there is a network outage, what EXACTLY does that mean? How to troubleshoot/diagnose cause of Internet outages? How to tell if Internet outage is ISP or issues with my network? Why do I have intermittent Network Outages consistently lasting 30 seconds?

Read Post

Obkio

Read more about What Is a Network Outage? Causes, Symptoms, Detection, and How to Fix It

systemctl: The Complete Guide to Managing Linux Services

Mar 17, 2025 By Prathamesh Sonpatki In Last9

Ever found yourself staring at your terminal, wondering why a service won’t start? systemctl is the backbone of modern Linux service management, but if you’re new to it, it can feel overwhelming. This guide breaks it down—covering essential commands and advanced techniques in a clear, practical way. No unnecessary jargon, just the know-how you need to manage services with confidence.

Read Post

Last9

Read more about systemctl: The Complete Guide to Managing Linux Services

Syslog Servers Explained: How They Help with Logging

Mar 17, 2025 By Preeti Dewani In Last9

Your team lead just dropped, "We need to set up a syslog server," and now you're wondering what you've signed up for. Syslog servers aren’t just another checkbox in your infrastructure; they’re the quiet workhorses that keep logs organized and accessible. When things go wrong, they help you connect the dots faster. Imagine this: It’s 3 AM, and alerts are flooding in. Your authentication service is failing, but the logs on that server show nothing unusual.

Read Post

Last9

Read more about Syslog Servers Explained: How They Help with Logging

Understanding the Chrome DevTools Timeline

Mar 17, 2025 By Request Metrics In Request Metrics

Learn how to decode flame charts in this essential Concepts of Web Performance tutorial with Todd Gardner from Request Metrics. Perfect for entry-level web developers, this quick guide demystifies the intimidating flame charts found in Chrome DevTools that visualize your browser's main thread activity. Discover how to identify performance bottlenecks by understanding the color-coding system—gray for browser tasks, blue for HTML parsing, purple for layout and paint operations, dark yellow for script compilation, and light yellow for JavaScript execution.

View Video

Request Metrics

Monitoring

Read more about Understanding the Chrome DevTools Timeline

Is There Such a Thing as Good Friction in UX?

Mar 17, 2025 By Germain UX Team In Germain UX

If you’ve ever worked on a digital product—or just used one—you’ve probably heard this advice a million times: reduce friction. Make things fast. Make them seamless. Remove anything that slows users down. That’s solid advice. No one wants to fill out a form with 20 fields just to sign up for an app. Nobody enjoys a checkout process that feels like solving a puzzle. But here’s the thing: sometimes friction is actually a good thing.

Read Post

Germain UX

Read more about Is There Such a Thing as Good Friction in UX?

Sponsored Post

Some of the open source standards used with AI agents or agentic frameworks

Mar 16, 2025 By Shailesh Manjrekar In Fabrix

Several open-source standards have emerged for AI agents and agentic frameworks, aiming to improve interoperability and standardization in the rapidly evolving field of AI.

Read Post

Fabrix

Read more about Some of the open source standards used with AI agents or agentic frameworks

How We Enabled Loading a Million Spans in SigNoz Trace Details Page

Mar 16, 2025 By Vikrant Gupta In SigNoz

We recently launched a feature in our launch week that got a lot of attention - loading and visualizing even a million spans in our trace detail page. This sparked curiosity among users and developers, leading many to ask: How did we do it? The motivation behind building this feature was clear—our users needed this capability. It unlocks new debugging workflows, making it easier to analyze massive traces efficiently. Below is our revamped trace details page. Each line represents a span.

Read Post

SigNoz

Read more about How We Enabled Loading a Million Spans in SigNoz Trace Details Page

How digital experience monitoring (DEM) tools improve both customer and employee journeys

Mar 16, 2025 By Bela Susan Thomas In Site24x7

Outstanding digital experiences are becoming a basic requirement in today's digital economy rather than a distinction. From initial discovery to post-purchase assistance, customers demand smooth, personalized journeys that fulfil their expectations and flow naturally via each touchpoint. Employees need the tools and information to support these experiences effectively.

Read Post

Site24x7

Read more about How digital experience monitoring (DEM) tools improve both customer and employee journeys

AI in server monitoring

Mar 16, 2025 By Geoffrin Edwin In Site24x7

AI is what automation used to be: the latest problem-solver. Organizations have rallied their teams to integrate AI into their workflows to quadruple the efficiency quotient—and it's already started to yield results. As organizations increasingly rely on complex server ecosystems, traditional monitoring methods often struggle to kee pace with the volume and complexity of data generated. AI can be a star player here.

Read Post

Site24x7

Read more about AI in server monitoring

New Browser APIs for Detecting Javascript Performance Issues in the Production

Mar 16, 2025 By Janani In Atatus

Users nowadays demand the greatest possible experience, which implies top-notch performance. Smooth scrolling, prompt interaction responses, a fast page load time, and flawless animations are all things they anticipate. Local profiling to identify performance issues is convenient, but it only provides a limited amount of information. While things may run smoothly on our high-end developer machines, the user may be dealing with poor hardware and a bad experience.

Read Post

Atatus

Read more about New Browser APIs for Detecting Javascript Performance Issues in the Production

Understanding JavaScript Performance with Flame Charts

Mar 15, 2025 By Request Metrics In Request Metrics

View Video

Request Metrics

Read more about Understanding JavaScript Performance with Flame Charts

DeepSeek's GRPO is the biggest breakthrough since transformers

Mar 14, 2025 By Shailesh Manjrekar In Fabrix

GRPO is a new reinforcement learning technique that replaces traditional methods like Proximal Policy Optimization (PPO) DeepSeek’s Group Relative Policy Optimization (GRPO) represents a paradigm shift in reinforcement learning (RL) for large language models, addressing key limitations of Proximal Policy Optimization (PPO) through innovative simplifications and efficiency gains. Here’s why GRPO stands out.

Read Post

Fabrix

Read more about DeepSeek's GRPO is the biggest breakthrough since transformers

When to choose RAG or LoRA for training?

Mar 14, 2025 By Shailesh Manjrekar In Fabrix

When choosing between Retrieval-Augmented Generation (RAG) and Low-Rank Adaptation (LoRA) for model training, the decision hinges on your specific use case, resource constraints, and performance requirements. Here’s a structured comparison to guide your selection.

Read Post

Fabrix

Read more about When to choose RAG or LoRA for training?

3CX VoIP Call Detail Records In Graylog

Mar 14, 2025 By Jeff Darrington In Graylog

Even with the rise of high-speed networks and sophisticated monitoring tools, VoIP Call Data Records (CDR) remain an essential resource for troubleshooting and optimizing bandwidth usage. These records provide a granular view of call quality, latency, jitter, and packet loss—critical factors that directly impact voice performance.

Read Post

Graylog

Read more about 3CX VoIP Call Detail Records In Graylog

Identifying and fixing deadlocks in Java

Mar 14, 2025 By Kirubanandan RA In Site24x7

A deadlock occurs when two or more threads are continuously blocked after waiting for the same resources. In other words, Thread A is waiting for a resource held by Thread B, while Thread B is also waiting for a resource held by Thread A. This creates a loop of blocking, causing the application to become unresponsive.

Read Post

Site24x7

Read more about Identifying and fixing deadlocks in Java

You Can See Who Is Spiking Your Cloud Costs

Mar 14, 2025 By Honeycomb In Honeycomb

With Honeycomb, you can use traces and wide attributes to find out who is spiking your cloud costs. Simply add attributes to your Lambdas and other serverless traces, use triggers to alert your team, and run Honeycomb Queries. This short video shows you a simple query flow.

View Video

Honeycomb

Read more about You Can See Who Is Spiking Your Cloud Costs

Fix IT Incidents Faster with AI | Meet Edwin AI: The First Agentic AI for ITOps

Mar 14, 2025 By LogicMonitor In LogicMonitor

Tired of drowning in IT alerts? Struggling to find the root cause of incidents? Edwin AI is here to help. Edwin AI is the first agentic AI built for IT teams, designed to cut through the noise, speed up resolutions, and prevent outages. Cuts alert noise by 90% – Less clutter, more focus Fixes issues 60% faster – AI-powered insights and recommendations Boosts team productivity by 20% – Automates tasks and escalations.

View Video

LogicMonitor

Read more about Fix IT Incidents Faster with AI | Meet Edwin AI: The First Agentic AI for ITOps

Best practices for managing Datadog organizations at scale

Mar 14, 2025 By Santiago Gómez Sáez In Datadog

The adoption of Datadog in large enterprises typically goes beyond integrating metrics, traces, and logs to unify observability. These enterprises must implement and use Datadog in a compliant and standard way across divisions, teams, and projects to enhance data security, comply with regulations, manage costs, and increase operational efficiency.

Read Post

Datadog

Read more about Best practices for managing Datadog organizations at scale

A Simple HTML Document in a Flame Chart

Mar 14, 2025 By Request Metrics In Request Metrics

View Video

Request Metrics

Monitoring

Read more about A Simple HTML Document in a Flame Chart

Optimizing AWS NAT Gateway Usage

Mar 14, 2025 By Kentik In Kentik

AWS NAT gateways are essential but costly—especially when they're underutilized or overused. In this Kentik walkthrough, we'll show you how to quickly identify unnecessary NAT gateway expenses and optimize your cloud infrastructure spending. Learn to analyze traffic patterns, pinpoint problematic gateways, and achieve cost-effective network visibility using Kentik's Data Explorer.

View Video

Kentik

Read more about Optimizing AWS NAT Gateway Usage

3 Popular Methods to Shut Down or Reboot a Remote Computer

Mar 14, 2025 By Staff Member In SolarWinds

Managing IT systems in interconnected environments often requires shutting down or rebooting remote computers for several reasons. For instance, you might want to reboot the computer to troubleshoot errors and address software updates. Or you might shut it down as part of your security protocols. In this post, you’ll learn three popular methods for rebooting or shutting down remote computers. We’ll also cover some additional considerations, including potential issues and how to solve them.

Read Post

SolarWinds

Read more about 3 Popular Methods to Shut Down or Reboot a Remote Computer

Edwin AI kicks off a new era of ITOps, powered by LogicMonitor and OpenAI

Mar 14, 2025 By LogicMonitor In LogicMonitor

I know you’ve been there: a critical system goes down, and suddenly, you’re in a war room, staring at a blizzard of alerts, conflicting logs, and a dozen theories pointing in different directions. Time slips by as you sift through fragmented data, chasing symptoms instead of solutions. Hours of digging later, all you have are more questions and a cup of lukewarm coffee. This isn’t just frustrating—it’s draining.

Read Post

LogicMonitor

Read more about Edwin AI kicks off a new era of ITOps, powered by LogicMonitor and OpenAI

We Built a CLI Tool for Hosted Graphite - Here's How It Works!

Mar 14, 2025 By MetricFire In MetricFire

We'll walk through: Simple installation & setup Sending and querying metrics in seconds Seamless integration with your monitoring stack Ready to try it out? Install the Hosted Graphite CLI now: Give it a spin and let us know what you think in the comments!

View Video

MetricFire

Read more about We Built a CLI Tool for Hosted Graphite - Here's How It Works!

Teaching AI to Speak Nexthink Query Language: Lessons from Nexthink Assist

Mar 13, 2025 By Mohamed Kafsi In Nexthink

In today's fast-paced IT environments, managing the Digital Employee Experience (DEX) shouldn't require mastering query languages or wading through endless data. IT teams need immediate answers, not more complexity. That’s why we have built Nexthink Assist, our AI-powered virtual assistant in Nexthink Infinity. By leveraging the power of Generative AI (GenAI) and Large Language Models (LLMs), Assist transforms the way organizations manage their DEX.

Read Post

Nexthink

Read more about Teaching AI to Speak Nexthink Query Language: Lessons from Nexthink Assist

No Jitter Webinar: Move Beyond Reactive Fixes with Proactive Microsoft Teams Monitoring

Mar 13, 2025 By Mia Martello In Martello Technologies

In today’s hybrid work environment, Microsoft Teams has become the backbone of business communication. But as organizations rely more on Teams and Teams Phone, unexpected performance issues can lead to costly downtime, frustrated employees, and disrupted workflows. Traditional reactive troubleshooting is no longer enough—businesses need a proactive approach to ensure uninterrupted collaboration.

Read Post

Martello Technologies

Read more about No Jitter Webinar: Move Beyond Reactive Fixes with Proactive Microsoft Teams Monitoring

What Is a Status Page Aggregator?

Mar 13, 2025 By Nuno Tomas In isDown

Businesses today rely on multiple cloud services to manage their operations. Whether it's hosted services like AWS, customer relationship tools like Salesforce, or marketing platforms like HubSpot, these services play a crucial role in day-to-day business functions. However, businesses can suffer significant disruptions when a third-party service experiences downtime. A single outage in a critical service can halt operations, causing frustration for both employees and customers.

Read Post

isDown

Read more about What Is a Status Page Aggregator?

A Practical Guide to the OpenTelemetry Java Agent

Mar 13, 2025 By Prathamesh Sonpatki In Last9

Ever felt like you're missing crucial insights into your Java applications? The OpenTelemetry Java Agent changes that game completely. This comprehensive guide takes you beyond the basics, showing you not just how to implement it, but how to master it for maximum observability.

Read Post

Last9

Read more about A Practical Guide to the OpenTelemetry Java Agent

The Complete Guide to Monitoring Container CPU Usage

Mar 13, 2025 By Anjali Udasi In Last9

Have you ever opened your Kubernetes dashboard and wondered why your app seems to slow down? As containers multiply rapidly, keeping track of CPU usage becomes a must. Let’s break it down by focusing on one key metric: container_cpu_usage_seconds_total.

Read Post

Last9

Read more about The Complete Guide to Monitoring Container CPU Usage

How to Set Up Logging in Node.js (Without Overthinking It)

Mar 13, 2025 By Preeti Dewani In Last9

Logging in Node.js might not be the most exciting part of development, but it’s one of the most important. Whether you're troubleshooting bugs or keeping track of how your app is running, good logs make life easier. Let’s break down how to set up logging the right way.

Read Post

Last9

Read more about How to Set Up Logging in Node.js (Without Overthinking It)

LLMs Are Weird Computers

Mar 13, 2025 By Phillip Carter In Honeycomb

I’ve increasingly changed my perspective on LLMs and modern AI systems over the past few years: Let me elaborate on why I believe this now.

Read Post

Honeycomb

Read more about LLMs Are Weird Computers

Escaping the technical debt black hole with APM

Mar 13, 2025 By Site24x7 In ManageEngine

Technical debt accumulates when short-term solutions lead to long-term software inefficiencies, increasing maintenance costs, slowing development, and degrading performance. To effectively manage technical debt, teams need full-stack observability, from a high-level application view down to code execution and thread-level analysis. Tackling technical debt ensures long-term software sustainability.

Read Post

ManageEngine

Read more about Escaping the technical debt black hole with APM

Combine Fixtures & Page Object Models for DRYer Test Code in Playwright

Mar 13, 2025 By Nočnica Mellifera In Checkly

If you're using Playwright for end-to-end testing or synthetic monitoring with Checkly, you've likely considered reusing your test code across different test cases. A common approach for this is using Page Object Models (POMs). However, if you're like me, you might have mixed feelings about POMs—while they help organize your code, they can sometimes feel cumbersome to set up and maintain.

Read Post

Checkly

Read more about Combine Fixtures & Page Object Models for DRYer Test Code in Playwright

How we responded to a 2+ hour partial outage in Grafana Cloud

Mar 13, 2025 By Mick Gregg In Grafana

On Tuesday, Feb. 18, 2025, we experienced an outage that lasted approximately 150 minutes and impacted roughly 25% of our Grafana Cloud services. To our customers: we are very sorry and more than a little embarrassed that we stepped outside our own processes and advice to cause this. You rely on us to help monitor and troubleshoot your environments, and this type of incident obviously makes it harder for you to do that.

Read Post

Grafana

Read more about How we responded to a 2+ hour partial outage in Grafana Cloud

Why Monitoring iManage is Critical for Enhancing End-User Experience in Legal Firms

Mar 13, 2025 By Teneo In Teneo

As a Performance Field Technical Consultant working with customers in the legal industry, my primary focus is to ensure that technology enhances productivity rather than hinders it. Legal professionals rely on iManage as a business-critical application for document management, collaboration, and compliance. However, with the increasing shift to the cloud and integration with platforms like O365, ensuring a seamless user experience has become more complex.

Read Post

Teneo

Read more about Why Monitoring iManage is Critical for Enhancing End-User Experience in Legal Firms

Datadog On Datadog

Mar 13, 2025 By Datadog In Datadog

At Datadog, over 2,000 engineers deploy and ship new features daily. As a leading observability and security platform used by thousands of companies, ensuring quality and reliability is no small feat. Part of our commitment to excellence lies in our dogfooding culture where our engineering organization is one of the largest and most demanding users of the Datadog platform.

View Video

Datadog

Read more about Datadog On Datadog

Instrument, monitor, fix: a hands-on debugging session

Mar 13, 2025 By Sentry In Sentry

Join me for a hands-on session where you’ll build it, watch it break, debug it, and go from “no idea what’s wrong” to fixing issues—all in one go. Since we’re serious developers, we’ll use Next.js and.

View Video

Sentry

Read more about Instrument, monitor, fix: a hands-on debugging session

Console Updates

Mar 13, 2025 By Sentry In Sentry

Sentry now supports game consoles! Get crash reports and detailed context for PlayStation, Xbox, and Nintendo Switch.

View Video

Sentry

Read more about Console Updates

What is gen AI as a technology?

Mar 13, 2025 By SolarWinds In SolarWinds

Connect with SolarWinds.

View Video

SolarWinds

Read more about What is gen AI as a technology?

See It In Action: Graylog Demo

Mar 13, 2025 By Graylog In Graylog

Watch Seth Goldhammer, Graylog VP of Product walk you through a Graylog Demo.

View Video

Graylog

Read more about See It In Action: Graylog Demo

Visualizing Browser Performance with Flame Charts

Mar 13, 2025 By Request Metrics In Request Metrics

View Video

Request Metrics

Monitoring

Read more about Visualizing Browser Performance with Flame Charts

Launching SigNoz Single Binary for Super Easy Open-Source Installation & Maintenance

Mar 13, 2025 By Ankit Anand In SigNoz

At SigNoz, we are always striving to make observability simple and accessible. In response to feedback from our open-source community, we have bundled key components of SigNoz into a single binary. This means fewer moving parts, simpler maintenance, and a much smoother installation experience.

Read Post

SigNoz

Read more about Launching SigNoz Single Binary for Super Easy Open-Source Installation & Maintenance

Essential Prometheus Queries: Simple to Advanced

Mar 13, 2025 By Anjali Udasi In Last9

Monitoring your infrastructure doesn't have to be a headache. With Prometheus, you've got a powerful ally in your corner—but like any tool, knowing how to use it makes all the difference. Let's cut through the noise and get straight to the good stuff: practical Prometheus query examples that extract exactly the insights you need when you need them most.

Read Post

Last9

Read more about Essential Prometheus Queries: Simple to Advanced

Dynatrace vs Prometheus - A Detailed Comparison for 2025

Mar 12, 2025 By Pavithra Parthiban In Atatus

When it comes to monitoring solutions, Dynatrace and Prometheus are two powerful tools that cater to different use cases. While Dynatrace is a comprehensive observability platform Prometheus is an open-source monitoring tool designed for scalability and flexibility. But which one should you choose? This detailed Dynatrace vs. Prometheus comparison will help you make an informed decision by evaluating key aspects such as data collection, alerting, integrations, scalability, and pricing.

Read Post

Atatus

Read more about Dynatrace vs Prometheus - A Detailed Comparison for 2025

London Summit: Executive Opening Keynote

Mar 12, 2025 By Datadog In Datadog

View Video

Datadog

Read more about London Summit: Executive Opening Keynote

Incident Response: Keeping Cool When Everything's on Fire

Mar 12, 2025 By Datadog In Datadog

The DevOps revolution broke down the traditional silos between development and operations, fundamentally reshaping how we build and maintain software. But with this evolution came an inevitable reality for many engineers: being on-call and responding to incidents. While critical for service reliability, the on-call experience often brings significant stress.

View Video

Datadog

Read more about Incident Response: Keeping Cool When Everything's on Fire

Automation Solves a Reboot Nightmare for a Leading Technology Company

Mar 12, 2025 By Ella Drimer In Nexthink

In the growing DEX industry, we advocate for a predictive approach to digital workplace management. Build processes and systems around the goal of a seamless employee experience, and you’ll deal with fewer IT challenges as a result. However, even the most well-designed system cannot avoid the inescapable impact of technologies greatest foe: human error – as one of our customers, a global technology leader, recently discovered.

Read Post

Nexthink

Read more about Automation Solves a Reboot Nightmare for a Leading Technology Company

Update: Status Pages Now Support 11 Languages

Mar 12, 2025 By Tomas Koprusak In Uptime Robot

Hey everyone! We’re back with some exciting news for all Status page users. Thanks to your feedback, we’ve added the option to switch your status page to any of the following languages with a single click: You can create separate status pages for each language in all paid plans. Go try it now! Translate Status Page Now Tip: Want to be part of future improvements? Drop your feature ideas on our Nolt board or vote on existing ones.

Read Post

Uptime Robot

Read more about Update: Status Pages Now Support 11 Languages

Serving Self-hosted Healthchecks Under a Path

Mar 12, 2025 By Pēteris Caune In Healthchecks

But I am also happy to incorporate features that enable or simplify self-hosting use cases. Examples include the first-party Docker image, the remote authentication support, the Apprise integration, the Shell commands integration. A more niche feature that has come up a few times is the ability to serve Healthchecks on a subpath. Typically Healthchecks would run on a root level of a domain:.

Read Post

Healthchecks

Read more about Serving Self-hosted Healthchecks Under a Path

Efficient Error Triage: Reducing Debugging Time

Mar 12, 2025 By Rollbar In Rollbar

When software errors strike, developers must act fast. Efficiently triaging issues can drastically reduce downtime, improve user experience, and keep your development team focused on innovation. Rollbar offers powerful features designed to help teams streamline error triage and resolve issues quickly. Here's how you can master the triage process and leverage Rollbar to reduce time spent debugging.

Read Post

Rollbar

Read more about Efficient Error Triage: Reducing Debugging Time

How to assign bugs to a team or user

Mar 12, 2025 By Rollbar In Rollbar

Quickly assign critical items to a team or user in a team... ensuring no bug falls into a "black hole".

View Video

Rollbar

Monitoring

Read more about How to assign bugs to a team or user

Telemetry pipeline management at any scale: Fleet Management in Grafana Cloud is generally available

Mar 12, 2025 By Edwin Onattu In Grafana

We announced Fleet Management in Grafana Cloud last year to solve the pain points that come with managing dozens, hundreds, or even thousands of telemetry collectors across departments and environments. And today we’re excited to announce that Fleet Management is generally available for all Grafana Cloud users who need help managing telemetry collector deployments at scale.

Read Post

Grafana

Read more about Telemetry pipeline management at any scale: Fleet Management in Grafana Cloud is generally available

Reading Flame Charts for Web Performance

Mar 12, 2025 By Request Metrics In Request Metrics

View Video

Request Metrics

Monitoring

Read more about Reading Flame Charts for Web Performance

Easy debugging with Laravel breadcrumbs and Honeybadger

Mar 12, 2025 By X/Twitter Author Twitter In Honeybadger

If you're building web applications and care about your users, Laravel breadcrumbs can help you debug why you're seeing an error, giving you greater insight into what users are experiencing. It's easy to take advantage of this feature and add breadcrumbs without much extra configuration, especially if you're already using Honeybadger. Here's a quick walkthrough.

Read Post

Honeybadger

Read more about Easy debugging with Laravel breadcrumbs and Honeybadger

Prometheus Port Configuration: A Detailed Guide

Mar 12, 2025 By Prathamesh Sonpatki In Last9

Setting up Prometheus should be straightforward, but when metrics stop flowing, it’s usually something simple—like a port issue. Misconfigure it, and suddenly, your whole monitoring setup feels like a guessing game. This guide breaks down how to configure Prometheus ports properly, whether you're sticking to defaults or need a custom setup.

Read Post

Last9

Read more about Prometheus Port Configuration: A Detailed Guide

Syslog Monitoring: A Guide to Log Management and Analysis

Mar 12, 2025 By Anjali Udasi In Last9

Relying on syslogs to debug issues at odd hours? It happens to the best of us. A solid syslog setup isn’t just about collecting logs—it’s about making them useful. This guide walks through setting up syslog, configuring it for better visibility, and using monitoring techniques that actually help when things go wrong. No fluff, just practical steps you can use right away.

Read Post

Last9

Read more about Syslog Monitoring: A Guide to Log Management and Analysis

Performance Impact of High Cardinality in Time-Series DBs

Mar 12, 2025 By Anjali Udasi In Last9

Time-series databases have become the backbone of modern observability, financial analytics, and IoT systems. But there's a common challenge that can bring even the most robust systems to their knees: high cardinality. When your database starts tracking millions of unique values across various dimensions, performance doesn't just dip—it can collapse entirely. Let's understand the technical details of what happens when cardinality spikes and how you can architect your systems to handle it.

Read Post

Last9

Read more about Performance Impact of High Cardinality in Time-Series DBs

What is Log Data? The SRE's Essential Guide

Mar 12, 2025 By Anjali Udasi In Last9

As an SRE, when systems fail and alerts flood in, log data becomes your most valuable asset. But what exactly is log data, and how can you use it to improve system reliability?

Read Post

Last9

Read more about What is Log Data? The SRE's Essential Guide

New Storage Support in SolarWinds 2025.1! | HPE Alletra, Dell PowerStore & More!

Mar 12, 2025 By SolarWinds In SolarWinds

New in SolarWinds Platform 2025.1! We’ve added support for HPE Alletra 5k, 6k, 9k, GreenLake for Block, and Dell PowerStore Q models — giving you deeper visibility into your storage infrastructure. In this quick walkthrough, SolarWinds Evangelist Chrystal Taylor dives into the HPE Alletra support now live in our online demo, showing you how to explore: Cluster performance & health Storage details (block, file, hardware) LUNs and more!

View Video

SolarWinds

Read more about New Storage Support in SolarWinds 2025.1! | HPE Alletra, Dell PowerStore & More!

Flowing with Your Code: How Lightrun's Dynamic Traces Help Debug Complex Application Flows

Mar 12, 2025 By Barark Ben Ari In Lightrun

Debugging software, whether during development or incident investigation, often begins with a manual and error-prone process. Developers typically scatter logs and snapshots across the codebase, allowing them to trigger multiple times. They then inspect the outputs and sift through the results to identify those relevant to the issue under investigation. Developers tend to group results that stem from the same user request or transaction.

Read Post

Lightrun

Read more about Flowing with Your Code: How Lightrun's Dynamic Traces Help Debug Complex Application Flows

Project Managing Multiple Applications with AppSignal

Mar 12, 2025 By Michael Kurt In AppSignal

As a project manager for a small Rails agency, I find it challenging to keep track of every client’s application. Is the site live? Is it stable? Do we have a silent issue that frequently rears its head? AppSignal makes things like anomaly detection, uptime monitoring, and issue resolution easy, even for a non-technical project manager!

Read Post

AppSignal

Read more about Project Managing Multiple Applications with AppSignal

Introducing Alarms: Get real-time alerts from any query in Honeybadger

Mar 11, 2025 By Joshua Wood In Honeybadger

In Honeybadger, everything is an event. Application errors, logs, telemetry data? All events. While we provide simple APM-style (Application Performance Monitoring) views on top of these events, we also give you direct access through our advanced query engine in Honeybadger Insights. You can use BadgerQL to transform and aggregate events at query time, allowing you to analyze your data and derive metrics without deploying new instrumentation.

Read Post

Honeybadger

Read more about Introducing Alarms: Get real-time alerts from any query in Honeybadger

Grafana OnCall OSS in maintenance mode: your questions answered

Mar 11, 2025 By Richard "RichiH" Hartmann In Grafana

At Grafana Labs, we believe in treating everyone with respect, and a core aspect of respect is clear and transparent communication. When we decided to move Grafana OnCall (OSS) into maintenance mode, we knew that along with the public announcement, there would be a lot of questions.

Read Post

Grafana

Read more about Grafana OnCall OSS in maintenance mode: your questions answered

Incident response and on-call management in one app: Introducing Grafana Cloud IRM

Mar 11, 2025 By Joey Orlando In Grafana

At Grafana Labs, we’re always searching for ways to develop products that give our users the best tooling to help in their day-to-day understanding of their systems. We built OnCall and Incident in Grafana Cloud, our fully managed observability platform, to make it easier to respond to and fix incidents — all on top of the Grafana dashboards you know and love.

Read Post

Grafana

Read more about Incident response and on-call management in one app: Introducing Grafana Cloud IRM

How DX NetOps Reporting Improves Operational Efficiency

Mar 11, 2025 By Sandeep Tiwary In Broadcom

In my role in product management, I get the opportunity to meet with network operations professionals in a range of organizations. When I ask them about the challenges they’re facing, their responses can largely be summed up in one word: complexity.

Read Post

Broadcom

Read more about How DX NetOps Reporting Improves Operational Efficiency

How Do We Market Our SaaS? Here's What Worked And What Didn't

Mar 11, 2025 By Laurens Goethals In Oh Dear

Oh Dear is the underdog in the website monitoring software space. We’re a small player in a market dominated by big, well-funded competitors. And yet, we’ve built a solid user base and created a profitable SaaS business. But how did we do it? And more importantly, what worked, and what didn’t? This is our honest story about marketing Oh Dear, our lessons learned, and the strategies that helped us grow.

Read Post

Oh Dear

Read more about How Do We Market Our SaaS? Here's What Worked And What Didn't

Monitor Reporting & Analytics

Mar 11, 2025 By Leo Baecker In Hyperping

We're excited to launch our new Reporting Dashboard, giving you a powerful command center to analyze your monitoring data and make data-driven reliability decisions.

Read Post

Hyperping

Read more about Monitor Reporting & Analytics

Getting Started with Grafana Cloud IRM | Grafana Labs

Mar 11, 2025 By Grafana In Grafana

In this video, Joey Orlando, Engineering Manager at Grafana, walks you through Grafana Cloud Incident Response Management (IRM)—a new powerful solution that unifies Grafana OnCall and Grafana Incidents into one seamless experience. You'll learn how to: Set up on-call schedules and escalation chains Configure integrations for your monitoring systems Respond to alerts efficiently with automated workflows Migrate from PagerDuty or Splunk On-Call to Grafana IRM.

View Video

Grafana

Read more about Getting Started with Grafana Cloud IRM | Grafana Labs

Using Kerberos Authentication in WhatsUp Gold

Mar 11, 2025 By WhatsUp Gold In WhatsUp Gold

Watch this video to learn how to configure your devices and your WhatsUp Gold environment to enable Kerberos authentication for your Windows-based devices.

View Video

WhatsUp Gold

Read more about Using Kerberos Authentication in WhatsUp Gold

Alerting with InfluxDB 3 Core and Enterprise

Mar 11, 2025 By Anais Dotis-Georgiou In InfluxData

Monitoring is only as good as the alerts that surface critical issues before they spiral out of control. With InfluxDB 3 Core and Enterprise, you can extend alerting capabilities beyond built-in solutions by leveraging custom Python processing plugins. Whether you need real-time notifications when thresholds are exceeded or advanced anomaly detection tailored to your infrastructure, developing custom alerting logic ensures you get the right alerts at the right time.

Read Post

InfluxData

Read more about Alerting with InfluxDB 3 Core and Enterprise

Tackling geographic discrepancies in user experience for mid-market businesses with real user monitoring

Mar 11, 2025 By Sindu Priyadharshini V In Site24x7

Middle market businesses operate in a unique space—they need to do more with less. Whether you’re running an e-commerce store, a SaaS platform, or a service-based website, customers of mid-market businesses expect fast-loading pages and smooth interactions—no matter where they are. Creating a seamless digital experience is essential for customer retention and revenue growth. But here’s the challenge: Website and application performance aren’t the same everywhere.

Read Post

Site24x7

Read more about Tackling geographic discrepancies in user experience for mid-market businesses with real user monitoring

New! Website and Ping monitor metrics

Mar 11, 2025 By Colin Bartlett In StatusGator

We’re excited to announce a new enhancement to StatusGator: the Metrics tab on the monitor details page! Now, for both Website and Ping monitors, you can track vital performance data, including availability, downtime, and response time trends over the last 24 hours.

Read Post

StatusGator

Read more about New! Website and Ping monitor metrics

PHP Error Logs: The Complete Troubleshooting Guide You Need

Mar 11, 2025 By Preeti Dewani In Last9

That moment when your PHP application runs flawlessly on your local machine but crashes in production—we've all been there. The key difference between struggling with issues and resolving them efficiently often comes down to understanding PHP error logs. This guide will help you move from trial-and-error debugging to a structured approach for identifying and fixing problems faster.

Read Post

Last9

Read more about PHP Error Logs: The Complete Troubleshooting Guide You Need

Auto Instrumentation: An In-Depth Guide

Mar 11, 2025 By Anjali Udasi In Last9

Auto instrumentation might sound like something from a music studio, but it's one of the most powerful tools in a developer's arsenal for gaining visibility into applications without tedious manual code additions. If you're tired of littering your codebase with custom traces and want a more elegant solution, you're in the right place.

Read Post

Last9

Read more about Auto Instrumentation: An In-Depth Guide

Getting Started with OpenTelemetry JavaScript

Mar 11, 2025 By Prathamesh Sonpatki In Last9

Have you ever watched your JavaScript app fail in production and wondered, “What just happened?” OpenTelemetry JavaScript helps answer that question, in a practical way to track what’s going on under the hood. Let’s walk through how it works, why it’s useful, and how to set it up without unnecessary complexity. If you've ever struggled with vague logs and slow API calls, this is for you.

Read Post

Last9

Read more about Getting Started with OpenTelemetry JavaScript

Easily Monitor Multiple Heroku Apps with AppSignal

Mar 11, 2025 By Connor James In AppSignal

You can now connect multiple Heroku apps to a single AppSignal instance with the AppSignal add-on. We've also improved our installation wizard to help you get up and monitoring even quicker, so you can start tracking your apps in no time.

Read Post

AppSignal

Read more about Easily Monitor Multiple Heroku Apps with AppSignal

Silence during chaos: Why the X outage is a call to arms for proactive monitoring

Mar 11, 2025 By Ritik Sharma In Catchpoint

When X (formerly Twitter) suffered a global outage on March 10-11, 2025, millions of users and businesses were left in the dark. Apart from a solitary post from CEO Elon Musk claiming a cyber-attack, X has remained silent. Yet Catchpoint’s Internet Sonar detected the crisis in real time—highlighting the critical role independent, proactive monitoring plays when vendor communication fails.

Read Post

Catchpoint

Read more about Silence during chaos: Why the X outage is a call to arms for proactive monitoring

What Is AI Autonomous Debugging? A Deep Dive into the Future of Software Troubleshooting

Mar 11, 2025 By Dror Bereznitsky In Lightrun

In the fast-paced world of software development, debugging remains one of the most time-consuming and complex tasks for engineers. Modern observability tools that use logs, metrics, and traces help developers gain insights into system behavior, but they still require manual effort to identify and fix issues.

Read Post

Lightrun

Read more about What Is AI Autonomous Debugging? A Deep Dive into the Future of Software Troubleshooting

Monitor GitHub Copilot with Datadog

Mar 11, 2025 By Bowen Chen In Datadog

AI-powered coding tools are becoming more commonplace within developer workflows. GitHub Copilot is a popular AI coding assistant that can be integrated directly into IDEs or as a standalone chat interface. This tool helps you write code faster and with less effort by auto-completing code in real time, generating blocks of code from natural language prompts, and answering your questions to help you get over coding hurdles and roadblocks.

Read Post

Datadog

Read more about Monitor GitHub Copilot with Datadog

Enhancing Observability with the OTEL Framework and Virtana

Mar 11, 2025 By David McNerney In Virtana

In today’s rapidly evolving technological landscape, observability has become essential for supporting robust, efficient systems. According to Gartner’s report “Preparing for the Future of Observability” from September 2024, OpenTelemetry (OTEL) is emerging as the standard framework for collecting telemetry data across different application pipelines.

Read Post

Virtana

Read more about Enhancing Observability with the OTEL Framework and Virtana

New Discovery with NetScan for Automated Asset Management in Pandora FMS NG 781 RRR

Mar 11, 2025 By Pandora FMS team In Pandora FMS

In the recent NG 781 RRR update, Pandora FMS has significantly enhanced its Discovery system with the powerful NetScan feature, making it even easier to automatically detect and comprehensively monitor technological assets in complex networks.

Read Post

Pandora FMS

Read more about New Discovery with NetScan for Automated Asset Management in Pandora FMS NG 781 RRR

Integrate Checkly with Render for more reliable production environments

Mar 11, 2025 By Nočnica Mellifera In Checkly

With Render’s announcement this week of their new webhook integrations triggered by Render events, I wanted to explore how the integration between Render and Checkly can help ensure more reliable production services for your users. Render is a cloud application platform that enables developers to deploy and scale their apps without needing to manage infrastructure.

Read Post

Checkly

Read more about Integrate Checkly with Render for more reliable production environments

What To Know About Parsing JSON

Mar 11, 2025 By Jeff Darrington In Graylog

If you grew up in the 80s and 90s, you probably remember your most beloved Trapper Keeper. The colorful binder contained all the folders, dividers, and lined paper to keep your middle school and high school self as organized as possible. Parsing JSON, a lightweight data format, is the modern, IT environment version of that colorful – perhaps even Lisa Frank themed – childhood favorite.

Read Post

Graylog

Read more about What To Know About Parsing JSON

Unity Support Sweeps the Nation

Mar 11, 2025 By Sentry In Sentry

Sentry’s Unity SDK now has expanded support for multiplayer debugging and improved IL2CPP error tracking for faster fixing. In this video we’ll show you how you can setup Sentry in your Unity game, and a short demo of Sentry’s features.

View Video

Sentry

Monitoring

Read more about Unity Support Sweeps the Nation

Monitor and troubleshoot logs in real-time with Sumo Logic's Live Tail

Mar 11, 2025 By Hadijah Creary In Sumo Logic

Troubleshooting production logs shouldn’t be a hassle. Developers and IT operations need real-time insights without jumping between tools or manually sifting through endless log files. Sumo Logic Live Tail simplifies this process. You can instantly search, filter, and troubleshoot log tails in real-time within a single interface to get the data you need without logging into business-critical applications.

Read Post

Sumo Logic

Read more about Monitor and troubleshoot logs in real-time with Sumo Logic's Live Tail

Shared dashboards now start at FREE

Mar 11, 2025 By Richard Benwell In Squared Up

Since we added the Open Access feature to Dashboard Server way back in 2014, it has been a customer favourite. Build a dashboard, grab a special URL, and share it with anyone without getting into the costs and hassle of user management - useful for embedding in other tools, show it off it on a very visible wall monitor, or send to management for a monthly report. It's versatile, simple, and most importantly, affordable.

Read Post

Squared Up

Read more about Shared dashboards now start at FREE

Announcing DX NetOps Active Experience

Mar 11, 2025 By Jason Normandin In Broadcom

There is one scene in the 2003 blockbuster film “Lord of the Rings: The Return of the King” that never fails to fill me with awe. As the forces of the Dark Lord Sauron move towards their bloody siege of the Gondor capital, Minas Tirith, the hobbit Pippin lights the beacon fire.

Read Post

Broadcom

Read more about Announcing DX NetOps Active Experience

Best Datadog alternatives in 2025 [29 analyzed, top 4 picks]

Mar 10, 2025 By Leo Baecker In Hyperping

Datadog is the leader in monitoring software. But that doesn't mean it's the best choice for everyone. And if you're reading this, you probably have your doubts. While Datadog used to be the default choice for DevOps teams, today's organizations often struggle to justify its complex pricing model and steep learning curve. Many companies that started with Datadog have found it becoming prohibitively expensive and harder to use as they scale.

Read Post

Hyperping

Read more about Best Datadog alternatives in 2025 [29 analyzed, top 4 picks]

Top 12 Best Remote Access Software for Efficient Connectivity

Mar 10, 2025 By Staff Contributor In SolarWinds

Today, the workforce is more geographically dispersed than ever before. In the past, remote access was primarily used by IT teams or freelancers who needed to access specific resources from afar. For several years, remote work has been gaining traction, and the COVID-19 pandemic accelerated the adoption of remote and hybrid work environments. Now, businesses of all sizes rely on remote access software to empower employees, maintain productivity, and stay connected across various locations and time zones.

Read Post

SolarWinds

Read more about Top 12 Best Remote Access Software for Efficient Connectivity

Reducing MTTR: Why Speed Matters for B2B SaaS Companies

Mar 10, 2025 By Sara Miteva In Checkly

For B2B SaaS companies, downtime isn’t just an inconvenience—it’s a direct threat to customer satisfaction and revenue. Unlike consumer applications, they serve a mix of power users pushing the system to its limits and new users expecting a seamless experience from day one. Reliability isn’t just about keeping services online—it’s about ensuring every user interaction runs smoothly. A minor hiccup for one customer might be a major disruption for another.

Read Post

Checkly

Read more about Reducing MTTR: Why Speed Matters for B2B SaaS Companies

How to Monitor Server Uptime Without Missing Critical Failures

Mar 10, 2025 By Sematext In Sematext

Server uptime monitoring is critical for ensuring the reliability and availability of your infrastructure and services. By keeping track of server uptime, you may be able to identify and address potential issues before they impact your end-users. Why just “may be able to”? Because “it depends”. It depends on whether your infrastructure/applications/deployments are built with redundancy in mind. Even if you have a redundant setup, it depends whether it actually works.

Read Post

Sematext

Read more about How to Monitor Server Uptime Without Missing Critical Failures

A Guide to Fixing Kafka Consumer Lag [Without Jargon]

Mar 10, 2025 By Prathamesh Sonpatki In Last9

Have you ever looked at your monitoring dashboard and wondered, "Why is my Kafka consumer lag spiking again?" It’s a common frustration. Consumer lag isn’t just an inconvenience—it’s a sign that something’s wrong with your data pipeline. When lag builds up, you're facing delayed data processing and the risk of system failures.

Read Post

Last9

Read more about A Guide to Fixing Kafka Consumer Lag [Without Jargon]

Retrieving All Keys in Redis: Commands & Best Practices

Mar 10, 2025 By Anjali Udasi In Last9

Need to list all the keys in your Redis database? If you're debugging an issue or just checking what's stored, retrieving all keys is a useful skill for any developer. This guide covers everything you need to know—from the basic commands to the performance implications—so you can query Redis efficiently without slowing things down.

Read Post

Last9

Read more about Retrieving All Keys in Redis: Commands & Best Practices

High Cardinality Is Eating Your Storage Budget-Here's Why

Mar 10, 2025 By Anjali Udasi In Last9

Have you noticed your storage costs rising even when you're keeping an eye on them? The reason might be something easy to overlook: high cardinality data. For data engineers and developers balancing performance and costs, understanding its impact isn’t just useful—it’s key to avoiding unnecessary spending and system slowdowns.

Read Post

Last9

Read more about High Cardinality Is Eating Your Storage Budget-Here's Why

Monitoring in Hyperconverged Infrastructures: Challenges and Solutions

Mar 10, 2025 By Isaac García In Pandora FMS

I have a not-so-secret suspicion that the dream of everyone working with technology is the Enterprise computer from Star Trek. Controlling shields, communications, engines, and everything else from a single place—and with voice commands, no less. “One button to rule them all,” as Sauron might whisper. But until that utopia becomes a reality, at least we can implement a hyperconverged infrastructure (HCI) in our organization’s technology stack.

Read Post

Pandora FMS

Read more about Monitoring in Hyperconverged Infrastructures: Challenges and Solutions

Let's Encrypt Stops Expiration Emails - How to Ensure Your Certificates Stay Valid with SSL Certificate Monitoring

Mar 10, 2025 By Simona Omidkar In Icinga

SSL/TLS certificates are critical for secure communication, and keeping track of their expiration is essential. Until now, Let’s Encrypt has sent email notifications when certificates were about to expire. However, as of June 2025, Let’s Encrypt will discontinue these expiration emails. This change could lead to expired certificates going unnoticed, potentially causing security risks and downtime.

Read Post

Icinga

Read more about Let's Encrypt Stops Expiration Emails - How to Ensure Your Certificates Stay Valid with SSL Certificate Monitoring

7 Java Exception Monitoring Blind Spots That SREs Must Eliminate

Mar 10, 2025 By Arun Aravamudhan In eG Innovations

It’s 2 a.m. Alerts flood your dashboard. Transactions are failing, but logs offer no clues. Your SRE team is drowning in noise—while users struggle with outages. As Java workloads shift to microservices, Kubernetes, and the cloud, this problem is compounded. Exceptions cascade across tiers, triggering blame games while the root cause remains buried under fragmented logs and scattered alerts. Legacy monitoring tools overwhelm SREs with raw data but fail to connect the dots.

Read Post

eG Innovations

Read more about 7 Java Exception Monitoring Blind Spots That SREs Must Eliminate

Generating Calculated Fields From Natural Language

Mar 10, 2025 By Molly Stamos In Honeycomb

If you’ve been using Honeycomb for a bit, you know that Calculated Fields (otherwise known as derived columns) are a powerful way to transform your events to a format that’s easier to query and understand. However, they use a lisp-esque language that can be difficult to read and a pain to write. If you dislike making Calculated Fields and want something a little easier, here’s a generative AI prompt that can generate them from natural language.

Read Post

Honeycomb

Read more about Generating Calculated Fields From Natural Language

The One Where We Talk About #CriblCon25

Mar 10, 2025 By Cribl In Cribl

Join Ed Bailey, Chris Breshears, and Mike Dupuis as they wrap up Cribl’s CKO, share their excitement for the 2025 product roadmap, and get even more hyped for!

View Video

Cribl

Read more about The One Where We Talk About #CriblCon25

Grafana Drilldown: first-class OpenTelemetry support now available for metrics

Mar 10, 2025 By Brendan O'Handley In Grafana

When we launched Grafana Drilldown, our queryless experience for quicker, easier insights into your telemetry, we focused first on Prometheus because it was—and is—such a great solution for storing time series data. But as the industry continued to evolve, a different open source project began to emerge as another standard for modern observability: OpenTelemetry.

Read Post

Grafana

Read more about Grafana Drilldown: first-class OpenTelemetry support now available for metrics

Solve a Problem Using Honeycomb's Frontend Observability Sandbox

Mar 10, 2025 By Honeycomb In Honeycomb

I have a performance problem with my web application. This video shows how I use Honeycomb's Frontend Observability Web Launchpad to quickly identify the symptoms causing the slowdown.

View Video

Honeycomb

Read more about Solve a Problem Using Honeycomb's Frontend Observability Sandbox

See Why Your SLO is Failing in One Click

Mar 10, 2025 By Honeycomb In Honeycomb

Honeycomb Service Level Objectives (SLO) can notify your team when one of your error budgets are being exhausted. See how the SLO view gets you from zero information to identification in one click!

View Video

Honeycomb

Read more about See Why Your SLO is Failing in One Click

Release Notes March 25

Mar 10, 2025 By Henn Idan In logz.io

Syntax highlighting and auto suggestions to streamline query building and troubleshooting.

Read Post

logz.io

Read more about Release Notes March 25

Playwright Now Lets You Easily Paste Test Errors into LLMs!

Mar 10, 2025 By Checkly In Checkly

Join Stefan Judis, Playwright ambassador, as he explains how to use Playwright's new "Copy Prompt" button to resolve end-to-end tests with the help of AI.

View Video

Checkly

Read more about Playwright Now Lets You Easily Paste Test Errors into LLMs!

Fine-Tune Your Charts with Minutely Metrics in AppSignal

Mar 10, 2025 By Connor James In AppSignal

We've enhanced our application performance monitoring capabilities to give you a granular view of your application's behavior with minutely metrics. Now, when you select specific time ranges in your charts, you can see short-term trends, spot anomalies faster, and gain deeper insights into your application's performance.

Read Post

AppSignal

Read more about Fine-Tune Your Charts with Minutely Metrics in AppSignal

Istio Zero-Code Instrumentation

Mar 9, 2025 By Israel Blancas Alvarez In Coralogix

Tracing in Istio environments should be seamless, but too often, teams run into a frustrating problem—traces are broken. Requests jump between services, but instead of a complete flow, Coralogix displays fragmented spans. Tracing should work out of the box in those environments. Istio’s sidecars capture spans automatically, so why are traces incomplete? The issue is almost always context propagation, and fixing it doesn’t have to mean modifying application code.

Read Post

Coralogix

Read more about Istio Zero-Code Instrumentation

Advanced Monitor Filtering

Mar 9, 2025 By Leo Baecker In Hyperping

We've just rolled out powerful improvements to our monitor filtering system, making it easier than ever to find and manage exactly the monitors you need.

Read Post

Hyperping

Read more about Advanced Monitor Filtering

Getting started with the CSV data source

Mar 9, 2025 By John Hayes In Squared Up

Most of the time, the dashboards we create are querying data from SQL databases, Web APIs or large backend systems. Sometimes though, we might want to visualize an ad hoc data set – and this is where the SquaredUp CSV plugin really shines. You can create powerful dashboards just by pointing to the path of a CSV file, or even just paste your CSV data into a text box.

Read Post

Squared Up

Read more about Getting started with the CSV data source

How to Analyze Logs Using AI

Mar 8, 2025 By James Yaria In LogicMonitor

Your tech stack is growing, and with it, the endless stream of log data from every device, application, and system you manage. It’s a flood—one growing 50 times faster than traditional business data—and hidden within it are the patterns and anomalies that hold the key to the performance of your applications and infrastructure. But here’s the challenge you know well: with every log, the noise grows louder, and manually sifting through it is no longer sustainable.

Read Post

LogicMonitor

Read more about How to Analyze Logs Using AI

How Employers Can Identify Internal Security Risks Through Cyber Investigations

Mar 8, 2025 By OpsMatters In OpsMatters

Employers encounter a major risk known as insider threats in the digital world of today. Organizational personnel who hold access to sensitive data can use their privileges to launch destructive activities. Organizational systems face different security threats which include both data breaches alongside intellectual property theft and destructive attacks on company infrastructure. The detection of potential cyber threats depends heavily on effective cyber investigations because they help identify risks early at minimum damage.

Read Post

OpsMatters

Read more about How Employers Can Identify Internal Security Risks Through Cyber Investigations

Getting started with Azure DevOps dashboards

Mar 7, 2025 By Sameer Mhaisekar In Squared Up

Azure DevOps and its extensive feature set helps teams plan smarter, collaborate better, and ship faster. With several integrated features such as Azure Pipelines or Azure Repos, it gives you the flexibility to use just what you need to complement your existing workflows. However, as your usage of Azure DevOps grows, you might find that monitoring and observing key CI/CD metrics across these services gets increasingly challenging.

Read Post

Squared Up

Read more about Getting started with Azure DevOps dashboards

ManageEngine: Loved by customers, recognized on Gartner Peer Insights

Mar 7, 2025 By General In ManageEngine

ManageEngine Applications Manager has been positioned in the Customer’s Choice quadrant in the 2024 Gartner Peer Insights Voice of the Customer for Observability Platforms.

Read Post

ManageEngine

Read more about ManageEngine: Loved by customers, recognized on Gartner Peer Insights

9 Kubernetes monitoring best practices: A practical guide to successful implementation

Mar 7, 2025 By Applications Manager In ManageEngine

Kubernetes has revolutionized containerized application deployment, but effective monitoring remains a crucial challenge. Unlike traditional infrastructures, Kubernetes environments are dynamic, distributed, and short-lived, making real-time visibility essential for performance, security, and cost optimization. Without proper monitoring, teams risk application downtime, resource wastage, and security vulnerabilities.

Read Post

ManageEngine

Read more about 9 Kubernetes monitoring best practices: A practical guide to successful implementation

Datadog CPO Yanbing Li on AI, LLMs & 2025 Product Innovations! #Datadog #AI #TechTalk

Mar 7, 2025 By Datadog In Datadog

On a special episode of This Month in Datadog, Datadog CPO Yanbing Li joins Jeremy to discuss the company’s approach to building products, use cases she’s excited about in 2025, AI and LLMs, and so much more.

View Video

Datadog

Read more about Datadog CPO Yanbing Li on AI, LLMs & 2025 Product Innovations! #Datadog #AI #TechTalk

What is a Status Page? All You Need to Know

Mar 7, 2025 By Nuno Tomas In isDown

Nobody likes being left in the dark when a service goes down. We can imagine how frustrating it is to refresh a page repeatedly, wondering if the issue is on your end or if something bigger is happening. A status page provides real-time updates and eliminates that uncertainty, keeping users informed and reducing confusion. But what is it all about?

Read Post

isDown

Read more about What is a Status Page? All You Need to Know

The $1 Million Lesson: Building a Culture of Quality Through SLAs

Mar 7, 2025 By Mehdi Daoudi In Catchpoint

In the early days of DoubleClick, back when SaaS was still known as Application Service Provider (ASP), I was tasked with setting up the QoS (Quality of Service) Team. Our primary mission was to establish a monitoring system, but we quickly found ourselves managing Service Level Agreements (SLAs)—a task that became critical after we paid out over $1 million in penalties for SLA violations to a single customer. The reason? Someone had signed a contract promising 100% uptime, an impossible commitment.

Read Post

Catchpoint

Read more about The $1 Million Lesson: Building a Culture of Quality Through SLAs

Elasticsearch vs. Solr: What Developers Need to Know in 2025

Mar 7, 2025 By Anjali Udasi In Last9

When your project calls for a high-performance search solution, the Elasticsearch vs. Solr debate inevitably surfaces. Both are Lucene-powered search engines with passionate communities, but their architectural approaches and performance characteristics differ significantly. This guide dives into the technical nuances that matter to developers and DevOps professionals, helping you make an informed decision based on concrete metrics and real-world implementation considerations.

Read Post

Last9

Read more about Elasticsearch vs. Solr: What Developers Need to Know in 2025

How to Make the Most of Redis Pipeline

Mar 7, 2025 By Anjali Udasi In Last9

If you’ve been using Redis but haven’t explored pipelining, you’re missing out on some significant performance benefits. Redis pipelining is like a hidden gem—those who know about it can’t imagine working without it. In this guide, we’ll break down why pipelining is important and how it can help improve the efficiency of your applications.

Read Post

Last9

Read more about How to Make the Most of Redis Pipeline

High vs Low Cardinality: Is Your Observability Stack Failing?

Mar 7, 2025 By Anjali Udasi In Last9

Imagine trying to find a friend in a packed stadium with 50,000 people versus spotting them in a quiet coffee shop. That’s the difference between high and low cardinality data. And if you’re working with distributed systems or microservices, this isn’t just a theoretical distinction—it’s a fundamental challenge that can make or break your observability setup.

Read Post

Last9

Read more about High vs Low Cardinality: Is Your Observability Stack Failing?

Logging Best Practices to Reduce Noise and Improve Insights

Mar 7, 2025 By Prathamesh Sonpatki In Last9

Are your logs helping you, or are they just creating more work? If you’re sifting through endless data but still missing the important details, you’re not alone. It’s a common challenge—but one that can be solved. For anyone managing infrastructure, logs are essential. They show what’s happening, what’s broken, and sometimes even why. But without the right approach, they can easily turn into clutter instead of clarity.

Read Post

Last9

Read more about Logging Best Practices to Reduce Noise and Improve Insights

SolarWinds Observability Self-Hosted | 2025.1 GA Release Features Demo

Mar 7, 2025 By SolarWinds In SolarWinds

This webcast shows off the latest features included in the 2025.1 GA Release of SolarWinds Observability Self-Hosted. Product experts Erik Eff and Chad Every discuss the importance of total cost of ownership and customer feedback in driving product development, highlighting key areas such as hybrid IT visibility and AI-driven solutions. The demo section showcases improvements in cloud monitoring, device support, and user experience, including a new NOC dashboard with dark theme.

View Video

SolarWinds

Read more about SolarWinds Observability Self-Hosted | 2025.1 GA Release Features Demo

AI Agents: Your data sidekick (minus the coffee breaks)

Mar 7, 2025 By Jade Lassery In logz.io

Do you ever wish you had a personal data guru who could magically sift through all your data, spot patterns before they become problems, summarize everything in a way that actually makes sense and propose recommendations? Well, meet AI Agents—the “digital teammates” who do all that without demanding coffee breaks.

Read Post

logz.io

Read more about AI Agents: Your data sidekick (minus the coffee breaks)

Best status page software in 2025 [25 analyzed, top 5 picks]

Mar 7, 2025 By Leo Baecker In Hyperping

Are you looking for a reliable status page solution to keep your users informed? Wondering what alternatives are available to help you communicate system status effectively? While Statuspage.io used to be everyone's default choice, today's DevOps and SRE teams have a hard time justifying this choice. And there are a lot of new tools popping up every year. For this guide, we analyzed 25 tools and we'll explore the best status page software available today.

Read Post

Hyperping

Read more about Best status page software in 2025 [25 analyzed, top 5 picks]

Cross Project Searching with Rollbar Advanced PLans

Mar 7, 2025 By Rollbar In Rollbar

Unlock More Power with Rollbar Advanced.

View Video

Rollbar

Monitoring

Read more about Cross Project Searching with Rollbar Advanced PLans

How to Monitor Apache Zookeeper Using the OpenTelemetry Collector

Mar 7, 2025 By Benjamin Pitts In MetricFire

Apache Zookeeper is a distributed coordination tool that helps keep large-scale systems in sync. It’s the backbone for managing leader elections, service discovery, and metadata storage in projects like Kafka, Hadoop, and Elasticsearch. Think of it as a highly available traffic controller for distributed apps, ensuring everything runs smoothly.

Read Post

MetricFire

Read more about How to Monitor Apache Zookeeper Using the OpenTelemetry Collector

Introduction To Grafana Faro | Frontend Observability

Mar 7, 2025 By Grafana In Grafana

Learn how to gain real-time visibility into your web applications with Grafana Faro! In this hands-on tutorial, we’ll walk through instrumenting a JavaScript app with Faro.

View Video

Grafana

Read more about Introduction To Grafana Faro | Frontend Observability

Inside vmselect: The Query Processing Engine of VictoriaMetrics

Mar 7, 2025 By Phuong Le In VictoriaMetrics

This piece is part of our ongoing VictoriaMetrics series, where we break down how different components of the system function: Inside vmselect: The Query Processing Engine of VictoriaMetrics.

Read Post

VictoriaMetrics

Read more about Inside vmselect: The Query Processing Engine of VictoriaMetrics

How to Reduce Operational Costs with Efficient Power Generation in Industrial Settings

Mar 7, 2025 By OpsMatters In OpsMatters

Energy costs represent a significant portion of operational expenses in industrial settings. Factories, manufacturing plants, and large-scale production facilities rely on consistent and efficient power generation to keep operations running smoothly. Inefficient power usage can lead to higher energy bills, increased maintenance costs, and operational downtime. By implementing strategic energy solutions and investing in modern power generation technologies, industries can significantly reduce costs while maintaining productivity.

Read Post

OpsMatters

Read more about How to Reduce Operational Costs with Efficient Power Generation in Industrial Settings

How to Monitor Distributed Networks: The Essential Guide

Mar 6, 2025 By Alyssa Lamberti In Obkio

Traditional centralized networks are a thing of the past—distributed networks have taken over. Why? Because they’re built to handle today’s cloud-based services and SaaS apps way more effectively. In a world where businesses operate across the globe and data moves in real time, distributed networks have become the foundation of modern IT.

Read Post

Obkio

Read more about How to Monitor Distributed Networks: The Essential Guide

Prometheus API: From Basics to Advanced Usage

Mar 6, 2025 By Prathamesh Sonpatki In Last9

Monitoring your infrastructure shouldn’t be a shot in the dark. The Prometheus API helps you pull the right metrics so you actually know what’s going on. Whether you’re just getting started or trying to make sense of your current setup, this guide breaks down how to use the API to get the answers you need—without the guesswork.

Read Post

Last9

Read more about Prometheus API: From Basics to Advanced Usage

Nginx Logging: A Complete Guide for Beginners

Mar 6, 2025 By Aditya Godbole In Last9

So, you're wrestling with Nginx logs, huh? Been there. In fact, I used to spend way too much time hunting down log files until I finally got smart about it. Let me save you the trouble. Nginx logs are like the black box flight recorder for your web server. When everything crashes and burns (and it will), those logs are often the only evidence left to figure out what happened. But first, you need to know where to find them.

Read Post

Last9

Read more about Nginx Logging: A Complete Guide for Beginners

InfoBlox NetMRI is Ending-Here's Why You Should Move to IP Fabric Now

Mar 6, 2025 By Teneo In Teneo

If you are a network owner, you know the importance of stability, visibility, and automation. With NetMRI reaching its Last Order Date on April 30, 2025, now is the time to think ahead and choose a solution that doesn’t just replace what you have—but actually makes your job easier. That’s where IP Fabric comes in. If you’re still relying on NetMRI for network configuration and change management (NCCM), I strongly recommend making the switch now. Here’s why.

Read Post

Teneo

Read more about InfoBlox NetMRI is Ending-Here's Why You Should Move to IP Fabric Now

This is why you need CNAPP.

Mar 6, 2025 By Sysdig In Sysdig

View Video

Sysdig

Read more about This is why you need CNAPP.

The 555 benchmark for cloud security

Mar 6, 2025 By Sysdig In Sysdig

View Video

Sysdig

Read more about The 555 benchmark for cloud security

Keep Your Slack Integration Up to Date

Mar 6, 2025 By Tomas Koprusak In Uptime Robot

Hi everyone! Slack has updated the process for adding webhooks. If you created your Slack integration after November 2024, it will continue to work. However, if it was created before that date, it is now obsolete. We strongly recommend updating it to prevent potential issues in case Slack discontinues support.

Read Post

Uptime Robot

Read more about Keep Your Slack Integration Up to Date

Building an agentic AIOps strategy? Don't start without this checklist.

Mar 6, 2025 By LogicMonitor In LogicMonitor

Most IT leaders know they need AIOps. Few have a strategy for making it work. The problem isn’t a lack of AI-powered tools; it’s the absence of a clear, outcome-driven plan. Especially given the rapid adoption of ChatGPT and LLMs in general, organizations are spending billions on AI. But without a defined strategy, AIOps quickly turns into a patchwork of disconnected tools, rising costs, and disappointing ROI.

Read Post

LogicMonitor

Read more about Building an agentic AIOps strategy? Don't start without this checklist.

Introducing TCP Monitoring - A More Reliable Way to Monitor Your Entire Network

Mar 6, 2025 By Sarbdeep Singh In Broadcom

Network operations teams are under constant pressure to ensure optimal performance and availability. But in today's complex network environments, gaining a clear picture of what's happening is difficult. Without a reliable method of collecting performance metrics across your most critical connections, identifying the root cause of slowdowns or outages becomes a frustrating and time-consuming process.

Read Post

Broadcom

Read more about Introducing TCP Monitoring - A More Reliable Way to Monitor Your Entire Network

Stackify Retrace Use Cases - Customer Support

Mar 6, 2025 By James Michaelis In Stackify

Since joining Stackify in 2014 during the early stages of the startup company, I have had the privilege of working in various roles along the way. I began in customer support, moved to the QA team, and am now the product manager for Stackify Retrace.

Read Post

Stackify

Read more about Stackify Retrace Use Cases - Customer Support

MetricFire's CLI Tool: Easy Monitoring & Automation!

Mar 6, 2025 By MetricFire In MetricFire

Looking for a powerful way to send and visualise metrics from the command line? Meet HG CLI, MetricFire’s official command-line tool! In this video, we’ll show you how to install, configure, and use HG CLI to manage your Hosted Graphite metrics and create dashboards, all without having to configure an agent yourself. Whether you're a DevOps engineer, SRE, or developer, this tool will streamline your monitoring workflows! Don't forget to like, subscribe, and hit the bell for more MetricFire insights!

View Video

MetricFire

Read more about MetricFire's CLI Tool: Easy Monitoring & Automation!

Proactive Protection Beyond the Endpoint

Mar 6, 2025 By Filip Cerny In Flowmon

The IT landscape for delivering applications and other services to end users has shifted to a hybrid deployment model, and this change is here to stay. While it provides myriad benefits for IT teams and their organizations, it also complicates the cybersecurity landscape, which needs protecting. Attackers continuously find new techniques to bypass traditional security measures.

Read Post

Flowmon

Read more about Proactive Protection Beyond the Endpoint

The One Where We Talk About Cribl University

Mar 6, 2025 By Cribl In Cribl

Join Ed Bailey and Dar Kobe as they break down the latest updates to Cribl U—what it is, who it’s for, and why it matters. Whether you’re just getting started or looking to level up your data skills, you won’t want to miss this!

View Video

Cribl

Read more about The One Where We Talk About Cribl University

This Month in Datadog: Conversations with two Datadog leaders, a sneak peek of DASH 2025, and more

Mar 6, 2025 By Datadog In Datadog

Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. This month, we’re joined by Datadog CPO Yanbing Li and SVP of Engineering David Mitchell..

View Video

Datadog

Read more about This Month in Datadog: Conversations with two Datadog leaders, a sneak peek of DASH 2025, and more

6 Reasons Why Digital Transformations Fail

Mar 6, 2025 By Megan Brake In Nexthink

According to McKinsey research, 70% of digital transformation projects fail to meet the stated goals. Depending on the reasons for launching a digital transformation project, this failure can lead to loss of productivity, security, profitability, or any other number of costly outcomes. In today’s competitive landscape, businesses cannot afford this failure – and yet it continues. Why?

Read Post

Nexthink

Read more about 6 Reasons Why Digital Transformations Fail

Visualize Google Sheets data: how to turn your spreadsheets into Grafana dashboards

Mar 6, 2025 By Usman Ahmad In Grafana

In 2020, we launched the Google Sheets data source for Grafana, providing organizations with real-time data visualization capabilities for all their go-to spreadsheets. Since then, thousands of users have installed the data source to quickly and easily derive insights from their spreadsheet data. In this blog post, we’ll explore key features of the Google Sheets data source, as well as some helpful resources to install and start using the data source today.

Read Post

Grafana

Read more about Visualize Google Sheets data: how to turn your spreadsheets into Grafana dashboards

Getting MTTR to zero: the failed promise of observability

Mar 6, 2025 By Joe Kim In Sumo Logic

There’s an old cliche about sales and jobs to be done - no one wants to buy a drill, they need a hole… actually, they want a home with pictures on the wall. To get to that beautifully designed home, they will buy a drill, make holes for brackets that can support their various artwork and family photos, and progress toward their dream home experience. Similarly, no one wants to buy observability software. They want their mean time to resolve (MTTR) issues to be zero.

Read Post

Sumo Logic

Read more about Getting MTTR to zero: the failed promise of observability

Monitoring Netdata Restarts: A Journey to a Reliable and High-Performance Solution

Mar 6, 2025 By Netdata Team In netdata

For a tool like Netdata, monitoring crashes and abnormal events extends far beyond bug fixing—it’s essential for identifying edge cases, preventing regressions, and delivering the most dependable observability experience possible. With millions of daily downloads, each event provides a vital signal for maintaining the integrity of our systems.

Read Post

netdata

Read more about Monitoring Netdata Restarts: A Journey to a Reliable and High-Performance Solution

Secure Your Sign-Ins with AppSignal's Single Sign-On

Mar 6, 2025 By Connor James In AppSignal

Managing team access to your organization's AppSignal account just got easier. We're excited to introduce our new Security Assertion Markup Language (SAML) Single Sign-On (SSO) Business Add-On — a secure solution designed to integrate effortlessly with your existing identity provider. This powerful feature streamlines login processes and enhances secure access management across your organization, making single sign-on a breeze.

Read Post

AppSignal

Read more about Secure Your Sign-Ins with AppSignal's Single Sign-On

Top tips: 5 potential use cases of 6G networks

Mar 6, 2025 By AlarmsOne In ManageEngine

Top tips is a weekly column where we highlight what’s trending in the tech world and list ways to explore these trends. This week, we’ll look at five areas where 6G technology will spawn rapid digital transformation. We’re a lucky generation—at least in the sense that we’re living on the precipice of entering the futuristic world we’ve seen in movies and TV shows.

Read Post

ManageEngine

Read more about Top tips: 5 potential use cases of 6G networks

NiCE Linux Power Management Pack 1.50 is Here

Mar 5, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

We are excited to announce the release of NiCE Linux Power Management Pack 1.50! This latest update brings significant security enhancements and ensures seamless compatibility with the latest enterprise monitoring environments. With support for OpenSSL 3.x, improved stability, and future-ready integrations, this release strengthens your Linux on IBM Power Systems monitoring like never before.

Read Post

NiCE IT Mgmt

Read more about NiCE Linux Power Management Pack 1.50 is Here

EventSentry v5.2: Processes, Security & Inventory

Mar 5, 2025 By ingmar.koecher In EventSentry

The latest iteration of EventSentry adds many powerful security features, continuing to enhance EventSentry’s ability to improve the security of Windows-based networks by strengthening its foundation and detecting suspicious behavior.

Read Post

EventSentry

Read more about EventSentry v5.2: Processes, Security & Inventory

Unlocking Enhanced Observability and Troubleshooting with MetrixInsight for Citrix VAD/DaaS

Mar 5, 2025 By GripMatix In GripMatix

We are excited to introduce a series of powerful new features in the latest update of MetrixInsight for Citrix VAD/DaaS. These enhancements bring greater visibility, improved troubleshooting capabilities, and deeper integration within SCOM, making it easier than ever to monitor and manage your Citrix environments effectively.

Read Post

GripMatix

Read more about Unlocking Enhanced Observability and Troubleshooting with MetrixInsight for Citrix VAD/DaaS

Less than a quarter of IT pros say their budget is sufficient

Mar 5, 2025 By SolarWinds In SolarWinds

Over two-fifths (43%) consider budgetary issues to be their company's biggest challenge this year.

Read Post

SolarWinds

Read more about Less than a quarter of IT pros say their budget is sufficient

Turns any command into a plugin: check_rungrep

Mar 5, 2025 By Alexander Klimov In Icinga

Imagine you have one more special thing to monitor. While our Icinga 2 can observe infrastructure of almost any size, it still needs a plugin for each kind of check. Unfortunately not every command meets the monitoring plugin API: exit code 0-3 (ok, warning, critical, unknown), performance data, etc. E.g. often programs exit with 1 in case of a fatal error, which is considered just a warning by Icinga.

Read Post

Icinga

Read more about Turns any command into a plugin: check_rungrep

From detection to resolution: The DEM workflow

Mar 5, 2025 By Bela Susan Thomas In Site24x7

Like finicky eaters, customers look for a smooth, satisfying meal with each course fulfilling their needs. A slow server, a confused menu, or a process hiccup all take away from the entire experience. Companies require a strong tool, such as digital experience monitoring (DEM), to not only spot the problems but also to promptly fix them. Similar to the kitchen manager eagerly acquiring ingredients and presenting the food, the site owner makes sure everything goes well without a hitch.

Read Post

Site24x7

Read more about From detection to resolution: The DEM workflow

New Integration: GitHub Issues

Mar 5, 2025 By Pēteris Caune In Healthchecks

Healthchecks can now notify you about a failing check by opening a new issue in your chosen GitHub repository. Here is an example of how the GitHub issue might look: The technical side of creating a new issue is straightforward: GitHub has an API call for creating an issue. You make an HTTP POST request with an access token in a request header and the issue title, body, and labels in the request body. However, where do we get the access token from? The API call accepts three types of access tokens.

Read Post

Healthchecks

Read more about New Integration: GitHub Issues

New Relic vs Zabbix - Which Monitoring Tool to Choose? [2025 Guide]

Mar 5, 2025 By Pavithra Parthiban In Atatus

Monitoring and observability are critical for ensuring system performance, stability, and reliability. New Relic and Zabbix are two widely used monitoring solutions, each catering to different needs. While Zabbix focuses on comprehensive infrastructure monitoring, New Relic excels in application performance monitoring (APM) and full-stack observability.

Read Post

Atatus

Read more about New Relic vs Zabbix - Which Monitoring Tool to Choose? [2025 Guide]

Why you should never use page.waitForTimeout() in Playwright

Mar 5, 2025 By Nočnica Mellifera In Checkly

Playwright isn’t a testing framework. Sure it’s got assertions, scripted behaviors, even controls over environments. But testing isn’t Playwright’s only purpose. Playwright is an automation tool. It can carry out any browser-based action consistently, and carry out instructions robustly. Locators for buttons and other elements aren’t visual or CSS class-based, but based on ARIA role, and even small styling changes won’t make the scripted action fail.

Read Post

Checkly

Read more about Why you should never use page.waitForTimeout() in Playwright

Does AI Help Write Better Software, or Just... More Code?

Mar 5, 2025 By Fahim Zaman In Honeycomb

As software teams race to integrate AI into their development workflows, we need to ask ourselves: are AI-powered tools actually making software better? The latest research from DORA confirms what many engineers have long suspected, and what we at Honeycomb have said for a long time: AI tools don’t magically lead to better software. In fact, without careful implementation, AI can introduce a whole slew of challenges, including decreased productivity and unreliable code.

Read Post

Honeycomb

Read more about Does AI Help Write Better Software, or Just... More Code?

Synthetic Monitoring: Frequently Asked Questions

Mar 5, 2025 By AlertBot In AlertBot

One of the most important features in a comprehensive enterprise-grade web monitoring solution is synthetic monitoring. Below, we answer some frequently asked questions, so that you can clearly understand what this is, how it works, and why it’s essential vs. optional.

Read Post

AlertBot

Read more about Synthetic Monitoring: Frequently Asked Questions

Grafana Alloy: OpenTelemetry, With Some Abstraction Issues

Mar 5, 2025 By Martin McLarnon In Coralogix

OpenTelemetry (OTel) is supposed to be the great equalizer in observability, giving teams full control over how they collect, process, and store telemetry data. It was built to be open, flexible, and vendor-neutral. Grafana Alloy claims to be OpenTelemetry-compatible, but scratch beneath the surface, and you’ll see that, based on our investigations, it is not a neutral OpenTelemetry Collector.

Read Post

Coralogix

Read more about Grafana Alloy: OpenTelemetry, With Some Abstraction Issues

Unlocking Zephyr Debugging

Mar 5, 2025 By Percepio In Percepio

If you’ve been working with Zephyr RTOS, you know how powerful and flexible it is for embedded development. At Percepio, we appreciate Zephyr’s hardware abstraction and kernel architecture, which make it easy to get up and running on a wide range of hardware. Now, we have exciting news for developers looking to improve their Zephyr debugging and performance analysis: we’ve validated that Percepio Tracealyzer works on over 600 Zephyr-supported development boards!

Read Post

Percepio

Read more about Unlocking Zephyr Debugging

Advanced Container Resource Monitoring with docker stats

Mar 5, 2025 By Preeti Dewani In Last9

If you’ve ever needed to check how much CPU or memory a Docker container is using, docker stats is the command for the job. It provides real-time resource usage metrics, helping you monitor and troubleshoot containers efficiently. This guide covers everything you need to know about docker stats: how to use it, what each metric means, and how to integrate it into a larger monitoring setup.

Read Post

Last9

Read more about Advanced Container Resource Monitoring with docker stats

Revolutionizing Incident Management with AI: Meet Mo Copilot

Mar 5, 2025 By Sumo Logic In Sumo Logic

Join us for this webinar as we explore how our newly launched Sumo Logic Mo Copilot redefines incident management with the power of AI. We'll examine the limitations of traditional troubleshooting methods and why they fall short in today’s fast-paced environments. Discover how Mo Copilot leverages advanced machine learning and automation to streamline root cause analysis and reduce mean time to resolution (MTTR). We'll also showcase a live demonstration and highlight how Mo Copilot integrates into your workflow, transforming how you manage operational reliability.

View Video

Sumo Logic

Read more about Revolutionizing Incident Management with AI: Meet Mo Copilot

How I used Graylog to Fix my Internet Connection

Mar 5, 2025 By The Graylog Team In Graylog

In today’s digital age, the internet has become an integral part of our daily lives. From working remotely to streaming movies, we rely on the internet for almost everything. However, slow internet speeds can be frustrating and can significantly affect our productivity and entertainment. Despite advancements in technology, many people continue to face challenges with their internet speeds, hindering their ability to fully utilize the benefits of the internet.

Read Post

Graylog

Read more about How I used Graylog to Fix my Internet Connection

Getting Started with MSSQL Dashboards

Mar 5, 2025 By Sameer Mhaisekar In Squared Up

If you're a Microsoft shop, it is a given that you have loads of SQL databases lying around with critical data in them. SquaredUp lets you connect to your SQL database and run SQL queries on the database to fetch data and build dashboards with it. Let's see how.

Read Post

Squared Up

Read more about Getting Started with MSSQL Dashboards

Bindplane Community Call in March 2025

Mar 5, 2025 By ObservIQ In ObservIQ

Tune in for the Bindplane Community Call in March to learn more about meeting the Bindplane team at KubeCon Europe in London, fresh product and community updates, and hands-on demos. March 12th at 11:00 am ET Join us live and bring your questions for a dedicated Q&A session!

View Video

ObservIQ

Read more about Bindplane Community Call in March 2025

Sponsored Post

Using observability tools for security monitoring and incident detection

Mar 4, 2025 By Rowan Tandy In Raygun

Most security teams overlook a goldmine of data sitting right in their applications - crash reports and Real User Monitoring (RUM) telemetry. While engineers typically use these tools for performance tracking, they can reveal security incidents that might otherwise go unnoticed. Let's explore some practical ways to turn your observability data into a powerful security monitoring system. I'll help create a table of contents in the requested format based on the headings in the article.

Read Post

Raygun

Read more about Using observability tools for security monitoring and incident detection

What Is Jitter in Networking: The Network Jitterbug

Mar 4, 2025 By Alyssa Lamberti In Obkio

Welcome to the world of networking, where seamless connections keep businesses running smoothly. Today, we’re diving into a common but often misunderstood issue: jitter. You might be wondering, what exactly is jitter? Simply put, it’s the variation in packet arrival times that can cause choppy video calls, laggy VoIP conversations, and disrupted online experiences.

Read Post

Obkio

Read more about What Is Jitter in Networking: The Network Jitterbug

ScienceLogic Transforms Computacenter's IT Operations, Achieving 50% Reduction in Incident Response Times

Mar 4, 2025 By ScienceLogic In ScienceLogic

Since our inception in 2003, ScienceLogic has been dedicated to empowering our partners with innovative solutions that deliver exceptional visibility and insights into their and their clients’ IT environments. Our mission is to help these organizations navigate complexity, transform inefficiencies into productive outcomes, and achieve and exceed their business goals.

Read Post

ScienceLogic

Read more about ScienceLogic Transforms Computacenter's IT Operations, Achieving 50% Reduction in Incident Response Times

Exciting Security Enhancements: Stronger, Smarter Access Tokens

Mar 4, 2025 By Rollbar In Rollbar

Security has been our top priority over the last year, and we’re rolling out major improvements to account and project access tokens to bring Rollbar up to today’s security standards. Newly created tokens will be stored in an encrypted format, inaccessible via the UI or API after being created, and you will be able to manually encrypt your existing tokens. This change to token storage will give you more control over who can submit, access or update data in your system.

Read Post

Rollbar

Read more about Exciting Security Enhancements: Stronger, Smarter Access Tokens

Everything You Need to Know About SIEM Logs

Mar 4, 2025 By Anjali Udasi In Last9

That moment when your production system goes down, and you're stuck piecing together logs from twenty different services? It’s frustrating and slow—especially when you need answers fast. SIEM logs help bring order to this chaos, giving you a structured way to track security events and system activity. But understanding how to use them effectively isn’t always straightforward, and most documentation can feel more complicated than the problem itself.

Read Post

Last9

Read more about Everything You Need to Know About SIEM Logs

Getting Started with the Grafana API: Practical Use Cases

Mar 4, 2025 By Prathamesh Sonpatki In Last9

Building dashboards one by one in Grafana can quickly become tedious. Clicking through the UI for every change isn’t exactly efficient. There’s a better way. The Grafana API lets you automate repetitive tasks and extend Grafana’s capabilities beyond the UI. If you're new to monitoring or managing a complex observability setup, understanding the API can make your workflow more efficient and scalable.

Read Post

Last9

Read more about Getting Started with the Grafana API: Practical Use Cases

Python Logging Exceptions: The Setup Guide You Actually Need

Mar 4, 2025 By Preeti Dewani In Last9

Debugging a Python app can be frustrating, especially when an unexpected crash leaves behind nothing but a vague error message. A well-configured exception log can make all the difference, turning guesswork into clear insights. Here’s how to set up logging that actually helps.

Read Post

Last9

Read more about Python Logging Exceptions: The Setup Guide You Actually Need

An Introduction to Absinthe for Elixir Monitoring with AppSignal

Mar 4, 2025 By Sapan Diwakar In AppSignal

Absinthe is a popular GraphQL toolkit for building robust APIs in Elixir. Monitoring such APIs is essential to ensure performance, detect bottlenecks, and handle errors effectively. AppSignal offers a seamless way to monitor and gain insights into your Absinthe-powered GraphQL APIs, enabling you to keep applications performant and reliable.

Read Post

AppSignal

Read more about An Introduction to Absinthe for Elixir Monitoring with AppSignal

Accelerate Network Incident Response With AppNeta, Automic Automation, and ConnectALL

Mar 4, 2025 By Nestor Falcon Gonzalez In Broadcom

Enabling accurate exchange of information between key applications has become crucial in today’s hybrid and complex IT operations. When we speak with potential customers, one common question we hear is, “How easy is it to consume and integrate the insights generated by Network Observability by Broadcom?” This might sound like table stakes, but it is often a challenge due to siloed teams, the high levels of expertise required, different data formats, and time-consuming processes.

Read Post

Broadcom

Read more about Accelerate Network Incident Response With AppNeta, Automic Automation, and ConnectALL

5 strategies to reduce false alerts in server monitoring

Mar 4, 2025 By Geoffrin Edwin In Site24x7

There are two types of alerts you don't want: We call these false alerts. As a person with responsibility over your IT infrastructure, it is natural that you have configured your monitoring systems to alert you at every step. But when these false alerts take up too much of your time, one of these unfortunate scenarios may occur: Let's explore more about false alerts before we dive into five strategies to avoid them.

Read Post

Site24x7

Read more about 5 strategies to reduce false alerts in server monitoring

The critical role of Kafka monitoring in managing big data streams

Mar 4, 2025 By Sinjan Ballav In Site24x7

Apache Kafka is the backbone of modern data streaming architectures, enabling real-time data movement, stream processing, and event-driven applications at scale. It enables high-throughput messaging between data sources and analytics platforms, supports log aggregation, and facilitates scalable extract, transform, load (ETL) pipelines for continuous data transformation and storage.

Read Post

Site24x7

Read more about The critical role of Kafka monitoring in managing big data streams

Java on containers: a guide to efficient deployment

Mar 4, 2025 By Nicholas Thomson In Datadog

Java remains one of the most widely used programming languages today, especially in enterprise backend systems—and for many good reasons. With each new release, Java’s robust runtime offers additional improvements in performance, security, scalability, and developer productivity. The portability of its code has proven increasingly relevant and useful as the industry embraces ARM64, making Java one of the go-to languages for modern workloads.

Read Post

Datadog

Read more about Java on containers: a guide to efficient deployment

Monitoring single-page app interactivity with Core Web Vitals and Datadog

Mar 4, 2025 By Addie Beach In Datadog

Web applications generate a wealth of performance data, but it’s challenging to know exactly which metrics are the most useful for monitoring your user experience. Focusing on irrelevant metrics wastes time and resources—but if you pare down the data you’re observing too much, you may miss critical insights.

Read Post

Datadog

Read more about Monitoring single-page app interactivity with Core Web Vitals and Datadog

Work faster with Sumo Logic: Mo Copilot, Otel Remote Management and more

Mar 4, 2025 By Hadijah Creary In Sumo Logic

Are you tired of always digging through data and not finding what you're looking for? We get it. Troubleshooting and data analysis should be easier, not harder, especially when time is of the essence. To simplify your work life, we’ve introduced several powerful new features designed to eliminate wasted time and help you focus on what matters: less time troubleshooting and more time building.

Read Post

Sumo Logic

Read more about Work faster with Sumo Logic: Mo Copilot, Otel Remote Management and more

Building Your First Python Plugin for the InfluxDB 3 Processing Engine

Mar 4, 2025 By Anais Dotis-Georgiou In InfluxData

One of the most compelling features of InfluxDB 3 is its built-in Python Processing Engine, a versatile component that adds powerful, real-time processing capabilities to both InfluxDB 3 Core and Enterprise. For those familiar with Kapacitor in InfluxDB 1.x or Flux Tasks in 2.x, the Processing Engine represents a more streamlined, integrated, and scalable approach to acting on data.

Read Post

InfluxData

Read more about Building Your First Python Plugin for the InfluxDB 3 Processing Engine

Challenges in Kubernetes monitoring and how to overcome them

Mar 4, 2025 By Applications Manager In ManageEngine

Kubernetes has revolutionized how organizations deploy, scale, and manage containerized applications, offering unprecedented efficiency and flexibility. However, the very characteristics that make Kubernetes so powerful—its dynamic, distributed, and ephemeral nature—also create significant challenges for monitoring. Without robust monitoring capabilities, organizations struggle to identify and resolve performance bottlenecks, optimize resource utilization, and maintain security.

Read Post

ManageEngine

Read more about Challenges in Kubernetes monitoring and how to overcome them

How I Code With LLMs These Days

Mar 4, 2025 By Phillip Carter In Honeycomb

I first started using AI coding assistants in early 2021, with an invite code from a friend who worked on the original GitHub Copilot team. Back then, the workflow was just single-line tab completion, but you could also guide code generation with comments and it’d try its best to implement what you want. Fast forward to 2025. There’s now a wide range of coding assistants that are packed with features.

Read Post

Honeycomb

Read more about How I Code With LLMs These Days

How to monitor your Shopify store with Grafana Cloud Frontend Observability

Mar 4, 2025 By Mark Covello In Grafana

Shopify is a fantastic tool for organizations who want to sell products, but don’t want to build or maintain an e-commerce platform themselves. Even some of the largest brands that have built their own e-commerce platforms in the past have seen the value of using Shopify to accelerate their business. As your Shopify site scales and grows, however, you may need more insight into the performance of your store.

Read Post

Grafana

Read more about How to monitor your Shopify store with Grafana Cloud Frontend Observability

Top B2B eCommerce Strategies for 2025: Less Hassle, More Sales

Mar 4, 2025 By Germain UX Team In Germain UX

B2B eCommerce is finally catching up. While B2C has spent the last decade perfecting oneClick checkouts and AI-powered recommendations, B2B has been stuck in the past—relying on email chains, phone orders, and clunky procurement systems. But that’s changing. Fast. With B2B eCommerce sales already more than double D2C sales (we’re talking $7.7 trillion vs. $3.8 trillion), companies are finally realizing they need to streamline and automate the way they sell.

Read Post

Germain UX

Read more about Top B2B eCommerce Strategies for 2025: Less Hassle, More Sales

What Is Powershell? An Introduction

Mar 4, 2025 By Stackify Team In Stackify

PowerShell is a command-line-based shell and scripting language that automates tasks on the Windows OS. PowerShell lets you automate any task normally done on Windows, like installing programs or updating software, allowing you to complete those tasks faster and on a larger scale. You can even extend its powers with Azure PowerShell to control Azure’s robust functionality, allowing you to use cmdlets to provision VMs, create cloud services, and carry out a number of other complex processes.

Read Post

Stackify

Read more about What Is Powershell? An Introduction

When AI tools fail: How to map your AI dependencies for proactive visibility

Mar 4, 2025 By Ankit Kumar In Catchpoint

AI platforms have experienced several service interruptions over the past few months. We’ve all seen the memes fly when ChatGPT, Gemini or Perplexity go down. They’re funny at first, but then reality hits: if you rely on AI tools for work or business, these outages can grind your day to a halt.

Read Post

Catchpoint

Read more about When AI tools fail: How to map your AI dependencies for proactive visibility

EventSentry Training [11-09]: Sysmon Management / Security & Compliance

Mar 4, 2025 By EventSentry In EventSentry

Shows how to deploy Sysmon and centrally manage the Sysmon configuration file with EventSentry.

View Video

EventSentry

Read more about EventSentry Training [11-09]: Sysmon Management / Security & Compliance

DEM 101: Understanding and implementing digital experience monitoring

Mar 4, 2025 By Bela Susan Thomas In Site24x7

A faulty engine in a high-performance car; how disappointing can that be? The same is the case of a slow-loading, poorly performing webpage for any digital entity. All that the page can gain will be a group of tired and irritated customers and a loss of trust in the brand. Modern businesses need a fast, reliable, and seamless digital experience. Proactive monitoring of the user experience—understanding how users interact with all digital touchpoints—is vital.

Read Post

Site24x7

Read more about DEM 101: Understanding and implementing digital experience monitoring

Why you shouldn't run tests sequentially

Mar 4, 2025 By Nočnica Mellifera In Checkly

Frequently in support conversations and posts on Playwright forums, a problem has come up that’s a little bit hard to describe, but comes down to synchronous testing: developers writing a series of Playwright tests that operate on the assumption that one of the tests will either run first or run last, and perform the function of a setup and cleanup script.

Read Post

Checkly

Read more about Why you shouldn't run tests sequentially

Unlocking the Value of Network Observability

Mar 4, 2025 By Gedeon Hombrebueno In Broadcom

Today, a strong network forms the backbone of business success, making network visibility crucial. As modern networks continue their rapid evolution, it's essential to have an observability solution that is robust, resilient, and scalable. Teams need a solution that helps them enhance network performance and improve user experiences. They need a solution that enables them to confidently face current and future network operations challenges. Network Observability by Broadcom is that solution.

Read Post

Broadcom

Read more about Unlocking the Value of Network Observability

February product updates

Mar 4, 2025 By Colin Bartlett In StatusGator

February may be behind us, but we’ve got some great updates to share! Here’s a quick recap of what’s new at StatusGator to improve your monitoring experience.

Read Post

StatusGator

Read more about February product updates

Top 5 outages detected by StatusGator in February 2025

Mar 4, 2025 By Colin Bartlett In StatusGator

Service disruptions can happen at any time, affecting communication, productivity, and access to critical platforms. In February, several major services experienced outages, causing frustration for users worldwide. With its Early Warning Signals feature, StatusGator detected these issues in real time—often before official acknowledgments—helping users stay informed and prepared. Here are five notable outages from the past month.

Read Post

StatusGator

Read more about Top 5 outages detected by StatusGator in February 2025

ilert and Netdata: AIOps from Monitoring to Alerting

Mar 4, 2025 By Netdata In netdata

What is most important for efficient incident management? Effective incident management starts before incidents occur. Ideally, alerts should trigger preemptively to prevent outages or fire immediately when issues arise, minimizing downtime and resolution time.

View Video

netdata

Read more about ilert and Netdata: AIOps from Monitoring to Alerting

Is your #observability always one step behind?

Mar 4, 2025 By Netdata In netdata

Guess what: It is designed to be like that! And the only way for you to get ahead of your operational challenges is to think differently. With Netdata, you get high-fidelity, ultra-detailed insights with unmatched granularity and cardinality and instant root cause analysis. See your infrastructure like never before! Get X-Ray Vision for your infrastructure!

View Video

netdata

Read more about Is your #observability always one step behind?

IT Monitoring News | March '25 Edition

Mar 3, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

Welcome to the March edition of the NiCE bi-monthly monitoring news! As the year starts to take full swing, we’re excited to bring you the latest updates, insights, and events to keep you at the forefront of IT monitoring. With significant developments, there’s much to explore, prepare for, and leverage in the coming months. Stay tuned!

Read Post

NiCE IT Mgmt

Read more about IT Monitoring News | March '25 Edition

What Causes Jitter: Your Go-To Troubleshooting Resource

Mar 3, 2025 By Andrii Kernitskyi In Obkio

Jitter is one of the most common (and frustrating) network issues, impacting both individuals and businesses. Whether it's choppy video calls, laggy online meetings, or inconsistent VoIP quality, jitter can quickly derail productivity and communication. But before jumping straight into troubleshooting, it's essential to understand what actually causes jitter in the first place.

Read Post

Obkio

Read more about What Causes Jitter: Your Go-To Troubleshooting Resource

Dashboard Studio: Your Dashboards, Now Guest-Friendly

Mar 3, 2025 By Lizzy Li In Splunk

In Splunk Cloud Platform 9.3.2411, we’re excited to announce support for publish dashboards, the button input, as well as a number of small enhancements that will level up your dashboarding experience.

Read Post

Splunk

Read more about Dashboard Studio: Your Dashboards, Now Guest-Friendly

EC2 Monitoring: A Practical Guide for AWS Engineers

Mar 3, 2025 By Anjali Udasi In Last9

Monitoring your EC2 instances shouldn’t be complicated or exhausting. Yet, too often, engineers find themselves troubleshooting issues in the middle of the night, searching for the root cause of an unexpected failure. Whether you're managing a few instances or hundreds spread across multiple regions, effective EC2 monitoring helps you stay ahead of problems instead of constantly reacting to them. And if you've ever dealt with a critical alert at an inconvenient hour, you know how important that is.

Read Post

Last9

Read more about EC2 Monitoring: A Practical Guide for AWS Engineers

Nginx Error Logs: Troubleshooting and Security Guide

Mar 3, 2025 By Preeti Dewani In Last9

Nginx error logs can be tough to decipher, even for experienced sysadmins and DevOps engineers. They hold valuable clues about what’s going wrong, but sorting through them can feel overwhelming. Understanding these logs doesn’t have to be a challenge. This guide breaks them down in a clear, practical way—so you can find the issues that matter and fix them with confidence.

Read Post

Last9

Read more about Nginx Error Logs: Troubleshooting and Security Guide

How to Use journalctl --last to Check Recent System Logs

Mar 3, 2025 By Ujjwal Goyal In Last9

When your Linux server starts acting up at 3 AM, you don't need a philosophy lesson—you need answers. Fast. That's where journalctl last comes in, the command-line equivalent of having a time machine for your system's events. If you've been piecing together log information like some digital detective with a cork board and string, it's time to upgrade your toolkit. Let's cut through the noise and get you the intel you need, when you need it.

Read Post

Last9

Read more about How to Use journalctl --last to Check Recent System Logs

Cut Costs, Not Insights: A Practical Guide to Telemetry Data Optimization - A Mezmo Webinar

Mar 3, 2025 By Mezmo In Mezmo

Managing telemetry data efficiently is a constant balancing act—how do you maximize visibility while controlling costs? In this webinar, we’ll show you how Mezmo’s telemetry pipeline helps you make smarter decisions about your data.

View Video

Mezmo

Read more about Cut Costs, Not Insights: A Practical Guide to Telemetry Data Optimization - A Mezmo Webinar

EDR and Endpoint Security

Mar 3, 2025 By Isaac García In Pandora FMS

Endpoints are the primary target of cyberattacks. The most conservative estimates indicate that between 68% and 70% of data breaches begin on these devices. This is why implementing an EDR (Endpoint Detection and Response) solution is crucial to protect them in today’s cyber threat landscape.

Read Post

Pandora FMS

Read more about EDR and Endpoint Security

Top Audit Logging Best Practices

Mar 3, 2025 By David Benson In Logit.io

Audit logs, otherwise referred to as audit trails, are detailed records that document activities or a sequence of activities or events. Typically, they deal with the usage of systems, applications, and/or networks. They are crucial in ensuring security, compliance, and operational oversight and enable users to keep track of the history of all actions executed and who has done what and when.

Read Post

Logit.io

Read more about Top Audit Logging Best Practices

OpenShift vs. Kubernetes: What's the Difference?

Mar 3, 2025 By Wendy Howard In eG Innovations

If asked even a year ago to forecast the most dominant technologies of 2024, it].; may not be too surprising that containerization would be among those seeing widespread adoption. Now commonplace for modern app development, organizations are faced with deciding between two leading container orchestration platforms: OpenShift and Kubernetes, each touting superior orchestration. With both platforms vying for a share in the market, many struggle to choose one over the other.

Read Post

eG Innovations

Read more about OpenShift vs. Kubernetes: What's the Difference?

WhatsUp Gold Device Group Access Rights

Mar 3, 2025 By WhatsUp Gold In WhatsUp Gold

Watch this video to learn about Device Group Access Rights, which allows you to fine-tune Read and Write access to monitored devices in WhatsUp Gold. Find more information on WhatsUp Gold and Device Group Access Rights: WhatsUp Gold Device Group Access Rights online training WhatsUp Gold User Authentication and Device Group Access Rights Learning Path Device Group Access Rights documentation.

View Video

WhatsUp Gold

Read more about WhatsUp Gold Device Group Access Rights

AI: Where in the Loop Should Humans Go?

Mar 3, 2025 By Fred Hebert In Honeycomb

AI is everywhere, and its impressive claims are leading to rapid adoption. At this stage, I’d qualify it as charismatic technology—something that under-delivers on what it promises, but promises so much that the industry still leverages it because we believe it will eventually deliver on these claims. This is a known pattern.

Read Post

Honeycomb

Read more about AI: Where in the Loop Should Humans Go?

Turn Your Observability Data into Actionable Insights with DataPrime- Live Webinar

Mar 3, 2025 By Coralogix In Coralogix

Turn Your Observability Data into Actionable Insights with DataPrime– Live Webinar March 3, 2025 08:00-09:45 (GMT) | 13:30-15:15 (IST)

View Video

Coralogix

Read more about Turn Your Observability Data into Actionable Insights with DataPrime- Live Webinar

Monitor OracleDB EX with OpenTelemetry and MetricFire

Mar 3, 2025 By Benjamin Pitts In MetricFire

OracleDB remains a top choice as a relational database management system (RDBMS), despite its strict licensing requirements. It excels at handling complex SQL queries, massive datasets, and transactional workloads, making it ideal for large Enterprise technology stacks. Its many benefits include robust indexing, partitioning, and in-memory processing to optimize query performance at scale.

Read Post

MetricFire

Read more about Monitor OracleDB EX with OpenTelemetry and MetricFire

Why IT Directors Love StatusGator

Mar 3, 2025 By Colin Bartlett In StatusGator

Maintaining uptime and reliability becomes crucial as more businesses move to the cloud. While platforms like AWS, Azure, and Google Cloud offer flexibility and cost-effectiveness, they also introduce risks that can disrupt critical services. Recent events show how fragile cloud infrastructure can be. On July 19, 2024, a routine cybersecurity update caused a global internet outage. Beyond large-scale incidents, human error remains a leading cause of downtime in IT and data centers.

Read Post

StatusGator

Read more about Why IT Directors Love StatusGator

Splunk AI Assistant for SPL app Overview

Mar 3, 2025 By Splunk In Splunk

Learn and write SPL faster with the Splunk AI Assistant for SPL. New to SPL? Have users who are new to SPL? No problem! Check out the video to learn how you can leverage the power of generative AI to easily write and explain SPL queries using natural language.

View Video

Splunk

Read more about Splunk AI Assistant for SPL app Overview

The importance of benchmarking in digital experience monitoring

Mar 3, 2025 By Bela Susan Thomas In Site24x7

Having a smooth and effective online experience is now essential rather than a differentiation. Customer loss, damaged brand reputation, and eventually a sharp decline in profitability can all result from a subpar digital experience. Gaining a significant competitive edge and promoting ongoing improvement are two benefits of knowing how your digital experience compares to industry best practices.

Read Post

Site24x7

Read more about The importance of benchmarking in digital experience monitoring

Azure Tagging: A Comprehensive Guide for Technophiles

Mar 3, 2025 By Turbo360 In Turbo360

Introduction: Businesses and enterprises with complex settings and backgrounds may find Azure resource management uneasy. Resource tags in Azure help manage environments competently. They improve the visibility and governance of cloud resources by organizing, tracking, and optimizing them. This post may scrutinize Azure tags and find ways to maximize the benefits of resource management.

Read Post

Turbo360

Read more about Azure Tagging: A Comprehensive Guide for Technophiles

Complexity Can Be Chaos

Mar 3, 2025 By Lauren Barnes In MetricFire

Monitoring is integral to understanding what is happening in your infrastructure, applications, or other observability projects. However, a common predicament developers can land themselves in is their observability stack becoming unwieldy and unmanageable due to a lack of streamlining and/or over-complicated code. To simplify your workload, it is important to streamline your monitoring.

Read Post

MetricFire

Read more about Complexity Can Be Chaos

Monitoring Distributed Systems

Mar 3, 2025 By Dotcom-Monitor In Dotcom-Monitor

Monitoring distributed systems is essential to keep your system running smoothly, efficiently, and reliably. With the growing reliance on distributed systems in everything from web services to cloud computing and large-scale applications, having a robust monitoring setup is crucial. Let’s dive into what distributed systems are, their different types, key characteristics, and how monitoring plays a critical role in maintaining their performance.

Read Post

Dotcom-Monitor

Read more about Monitoring Distributed Systems

The ultimate guide to cloud-native application performance monitoring with AWS, GCP, and Azure

Mar 2, 2025 By Sindu Priyadharshini V In Site24x7

The rapid adoption of cloud-native applications has revolutionized how businesses innovate, scale, and optimize costs. These applications leverage microservices, containers, and serverless functions, allowing seamless collaboration across multiple platforms like AWS, GCP, and Azure. However, managing performance in such a distributed environment presents challenges such as latency, security risks, and cost-inefficiencies.

Read Post

Site24x7

Read more about The ultimate guide to cloud-native application performance monitoring with AWS, GCP, and Azure

IT Status Page - Reduce IT Ticket Burden For Tech Companies

Mar 1, 2025 By Colin Bartlett In StatusGator

In 2025, tech teams, such as IT departments, DevOps engineers, SREs, and SaaS providers heavily rely and increase expenses on cloud services, APIs, and third-party tools to keep operations running smoothly. Many organizations manage dozens of critical platforms, from AWS and Google Cloud to collaboration tools like Slack and project management software like Jira. With this complexity, IT teams often face an overwhelming number of support tickets.

Read Post

StatusGator

Read more about IT Status Page - Reduce IT Ticket Burden For Tech Companies

Dotcom-Monitor's Role in Ensuring SLA Compliance

Mar 1, 2025 By Dotcom-Monitor In Dotcom-Monitor

When businesses promise their customers top-tier service, they often formalize these commitments in Service Level Agreements (SLAs). An SLA outlines performance standards such as uptime, response times, and issue resolution windows. However, meeting these standards is easier said than done. That’s where Dotcom-Monitor comes in by providing comprehensive monitoring solutions to help businesses ensure SLA compliance.

Read Post

Dotcom-Monitor

Read more about Dotcom-Monitor's Role in Ensuring SLA Compliance

Grow your MSP business without straining your staff

Mar 1, 2025 By Mia Martello In Martello Technologies

Our previous blog {LINK} explained how managing Microsoft Teams for enterprise clients can be a powerful way to grow your MSP business and boost managed service revenues. But seizing that opportunity requires capacity. For MSPs with tight margins and already maxed-out support analysts, delivering enhanced, high-value Teams services may seem out of reach.

Read Post

Martello Technologies

Read more about Grow your MSP business without straining your staff

Operations | Monitoring | ITSM | DevOps | Cloud