Monthly Archive

Migrate to SCOM 2025: A Seamless Transition for Enhanced Monitoring

Feb 28, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

Are you ready for the next evolution of System Center Operations Manager (SCOM)? Microsoft launched SCOM 2025 in November last year, bringing new enhancements and improved capabilities. To help you navigate the transition smoothly, we’re hosting an exclusive webinar where our experts will walk you through the migration process, best practices, and new feature highlights. What’s in Store?

Read Post

NiCE IT Mgmt

Read more about Migrate to SCOM 2025: A Seamless Transition for Enhanced Monitoring

Sponsored Post

How to Quickly Analyze CloudFront Cloud Logs in Amazon S3

Feb 28, 2025 By David Bunting In ChaosSearch

Content delivery networks (CDNs) such as Amazon CloudFront generate a flood of log files. In today's world where your customers are all around the globe, it's important to make sure that your websites' application assets are as close to the users as possible.

Read Post

ChaosSearch

Read more about How to Quickly Analyze CloudFront Cloud Logs in Amazon S3

Sponsored Post

Monitoring Cloud Foundry in SAP Business Technology Platform (BTP)

Feb 28, 2025 By Robert MacDonald In Avantra

Cloud Foundry is possibly the most popular environment on SAP Business Technology Platform. When customers build applications with the SAP Cloud Application Programming (CAP) framework to extend SAP S/4HANA solutions and achieve a clean core, they typically deploy using Cloud Foundry. After the applications on Cloud Foundry go into productive use, they become business critical and that creates a need for observability in those applications and the platform. Monitoring of Cloud Foundry is now an essential requirement of SAP operations teams.

Read Post

Avantra

Read more about Monitoring Cloud Foundry in SAP Business Technology Platform (BTP)

The 8 Hidden Pitfalls of Using AWS CloudWatch

Feb 28, 2025 By Tomer Levy In logz.io

AWS CloudWatch is a widely used observability tool that comes built into AWS. It provides easy access to logs, metrics, and alarms, making it a convenient choice for teams monitoring AWS workloads. But while CloudWatch offers a lot of power, many teams unknowingly misconfigure or misuse it, leading to unexpected costs, limited visibility, and operational challenges. Here are some common pitfalls we see—and how to avoid them.

Read Post

logz.io

Read more about The 8 Hidden Pitfalls of Using AWS CloudWatch

Moving to VDI? Don't Forget Your Web Apps

Feb 28, 2025 By Dave Wagner In Nexthink

I recently spoke to one of our Customers in Financial Services, who offer financial services through a network of Agents located across the United States. The agents are customer-facing and revenue generating. They rely on a variety of browser-based applications to deliver services to their clients – making these applications mission-critical.

Read Post

Nexthink

Read more about Moving to VDI? Don't Forget Your Web Apps

StatusGator now monitors 5,000 services - and growing!

Feb 28, 2025 By Colin Bartlett In StatusGator

We’re thrilled to announce a major milestone: StatusGator now monitors more than 5,000 services! Whether you rely on cloud platforms, SaaS tools, developer APIs, or infrastructure providers, we’ve got you covered. Our extensive service coverage means you can track the status of all your critical dependencies in one place, reducing downtime surprises and keeping your team informed.

Read Post

StatusGator

Read more about StatusGator now monitors 5,000 services - and growing!

10 Reasons Why Tech Companies Need StatusGator

Feb 28, 2025 By Colin Bartlett In StatusGator

Reliance on cloud services for infrastructure, collaboration, and seamless operations continues to grow. With organizations investing heavily in cloud infrastructure, Statista projects that the global public cloud computing market will reach $127 billion. As a result monitoring solutions have become indispensable. A well-monitored cloud environment helps reduce downtime, prevent revenue loss, and ensure smooth business operations.

Read Post

StatusGator

Read more about 10 Reasons Why Tech Companies Need StatusGator

Comparing Go vs Ruby

Feb 28, 2025 By Ayooluwa Isaiah In Honeybadger

Ruby and Rails are great tools that allow you to create complex web applications quickly. Well, some kinds of complex web applications. While they excel at traditional, monolithic, server-rendered applications, they fail to excel at delivering real-time or distributed services. This is why it's so handy for Rubyists to learn a programming language like Go. Go is designed to write lightweight services that handle lots of inbound connections.

Read Post

Honeybadger

Read more about Comparing Go vs Ruby

Top B2C eCommerce Strategies in 2025: What's Actually Working

Feb 28, 2025 By Germain UX Team In Germain UX

ECommerce is a mess right now. Luxury platforms are crashing. Social commerce is booming (but probably not for long). CAC is through the roof. And somehow, despite all this, brands still need to find a way to stand out, sell, and make money. If you’re running an eCommerce brand in 2025, here’s what’s actually working—and what’s just hype.

Read Post

Germain UX

Read more about Top B2C eCommerce Strategies in 2025: What's Actually Working

Prometheus Functions: How to Make the Most of Your Metrics

Feb 28, 2025 By Preeti Dewani In Last9

Keeping track of your infrastructure is non-negotiable. Prometheus makes that easier by collecting metrics and alerting you when something’s off. It’s a powerful tool that helps you understand what’s happening under the hood, whether you’re running a small cluster or managing large-scale applications. In this guide, we’ll break down Prometheus functions—what they do, how they work, and why they matter for better observability. Let’s get into it.

Read Post

Last9

Read more about Prometheus Functions: How to Make the Most of Your Metrics

CloudFront on AWS: Basics & Setup Guide

Feb 28, 2025 By Ujjwal Goyal In Last9

Some websites load in a snap, while others make you wonder if the internet is broken. The difference? Often, it comes down to how (and where) their content is served. A Content Delivery Network (CDN) helps by storing copies of your content in multiple locations worldwide, so users don’t have to wait for a distant server to respond. If you're on AWS, CloudFront is the built-in way to do this—helping speed things up while also handling security and traffic optimization.

Read Post

Last9

Read more about CloudFront on AWS: Basics & Setup Guide

How to implement multi-window, multi-burn-rate alerts with Grafana Cloud

Feb 28, 2025 By Andrew Dedesko In Grafana

Andrew Dedesko is a backend software engineer with 13 years of experience. He became very interested in metrics and alerting after being woken up countless nights while on call. Outside of work, Andrew likes cycling, camping, making s’mores, and pancakes. Adriano Mariani is a software engineer with three years of experience specializing in backend software development. Currently, Adriano is working at Kijiji on SEO-related initiatives.

Read Post

Grafana

Read more about How to implement multi-window, multi-burn-rate alerts with Grafana Cloud

OpenTelemetry vs. Datadog: Key Differences Explained

Feb 28, 2025 By Anjali Udasi In Last9

Choosing between OpenTelemetry and Datadog isn't just another tool decision. It's about how you'll monitor your systems, troubleshoot issues, and ultimately keep your services running smoothly. If you've been tasked with figuring out which route to take, you're in the right place. Let's get started!

Read Post

Last9

Read more about OpenTelemetry vs. Datadog: Key Differences Explained

Getting started with GitHub Actions dashboards

Feb 28, 2025 By John Hayes In Squared Up

If you are part of an engineering team, monitoring the performance of your CI/CD pipelines is a high priority. With the SquaredUp GitHub plugin you can view key metrics for your GitHub repos and workflows all within a single pane of glass. We also have plugins for Jira, Circle CI, Azure DevOps and more. So even if you are using many different tools you can still get an end to end view of your processes.

Read Post

Squared Up

Read more about Getting started with GitHub Actions dashboards

Mastering NIS2 Compliance: Advanced Threat Detection Simplified

Feb 28, 2025 By Flowmon In Flowmon

In this webinar, “Mastering NIS2 Compliance: Advanced Threat Detection Simplified” we’ll demystify NIS2 and demonstrate how the Progress Flowmon Network Detection and Response (NDR) solution can streamline compliance efforts and enhance an organization's security posture.

View Video

Flowmon

Read more about Mastering NIS2 Compliance: Advanced Threat Detection Simplified

Future Proof Your IT Monitoring with Microsoft SCOM 2025

Feb 28, 2025 By NiCE IT Management Solutions In NiCE IT Mgmt

Are you ready to transition to System Center Operations Manager (SCOM) 2025? The new release has been available since November 2024, so organizations must prepare to migrate seamlessly while maximizing new features and capabilities. This webinar will guide IT administrators, system engineers, and decision-makers through the migration process, best practices, and key enhancements in SCOM 2025.

View Video

NiCE IT Mgmt

Read more about Future Proof Your IT Monitoring with Microsoft SCOM 2025

AI Agents: Hype or Reality?

Feb 28, 2025 By Dan Mindru In Sentry

A few years ago, it was all about Blockchain; before that, IoT, then Big Data, and even earlier, the Cloud. Each era brought a paradigm shift of sorts, drawing huge investments and promises. Some delivered, some didn’t, but they each brought advancement in tech. Today, we find ourselves fully embracing the AI hype cycle that started circa 2022 with OpenAI.

Read Post

Sentry

Read more about AI Agents: Hype or Reality?

How To Monitor Server Uptime

Feb 28, 2025 By Lauren Barnes In MetricFire

Keeping your servers online is always important for the health of your business and keeping users happy. Essentially, if you are keeping an eye on your servers, you can proactively fix problems before they blow up rather than fighting them as they arise. Setting all this up can be a breeze or a bit of a headache, depending on your servers, what metrics you're tracking, and your expertise. Either way, MetricFire’s got your back!

Read Post

MetricFire

Read more about How To Monitor Server Uptime

Best incident management tools in 2025 [45 analyzed, top 3 picks]

Feb 27, 2025 By Leo Baecker In Hyperping

PagerDuty, Splunk, ServiceNow — with dozens of incident management tools on the market, how do you know which one to choose? Here's the reality — downtime costs organizations an average of $9,000 per minute. That's why companies are increasingly investing in incident management tools to reduce disruption and improve their incident response. But with the market evolving rapidly and new players emerging constantly, selecting the right tool has become more challenging than ever.

Read Post

Hyperping

Read more about Best incident management tools in 2025 [45 analyzed, top 3 picks]

9 Essential Network Monitoring Protocols: An Overview

Feb 27, 2025 By Alyssa Lamberti In Obkio

Network monitoring protocols are essential for keeping your network running smoothly. They are data-collection and analysis techniques that provide insights into the health of your network and can help you identify and fix network problems before they cause major disruptions. Think of your network like a city's road system: data packets are cars, routers are traffic lights, and switches are intersections.

Read Post

Obkio

Read more about 9 Essential Network Monitoring Protocols: An Overview

Introducing CartShark

Feb 27, 2025 By Georgina Grant-Muller In RapidSpike

Ecommerce websites are more vulnerable than ever to cyberattacks. Among these threats, web-skimming attacks – also known as data exfiltration or Magecart attacks – stand as the number one threat, targeting sensitive customer data and payment information. RapidSpike is proud to introduce CartShark, a revolutionary cybersecurity platform that empowers ecommerce businesses to combat these threats swiftly and effectively.

Read Post

RapidSpike

Read more about Introducing CartShark

Crafting Connections for Elevating Vendor-Client Partnerships

Feb 27, 2025 By Raygun In Raygun

Join us for an episode of "Founder & Friends," featuring Kyra Abbu, as she shares her experiences in product management and the vital role of nurturing robust vendor-client relationships.

View Video

Raygun

Monitoring

Read more about Crafting Connections for Elevating Vendor-Client Partnerships

InfluxDB 3 Core and Enterprise Architecture Highlights

Feb 27, 2025 By Jameelah Mercer In InfluxData

Time series data innovators and open source community members following us will know that we recently released two new products: InfluxDB 3 Core and InfluxDB Enterprise. InfluxDB 3 Core is a high-performance recent data engine optimized for real-time monitoring, data collection, and streaming analytics use cases. InfluxDB 3 Enterprise builds on Core’s foundation by integrating historical analysis and data compaction, enabling efficient querying over extended time ranges.

Read Post

InfluxData

Read more about InfluxDB 3 Core and Enterprise Architecture Highlights

HTTP Caching Headers: The Complete Guide to Faster Websites

Feb 27, 2025 By Request Metrics In Request Metrics

The fastest website is the website that is already loaded, and that’s exactly what HTTP caching delivers. HTTP caching is a powerful technique that lets web browsers reuse previously loaded resources like pages, images, JavaScript, and CSS without downloading them again. Understanding HTTP caching headers is essential for web performance optimization, but misconfiguration can cause big performance problems.

Read Post

Request Metrics

Read more about HTTP Caching Headers: The Complete Guide to Faster Websites

What's required for modern observability in 2025?

Feb 27, 2025 By Catchpoint In Catchpoint

In an era of rapid digital transformation, IT leaders and DevOps teams face mounting pressure to align IT metrics with business goals, simplify tool sprawl, and manage costs effectively.

View Video

Catchpoint

Read more about What's required for modern observability in 2025?

It's time for a new approach: Edwin AI solves ITOps biggest challenges with agentic AI

Feb 27, 2025 By LogicMonitor In LogicMonitor

For years, the term “AIOps” has been tossed around, but for IT teams, it hasn’t really brought the change it promised. Gartner coined the term, promising that machine learning and AI would forever change how we manage IT operations. Yet, the reality has been underwhelming. For most teams, traditional AIOps has amounted to little more than event management with a shiny new label.

Read Post

LogicMonitor

Read more about It's time for a new approach: Edwin AI solves ITOps biggest challenges with agentic AI

Everything You Need to Know About OpenTelemetry Agents

Feb 27, 2025 By Prathamesh Sonpatki In Last9

If you’re reading this, chances are you’re already familiar with OpenTelemetry (OTel)—the open-source standard for collecting observability data. But what about OpenTelemetry agents? How do they work, and why do they matter? This guide unpacks everything you need to know about OTel agents—where they fit in your stack, how to set them up, and common pitfalls to watch out for. Let’s get into it.

Read Post

Last9

Read more about Everything You Need to Know About OpenTelemetry Agents

Handling persistent storage problems in Kubernetes clusters

Feb 27, 2025 By Grace Nalini In Site24x7

Persistent storage is the backbone of stateful applications running in Kubernetes. Whether you are managing databases, logs, or application states, ensuring transactional data remains intact despite pod restarts or node failures is a challenge. In this blog, we will discuss the most common persistent storage issues in Kubernetes and how to handle them with practical, real-world solutions.

Read Post

Site24x7

Read more about Handling persistent storage problems in Kubernetes clusters

How to Effectively Monitor Nginx and Prevent Downtime

Feb 27, 2025 By Anjali Udasi In Last9

Nginx is widely known for its high performance and reliability. However, just like any software running in production, it requires continuous monitoring to ensure smooth operation. Issues such as high latency, unexpected crashes, or overwhelming traffic spikes can lead to performance degradation or even complete outages. Therefore, implementing a robust monitoring strategy is crucial to maintaining the health and stability of your Nginx deployment.

Read Post

Last9

Read more about How to Effectively Monitor Nginx and Prevent Downtime

Troubleshooting Kubernetes deployment failures

Feb 27, 2025 By Grace Nalini In Site24x7

Do you feel like you're solving a puzzle when deploying applications in Kubernetes? You are not alone in this! When something goes wrong during application deployment, it becomes all the more crucial to diagnose the issue methodically and get things back on track. This guide walks you through practical steps for troubleshooting deployment failures efficiently.

Read Post

Site24x7

Read more about Troubleshooting Kubernetes deployment failures

Monitoring for Kubernetes API server performance lags

Feb 27, 2025 By Grace Nalini In Site24x7

The Kubernetes API server is a key component in the control plane. Every interaction, whether deploying applications, scaling workloads, or monitoring system health, depends on the API server. Consider the human body: We have the brain as the critical organ, and the nerves function as the control system. The Kubernetes API server is like the nerve center of cluster management.

Read Post

Site24x7

Read more about Monitoring for Kubernetes API server performance lags

How to perform a ping check with Grafana Cloud Synthetic Monitoring

Feb 27, 2025 By Bukola Ayodele In Grafana

Synthetic monitoring is a critical practice to proactively track the health and performance of web applications. By simulating user interactions, this approach helps developers identify issues before they impact real users. One of the simplest forms of synthetic monitoring is known as a ping check, which verifies whether an endpoint is reachable. In this blog post, we’ll take a closer look at what a ping check is, and then walk through how to perform one using Grafana Cloud Synthetic Monitoring.

Read Post

Grafana

Read more about How to perform a ping check with Grafana Cloud Synthetic Monitoring

Getting started with Azure cost dashboards

Feb 27, 2025 By Sameer Mhaisekar In Squared Up

As an Azure admin, it is of critical importance that you keep an eye on how much cost you are incurring running your workloads in the cloud. You also want to have sight of any deployed resources that are not contributing to business and accumulating cost over time. Using a dedicated Azure plugin, SquaredUp dashboards will help you understand your Azure costs across services, resources, locations and apps – so you can keep tabs on how much you're spending and identify opportunities to save costs.

Read Post

Squared Up

Read more about Getting started with Azure cost dashboards

How to Monitor Azure Cloud Services with Grafana Cloud | Demo | Observability | Grafana Labs

Feb 27, 2025 By Grafana In Grafana

Microsoft Azure Cloud monitoring has never been more streamlined! In this video, Vasil Kaftandzhiev, Product Manager for Cloud Provider Observability in Grafana Cloud, walks you through how easy it is to monitor Azure Cloud Services with Grafana. With out-of-the-box dashboards, you can instantly visualize key metrics for essential Azure services like: API Gateway Queue Storage Virtual Machines Log Storage Events Hub Network Load Balancers SQL.

View Video

Grafana

Read more about How to Monitor Azure Cloud Services with Grafana Cloud | Demo | Observability | Grafana Labs

Our New CLI: How and Why We Made It

Feb 27, 2025 By Lauren Barnes In MetricFire

We are happy to announce our latest project at MetricFire: a brand-new CLI tool! Get ready to start monitoring your systems in one step - no need to modify any configuration files manually. Just run a terminal command, follow the prompts, and forward your system metrics to Hosted Graphite in minutes. In this article, we’ll share an overview of the Hosted Graphite CLI, why we’re making it, and how we’re making it.

Read Post

MetricFire

Read more about Our New CLI: How and Why We Made It

Improve gaming app performance with Unity support in Datadog RUM

Feb 27, 2025 By Jessica Manheimer In Datadog

As mobile gaming evolves, players have higher expectations for seamless experiences, real-time interactions, and cross-platform accessibility. Whether you’re developing games for iOS, Android, or another mobile operating system, maintaining and optimizing the performance of your game is critical for player retention. For instance, if a mobile game becomes laggy or begins to drop frames during gameplay, players will grow frustrated and abandon the game altogether.

Read Post

Datadog

Read more about Improve gaming app performance with Unity support in Datadog RUM

The One Where We Streamline Agent Management

Feb 27, 2025 By Cribl In Cribl

Join us LIVE on YouTube, X, and LinkedIn as we explore the latest features and hidden gems of the Cribl Suite! In this episode, we're going to show you how to streamline agent management with Cribl Edge and Cribl Copilot!

View Video

Cribl

Read more about The One Where We Streamline Agent Management

TCP Checks Now Available in Checkly

Feb 27, 2025 By Sara Miteva In Checkly

Checkly has always helped you monitor your APIs and web services, ensuring they stay fast, reliable, and available. But application reliability doesn’t stop there—databases, message queues, and mail servers all play a crucial role in your infrastructure. To provide full application reliability, we’re expanding into network monitoring with TCP checks. Now, you can monitor critical non-HTTP services directly in Checkly—without adding extra tools to your stack.

Read Post

Checkly

Read more about TCP Checks Now Available in Checkly

Why and How You Should Use Your Learning & Visiting Budget

Feb 27, 2025 By Lauren Singh In Checkly

When I joined Checkly as Junior People Operations Manager, one of the benefits that immediately stood out to me was the Learning & Visiting budget. I found myself wondering—how is this budget actually being used across the company? At the start of the year, many of our team members plan how they’ll use their learning budget—whether to enhance professional skills or pursue self-driven projects. With flexible guidelines, we encourage them to invest in what matters most.

Read Post

Checkly

Read more about Why and How You Should Use Your Learning & Visiting Budget

Integrating Google SecOps with Bindplane February 2025

Feb 27, 2025 By ObservIQ In ObservIQ

Google SecOps (formerly Chronicle) is Google Cloud’s security operations platform (SIEM) that helps you detect, investigate, and respond to cybersecurity threats. Integrating Bindplane enables an easy way of standardizing how you efficiently collect, process, and forward security-relevant data to Google SecOps. In this live workshop you’ll get a hands-on demo of how to configure log collection with the Bindplane Distro for OpenTelemetry Collector, and best practices for data standardization using open standards and OpenTelemetry.

View Video

ObservIQ

Read more about Integrating Google SecOps with Bindplane February 2025

Optimized IBM Power Systems Monitoring

Feb 26, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

Monitoring IBM Power Systems requires a robust, efficient, and proactive approach. With the release of NiCE HMC VIOS Management Pack v1.1, IT teams gain access to advanced monitoring capabilities designed to improve visibility, optimize performance, and ensure seamless operations within Microsoft SCOM.

Read Post

NiCE IT Mgmt

Read more about Optimized IBM Power Systems Monitoring

Agentless monitoring for cloud VMs: Simplify scaling and observability

Feb 26, 2025 By LogicMonitor In LogicMonitor

Managing cloud infrastructure is challenging enough without adding the burden of deploying and maintaining monitoring agents. What if there was a simpler, more efficient way to monitor your virtual machines (VMs)? In the first part of this series, we looked at the (link) and presented a better solution: agentless monitoring. Agentless monitoring is an efficient approach to observability that eliminates the need to install and manage software agents on each monitored device.

Read Post

LogicMonitor

Read more about Agentless monitoring for cloud VMs: Simplify scaling and observability

NIS2 Directive and Cybersecurity: Requirements, Risk Management, and Monitoring

Feb 26, 2025 By Isaac García In Pandora FMS

The days when an antivirus and common sense were enough to guarantee an organization’s cybersecurity are long gone. Especially if you work in a critical sector. That’s why the NIS2 Directive (2022/2555) of the European Union establishes cybersecurity obligations for these key activities… and the consequences of non-compliance. These consequences are significant, so let’s analyze the regulation, when it applies, and how to implement it.

Read Post

Pandora FMS

Read more about NIS2 Directive and Cybersecurity: Requirements, Risk Management, and Monitoring

Why Super Bowl 2025 was a triumph for Internet Resilience

Feb 26, 2025 By Catchpoint Team In Catchpoint

When you’re spending close to $8 million for a 30-second Super Bowl ad, the one thing you don’t want to leave to chance is your website—especially when millions of viewers, whether they came for the game, Kendrick Lamar, or to catch a glimpse of Taylor Swift in the stands, might head there right after the spot airs. Make no mistake: web performance is just as critical as the ad itself.

Read Post

Catchpoint

Read more about Why Super Bowl 2025 was a triumph for Internet Resilience

SCOM 2025 upgrade: In-place upgrade or side-by-side installation

Feb 26, 2025 By Jonas Lenntun In OpsLogix

SCOM 2025 upgrade: In-place upgrade or side-by-side installation SCOM 2025 was released last year, and now is the time to start planning your upgrade. But where do you begin? Upgrading can be a complicated process, and it is important to consider the different options to make the process as smooth as possible. When upgrading, you can choose between an in-place upgrade or a side-by-side installation, and each approach leads to different outcomes. The right path for you depends on several factors.

Read Post

OpsLogix

Read more about SCOM 2025 upgrade: In-place upgrade or side-by-side installation

OpenTelemetry Metrics Explained: A Guide for Engineers

Feb 26, 2025 By Rox Williams In Honeycomb

OpenTelemetry (often abbreviated as OTel) is the golden standard observability framework, allowing users to collect, process, and export telemetry data from their systems. OpenTelemetry’s framework is organized into distinct signals, each offering an aspect of observability. Among these signals, OpenTelemetry metrics are crucial in helping engineers understand their systems.

Read Post

Honeycomb

Read more about OpenTelemetry Metrics Explained: A Guide for Engineers

Understanding OpenTelemetry: A Practical Guide

Feb 26, 2025 By Sematext In Sematext

Observability is essential for understanding how modern applications perform and behave in production. OpenTelemetry has emerged as the industry standard for collecting, processing, and exporting telemetry data—traces, metrics, and logs—without vendor lock-in. This guide will walk you through OpenTelemetry’s core components, how it works, and why it’s a game-changer for observability.

Read Post

Sematext

Read more about Understanding OpenTelemetry: A Practical Guide

Sentry AI - Autofix

Feb 26, 2025 By Sentry In Sentry

Autofix from Sentry takes all the context Sentry captures around errors and crashes in your code, and uses AI to determine root causes, propose solutions, and even pull a PR to fix it. Autofix allows you to add additional context as needed, and help influence the solution. Code breaks, it happens. Autofix helps you fix it and get back up and running.

View Video

Sentry

Read more about Sentry AI - Autofix

The challenges of agent-based monitoring for cloud virtual machines and how to overcome them

Feb 26, 2025 By LogicMonitor In LogicMonitor

Imagine discovering that 40% of your cloud infrastructure went unmonitored for a week because monitoring agents failed to deploy during an auto-scaling event. This scenario isn’t just hypothetical—it’s a growing reality for organizations relying on traditional agent-based monitoring in dynamic cloud environments.

Read Post

LogicMonitor

Read more about The challenges of agent-based monitoring for cloud virtual machines and how to overcome them

Getting Started with OpenTelemetry for Browser Monitoring

Feb 26, 2025 By Preeti Dewani In Last9

OpenTelemetry is the go-to open-source standard for observability, but when it comes to tracking frontend performance and user interactions, things get a little tricky. Unlike backend services, browsers introduce challenges like CORS restrictions, asynchronous execution, and limited access to certain telemetry data. This guide covers everything you need to know about using OpenTelemetry in the browser, from setup to best practices, advanced configurations, and real-world debugging techniques.

Read Post

Last9

Read more about Getting Started with OpenTelemetry for Browser Monitoring

How to avoid blowing the budget on Azure AI

Feb 26, 2025 By Adam Kinniburgh In Squared Up

So you had a great day playing with really awesome new tech, solving big business challenges, and feeling like you really nailed it. Then you wake up the next day to an alert from Azure telling you you've blown your monthly budget and its only the first week of the month. We've all been there... right? Using any cloud service comes with a cost, but for most services the budget risk is low. Cost calculated daily isn't a problem when usage is predictable, but not everything works like that.

Read Post

Squared Up

Read more about How to avoid blowing the budget on Azure AI

Graylog Data Routing: Index Set vs Data Warehouse

Feb 26, 2025 By Graylog In Graylog

Follow Seth Goldhammer, Graylog's VP of Product as he shows you some differences on how you can route your data in Graylog and the benefits of each.

View Video

Graylog

Read more about Graylog Data Routing: Index Set vs Data Warehouse

Search and analyze unsampled logs in real time with Live Tail

Feb 26, 2025 By Candace Shamieh In Datadog

With thousands of logs generated every minute from your infrastructure, applications, services, and devices, retaining all of this data for active search and analysis can be cost-prohibitive. Because log volumes continue to grow rapidly as operations scale, it’s common for organizations to implement log management strategies and limit the amount that they store in order to minimize costs.

Read Post

Datadog

Read more about Search and analyze unsampled logs in real time with Live Tail

Integration roundup: Monitoring your modern data platforms

Feb 26, 2025 By Curtis Maher In Datadog

Modern applications increasingly rely on specialized databases and platforms to power real-time analytics and support advanced AI/ML capabilities. These tools help teams accelerate development by consolidating workflows and processes, enabling faster and more efficient data operations. That’s why Datadog has launched three new data platform integrations with Supabase, DuckDB, and Milvus.

Read Post

Datadog

Read more about Integration roundup: Monitoring your modern data platforms

Networks are everyone's business - TCP Checks for app developers

Feb 26, 2025 By Nočnica Mellifera In Checkly

Checkly is the industry’s best tool to monitor your production applications. With the power of playwright, developers can test the systems they’ve developed, and roll out those tests as production monitors running from multiple geographies on the Checkly system. And Checkly monitors thousands of API endpoints with complex validation, setup and cleanup scripts, and reliable alerting. So why are we expanding into TCP-based checks?

Read Post

Checkly

Read more about Networks are everyone's business - TCP Checks for app developers

Tracking Complex User Transactions to Ensure E-Commerce Success

Feb 26, 2025 By Richa Gupta In WebSitePulse

Did you know that, on average, 70% of online shoppers abandon their carts? Yep, seven out of ten people leave without making a purchase. That's according to the Baymard Institute, which analyzed data from 49 different sources. Crazy, right?

Read Post

WebSitePulse

Read more about Tracking Complex User Transactions to Ensure E-Commerce Success

From basics to benefits: A beginner's guide to cloud computing

Feb 26, 2025 By Arun Madhavan In Site24x7

Cloud computing powers everything from startups to global enterprises. With it, a new business can scale quickly without investing in expensive servers, while large organizations can store vast amounts of data and run applications seamlessly across the world. Simply put, cloud computing delivers computing resources over the internet that are scalable, cost-effective, and accessible—anytime, anywhere. Let’s break down the fundamentals of cloud computing and why it matters.

Read Post

Site24x7

Read more about From basics to benefits: A beginner's guide to cloud computing

Mastering Docker for seamless application deployment

Feb 26, 2025 By Arun Madhavan In Site24x7

Imagine you're developing an application on your laptop. It runs perfectly, but when you deploy it on a server, things break—dependency mismatches, configuration issues, and endless debugging. Docker eliminates these problems by packaging applications and their dependencies into portable, lightweight containers. This ensures that applications run consistently across different environments, whether it's a developer’s laptop, a testing server, or a cloud platform.

Read Post

Site24x7

Read more about Mastering Docker for seamless application deployment

How Obkio's NPO Plan Supports Organizations Making a Global Impact With Affordable Network Monitoring

Feb 26, 2025 By Alyssa Lamberti In Obkio

At Obkio, we believe in using our resources to give back to organizations that make the world a better place. That’s why we launched our NPO Plan—a program designed to help non-profits access advanced Network Performance Monitoring at a significantly reduced cost. By offering our services to non-profits at a fraction of the price, we help them to focus on what matters most—supporting their missions, rather than worrying about IT costs.

Read Post

Obkio

Read more about How Obkio's NPO Plan Supports Organizations Making a Global Impact With Affordable Network Monitoring

How to Monitor Aerospike With OpenTelemetry and MetricFire

Feb 26, 2025 By Benjamin Pitts In MetricFire

Aerospike is a high-performance, real-time NoSQL database built for speed, scale, and low-latency transactions—think millions of reads/writes per second without breaking a sweat. When you're dealing with high-throughput applications, keeping an eye on Aerospike’s performance isn't just a good idea—it's mission-critical to avoid bottlenecks, connection issues, or unexpected slowdowns.

Read Post

MetricFire

Read more about How to Monitor Aerospike With OpenTelemetry and MetricFire

VictoriaLogs Status Update: Heading Towards the Cluster Version

Feb 26, 2025 By Aliaksandr Valialkin In VictoriaMetrics

Today, we’re thrilled to share the latest updates on VictoriaLogs, your trusted open-source solution for efficient and user-friendly log management. Whether you’re just discovering VictoriaLogs or have been using it for a while, this post will walk you through the recent enhancements and give you a sneak peek at the much anticipated cluster version that’s on the horizon.

Read Post

VictoriaMetrics

Read more about VictoriaLogs Status Update: Heading Towards the Cluster Version

Monitor Microsoft Azure in Grafana Cloud: simplify and centralize your cloud provider observability

Feb 26, 2025 By Vasil Kaftandzhiev In Grafana

Organizations around the world use Microsoft Azure to power their businesses. The cloud computing platform includes hundreds of products and services organizations can use to build and manage applications, but monitoring those environments can often feel like navigating a maze of fragmented data, tools, and processes.

Read Post

Grafana

Read more about Monitor Microsoft Azure in Grafana Cloud: simplify and centralize your cloud provider observability

Enhancing Jenkins performance: Resource optimization for high-traffic workloads

Feb 26, 2025 By Sinjan Ballav In Site24x7

Jenkins is the backbone of many CI/CD pipelines, automating builds, tests, and deployments at scale. However, when handling high-traffic workloads, such as during peak development hours, large-scale deployments, or parallel builds and pipelines, Jenkins can quickly become a resource hog, leading to slow builds, queue backlogs, and even system crashes. Optimizing resource usage is essential to ensure smooth, efficient, and scalable performance.

Read Post

Site24x7

Read more about Enhancing Jenkins performance: Resource optimization for high-traffic workloads

AI Governance in 2025: A Full Perspective on Governance in Artificial Intelligence

Feb 26, 2025 By Austin Chia In Splunk

In a world where artificial intelligence (AI) is leaping forward — growing at a CAGR of almost 36% from 2024 to 2030 — questions about governance and ethics with the use of AI are surfacing. As humans continue to develop AI systems, it is crucial to establish proper guidelines to ensure powerful technologies like generative AI and adaptive AI are used in a responsible manner.

Read Post

Splunk

Read more about AI Governance in 2025: A Full Perspective on Governance in Artificial Intelligence

Release webinar: Dashboard Server 6.5

Feb 26, 2025 By SquaredUp In Squared Up

What's available in DS 6.5: New customizable branding – choose your own home navigation logo, login dialog logo, and login background image Improvements to the Grid Tile Dashboard loading skeletons and related improvements Customer subdomain support for Cloud Tile integration.

View Video

Squared Up

Read more about Release webinar: Dashboard Server 6.5

Lakehouse Demo

Feb 26, 2025 By Cribl In Cribl

Cribl Lakehouse is the first lakehouse built for the unpredictable nature of telemetry data. Unlike traditional solutions for structured data, it eliminates schema complexity and manual transformation while delivering elastic scalability, automated, cost-optimized tiered storage, and federated queries across diverse datasets. IT and security teams can effortlessly store and analyze massive volumes of evolving telemetry data in real time—without data engineering expertise—unlocking the full value of their data with a unified, management experience.

View Video

Cribl

Read more about Lakehouse Demo

What is Hosted OpenSearch? A Complete Guide for Businesses

Feb 26, 2025 By Lee Smith In Logit.io

As data continues to grow exponentially, businesses need powerful tools to search, analyze, and visualize their data efficiently. OpenSearch has emerged as a top choice for organizations seeking an open-source, scalable search and analytics engine. However, managing OpenSearch in-house can be complex, costly, and resource-intensive. That’s where hosted OpenSearch comes in.

Read Post

Logit.io

Read more about What is Hosted OpenSearch? A Complete Guide for Businesses

Shorten your MTTR with Checkly Traces

Feb 26, 2025 By Nočnica Mellifera In Checkly

We all know that Checkly is a ‘secret weapon’ for engineering teams who want to shorten their mean time to detection (MTTD). With Checkly, you can know within minutes if your service is unavailable for users, or acting unexpectedly. In this article we’ll talk about how Checkly traces can help you expand on the benefits of Checkly, adding insights that will help you diagnose root causes, and further reduce your mean time to resolution (MTTR) for outages and other incidents.

Read Post

Checkly

Read more about Shorten your MTTR with Checkly Traces

Key metrics to monitor for optimal SQL Server performance

Feb 25, 2025 By Applications Manager In ManageEngine

Microsoft SQL Server is a critical database component of many business applications, ensuring data integrity, fast query performance, and seamless transactions. However, maintaining peak performance requires proactive monitoring of essential metrics. In this blog, we’ll explore the key SQL Server performance metrics you should track and how they help prevent performance issues, optimize resource usage, and enhance database efficiency.

Read Post

ManageEngine

Read more about Key metrics to monitor for optimal SQL Server performance

Optimizing AWS NAT Gateway Usage

Feb 25, 2025 By Phil Gervasi In Kentik

AWS NAT Gateways are essential for private subnet access but can quickly become a costly burden, even when idle. With Kentik, cloud and network engineers gain deep visibility into NAT Gateway traffic, allowing them to identify underutilized gateways, analyze high-cost usage, and explore cost-saving alternatives like VPC Endpoints, Internet Gateways, or direct peering.

Read Post

Kentik

Read more about Optimizing AWS NAT Gateway Usage

Why Context Matters: Mastering Serverless App Monitoring

Feb 25, 2025 By Datadog In Datadog

Hi there, and welcome to the second video in this series on observing AWS serverless applications with Datadog. In this video, you’ll learn how important it is to add custom business context to the telemetry you send to Datadog and how you can use that inside APM to quickly diagnose and debug issues. You’ll walk away with an understanding of the importance of distributed tracing, as well as how you can add specific business context to the telemetry you send.

View Video

Datadog

Read more about Why Context Matters: Mastering Serverless App Monitoring

Managing Multiple Service Instances with a Systemd Generator

Feb 25, 2025 By Johannes Rauh In Icinga

When working with systemd services in Linux, you might encounter situations where multiple instances of a service need to be managed dynamically. When I had to develop a solution to monitor multiple Kubernetes clusters with Icinga for Kubernetes, I ran into exactly this challenge.

Read Post

Icinga

Read more about Managing Multiple Service Instances with a Systemd Generator

Elasticsearch Reindex API: A Guide to Data Management

Feb 25, 2025 By Prathamesh Sonpatki In Last9

If you've been working with Elasticsearch for a while, you’ll eventually run into a situation where you need to reindex your data. Maybe you’re changing mappings, upgrading versions, or restructuring your documents. That’s where the Elasticsearch Reindex API comes in. In this guide, we'll walk through everything you need to know about the Reindex API—what it is, how it works, common use cases, performance optimizations, and potential pitfalls. Let’s dive in.

Read Post

Last9

Read more about Elasticsearch Reindex API: A Guide to Data Management

Netdata vs. Prometheus: Which Monitoring Tool is Right for You? #monitoring #realtime

Feb 25, 2025 By netdata In netdata

Netdata's founder Costa Tsaousis built Netdata with performance and efficiency in mind. The result? 8x less RAM usage, 30x less disk I/O, 40x more data retention, 40x more data stored, and up to 22x faster queries—all thanks to our innovative tiered storage system, enabling ultra-efficient long-term queries.

View Video

netdata

Read more about Netdata vs. Prometheus: Which Monitoring Tool is Right for You? #monitoring #realtime

7 Projects Building on DataFusion

Feb 25, 2025 By Charles Mahler In InfluxData

2024 was a huge year for Apache DataFusion, and 2025 is looking to be even bigger. The project officially became a top-level Apache Software Foundation project, achieved best-in-class performance for querying Parquet files in the ClickBench benchmark, and had a research paper accepted for SIGMOD.

Read Post

InfluxData

Read more about 7 Projects Building on DataFusion

GTMetrix Alternatives: The Best Tools for Website Performance Testing

Feb 25, 2025 By Todd H. Gardner In Request Metrics

GTMetrix used to be the go-to tool for checking website speed, but let’s be honest—paying for one-off synthetic tests isn’t worth it. If you’re still relying on synthetic testing alone, you’re missing a big part of the web performance picture. If you care about Core Web Vitals, SEO performance, and user experience, you need more than just lab data. The good news? There are better (and free) alternatives like PageSpeed Insights and WebPageTest for synthetic testing.

Read Post

Request Metrics

Read more about GTMetrix Alternatives: The Best Tools for Website Performance Testing

State of DevOps: 2024 DORA Report Insights with Google

Feb 25, 2025 By Catchpoint In Catchpoint

Enjoy this exclusive webinar with Ben Good from Google as we explore the findings in the 2024 State of DevOps report. For over a decade, the DORA report has provided critical insights into the capabilities and practices that fuel high-performing technology organizations. This report highlights the significant impact of AI on software development, explores platform engineering’s promises and challenges, and emphasizes user-centricity and stable priorities for organizational success.

View Video

Catchpoint

Read more about State of DevOps: 2024 DORA Report Insights with Google

Pino Logger: The Fastest and Efficient Node.js Logging Library

Feb 25, 2025 By Ujjwal Goyal In Last9

Logging is an integral part of any production-ready Node.js application. Whether you're debugging issues, monitoring application performance, or setting up a centralized logging system, an efficient logger is crucial. Pino is one of the best choices available due to its speed, low overhead, and powerful features. This guide goes beyond the basics, providing an in-depth exploration of how to optimize Pino for your applications, use advanced features, and integrate it seamlessly with other tools.

Read Post

Last9

Read more about Pino Logger: The Fastest and Efficient Node.js Logging Library

Fine-tune notifications with Alert sensitivity

Feb 25, 2025 By Colin Bartlett In StatusGator

We’re excited to introduce a new feature that gives you greater control over how and when you receive alerts from your website and ping monitors. With Alert sensitivity, you can now specify the number of retries before an alert is triggered, reducing false alarms and ensuring more reliable notifications.

Read Post

StatusGator

Read more about Fine-tune notifications with Alert sensitivity

Boosting IT Efficiency: How to Do More With Less

Feb 25, 2025 By WhatsUp Gold In WhatsUp Gold

IT teams are constantly asked to do more with limited resources and budgets. Is your IT team’s monitoring strategy keeping up? Thankfully, these challenges aren’t impossible to overcome. Check out this exclusive webinar where Greg Collins, Product Marketing Manager at Progress, and Jason Alberino, Principal Product Manager at Progress, will share tips on accomplishing your IT goals with less.

View Video

WhatsUp Gold

Read more about Boosting IT Efficiency: How to Do More With Less

Data sources, visualizations, and apps: A guide to extending and customizing Grafana

Feb 25, 2025 By Usman Ahmad In Grafana

Grafana’s extensibility has always been one of the keys to its success. It comes with a wide range of data sources that allow you to query your data no matter where it lives, visualizations to help you quickly make sense of that data, and apps that can provide complete observability solutions, all in a single package.

Read Post

Grafana

Read more about Data sources, visualizations, and apps: A guide to extending and customizing Grafana

How to Implement OpenTelemetry in NestJS

Feb 25, 2025 By Aditya Godbole In Last9

Modern applications are becoming increasingly complex, and debugging distributed systems can feel like searching for a needle in a haystack. This is where OpenTelemetry (OTel) comes in. If you're using NestJS, integrating OpenTelemetry can provide deep insights into your application's behavior, helping you track performance, troubleshoot issues, and understand service interactions.

Read Post

Last9

Read more about How to Implement OpenTelemetry in NestJS

Empowering DevOps Teams: Overcoming IT Complexity with Advanced AI + Automation

Feb 25, 2025 By ScienceLogic In ScienceLogic

As IT environments become more complex, larger, and inundated with data, DevOps teams encounter significant obstacles that make efficient operations more challenging. The heightened complexity can create difficulties in maintaining visibility and control across hybrid IT ecosystems. Additionally, the substantial volume of data generated can overwhelm resource-constrained DevOps teams, making it difficult to extract valuable insights and make informed decisions.

Read Post

ScienceLogic

Read more about Empowering DevOps Teams: Overcoming IT Complexity with Advanced AI + Automation

Introducing the new Google Cloud Trace Explorer

Feb 25, 2025 By Sujay Solomon In Google Operations

New UI features in Cloud Trace, part of Google Cloud Observability, make it easier to troubleshoot latency and errors in your applications.

Read Post

Google Operations

Read more about Introducing the new Google Cloud Trace Explorer

Microsoft Entra ID Outage: How Vantage DX Detected the Issue Before Microsoft Acknowledges the Issue

Feb 25, 2025 By Sara Purdon In Martello Technologies

On February 25, 2025, at 11:32 AM EST, Martello’s Vantage DX monitoring began alerting on an issue affecting Microsoft Entra ID (Azure AD SSO). While Microsoft had not yet acknowledged the incident, online reddit forums had noted the issue and our Vantage DX proactive monitoring detected disruptions impacting authentication across multiple workloads. See here the critical warning for Exchange in Vantage DX Monitoring. Here is the critical warning for OneDrive and SharePoint in Vantage DX.

Read Post

Martello Technologies

Read more about Microsoft Entra ID Outage: How Vantage DX Detected the Issue Before Microsoft Acknowledges the Issue

Easy, comprehensive Logstash monitoring with Elastic Agent

Feb 25, 2025 By Trevor Blackford In Elastic

Logstash is a powerful tool for ingesting, transforming, and shipping data from various sources. Visibility into Logstash is critical for optimizing performance and troubleshooting issues related to data ingestion. We’ve greatly improved the Logstash integration to display the status of your Logstash nodes and pipelines at a glance. The integration is now powered by Elastic Agent, which queries Logstash monitoring APIs for data that populates managed dashboards.

Read Post

Elastic

Read more about Easy, comprehensive Logstash monitoring with Elastic Agent

Graylog Parsing Rules and AI Oh My!

Feb 25, 2025 By The Graylog Team In Graylog

In the log aggregation game, the biggest difficulty you face can be setting up parsing rules for your logs. To qualify this statement: simply getting log files into Graylog is easy. Graylog also has out-of-the-box parsing of a wide variety of common log sources, so if your logs fall into one of the many categories of log for which there is either a dedicated Input; a dedicated Illuminate component; or that uses a defined Syslog format; then yes, parsing logs is also easy.

Read Post

Graylog

Read more about Graylog Parsing Rules and AI Oh My!

Using Amazon RDS for high availability: How monitoring ensures reliable failover

Feb 25, 2025 By Sinjan Ballav In Site24x7

Database downtime can lead to significant disruptions, revenue loss, and frustrated users. Amazon Relational Database Service (RDS) provides a managed database solution with high availability and automated failover to minimize such risks. However, continuous monitoring is crucial to ensuring reliable failover and minimizing downtime by detecting potential issues before they impact operations.

Read Post

Site24x7

Read more about Using Amazon RDS for high availability: How monitoring ensures reliable failover

What are Kubernetes audit logs and how to monitor them?

Feb 25, 2025 By Mahalashmi Narayanan In Site24x7

Security and compliance: Many industries, especially those governed by regulations like HIPAA, the PCI DSS, or the GDPR, require detailed logs for compliance and to trace security incidents. Troubleshooting and forensic analysis: If something goes wrong—whether due to accidental configuration changes or malicious activity—having detailed logs helps diagnose the root cause and quickly remediate it.

Read Post

Site24x7

Read more about What are Kubernetes audit logs and how to monitor them?

What is Entra ID? .... and how Entra ID has evolved since the Azure AD rebranding

Feb 25, 2025 By Priya Balasubramaniam In eG Innovations

Entra ID is the new name for Azure Active Directory (Azure AD), Microsoft’s cloud-based identity and access management service. This rebranding, announced in July 2023, is part of Microsoft’s broader Entra product family, which focuses on securing access to digital resources and managing identities in a comprehensive way.

Read Post

eG Innovations

Read more about What is Entra ID? .... and how Entra ID has evolved since the Azure AD rebranding

Challenges in Monitoring Applications That Use OAuth

Feb 25, 2025 By Dotcom-Monitor In Dotcom-Monitor

OAuth (Open Authorization) has become a critical component in enabling secure and third-party access to APIs which makes it one of the most widely adopted authentication protocols for modern applications. From allowing users to sign into apps using their Google or Facebook accounts to enabling third-party service integrations, OAuth simplifies the process of granting access to resources without compromising security.

Read Post

Dotcom-Monitor

Read more about Challenges in Monitoring Applications That Use OAuth

The Benefits of Investing in a High-Quality Battery Box

Feb 25, 2025 By OpsMatters In OpsMatters

Having a dependable power source is crucial for various applications, from outdoor adventures and marine travel to off-grid living and emergency preparedness. While batteries provide the necessary energy, ensuring their protection and longevity requires a secure and efficient storage solution. A high-quality battery box not only safeguards the battery from environmental damage but also enhances safety, portability, and ease of use.

Read Post

OpsMatters

Read more about The Benefits of Investing in a High-Quality Battery Box

Deploying Prometheus with Docker Compose: A Step-by-Step Guide

Feb 24, 2025 By Prathamesh Sonpatki In Last9

Prometheus is one of the most popular open-source monitoring and alerting tools. Setting up Prometheus with Docker Compose can make your monitoring stack easier to deploy and manage if you're running containerized applications. This guide will walk you through everything you need to get Prometheus up and running with Docker Compose, from installation to configuration and setting up basic alerts.

Read Post

Last9

Read more about Deploying Prometheus with Docker Compose: A Step-by-Step Guide

Fix slow mobile apps before your users uninstall with Mobile Vitals

Feb 24, 2025 By Will McMullen In Sentry

Mobile devs know the struggle. Small regressions can cause big issues in production, and fixing them isn't as easy as pushing a quick patch. Unlike a web app, shipping fixes for apps means navigating app store approvals, and often hopping on meetings with customers to debug because mobile issues can be so challenging to recreate. Catching these issues before the 1-star reviews roll in is crucial. Luckily, Sentry just made it easier than ever.

Read Post

Sentry

Read more about Fix slow mobile apps before your users uninstall with Mobile Vitals

Optimizing Observability Data Volume and Cost with AI

Feb 24, 2025 By Logz.io In logz.io

Struggling with high observability costs? In this video, Jade Lassery breaks down the challenges of managing excessive data and skyrocketing expenses. She introduces the Logz.io AI agent, a powerful solution designed to optimize data usage, reduce unnecessary costs, and improve efficiency. Learn how to take control of your observability spending while maintaining high performance. Watch now to discover smarter data management strategies!

View Video

logz.io

Read more about Optimizing Observability Data Volume and Cost with AI

Why businesses lose trust after acquisition & how to choose wisely

Feb 24, 2025 By ManageEngine Site24x7 In Site24x7

Acquisitions are a double-edged sword. While they might seem like a sign of growth, they often leave customers dealing with slower updates, higher prices, and even privacy risks. If you’ve ever felt let down after your favorite tool was acquired, this video is for you. We’ll explore why businesses lose trust after acquisitions and share practical tips on how to choose tools that won’t leave you stranded. Plus, discover why ManageEngine is a standout choice for businesses looking for stability, innovation, and a customer-first approach.

View Video

Site24x7

Monitoring

Read more about Why businesses lose trust after acquisition & how to choose wisely

Understanding Reverse DNS Lookup

Feb 24, 2025 By Jeff Darrington In Graylog

On the information superhighway, an IP address is a series of numbers telling the location of a digital resource, similar to having a street address for a building. However, when all you know is the street address, you have no idea what the building itself looks like. If you’re a visual person, you might insert that address into Google Maps to pull up a picture of the building so you have a marker to help find a drive.

Read Post

Graylog

Read more about Understanding Reverse DNS Lookup

Fearless innovation is the true force behind IT project transformation

Feb 24, 2025 By Ugo Orsi In Digitate

Previously, I discussed the challenges of adopting AI in enterprises, focusing on middle managers’ concerns about its impact on their roles. In case you missed it, you can read it here: AI resistance isn’t where you expect it In this post, I’ll highlight the crucial steps for ensuring successful AI adoption. All business transformations are complex by nature because they change the organizational balance – that is, the equilibrium of power held among different leaders.

Read Post

Digitate

Read more about Fearless innovation is the true force behind IT project transformation

How to Build Observability into Chaos Engineering

Feb 24, 2025 By Ujjwal Goyal In Last9

If you've ever deployed a distributed system at scale, you know things break—often in ways you never expected. That’s where Chaos Engineering comes in. But running chaos experiments without robust observability is like debugging blindfolded. This guide will walk you through how observability empowers Chaos Engineering, ensuring that your experiments yield meaningful insights instead of just causing chaos for chaos’ sake.

Read Post

Last9

Read more about How to Build Observability into Chaos Engineering

How to Implement OpenTelemetry in Next.js

Feb 24, 2025 By Preeti Dewani In Last9

OpenTelemetry is an open-source observability framework designed to instrument, generate, collect, and export telemetry data, including traces, metrics, and logs. It is vendor-agnostic, allowing developers to send data to multiple backend services like Last9, Prometheus, Datadog, or Jaeger without vendor lock-in. For Next.js applications, OpenTelemetry is particularly useful due to the framework’s hybrid rendering approach.

Read Post

Last9

Read more about How to Implement OpenTelemetry in Next.js

OpenTelemetry Is Not "Three Pillars"

Feb 24, 2025 By Austin Parker In Honeycomb

OpenTelemetry is a big, big project. It’s so big, in fact, that it can be hard to know what part you’re talking about when you’re talking about it! One particular critique I’ve seen going around recently, though, is about how OpenTelemetry is just ‘three pillars’ all over again. Reader, this could not be further from the truth, and I want to spend some time on why.

Read Post

Honeycomb

Read more about OpenTelemetry Is Not "Three Pillars"

Spoiler Alert: How "Zero Day" Might Have Played Out Differently with Teneo and Palo Alto Cortex XDR

Feb 24, 2025 By Teneo In Teneo

This weekend, I binge-watched Netflix’s new series Zero Day, starring Robert De Niro. The series has sparked excitement and curiosity among cybersecurity enthusiasts and political thriller fans alike. As the title suggests, the show revolves around a cyberattack that exploits unknown vulnerabilities—so-called “zero days”—to wreak havoc on critical systems. But what if the organizations targeted in Zero Day had the right cybersecurity strategy in place?

Read Post

Teneo

Read more about Spoiler Alert: How "Zero Day" Might Have Played Out Differently with Teneo and Palo Alto Cortex XDR

Maximizing Azure Network Insights with VNet Flow Logs

Feb 24, 2025 By Kentik In Kentik

Join Kentik’s Phil Gervasi and Chris O’Brien in this LinkedIn Live replay as they discuss how VNet flow logs in Microsoft Azure boost network observability far beyond what’s possible with NSG flow logs. Learn how easier deployment, comprehensive visibility, and advanced analytics—integrated with AI-driven query capabilities—can help optimize your Azure (and multi-cloud) environment.

View Video

Kentik

Read more about Maximizing Azure Network Insights with VNet Flow Logs

How to Monitor Snowflake with OpenTelemetry

Feb 24, 2025 By Benjamin Pitts In MetricFire

Snowflake is a powerful, cloud-based data platform designed for high-performance analytics. Whether you're running massive analytical queries, managing structured and semi-structured data, or optimizing data pipelines, visibility into your Snowflake instance is essential. Performance bottlenecks, query execution delays, and unexpected cost spikes can quickly become issues without proper monitoring.

Read Post

MetricFire

Read more about How to Monitor Snowflake with OpenTelemetry

Slack solutions and tools for errors

Feb 24, 2025 By Rollbar In Rollbar

This video shows you how to use slack with Rollbar to quickly get alerted to issues as well as responding inside Slack.

View Video

Rollbar

Read more about Slack solutions and tools for errors

Instrument Google Cloud Run applications with the new Datadog Agent sidecar

Feb 24, 2025 By Jordan Obey In Datadog

Google Cloud Run is a fully managed service that allows you to deploy, manage, and scale workloads on serverless containers. Because Cloud Run abstracts away infrastructure management and runs on complex, distributed backends, it can be difficult to troubleshoot. Datadog’s integrations with Google Cloud and Google Cloud Run address that challenge by collecting and visualizing key metrics and logs.

Read Post

Datadog

Read more about Instrument Google Cloud Run applications with the new Datadog Agent sidecar

Grafana Loki 101: How to ingest logs with Alloy or the OpenTelemetry Collector

Feb 24, 2025 By Grafana Labs Team In Grafana

Logs play a critical role in observability, but they do come with their own challenges. Grafana Loki, our horizontally scalable, highly available, multi-tenant log aggregation system, addresses these challenges head on, giving you an open source tool that’s both cost effective and easy to operate.

Read Post

Grafana

Read more about Grafana Loki 101: How to ingest logs with Alloy or the OpenTelemetry Collector

Managing and resolving incidents effectively #shorts #datadog

Feb 24, 2025 By Datadog In Datadog

Platforms like @SeatGeek face challenges in managing and resolving incidents effectively. Discover how SeatGeek mastered incident response by integrating Datadog Incident Management.

View Video

Datadog

Read more about Managing and resolving incidents effectively #shorts #datadog

February 2025 Box Outage: Timeline and Post-Mortem

Feb 24, 2025 By Colin Bartlett In StatusGator

Box.com is a cloud-based content management and file-sharing platform designed for the enterprise and used by nearly 100,000 companies around the world. When a Box outage strikes, businesses can experience costly disruptions. On February 19, 2025, a disruption in core Box services including uploads, downloads, and the All Files page, affected thousands who depend on the cloud storage and collaboration platform.

Read Post

StatusGator

Read more about February 2025 Box Outage: Timeline and Post-Mortem

Migrating to cloud: Top five reasons

Feb 24, 2025 By Geoffrin Edwin In Site24x7

Since the inception of public clouds, a lot of CXOs have considered moving their IT infrastructure to the cloud and many have already done that. If your organization is considering migration to the cloud, learn what drove this mass movement from on-premises servers to the cloud. In this article, we'll explain the major reasons why organizations prefer the cloud, the issues you should watch out for, and how you should protect your cloud infrastructure.

Read Post

Site24x7

Read more about Migrating to cloud: Top five reasons

Conquering Data Overload at Ingestion - Tech Talks #2

Feb 24, 2025 By VictoriaMetrics In VictoriaMetrics

Join us for our second Tech Talk, where we’ll tackle log ingestion challenges and explore how VictoriaLogs makes log management effortless with the following: Modern infrastructure produces an overwhelming volume of log data, but traditional log management solutions struggle with scalability, performance, and cost.

View Video

VictoriaMetrics

Monitoring

Read more about Conquering Data Overload at Ingestion - Tech Talks #2

WhatsUp Gold 2024.0.2 Release Overview

Feb 24, 2025 By WhatsUp Gold In WhatsUp Gold

Watch this video to learn about the features included in version 2024.0.2 of WhatsUp Gold. Find more information on WhatsUp Gold.

View Video

WhatsUp Gold

Read more about WhatsUp Gold 2024.0.2 Release Overview

Troubleshoot Kubernetes Performance Issues with AI

Feb 24, 2025 By Logz.io In logz.io

Struggling with Kubernetes performance issues? This video introduces an AI-powered agent designed to help users quickly identify and resolve bottlenecks. By analyzing logs, the AI detects performance issues, streamlining troubleshooting and improving system efficiency. Watch now to see how AI can simplify Kubernetes performance management and keep your infrastructure running smoothly!

View Video

logz.io

Read more about Troubleshoot Kubernetes Performance Issues with AI

The One Where We Meet Cribl Copilot

Feb 24, 2025 By Cribl In Cribl

We’re kicking off our new live weekly product demo series—streaming on YouTube, X, and LinkedIn! Each week, we’ll dive into the latest features and hidden gems from the Cribl Suite of tools to help you unlock the full potential of your telemetry data. For our first session, we’re thrilled to welcome Nikhil Mungel, the visionary behind Cribl Copilot. This AI-powered assistant is designed to: Instantly surface answers from the documentation Build pipelines with just a simple request.

View Video

Cribl

Read more about The One Where We Meet Cribl Copilot

Entity Centric Detection -- Customer Brown Bag -- February 20th, 2025

Feb 24, 2025 By Sumo Logic In Sumo Logic

Please join us as Chas and Chris review how to use Entity Centric Detection with Sumo Logic.

View Video

Sumo Logic

Read more about Entity Centric Detection -- Customer Brown Bag -- February 20th, 2025

Free network monitoring: Full network visibility without the cost

Feb 23, 2025 By Rama Venkatesan In Site24x7

Investing in a network monitoring tool should mean complete visibility and faster troubleshooting. But what happens when an unexpected outage occurs and your expensive tool misses the warning signs? The result: hours of downtime, frustrated employees, and lost business productivity. Many organizations face this challenge, realizing that even premium monitoring solutions can leave critical gaps. The good news? You don’t have to break the bank to monitor your network effectively.

Read Post

Site24x7

Read more about Free network monitoring: Full network visibility without the cost

Optimize MTTD with the right check frequency

Feb 22, 2025 By Nočnica Mellifera In Checkly

Checkly enables engineers to automate the monitoring of their production services. Using the automation framework Playwright, you can run an end-to-end test on a regular cadence to make sure every feature is working for your users. But once you’ve got your check set up, either with Playwright scripting, a Terraform template, or an OpenAPI spec, we come to the question of what frequency you should run these checks. Should you be checking every few minutes, or every hour?

Read Post

Checkly

Read more about Optimize MTTD with the right check frequency

How Forbes delivers a premium digital experience with Datadog

Feb 21, 2025 By Datadog In Datadog

Learn how Forbes, a global media powerhouse, successfully migrated to the cloud with Datadog. Discover how they enabled their teams across their entire tech stack to access IT data and make critical improvements. The team maintained a 99.5 percent uptime through proactive alerting and improved root cause analysis by 10 percent.

View Video

Datadog

Read more about How Forbes delivers a premium digital experience with Datadog

Breaking Free from Legacy Observability: Why Service Providers Choose Kentik Over Deepfield

Feb 21, 2025 By Lauren Basile In Kentik

Modern network operators need modern observability tools. In this post, we explore why Deepfield — a traditional network flow analytics platform — falls short in providing comprehensive insights required for today’s network operations, and how Kentik’s modern data platform is purpose-built for today’s infrastructure teams.

Read Post

Kentik

Read more about Breaking Free from Legacy Observability: Why Service Providers Choose Kentik Over Deepfield

Increase control and reduce noise in your AWS logs using Datadog Observability Pipelines

Feb 21, 2025 By Ahmed Ahmed In Datadog

Today’s SRE and security operations center (SOC) teams often find themselves overwhelmed by the sheer volume and variety of logs generated by critical AWS services such as VPC Flow Logs, AWS WAF, and Amazon CloudFront. While these logs can be valuable for detecting and investigating security threats, as well as troubleshooting issues in your environment, managing them at scale can be challenging and costly.

Read Post

Datadog

Read more about Increase control and reduce noise in your AWS logs using Datadog Observability Pipelines

A deep dive into Database Monitoring index recommendations

Feb 21, 2025 By Alex Weisberger In Datadog

Datadog Database Monitoring (DBM) Recommendations help you proactively optimize performance throughout your database fleet. DBM draws on a wide range of data sources in order to detect and provide actionable guidance on issues such as blocking queries, low disk space, and missing indexes. In this post, we’ll show you how DBM formulates targeted indexing recommendations to help you optimize database performance.

Read Post

Datadog

Read more about A deep dive into Database Monitoring index recommendations

How to use locators to design more resilient synthetic tests

Feb 21, 2025 By Addie Beach In Datadog

Most modern web applications are frequently updated to implement new features, execute marketing campaigns, or enhance their UX with new libraries or APIs. While this helps you better engage your users, constant UI updates make designing flexible, long-lasting tests challenging.

Read Post

Datadog

Read more about How to use locators to design more resilient synthetic tests

EventSentry v5.2: New Features Overview

Feb 21, 2025 By EventSentry In EventSentry

EventSentry v5.2 includes numerous new features that help improve AD and end-point security.

View Video

EventSentry

Read more about EventSentry v5.2: New Features Overview

SolarWinds Observability 2025.1: Big Cloud Updates for GCP, AWS & Azure!

Feb 21, 2025 By SolarWinds In SolarWinds

New cloud support has landed in SolarWinds Observability 2025.1! Now with expanded monitoring for Google Cloud, AWS, and Azure, you can track even more cloud entities with ease. What’s new? Google Cloud – Now supports Google Compute Engine Azure – New support for Azure App Service & Blob Storage AWS – Expanded RDS support (MySQL, Aurora, PostgreSQL, Oracle) + Load Balancer monitoring See it in action! We explore the latest dashboards and drill into cloud resources like virtual machines, databases, and storage.

View Video

SolarWinds

Read more about SolarWinds Observability 2025.1: Big Cloud Updates for GCP, AWS & Azure!

Perses - A new language for dashboards?

Feb 21, 2025 By John Hayes In Squared Up

One of the most interesting stories in the dashboarding space over the past year or so has been the emergence of the Perses project. This is an open source project which not only provides a platform for dashboard creation, but also sets itself the very ambitious target of defining a common standard for dashboards as code. As a SquaredUp user, you may be wondering why we might want to talk about a potentially competing technology. Well, obviously, being SquaredUp, dashboards are in our DNA.

Read Post

Squared Up

Read more about Perses - A new language for dashboards?

Java based application log monitoring

Feb 21, 2025 By Rollbar In Rollbar

Discover how to monitor logs and errors for Java based applications. Improve the process using Rollbar's Java SDK.

View Video

Rollbar

Read more about Java based application log monitoring

An SRE's guide to optimizing ML systems with MLOps pipelines

Feb 21, 2025 By Max Saltonstall In Google Operations

As AI and ML become more prevalent, administrators can use Site Reliability Engineering (SRE) techniques to manage the ML infrastructure and software.

Read Post

Google Operations

Read more about An SRE's guide to optimizing ML systems with MLOps pipelines

How well-designed automations lead to efficient orchestration in AWS

Feb 21, 2025 By Sinjan Ballav In Site24x7

Managing resources efficiently in cloud-based environments like AWS is crucial for scalability, security, and cost-effectiveness. Automation is key to eliminating manual intervention in routine tasks, while orchestration ensures that these automated tasks are executed in a structured, coordinated manner. In AWS, leveraging well-designed automation enhances orchestration, enabling organizations to optimize performance, resource utilization, and security while maintaining operational agility.

Read Post

Site24x7

Read more about How well-designed automations lead to efficient orchestration in AWS

Elastic achieves AWS Government ISV Partner Competency, strengthening public sector solutions portfolio

Feb 21, 2025 By Udayasimha Theepireddy (Uday), In Elastic

Advancing digital transformation in government through Search AI and cloud innovation We’re thrilled to share that Elastic has achieved the AWS Government ISV Partner Competency. This prestigious designation recognizes Elastic as an Amazon Web Services (AWS) partner that has proven expertise in delivering high-quality solutions that help government agencies meet mandates, reduce costs, drive efficiencies, and boost innovation.

Read Post

Elastic

Read more about Elastic achieves AWS Government ISV Partner Competency, strengthening public sector solutions portfolio

Prometheus Metrics Explained: Counters, Gauges, Histograms & Summaries

Feb 21, 2025 By Phuong Le In VictoriaMetrics

Prometheus Metrics Explained: Counters, Gauges, Histograms & Summaries This discussion is the first part of the basic monitoring series, an effort to eliminate confusion in monitoring for both beginners and experienced users.

Read Post

VictoriaMetrics

Read more about Prometheus Metrics Explained: Counters, Gauges, Histograms & Summaries

Citrix Monitoring On Microsoft SCOM Webinar Recording 2025Q1

Feb 21, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

View Video

NiCE IT Mgmt

Read more about Citrix Monitoring On Microsoft SCOM Webinar Recording 2025Q1

Guide To Confluent Kafka vs Apache Kafka

Feb 21, 2025 By David Benson In Logit.io

Kafka is an open-source distributed streaming platform for high-throughput and fault-tolerant real-time data streaming in large-scale systems. It can integrate with a wide range of data sources and sinks, which include databases, message queues, big data processing frameworks like Apache Spark and Apache Flink, and many more.

Read Post

Logit.io

Read more about Guide To Confluent Kafka vs Apache Kafka

Getting Ready with Regex 101

Feb 21, 2025 By Jeff Darrington In Graylog

If you’ve dropped your house key in tall grass, you know how difficult it is to locate a small item hiding in an overgrown field. Perhaps, you borrowed a metal detector from a friend, then returned to the field hoping to get the loud beep that indicates finding metal in an otherwise organic area. Trying to find patterns in strings of data is the same process.

Read Post

Graylog

Read more about Getting Ready with Regex 101

OpenTelemetry Visualization Setup: A Developer's Guide

Feb 21, 2025 By Prathamesh Sonpatki In Last9

If you've ever tried to set up OpenTelemetry visualization, you know it can be a bit overwhelming. But don't worry—in this guide, we'll break it all down step by step. Whether you're just getting started or looking to fine-tune your existing setup, this walkthrough will help you get the most out of your telemetry data.

Read Post

Last9

Read more about OpenTelemetry Visualization Setup: A Developer's Guide

How to Use OpenSearch with Python for Search and Analytics

Feb 21, 2025 By Preeti Dewani In Last9

If you're working with search and analytics, you’ve probably heard about OpenSearch—the open-source alternative to Elasticsearch. OpenSearch is a powerful tool, whether you're building a search engine, running log analytics, or implementing full-text search in your applications. And the best part? You can integrate it easily with Python.

Read Post

Last9

Read more about How to Use OpenSearch with Python for Search and Analytics

Making sure you get a Checkly alert for every detected failure

Feb 21, 2025 By Nočnica Mellifera In Checkly

It’s every ops team’s biggest anxiety: a monitoring system detects a failure, but the notification either isn’t delivered or isn’t noticed by the team. Now we have to wait for users to complain before our team knows about the problem. Checkly sends an alert every time the system detects a failure, but how can you be sure you’re getting those alerts, and that those alerts are going to the right people?

Read Post

Checkly

Read more about Making sure you get a Checkly alert for every detected failure

Prometheus Monitoring: Instant Queries and Range Queries Explained

Feb 21, 2025 By Phuong Le In VictoriaMetrics

Prometheus Monitoring: Instant Queries and Range Queries Explained Over the years, we’ve received many questions about MetricsQL/PromQL, even from experienced users—especially regarding range queries and instant queries. This article is basic but turns out to be really important to explain why your query behaves the way it does. This discussion is part of the basic monitoring series, an effort to eliminate confusion in monitoring for both beginners and experienced users.

Read Post

VictoriaMetrics

Read more about Prometheus Monitoring: Instant Queries and Range Queries Explained

React.js Performance Guide

Feb 21, 2025 By Armin Ulrich In Sentry

Which JS framework is the most performant? React, Vue, Svelte, Angular,…? When trying to answer this question, we often get lost in comparing benchmarks for reactivity, bundle size, memory usage and other factors. Of course we want to choose the best framework to create performant apps! But your app will only benefit from framework performance if you also follow best practices for performance optimization of web apps in general, and React apps in particular. So, where to start?

Read Post

Sentry

Read more about React.js Performance Guide

What is Network Availability: Your Guide to 99.9 Uptime

Feb 21, 2025 By Alyssa Lamberti In Obkio

In the fast-paced world of network admins, where data flows like the heartbeat of an organization, network availability is the top priority. For admins, it's not just another term, it's a make-or-break factor for the success and smooth operation of everything they manage. In a world where we're all plugged in and counting on the constant exchange of info, getting network availability right is absolutely critical.

Read Post

Obkio

Read more about What is Network Availability: Your Guide to 99.9 Uptime

Error handling in JavaScript

Feb 21, 2025 By Rollbar In Rollbar

Discover how error handling works in JavaScript and a better way to find and fix JavaScript errors using Rollbar.

View Video

Rollbar

Read more about Error handling in JavaScript

It was DNS Again: Why Your Status Page Needs Its Own Domain

Feb 21, 2025 By Colin Bartlett In StatusGator

On February 20, 2025, at 16:22 UTC, StatusGator detected an outage affecting Vultr. The issue appeared to stem from a DNS failure, causing vultr.com and any other services hosted on its domain to become inaccessible. But what does that include? The official Vultr status page. Because Vultr hosts its status page on status.vultr.com, the same domain hosting its primary website and dashboard, users were left without an official source of updates during the outage.

Read Post

StatusGator

Read more about It was DNS Again: Why Your Status Page Needs Its Own Domain

FinOps IT Financial Management

Feb 21, 2025 By Turbo360 In Turbo360

Cloud computing has revolutionized IT infrastructure by offering unparalleled scalability and adaptability. However, organizations face significant challenges when it comes to effectively managing their cloud costs. Traditional IT Financial Management (ITFM) methodologies, designed for on-premises operations, often struggle to address the advanced financial complexities of cloud-based investments. This is where FinOps IT Financial Management takes center stage.

Read Post

Turbo360

Read more about FinOps IT Financial Management

Almost Three in Ten UK Public Sector IT Professionals Are Concerned About Potential Security Risks Associated with Adopting AI, SolarWinds Report Finds

Feb 20, 2025 By SolarWinds In SolarWinds

Digital transformation is a work in progress for most organisations, with privacy, security concerns, AI adoption and the complexity of integrating new systems remaining key barriers.

Read Post

SolarWinds

Read more about Almost Three in Ten UK Public Sector IT Professionals Are Concerned About Potential Security Risks Associated with Adopting AI, SolarWinds Report Finds

Smarter 'Ignore' error status controls for flexible alerting

Feb 20, 2025 By Ollie Bannister In Raygun

We’ve heard your feedback and are excited to roll out an improvement to Raygun’s ‘Ignore’ error status functionality, which gives you more control over how and when you suppress errors.

Read Post

Raygun

Read more about Smarter 'Ignore' error status controls for flexible alerting

Cisco Live'25 - Amsterdam - Be Bold in AI!

Feb 20, 2025 By Shailesh Manjrekar In Fabrix

The recently concluded Cisco Live’25 event in Amsterdam, gave a glimpse into Cisco’s strategy for the coming years, has the potential to join the $1 Trillion club, with its broad reach across Networking and Security. The recent market trends around Agentic AI are ripe to act as a catalyst for this.

Read Post

Fabrix

Read more about Cisco Live'25 - Amsterdam - Be Bold in AI!

Sponsored Post

Why AIX Monitoring Matters | Reasons, Obstacles, Solutions

Feb 20, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

AIX monitoring is essential for ensuring enterprise IT reliability, performance, and security. Traditional solutions often lack the depth needed for complex AIX environments, making specialized tools crucial for tracking performance and preventing downtime. As the need for real-time, automated monitoring grows, advanced solutions like NiCE AIX Management Pack integrate with Microsoft SCOM to enhance visibility and system optimization. By leveraging dedicated AIX monitoring, businesses can improve uptime, security, and efficiency, ensuring long-term infrastructure success.

Read Post

NiCE IT Mgmt

Read more about Why AIX Monitoring Matters | Reasons, Obstacles, Solutions

New Bindplane Destinations for Dash0 and Axiom #observability #opentelemetry #otlp

Feb 20, 2025 By ObservIQ In ObservIQ

View the Bindplane Community Call in February for more guidance.

View Video

ObservIQ

Read more about New Bindplane Destinations for Dash0 and Axiom #observability #opentelemetry #otlp

DORA Compliance - An Opportunity for MSPs

Feb 20, 2025 By Mike Ferioli In eG Innovations

For Managed Service Providers (MSPs) in the EU, who serve financial organizations, DORA regulatory compliance is a hot topic. The DORA (Digital Operational Resilience Act) is a new regulation that came into force on Jan 17th, 2025, aimed at ensuring the operational resilience of financial entities in the EU, focusing on technology risk management and minimizing disruptions in critical services.

Read Post

eG Innovations

Read more about DORA Compliance - An Opportunity for MSPs

Drilldown apps: An improved queryless experience for faster insights into your observability data

Feb 20, 2025 By Grafana In Grafana

See how we're improving the apps to help you quickly get insights into your logs, metrics, traces, and profiles, and find out why we changed the name from Explore apps to Drilldown. Grafana Cloud is the easiest way to get started with Grafana dashboards, metrics, logs, and traces. Our forever-free tier includes access to 10k metrics, 50GB logs, 50GB traces and more. We also have plans for every use case.

View Video

Grafana

Read more about Drilldown apps: An improved queryless experience for faster insights into your observability data

The Event Breaker Explained! Split and Unroll Logs with Bindplane Processor Bundles #observability

Feb 20, 2025 By ObservIQ In ObservIQ

View the Bindplane Community Call in February for more guidance.

View Video

ObservIQ

Read more about The Event Breaker Explained! Split and Unroll Logs with Bindplane Processor Bundles #observability

Enhance Network Performance Management With Next-Gen AIOps: Configuring Integration of DX Spectrum With DX Operational Observability

Feb 20, 2025 By Rubens Massini In Broadcom

To unlock the power of observability and advanced analytics of AIOps, teams need to collect exceptional monitoring data, establish connections and correlations between the data, and understand context with the help of robust and current topological maps. Because modern networks often span on-premises, cloud, and hybrid infrastructures, monitoring their performance and troubleshooting issues can be difficult. These complex infrastructures often lead to observability gaps for network teams.

Read Post

Broadcom

Read more about Enhance Network Performance Management With Next-Gen AIOps: Configuring Integration of DX Spectrum With DX Operational Observability

Intelligent Alerting with RapidSpike and ilert Integration

Feb 20, 2025 By Georgina Grant-Muller In RapidSpike

When it comes to website performance and uptime, every second counts. Businesses rely on tools like RapidSpike to monitor their digital presence, ensuring websites and applications run smoothly. However, effective alerting and incident management are just as critical as monitoring itself. That’s where ilert comes in.

Read Post

RapidSpike

Read more about Intelligent Alerting with RapidSpike and ilert Integration

Debugging a .NET Application with Loggly

Feb 20, 2025 By Loggly Team In SolarWinds

As modern applications grow more complex, debugging becomes increasingly challenging. Applications consist of multiple parts which can generate enormous amounts of log data, making debugging difficult. SolarWinds Loggly can help store, manage, and sift through this data. To demonstrate, we’ll set up an application built on.NET Core 9.0 and MongoDB; then, we’ll walk through how to export its logs to Loggly.

Read Post

SolarWinds

Read more about Debugging a .NET Application with Loggly

Nexthink Recognized in G2's 2025 Best Software Awards

Feb 20, 2025 By Friedbert Schuh In Nexthink

We are honoured to announce that G2 has recognized Nexthink as a leading software company in EMEA in their 2025 Best Software Awards. This achievement builds on our ongoing recognition from G2, where we’ve been named a Category Leader consecutively since 2021. Our success is only possible because of the support we receive from our incredible DEX community.

Read Post

Nexthink

Read more about Nexthink Recognized in G2's 2025 Best Software Awards

CLM Chowder: Digging Into the Cloud Latency of Azure, Google Cloud, and OCI

Feb 20, 2025 By Doug Madory In Kentik

CLM Chowder is a new series which highlights notable observations of cloud connectivity surfaced by Kentik’s Cloud Latency Map. In this edition, we look at measurements from Alibaba (China), latency swings from South Africa, and a temporary latency jump from Marseilles to Asia.

Read Post

Kentik

Read more about CLM Chowder: Digging Into the Cloud Latency of Azure, Google Cloud, and OCI

The next generation of Grafana Mimir: Inside Mimir's redesigned architecture for increased reliability

Feb 20, 2025 By Zhehao Zhou In Grafana

This year Grafana Mimir — the open source, horizontally scalable, multi-tenant time series database (TSDB) — will celebrate its third anniversary. Over the years, Mimir has become the go-to, Prometheus-compatible metrics backend within the open source community, with 29 maintainers and more than 4.6k GitHub stars. Since introducing Mimir, we’ve worked hard to deliver on our promise of making it the most scalable and performant open source TSDB in the world.

Read Post

Grafana

Read more about The next generation of Grafana Mimir: Inside Mimir's redesigned architecture for increased reliability

Grafana Drilldown apps: the improved queryless experience formerly known as the Explore apps

Feb 20, 2025 By Andrew Stucky In Grafana

When we introduced the Explore apps suite for metrics, logs, traces, and profiles last year at ObservabilityCON 2024, our goal was simple: offer a queryless, point-and-click experience so you can quickly find insights in your observability data—no queries or complicated syntax required. Our commitment to that goal remains unchanged, but we’re excited to announce that the Explore apps have a new name: Grafana Drilldown.

Read Post

Grafana

Read more about Grafana Drilldown apps: the improved queryless experience formerly known as the Explore apps

Why Internet Performance Monitoring is the new health check for IT organizations

Feb 20, 2025 By Howard Beader In Catchpoint

Monitoring has been part of our lives for centuries. We watch ourselves, our environment, and our habits to gain insights and make better decisions. Even the much-dreaded annual health check we line up for each year is just another facet of this age-old process. The goal is simple: spot small red flags now, before they balloon into bigger health complications later. It’s the same principle that has guided us for generations—keeping tabs, so we can correct course before trouble takes hold.

Read Post

Catchpoint

Read more about Why Internet Performance Monitoring is the new health check for IT organizations

KubeCon 2024 | Interviews with Observability Experts | Observability Insights with Aunsh Chaudhari

Feb 20, 2025 By Splunk In Splunk

In this interview from KubeCon 2024, I sit down with Aunsh Chaudhari, a Product Manager at Splunk, to discuss the biggest trends shaping observability today. With a background in software development and hands-on experience with observability tools, Aunsh shares insights on OpenTelemetry adoption, cost optimization strategies, and the shift toward unified observability. We also touch on emerging topics like AI in observability and the challenges of scaling observability in modern environments.

View Video

Splunk

Read more about KubeCon 2024 | Interviews with Observability Experts | Observability Insights with Aunsh Chaudhari

Integrating OpenTelemetry with Grafana for Better Observability

Feb 20, 2025 By Aditya Godbole In Last9

Modern application observability is essential for ensuring system performance, diagnosing issues, and optimizing user experiences. OpenTelemetry (Otel) and Grafana serve as two key components in achieving end-to-end visibility. While OpenTelemetry focuses on instrumenting applications to collect telemetry data, Grafana specializes in visualizing this data, making it actionable and insightful.

Read Post

Last9

Read more about Integrating OpenTelemetry with Grafana for Better Observability

An In-Depth Guide to Java Performance Monitoring for SREs

Feb 20, 2025 By Ujjwal Goyal In Last9

If you've ever had a Java application slow down in production and struggled to pinpoint the cause, you know the pain of performance issues. Java is a powerful, high-level language, but it doesn’t come without challenges—especially when it comes to resource management, garbage collection, and thread handling. This guide will take you through everything you need to know about Java performance monitoring, from key metrics to tools and best practices.

Read Post

Last9

Read more about An In-Depth Guide to Java Performance Monitoring for SREs

OpenTelemetry UI: The Ultimate Guide for Developers

Feb 20, 2025 By Prathamesh Sonpatki In Last9

If you’ve ever struggled with understanding distributed traces, managing metrics, or debugging complex applications, OpenTelemetry is your best friend. But what about the OpenTelemetry UI? How do you visualize and interact with all that telemetry data? In this guide, we’ll explore the best ways to use OpenTelemetry’s UI options, from setting up a proper observability stack to choosing the right front-end visualization tools.

Read Post

Last9

Read more about OpenTelemetry UI: The Ultimate Guide for Developers

How to Optimize Websites for Ad Publishers

Feb 20, 2025 By Dotcom-Monitor In Dotcom-Monitor

Optimizing a website for ad publishers is a must for anyone that is looking to maximize ad revenue and improve their user experience. A fast, well-optimized website ensures better engagement, higher ad visibility, and increased revenue potential. Additionally, search engines favor well-performing sites which leads to more organic traffic and greater ad impressions. In this updated guide, we’ll explore fresh strategies to enhance website optimization for ad publishers.

Read Post

Dotcom-Monitor

Read more about How to Optimize Websites for Ad Publishers

OpenTelemetry: The Future of Observability with Advanced Tracing and Metrics

Feb 20, 2025 By Oscar Parra In Apica

Hey there! Oscar here. After spending countless hours wrestling with various monitoring tools and proprietary solutions, I wanted to share my thoughts on what I believe is revolutionizing the observability landscape: OpenTelemetry (OTel). OpenTelemetry revolutionizes observability in distributed systems.

Read Post

Apica

Read more about OpenTelemetry: The Future of Observability with Advanced Tracing and Metrics

Transform Data with the New Python Processing Engine in InfluxDB 3

Feb 20, 2025 By Peter Barnett In InfluxData

In early January, we announced the launch of InfluxDB 3 Core and InfluxDB 3 Enterprise in public alpha. One of the newest included features is the InfluxDB 3 Processing Engine–a Python-based VM built to enable data transformation, enrichment, downsampling, alerting, and more, all from within the database itself. One month later, we’re excited to deliver a big update enabling new ways to interact with and transform your data.

Read Post

InfluxData

Read more about Transform Data with the New Python Processing Engine in InfluxDB 3

Application Performance Monitoring Product Walkthrough

Feb 20, 2025 By Raygun In Raygun

Happy coding from the team at Raygun.

View Video

Raygun

Read more about Application Performance Monitoring Product Walkthrough

Real User Monitoring Product Walkthrough

Feb 20, 2025 By Raygun In Raygun

Happy coding from the team at Raygun.

View Video

Raygun

Read more about Real User Monitoring Product Walkthrough

Crash Reporting Product Walkthrough

Feb 20, 2025 By Raygun In Raygun

Happy coding from the team at Raygun.

View Video

Raygun

Read more about Crash Reporting Product Walkthrough

Finding Root Cause Quickly with Logz.io AI Agent

Feb 20, 2025 By logz.io In logz.io

In the video, Jade Lassery discusses how to effectively manage complex environments, especially when faced with unexpected spikes in errors. She introduces a Logz.io AI agent prompt that assists users in quickly identifying the root cause of these issues. By simply asking the right questions, users can streamline their troubleshooting process and enhance their operational efficiency.

View Video

logz.io

Read more about Finding Root Cause Quickly with Logz.io AI Agent

Turn Your Browser Actions into End-to-End Tests with Playwright CodeGen

Feb 20, 2025 By Checkly In Checkly

Join Stefan Judis, Playwright ambassador, as he explains how to use Playwright's CodeGen command to turn browser actions into executable Playwright end-to-end test code.

View Video

Checkly

Read more about Turn Your Browser Actions into End-to-End Tests with Playwright CodeGen

Kubernetes made simple: A beginner's guide to managing containers

Feb 20, 2025 By Arun Madhavan In Site24x7

As applications become more complex, managing containers efficiently is key to scaling and maintaining performance. Kubernetes (also known as K8s) automates this process, making it easier to handle scaling, failures, and uptime. If you're new to Kubernetes, understanding the platform and how it's used is essential for managing your applications seamlessly. Let’s dive in and explore how Kubernetes makes it all possible.

Read Post

Site24x7

Read more about Kubernetes made simple: A beginner's guide to managing containers

Understanding Root Cause: Domain Name Systems (DNS) and Traceroute

Feb 20, 2025 By Pingdom In SolarWinds

You can think about a website the same way you think about your car. Every time something breaks, a professional—an engineer or a mechanic—usually charges a high amount for the fix (isn’t it annoying when you can’t tell if it’s a big or small fix?). Alternatively, you can learn some basics, get a few inexpensive tools, and troubleshoot many of the immediate issues yourself.

Read Post

SolarWinds

Read more about Understanding Root Cause: Domain Name Systems (DNS) and Traceroute

Logging vs. Metrics

Feb 20, 2025 By Lauren Barnes In MetricFire

When discussing observability, the “big 3” - logs, metrics, and traces, always get mentioned. But for some, more data doesn’t always mean better. Our lead engineer, JJ, had some advice to share about how logs may not be necessary for everyone. Simplifying your observability stack isn’t difficult - you just need to be intentional with implementation. Check out more MetricFire blog posts below, and our hosted Graphite service! Get a free trial and start using MetricFire now!

Read Post

MetricFire

Read more about Logging vs. Metrics

How APM and synthetic monitoring work together for better performance

Feb 20, 2025 By Sindu Priyadharshini V In Site24x7

Imagine this: A customer tries to log in to your app, but the page takes too long to load. Frustrated, they leave. Meanwhile, your IT team has no clue there was an issue—until complaints start pouring in. Sound familiar? Performance lags are the new downtime. Lags are not just an inconvenience—they lead to lost revenue and frustrated users. To prevent this, organizations turn to application performance monitoring (APM) and synthetic monitoring to maintain peak application performance.

Read Post

Site24x7

Read more about How APM and synthetic monitoring work together for better performance

Looking for app resilience? Looking to lower MTTD and MTTR? Well, look no further

Feb 19, 2025 By Catchpoint In Catchpoint

Learn more here: https://www.catchpoint.com/appassure

View Video

Catchpoint

Monitoring

Read more about Looking for app resilience? Looking to lower MTTD and MTTR? Well, look no further

Getting started with Snyk dashboards

Feb 19, 2025 By John Hayes In Squared Up

If you are involved in software development you will probably be aware of the ever-growing menace of supply chain attacks. These are attempts by attackers to insert malicious code into code libraries which might be downloaded or referenced by developers. Many modern frameworks can install hundreds or even thousands of dependencies, so the potential attack surface can be huge. As well as code libraries, attackers can also attempt to conceal malware in sources such as Docker images or CDNs.

Read Post

Squared Up

Read more about Getting started with Snyk dashboards

Diagnosing and resolving the 500 internal server error with Apache and Tomcat logs

Feb 19, 2025 By Subashree K In Site24x7

The dreaded 500 internal server error is a common challenge for web administrators, often signaling a disruption in server operations. Diagnosing the root cause requires in-depth visibility into both web server and application behavior. In this blog, we’ll explore how log management tools simplify the diagnosis and resolution of 500 errors by leveraging insights from both Apache and Tomcat logs.

Read Post

Site24x7

Read more about Diagnosing and resolving the 500 internal server error with Apache and Tomcat logs

How to leverage AI to enhance network monitoring in retail: A CXO's guide

Feb 19, 2025 By Rama Venkatesan In Site24x7

The retail industry has evolved into a mix of physical stores, e-commerce, digital payments, and omnichannel interactions. Now, GenAI has been added to this mix, which changes how people shop, how retailers operate, and how employees work. While this shift creates opportunities for retailers of all sizes, it also presents serious challenges in maintaining network performance and staying compliant with industry regulations.

Read Post

Site24x7

Read more about How to leverage AI to enhance network monitoring in retail: A CXO's guide

Diagnosing ActiveMQ broker performance issues with log analysis

Feb 19, 2025 By Subashree K In Site24x7

Apache ActiveMQ is a widely used message broker that enables seamless communication between distributed applications. However, as the volume of messages increases, performance bottlenecks can arise, leading to slow message processing, high latency, broker crashes, and out of memory (OOM) errors. One of the most critical issues affecting ActiveMQ is OOM errors, which occur when the broker exceeds its allocated heap memory. This can result in service failures, message loss, and prolonged downtime.

Read Post

Site24x7

Read more about Diagnosing ActiveMQ broker performance issues with log analysis

HTTP/3 is Fast!

Feb 19, 2025 By Request Metrics In Request Metrics

HTTP/3 is here, and it’s a big deal for web performance. See just how much faster it makes websites! Wait, wait, wait, what happened to HTTP/2? Wasn’t that all the rage only a few short years ago? It sure was, but there were some problems. To address them, there’s a new version of the venerable protocol working its way through the standards track. Ok, but does HTTP/3 actually make things faster? It sure does, and we’ve got the benchmarks to prove it.

Read Post

Request Metrics

Read more about HTTP/3 is Fast!

Getting started with Postgres dashboards

Feb 19, 2025 By John Hayes In Squared Up

In the last few years, Postgres has experienced a meteoric rise in popularity. A relational database that not long ago was relatively unknown outside of academic circles has now eclipsed MySql as the most popular database for developers in the most recent StackOverflow user survey. Why has it achieved such impressive popularity with developers?

Read Post

Squared Up

Read more about Getting started with Postgres dashboards

Manage your network with ManageEngine Site24x7!

Feb 19, 2025 By ManageEngine Site24x7 In Site24x7

As a network administrator, you know how critical it is to ensure seamless network performance, optimize bandwidth, and secure your infrastructure. But with the growing complexity of modern networks, staying on top of everything can be overwhelming. That’s where ManageEngine Site24x7 comes in! In this video, we dive into how Site24x7, a comprehensive network observability solution, empowers you to.

View Video

Site24x7

Read more about Manage your network with ManageEngine Site24x7!

InfluxData and AWS Expand Strategic Offering with New Capabilities to Power Large-Scale Time Series Workloads

Feb 19, 2025 By Company In InfluxData

Amazon Timestream for InfluxDB expands offering with Read Replicas, delivering enterprise-grade scalability and reliability to time series workloads on AWS.

Read Post

InfluxData

Read more about InfluxData and AWS Expand Strategic Offering with New Capabilities to Power Large-Scale Time Series Workloads

Breakpoint Episode 1 - Your Issues Have Issues

Feb 19, 2025 By Sentry In Sentry

First episode of Breakpoint with Sentry is here! We're talking all things Robots and Autofix, A brand new Issues refresh, Uptime watching for your downtime, and Dev Toolbars with red flags and feature flags galore. This is Breakpoint!

View Video

Sentry

Monitoring

Read more about Breakpoint Episode 1 - Your Issues Have Issues

Scale Time Series Workloads on AWS: Introducing Amazon Timestream for InfluxDB Read Replicas

Feb 19, 2025 By Peter Barnett In InfluxData

The world runs in real-time. From industrial automation and IoT monitoring to AI-powered analytics, developers rely on time series data to power critical systems and make split-second decisions. But as workloads grow, so do the challenges: keeping queries fast, ensuring high availability, and scaling efficiently without adding operational complexity. Not having to worry about operational overhead enables companies to focus on deriving value from their data.

Read Post

InfluxData

Read more about Scale Time Series Workloads on AWS: Introducing Amazon Timestream for InfluxDB Read Replicas

Manage All Your App Notifications in One Place with AppSignal

Feb 19, 2025 By Connor James In AppSignal

Alerts and notifications are the backbone of any Application Performance Monitoring (APM) tool, ensuring your team is immediately aware of critical issues. At AppSignal, we’re always improving our toolkit to help you stay ahead of problems before they impact performance or reliability. We've made huge improvements to how you can manage your app notifications and alerts with AppSignal.

Read Post

AppSignal

Read more about Manage All Your App Notifications in One Place with AppSignal

How to do Agentless Monitoring with check_by_ssh

Feb 19, 2025 By Alvar Penning In Icinga

The fundamentals of Icinga 2 are check plugins. They are being executed and their return value is mapped to either Host or Service objects. Everything else follows on top. These check plugins can be either from the Monitoring Plugins or custom. While their origin does not matter, they are the building blocks of an Icinga monitoring stack. If a plugin goes CRITICAL, Icinga 2 alerts the sysadmin.

Read Post

Icinga

Read more about How to do Agentless Monitoring with check_by_ssh

Grafana Cloud updates: Exemptions in Adaptive Logs, GPU monitoring in AI Observability, and more

Feb 19, 2025 By Kristin Knapp In Grafana

We consistently roll out helpful updates and fun features in Grafana Cloud, our fully managed observability platform powered by the open source Grafana LGTM Stack (Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics). In case you missed them, here’s our monthly round-up (the first of 2025!) of the latest and greatest Grafana Cloud updates. You can also read about all the features we add to Grafana Cloud in our What’s New in Grafana Cloud documentation.

Read Post

Grafana

Read more about Grafana Cloud updates: Exemptions in Adaptive Logs, GPU monitoring in AI Observability, and more

Optimizing Database Performance, Episode 1: The Solid Foundations of Database Design

Feb 19, 2025 By SolarWinds In SolarWinds

Join our resident database expert, Kevin Kline, for our upcoming webcast, “Optimizing Database Performance, Episode 1: The Solid Foundations of Database Design.” We’re going back to basics and focusing on how poor database design can impact even the most powerful and expensive hardware.

View Video

SolarWinds

Read more about Optimizing Database Performance, Episode 1: The Solid Foundations of Database Design

NANOG 93 in Atlanta: From Automation to AI

Feb 19, 2025 By Justin Ryburn In Kentik

NANOG 93 brought the networking community together for insightful discussions on the future of connectivity, emphasizing continuous learning, adaptation, and collaboration. Justin Ryburn highlights key takeaways from this premier event in the world of networking.

Read Post

Kentik

Read more about NANOG 93 in Atlanta: From Automation to AI

Breakpoint recap: Uptime Monitoring, robots, and feature flags galore

Feb 19, 2025 By Sasha Blumenfeld In Sentry

Bugs don’t announce themselves politely. They crash your checkout flow, break authentication, or slow your API to a crawl—usually right before your CEO asks how things are going. And when the error inbox is flooded with a hundred variations of TypeError: cannot read property of undefined, figuring out what actually matters can feel impossible.

Read Post

Sentry

Read more about Breakpoint recap: Uptime Monitoring, robots, and feature flags galore

What Are Network Packets & How to Monitor Them: The Secret Life Of Network Packets

Feb 19, 2025 By Alyssa Lamberti In Obkio

Ever wonder how the Internet actually works? It’s not just magic (though it sometimes feels that way). Behind every webpage you load, every video call you make, and every meme you send, tiny digital messengers called network packets are zipping through cyberspace, carrying data from one point to another. Think of them as the text messages of the Internet; small, efficient, and sometimes frustrating when they don’t arrive on time. But what exactly are network packets? How do they work?

Read Post

Obkio

Read more about What Are Network Packets & How to Monitor Them: The Secret Life Of Network Packets

Bindplane Expands Partnership with Google Cloud

Feb 19, 2025 By Joseph Howell In ObservIQ

We're only one month into 2025, but the momentum keeps building at Bindplane. In January, we rebranded our company as Bindplane, aligning our company name with our core mission: delivering the best OpenTelemetry-native telemetry pipeline on the market. Building on that excitement, we have another announcement: we've expanded and extended our partnership with Google Cloud.

Read Post

ObservIQ

Read more about Bindplane Expands Partnership with Google Cloud

Investigating Kubernetes Issues with Papertrail

Feb 19, 2025 By Papertrail Team In SolarWinds

While Kubernetes aims to streamline containerized application management, its multi-layered architecture creates potential points of failure. Problems in any of these layers can manifest as application crashes, resource overutilization, or failed deployments, making cluster maintenance a persistent challenge. Kubernetes meticulously logs all aspects of cluster activity and application output, from individual Pods to ReplicaSets.

Read Post

SolarWinds

Read more about Investigating Kubernetes Issues with Papertrail

Slicing Up-and Iterating on-SLOs

Feb 19, 2025 By Fred Hebert In Honeycomb

One of the main pieces of advice about Service Level Objectives (SLOs) is that they should focus on the user experience. Invariably, this leads to people further down the stack asking, “But how do I make my work fit the users?”—to which the answer is to redefine what we mean by “user.” In the end, a user is anyone who uses whatever it is you’re measuring.

Read Post

Honeycomb

Read more about Slicing Up-and Iterating on-SLOs

The integration directory is here!

Feb 19, 2025 By Colin Bartlett In StatusGator

Keeping your team informed about service disruptions has never been easier. The StatusGator Integration directory, available in the Menu on our website, allows you to explore all the integrations we support in one place. From collaboration tools to incident management platforms, we help you integrate status updates seamlessly into your workflow.

Read Post

StatusGator

Read more about The integration directory is here!

Batch Processing for Logs Streaming into Google SecOps #observability #tutorial #secops

Feb 19, 2025 By ObservIQ In ObservIQ

Use processors in Bindplane to configure batching for data that's flowing into SecOps. View the full hands-on workshop for more guidance.

View Video

ObservIQ

Read more about Batch Processing for Logs Streaming into Google SecOps #observability #tutorial #secops

Configure Google SecOps Log Ingestion Labels and Attributes #secops #observability

Feb 19, 2025 By ObservIQ In ObservIQ

Set log ingestion attributes and labels in Bindplane to identify data that's flowing into SecOps. View the full hands-on workshop for more guidance.

View Video

ObservIQ

Read more about Configure Google SecOps Log Ingestion Labels and Attributes #secops #observability

SolarWinds announces winners of 2025 EMEA Partner Awards

Feb 18, 2025 By SolarWinds In SolarWinds

SolarWinds shared the winners of its 2025 EMEA Partner Awards. The awards recognise the impressive growth and dedication within the distribution and reseller partner community over the past year.

Read Post

SolarWinds

Read more about SolarWinds announces winners of 2025 EMEA Partner Awards

Beyond Their Intended Scope: BGP Goof-UPX

Feb 18, 2025 By Doug Madory In Kentik

In this second installment of Beyond Their Intended Scope, we analyze a recent BGP leak out of Brazil that briefly affected networks around the world. Because this routing mishap was a path leak (i.e., did not involve any mis-originations and therefore immune from RPKI ROV protection), it demonstrates why we need a thing called ASPA … ASAP.

Read Post

Kentik

Read more about Beyond Their Intended Scope: BGP Goof-UPX

Easiest Way to Monitor NGINX Performance with OpenTelemetry

Feb 18, 2025 By Benjamin Pitts In MetricFire

If you're looking for a straightforward way to collect NGINX metrics via OpenTelemetry and send them to your Graphite-based monitoring setup, this article is for you! With minimal configuration you’ll be collecting key metrics from your NGINX connections within minutes. In this article, we'll explain how to install the OpenTelemetry Collector, and easily configure it to receive and export NGINX metrics to a Hosted Carbon endpoint.

Read Post

MetricFire

Read more about Easiest Way to Monitor NGINX Performance with OpenTelemetry

AI & Gartner's Strategic Roadmap Timeline for Cybersecurity - A Perspective from Teneo

Feb 18, 2025 By Teneo In Teneo

The integration of artificial intelligence (AI) presents both unprecedented opportunities and emerging threats. Gartner’s Strategic Roadmap for Cybersecurity Leadership emphasizes the need for adaptive strategies that align with business objectives and technological advancements. Concurrently, the UK’s National Cyber Security Centre (NCSC) has highlighted the dual-edged nature of AI in its report on the impact of AI on cyber threats.

Read Post

Teneo

Read more about AI & Gartner's Strategic Roadmap Timeline for Cybersecurity - A Perspective from Teneo

Observability for your NodeJS AWS Serverless Applications

Feb 18, 2025 By Datadog In Datadog

Hi there, and welcome to the first video in this series on observing AWS serverless applications with Datadog. In this video, you’ll learn how easy it is to get started observing your serverless NodeJS applications using Datadog and the AWS CDK. You’ll also look at how you can use the Datadog console to diagnose latency issues and errors inside your application. You’ll walk away with an understanding of how to instrument your Lambda functions with the AWS CDK, as well as practical steps you can take to debug your applications.

View Video

Datadog

Read more about Observability for your NodeJS AWS Serverless Applications

How to observe AWS Lambda functions using the OpenTelemetry Collector and Grafana Cloud

Feb 18, 2025 By Dominik Süß In Grafana

Getting telemetry data out of modern applications is very straightforward—or at least it should be. You set up a collector that either receives data from your application or asks it to provide an up-to-date state of various counters. This happens every minute or so, and if it’s a second late or early, no one really bats an eye. But what if the application isn’t around for long? What if every second waiting for the data to be collected is billed?

Read Post

Grafana

Read more about How to observe AWS Lambda functions using the OpenTelemetry Collector and Grafana Cloud

Self-Healing Infrastructure: Start Your Journey Now

Feb 18, 2025 By Meeta Lalwani In Virtana

Every CIO’s ultimate goal is to create a self-healing enterprise. Self-healing IT systems have the ability to proactively prevent issues within the IT environment, ensuring seamless and uninterrupted services that support business continuity. While automating every possible task seems like an obvious solution, implementing changes in a production environment can be challenging.

Read Post

Virtana

Read more about Self-Healing Infrastructure: Start Your Journey Now

Analysts Share Their 2025 Cybersecurity Predictions

Feb 18, 2025 By Filip Cerny In Flowmon

It's the start of a new year. Like last year, I want to examine what analysts are predicting for the cybersecurity landscape in 2025 and the risks they feel will be front and center. There is no shortage of predictions for this year’s cybersecurity landscape outlook—so many, it's impossible to compile them all. While not a thorough summary of the threats and risks in 2025, this article highlights the most common topics covered by cybersecurity specialists.

Read Post

Flowmon

Read more about Analysts Share Their 2025 Cybersecurity Predictions

A Quick Guide for OpenTelemetry Python Instrumentation

Feb 18, 2025 By Prathamesh Sonpatki In Last9

OpenTelemetry is an open-source tool that helps you keep an eye on your application’s performance. Whether you’re building microservices, using serverless setups, or working with a traditional monolithic app, it’s crucial to monitor and trace your app’s behavior for debugging and optimization. OpenTelemetry's Python instrumentation is an excellent way to track traces, metrics, and logs across your entire app.

Read Post

Last9

Read more about A Quick Guide for OpenTelemetry Python Instrumentation

AIOps Across the Board: 3 Industry Use Cases That Leverage the Power of the ScienceLogic Platform

Feb 18, 2025 By ScienceLogic In ScienceLogic

The ScienceLogic AI Platform and Skylar AI enable organizations to maintain the performance, health, and security of their IT environments. By providing comprehensive observability enhanced by unsupervised artificial intelligence, they turn data into actionable insights. With its unparalleled visibility and intelligence across complex multi-cloud, hybrid, and on-premises infrastructures, IT teams across the globe use ScienceLogic to proactively monitor, automate, and optimize operations.

Read Post

ScienceLogic

Read more about AIOps Across the Board: 3 Industry Use Cases That Leverage the Power of the ScienceLogic Platform

Tomcat Logs: Locations, Types, Configuration, and Best Practices

Feb 18, 2025 By Anjali Udasi In Last9

Apache Tomcat logs are essential for monitoring, debugging, and maintaining Java applications running on Tomcat. These logs capture critical information such as server startup details, request handling, and application errors. They help developers and system administrators troubleshoot issues, analyze traffic, and ensure application stability. Tomcat generates multiple logs, each serving a distinct purpose.

Read Post

Last9

Read more about Tomcat Logs: Locations, Types, Configuration, and Best Practices

Helm vs Terraform: A Detailed Comparison for Developers

Feb 18, 2025 By Anjali Udasi In Last9

When managing infrastructure and deploying applications in a cloud-native environment, two popular tools that developers often compare are Helm and Terraform. While both are used to automate deployments, they serve different purposes and operate in distinct ways. Understanding the differences can help you make the right choice for your use case.

Read Post

Last9

Read more about Helm vs Terraform: A Detailed Comparison for Developers

Eliminate log sprawl and cut costs with Sumo Logic

Feb 18, 2025 By Sumo Logic In Sumo Logic

How much money is your company wasting on using multiple tools for log ingestion? Security analysts, developers, and operations teams all rely on logs. But, when each team uses different and multiple tools to store and analyze logs, it leads to tool sprawl, wasted resources, and lost critical data. With Sumo Logic’s Log Analytics Platform, you get a single source of truth for all your log data. Gain context-driven insights into your performance, availability, security status, and threats, all while eliminating wasteful spending.

View Video

Sumo Logic

Read more about Eliminate log sprawl and cut costs with Sumo Logic

Stronger together: (Agentic) AIOps and observability are the keys to IT resilience

Feb 18, 2025 By LogicMonitor In LogicMonitor

Every new layer of infrastructure piles onto an already fragile web of interconnected challenges, making it painfully clear: traditional monitoring can’t keep up. You’re drowning in alerts, buried in data, and yet somehow still flying blind when real issues arise. More notifications don’t mean more insight, and more data doesn’t guarantee better decisions.

Read Post

LogicMonitor

Read more about Stronger together: (Agentic) AIOps and observability are the keys to IT resilience

How SNMP monitoring works

Feb 18, 2025 By LogicMonitor In LogicMonitor

Modern network management requires administrators to gather information about live network performance, detect faults as they are happening, and provide assurance of overall operations. Simple Network Management Protocol (SNMP) is a protocol commonly implemented for monitoring network infrastructure that satisfies each requirement.

Read Post

LogicMonitor

Read more about How SNMP monitoring works

How Dotcom-Monitor Enhances Your API Monitoring

Feb 18, 2025 By Dotcom-Monitor In Dotcom-Monitor

APIs (Application Programming Interfaces) play a crucial role in connecting applications, facilitating data exchange, and ensuring seamless user experiences. However, APIs are only as effective as their reliability and performance. Without proper monitoring, even the most well-designed API can encounter issues such as slow response times, unexpected downtime, or security vulnerabilities.

Read Post

Dotcom-Monitor

Read more about How Dotcom-Monitor Enhances Your API Monitoring

Enhancing Network Reliability: How to Measure, Test & Improve It

Feb 18, 2025 By Alyssa Lamberti In Obkio

Whether you're a business owner or an IT pro, you know that a solid network is the foundation of your organization’s success. And that’s where Network Reliability comes in. Think about it: a key video call with a client, a crucial file transfer right before a deadline, or an online transaction on your e-commerce site. What do they all have in common? They all depend on a network that just works—no glitches, no interruptions.

Read Post

Obkio

Read more about Enhancing Network Reliability: How to Measure, Test & Improve It

WhatsUp Gold Network Traffic Analysis Plus (NTA+) Demo

Feb 18, 2025 By WhatsUp Gold In WhatsUp Gold

This video gives a demonstration of the Network Traffic Analysis Plus (NTA+) functionality within WhatsUp Gold. Find more information on Network Traffic Analysis.

View Video

WhatsUp Gold

Read more about WhatsUp Gold Network Traffic Analysis Plus (NTA+) Demo

Effortless, Cost-Effective VMware Monitoring with NiCE

Feb 17, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

Managing a VMware environment can be complex, time-consuming, and expensive — unless you have the right monitoring solution. At NiCE, we pride ourselves on delivering intuitive, cost-effective monitoring solutions that simplify IT operations. One of our recent customers shared their experience with the NiCE VMware Management Pack, and their words speak for themselves.

Read Post

NiCE IT Mgmt

Read more about Effortless, Cost-Effective VMware Monitoring with NiCE

Why use Playwright in Catchpoint for synthetic monitoring

Feb 17, 2025 By Arunita Banerjee In Catchpoint

Modern websites demand constant oversight to ensure every click, login, and checkout runs smoothly. That’s where synthetic monitoring shines: it acts like a tireless, virtual visitor that spots performance hiccups before they can bother real users. Our Internet Performance Monitoring (IPM) platform features Playwright support. You can run new or existing Playwright scripts with little to no changes.

Read Post

Catchpoint

Read more about Why use Playwright in Catchpoint for synthetic monitoring

Integrating FinOps and ITSM for Optimal Cloud Cost Management

Feb 17, 2025 By Turbo360 In Turbo360

The adoption of cloud computing has revolutionized how businesses manage IT infrastructure accountability and budget control. As cloud offerings become increasingly complex and scalable, modern business environments demand improved financial management practices. Through its data-driven and collaborative approach, FinOps IT Service Management bridges the gap between engineering teams, business units, and finance departments, ensuring maximum cloud benefit consumption while optimizing expenses.

Read Post

Turbo360

Read more about Integrating FinOps and ITSM for Optimal Cloud Cost Management

An Easy Guide to OpenFeature Flagging

Feb 17, 2025 By Ujjwal Goyal In Last9

In software development, feature flags have become an essential tool for teams looking to deploy code with more control and agility. OpenFeature flagging, in particular, stands out as an open-source standard that’s revolutionizing how teams manage feature rollouts, experiments, and toggling. In this guide, we’ll understand what OpenFeature flagging is, its key benefits, how to implement it, and best practices to help you get the most out of it.

Read Post

Last9

Read more about An Easy Guide to OpenFeature Flagging

What is DynamoDB Throttling and How to Fix It

Feb 17, 2025 By Anjali Udasi In Last9

When you're working with DynamoDB, one of the most critical things you need to keep an eye on is throttling. If you're not careful, throttling can severely impact your database's performance. It’s not just about slower response times—throttling can lead to system failures or unexpected downtime if not addressed properly.

Read Post

Last9

Read more about What is DynamoDB Throttling and How to Fix It

Wiring Up a Next.js Self-Hosted Application to Honeycomb

Feb 17, 2025 By Ken Rimple In Honeycomb

Are you attempting to connect Honeycomb to a standalone (not hosted with Vercel) Next.js application? Most of the Next.js OpenTelemetry samples in the wild show how to connect Next.js to Vercel’s observability solution when hosting on their platform. But what if you’re hosting your own standalone Next.js server on Node.js? This blog post will get you started ingesting your Next.js application’s telemetry into Honeycomb.

Read Post

Honeycomb

Read more about Wiring Up a Next.js Self-Hosted Application to Honeycomb

Browse Incident & Uptime History in Status Pages

Feb 17, 2025 By Leo Baecker In Hyperping

Today we're introducing historical browsing for your status page services. You can now navigate through past incidents and uptime data to get a complete picture of your services' reliability over time.

Read Post

Hyperping

Read more about Browse Incident & Uptime History in Status Pages

Why a mobile app is the key to better incident communication

Feb 17, 2025 By Arun Madhavan In Site24x7

While downtime is inevitable, communication should remain swift and transparent. Businesses need a way to relay updates as incidents unfold, ensuring customers, internal teams, and stakeholders stay informed in real time. Relying on emails and web-based updates alone is no longer enough. A mobile-first approach is the solution.

Read Post

Site24x7

Read more about Why a mobile app is the key to better incident communication

Introducing the Middleware Adoption Journey

Feb 17, 2025 By Sean Riley In meshIQ

Middleware plays a crucial role in modern IT infrastructure by enabling seamless communication between applications, systems, and services. It facilitates data exchange, enhances interoperability, and supports various business functions by providing capabilities like messaging, transaction management, and integration services. Over time, middleware has evolved from simple message brokers to sophisticated platforms supporting APIs, cloud computing, microservices, and event-driven architectures.

Read Post

meshIQ

Read more about Introducing the Middleware Adoption Journey

Setup Telegraf, InfluxDB 3 Core & Grafana using Docker easily & quickly

Feb 17, 2025 By InfluxData In InfluxData

Learn to setup open source TIG Stack in less than 10 minutes. TimeStamps.

View Video

InfluxData

Read more about Setup Telegraf, InfluxDB 3 Core & Grafana using Docker easily & quickly

Top reasons why businesses lose trust after acquisition and how you can be smart

Feb 16, 2025 By Santhi Santhanakrishnan In Site24x7

Did you wake up to the news that your favorite tool was acquired? You probably got used to the tool's intuitive interface, cost-effectiveness, and feature set, which aligned perfectly with your day-to-day requirements. Your disappointment doesn't end here. It's just the beginning of a series of potential negative consequences of acquisitions.

Read Post

Site24x7

Read more about Top reasons why businesses lose trust after acquisition and how you can be smart

Passing the Phone: Auvik Way Edition

Feb 14, 2025 By Auvik In Auvik

The Auvik Way, in Action! At Auvik, our culture isn’t just words on a page—it’s what drives us every day. The Auvik Way is a set of principles that shape how we work, collaborate, and grow together. At RKO, we decided to bring it to life in a fun way! We asked Auvikians to "pass the phone" to a coworker who truly embodies one of our principles—and the results were nothing short of inspiring. Check out the video to see how our team uplifts, supports, and celebrates each other.

View Video

Auvik

Read more about Passing the Phone: Auvik Way Edition

Add Amazon S3 to your Pipeline Destination

Feb 14, 2025 By Splunk In Splunk

In this demo, we will look how you can add Amazon S3 as one of your destinations in Splunk Data Management. We are using Ingest Processor in this example but the same concept can be applied to Edge Processor as well.

View Video

Splunk

Read more about Add Amazon S3 to your Pipeline Destination

FOSDEM 2025 recap

Feb 14, 2025 By Jose Gomez-Selles In VictoriaMetrics

In case you haven’t heard about it yet, FOSDEM (Free and Open Source Software Developers’ European Meeting) is a huge, free, gathering for open-source software enthusiasts that happens every February in Brussels, Belgium. It’s a non-profit event put together by the community, and it’s one of the biggest of its kind - we’re talking about around 10,000 people from all over the world coming to hang out and talk about all things open source.

Read Post

VictoriaMetrics

Read more about FOSDEM 2025 recap

SRE Challenges & APM Solutions

Feb 14, 2025 By ManageEngine Site24x7 In Site24x7

Site Reliability Engineers (SREs) face constant challenges as cloud environments and microservices grow more complex. Performance issues often go unnoticed until they escalate, leading to downtime and disruptions. With Site24x7 APM, you can stay ahead of issues before they impact your business. Our Application Performance Monitoring (APM) solution provides real-time insights, predictive analytics, and deep visibility across your entire IT ecosystem—helping you.

View Video

Site24x7

Read more about SRE Challenges & APM Solutions

Native AWS Integrations with AutoDiscovery

Feb 14, 2025 By Ankit Anand In SigNoz

For developers, the main quest is building and scaling their applications—not struggling with complex monitoring setups. Yet, observability in cloud-native environments is essential, and configuring monitoring for AWS services has traditionally been a complex and manual process. Developers had to set up Firehose streams, CloudWatch metric streams, and log subscriptions, all while ensuring continuous maintenance for new instances, turning observability into an unwelcome side quest.

Read Post

SigNoz

Read more about Native AWS Integrations with AutoDiscovery

High Cardinality Explained: The Basics Without the Jargon

Feb 14, 2025 By Anjali Udasi In Last9

Cardinality refers to the number of unique values in a dataset column. A column with many distinct values—like a user ID or timestamp—has high cardinality, while a column with limited distinct values—like a boolean flag (true/false) or a category with a few possible options—has low cardinality. For example, consider a database of an e-commerce platform.

Read Post

Last9

Read more about High Cardinality Explained: The Basics Without the Jargon

Log Retention: Policies, Best Practices & Tools (With Examples)

Feb 14, 2025 By Anjali Udasi In Last9

Logs are the backbone of debugging, security, compliance, and performance monitoring. But if you don’t manage retention properly, you’ll either drown in unnecessary data or lose critical insights too soon. Log retention is all about striking a balance between keeping what’s necessary and discarding what’s not.

Read Post

Last9

Read more about Log Retention: Policies, Best Practices & Tools (With Examples)

Understanding Syslog Formats: A Quick and Easy Guide

Feb 14, 2025 By Anjali Udasi In Last9

Syslog is the backbone of logging in many Linux and Unix-based systems, playing a crucial role in monitoring, debugging, and auditing. But not all syslog messages are created equal. Depending on your system, software, and logging configuration, syslog messages may follow different formats. This guide walks you through the different syslog formats, why they matter, and how to work with them effectively.

Read Post

Last9

Read more about Understanding Syslog Formats: A Quick and Easy Guide

What is agentic AIOps, and why is it crucial for modern IT?

Feb 14, 2025 By LogicMonitor In LogicMonitor

Every minute of system downtime costs enterprises a minimum of $5,000. With IT infrastructure growing more complex by the day, companies are put at risk of even greater losses. Adding insult to injury, traditional operations tools are woefully out of date. They can’t predict failures fast enough. They can’t scale with growing infrastructure.

Read Post

LogicMonitor

Read more about What is agentic AIOps, and why is it crucial for modern IT?

Managing resource contention in Google App Engine: Best practices for optimal performance

Feb 14, 2025 By Mahalashmi Narayanan In Site24x7

Use case 1: When unexpected traffic surges lead to slower responses A sudden surge in user traffic during a high-demand event causes strain on resources in a cloud-based application running on App Engine. The platform automatically scales instances to handle the increased load, but since compute resources are shared, some instances experience CPU throttling. This leads to slower response times, delayed processing of critical operations, and potential errors that impact user experience. How to resolve it.

Read Post

Site24x7

Read more about Managing resource contention in Google App Engine: Best practices for optimal performance

What is Time Series Data?

Feb 14, 2025 By David Benson In Logit.io

Time series data is particularly prevalent, seen across numerous different industries and use cases. It offers significant value to various organizations, highlighting the importance of effectively monitoring and analyzing the data. By analyzing and monitoring time series data you can understand trends, patterns, and anomalies in sequential data collected at many points in time.

Read Post

Logit.io

Read more about What is Time Series Data?

Introducing Learning journeys: New step-by-step guides to get started with Grafana

Feb 14, 2025 By Fiona Peers Artiaga In Grafana

Our Big Tent philosophy provides the foundation for our broad, modular, and flexible observability platform. With Grafana’s powerful ability to integrate with a wide range of data sources, tools, and plugins, you can create customized solutions tailored to your unique needs.

Read Post

Grafana

Read more about Introducing Learning journeys: New step-by-step guides to get started with Grafana

The Role of ServiceOps in Enhancing IT Service Delivery and Efficiency

Feb 14, 2025 By Arpit Sharma In Motadata

Providing quick and effective IT services is paramount for organizational achievement in dynamic business operations. Technology development creates new obstacles for IT teams that must sustain service excellence and operational effectiveness standards. Recently developed ServiceOps implements a transformation of IT service management (ITSM) that surpasses all organizational needs.

Read Post

Motadata

Read more about The Role of ServiceOps in Enhancing IT Service Delivery and Efficiency

Improve developer experience and collaboration with Software Catalog

Feb 14, 2025 By Brooke Chen In Datadog

As software ecosystems grow more complex and fragmented, organizations are finding it harder to manage the thousands of interdependencies that make up their environments. For starters, engineers are collectively struggling to uphold security and reliability standards throughout their organizations because they lack a shared view of these complex software landscapes.

Read Post

Datadog

Read more about Improve developer experience and collaboration with Software Catalog

SolarWinds 2025.1: New Network Device Support You Need to See!

Feb 14, 2025 By SolarWinds In SolarWinds

Discover what’s new in SolarWinds Platform 2025.1! This update brings expanded network device support for Aruba, Fortinet, Ruckus Smart Zone Wireless, and Extreme Networks. Get hardware health insights, Layer 2 & 3 metrics, VLAN details, routing table utilization, and more!

View Video

SolarWinds

Read more about SolarWinds 2025.1: New Network Device Support You Need to See!

How to use APM data to improve your CI/CD pipeline performance

Feb 13, 2025 By Site24x7 In ManageEngine

Agile production has become the norm for software development cycles. The backbone for such a fast-paced landscape is the continuous integration and continuous delivery (CI/CD) pipeline. But merely depending on the CI/CD pipeline isn’t enough, even though the automated workflows give you a competitive edge. The pipeline needs to be optimized to function at its best. This is where monitoring your applications within the pipeline can be a game-changer.

Read Post

ManageEngine

Read more about How to use APM data to improve your CI/CD pipeline performance

Micro Lesson: Open Telemetry Collector Remote Management

Feb 13, 2025 By Sumo Logic In Sumo Logic

This video demonstrates remote management of Open Telemetry data collection by enabling setup and configuration from the Sumo Logic UI.

View Video

Sumo Logic

Read more about Micro Lesson: Open Telemetry Collector Remote Management

Native AWS Integrations with AutoDiscovery Demo

Feb 13, 2025 By SigNoz In SigNoz

Native AWS Integrations with AutoDiscovery Demo.

View Video

SigNoz

Read more about Native AWS Integrations with AutoDiscovery Demo

The Advanced Data Compression Techniques That Quietly Power Logz.io's AI Observability Agents

Feb 13, 2025 By David Lotan Bolotnikoff In logz.io

As an observability leader, at Logz.io, we pride ourselves on continuous innovation. That’s why, last year, we released our AI agents to revolutionize observability by helping businesses, and their engineering and DevOps teams, automate data analysis and root cause analysis. The primary way in which engineering and DevOps teams interact with the agents is by asking performance, troubleshooting, and optimization-related questions.

Read Post

logz.io

Read more about The Advanced Data Compression Techniques That Quietly Power Logz.io's AI Observability Agents

Types of Pods in Kubernetes: An In-depth Guide

Feb 13, 2025 By Anjali Udasi In Last9

When working with Kubernetes, pods are the fundamental building blocks of deployment. But not all pods are created equal. Understanding the different types of pods and their use cases is crucial for optimizing workloads, ensuring reliability, and maintaining efficiency in your cluster. Let's break it all down.

Read Post

Last9

Read more about Types of Pods in Kubernetes: An In-depth Guide

Telemetry Data Platform: Everything You Need to Know

Feb 13, 2025 By Anjali Udasi In Last9

As systems grow more distributed and complex, having a reliable way to monitor and understand what's happening across your infrastructure becomes essential. Telemetry data provides the visibility needed to keep everything running smoothly, whether you're managing microservices, cloud environments, or sophisticated AI systems. In this guide, we’ll break down what a telemetry data platform is, why it’s so important, and how you can choose the right one to meet your needs.

Read Post

Last9

Read more about Telemetry Data Platform: Everything You Need to Know

Uptime Monitoring: A Complete Beginner's Guide

Feb 13, 2025 By Simon Rodgers In WebSitePulse

Uptime monitoring checks whether a website, server, or online service is available. It runs automated tests at set intervals, verifying responses and sending alerts if a failure occurs. Businesses rely on uptime monitoring to detect issues early, prevent revenue loss, and maintain customer trust. A website outage can harm reputation, impact SEO rankings, and disrupt operations.

Read Post

WebSitePulse

Read more about Uptime Monitoring: A Complete Beginner's Guide

Deeper Trace Analytics - Analyze Root & Entry Spans with Ease

Feb 13, 2025 By Ankit Anand In SigNoz

Debugging distributed systems can often feel like searching for a needle in a haystack. When issues arise, engineers need faster ways to pinpoint critical spans within their traces. With our latest Deeper Trace Analytics update, SigNoz now enables powerful filtering for root and entry spans—making it significantly easier to analyze and debug distributed traces.

Read Post

SigNoz

Read more about Deeper Trace Analytics - Analyze Root & Entry Spans with Ease

How to monitor the monitoring - Roman Khavronenko, Fosdem 2025

Feb 13, 2025 By VictoriaMetrics In VictoriaMetrics

Monitoring serves as a shield against the unknown, providing clarity through the complexity of distributed systems and hardware. However, monitoring itself can become a complex system, posing challenges when it fails.

View Video

VictoriaMetrics

Monitoring

Read more about How to monitor the monitoring - Roman Khavronenko, Fosdem 2025

Discovering the Magic Behind OpenTelemetry Instrumentation - Jose Gomez-Selles | Fosdem 2025

Feb 13, 2025 By VictoriaMetrics In VictoriaMetrics

Instrumentation is the secret ingredient that brings observability to life, revealing the intricate workings of applications in ways logs and metrics alone can’t match. In this talk, we’ll dive deep into the magic of OpenTelemetry instrumentation, exploring how to uncover hidden insights within your applications and services.

View Video

VictoriaMetrics

Read more about Discovering the Magic Behind OpenTelemetry Instrumentation - Jose Gomez-Selles | Fosdem 2025

Pandora ITSM 105: New Tools for More Efficient IT Management

Feb 13, 2025 By Pandora FMS team In Pandora FMS

With the new Pandora ITSM version 105, you now have features designed to improve your workflow and optimize ticket and project management.

Read Post

Pandora FMS

Read more about Pandora ITSM 105: New Tools for More Efficient IT Management

Evaluating Cloud Gateways for Cost and Performance

Feb 13, 2025 By Phil Gervasi In Kentik

Cloud networking costs can escalate due to inefficient routing and limited visibility. Kentik’s cloud visibility and analytics solution helps engineers optimize transit, reduce costs, and improve performance by analyzing AWS Transit Gateways and exploring alternatives like direct peering, storage endpoints, and AWS CloudWAN.

Read Post

Kentik

Read more about Evaluating Cloud Gateways for Cost and Performance

From Detection to Prevention: Leveraging InfluxDB for Cybersecurity and IoT Threat Mitigation

Feb 13, 2025 By Suyash Joshi In InfluxData

Cybersecurity in the Industrial Internet of Things (IIoT) is often overlooked despite powering critical infrastructure such as energy grids, telecom networks, factories, robotics, and aerospace, all of which are prime targets for cyberattacks and data breaches. A single breach can disrupt essential services or expose sensitive data. So, how do we stay ahead of bad actors and proactively defend these systems?

Read Post

InfluxData

Read more about From Detection to Prevention: Leveraging InfluxDB for Cybersecurity and IoT Threat Mitigation

Preempting Problems in a Sociotechnical System

Feb 13, 2025 By Nick Travaglini In Honeycomb

Here at Honeycomb, we emphasize that organizations are sociotechnical systems. At a high level, that means that “wet-brained” people and the stuff they do is irreducible to “dry-brained” computations. That cashes out as the inability to ultimately remove or replace people in organizations with computers, in spite of what artificial general intelligence (AGI) ideologues would have you believe.

Read Post

Honeycomb

Read more about Preempting Problems in a Sociotechnical System

Establish End-to-End Visibility in VMware VeloCloud Environments

Feb 13, 2025 By Seth Differ In Broadcom

In recent years, the move to SD-WAN technologies like VeloCloud has been both rapid and widespread. While the use of SD-WAN has yielded benefits, one EMA report revealed that only 38% of respondents said their SD-WAN deployments were fully successful.

Read Post

Broadcom

Read more about Establish End-to-End Visibility in VMware VeloCloud Environments

Crafting effective cloud architecture diagrams: A comprehensive guide

Feb 13, 2025 By Kirubanandan RA In Site24x7

Cloud architecture diagrams play a crucial role in communication, planning, and execution within the realm of cloud computing. They provide a visual depiction of the infrastructure, highlighting the interconnections between different components and their collaborative functionality. In this guide, we will delve into the five fundamental factors that every cloud architect should consider when crafting a cloud infrastructure.

Read Post

Site24x7

Read more about Crafting effective cloud architecture diagrams: A comprehensive guide

Grafana Loki 3.4: Standardized storage config, sizing guidance, and Promtail merging into Alloy

Feb 13, 2025 By Julie Stickler In Grafana

The Grafana Loki 3.4 release is here, and it brings a fresh wave of enhancements aimed at standardizing Loki’s object storage, helping you right size your instance, and improving the ability to ingest out-of-order logs. Loki 3.4 also represents the official merging of Promtail into Grafana Alloy as part of our efforts to give our users a single telemetry collector. There’s a lot to go over, so let’s dive in.

Read Post

Grafana

Read more about Grafana Loki 3.4: Standardized storage config, sizing guidance, and Promtail merging into Alloy

Organization Home

Feb 13, 2025 By SquaredUp In Squared Up

Get a bird's eye view of your estate via the Organization Home. This view displays all the workspaces, dashboards and monitors in your organization. Health states are visible at a glance – simply drill down to troubleshoot or explore.

View Video

Squared Up

Read more about Organization Home

Debug issues hiding behind feature flags

Feb 13, 2025 By Sentry In Sentry

In this video, we walk through how to debug faster with the new Sentry Dev Toolbar, specifically when you're also monitoring errors behind Feature Flags. See how you can track feature rollouts, catch errors tied to specific flags, and get real-time insights without leaving your app.

View Video

Sentry

Monitoring

Read more about Debug issues hiding behind feature flags

The ROI of Developer-First Observability: Why It's a Game Changer

Feb 13, 2025 By Eran Kinsbruner In Lightrun

In today’s fast-paced software landscape, downtime is costly, debugging is time-consuming, and developers are constantly under pressure to resolve issues quickly. Observability tools have traditionally been built for operations and SRE teams, focusing on post-mortem analysis rather than proactive debugging. When developers gain real-time insights into live applications and fix issues without disrupting the software lifecycle it has been proven to be a game changer for a myriad of reasons.

Read Post

Lightrun

Read more about The ROI of Developer-First Observability: Why It's a Game Changer

Scraping NGINX Metrics with OpenTelemetry & Exporting to Carbon

Feb 13, 2025 By Benjamin Pitts In MetricFire

Looking for a straightforward way to collect NGINX metrics with OpenTelemetry and send them to your Graphite-based monitoring setup? Unlike Prometheus, which requires configuring scrape jobs and query language nuances, Carbon/Graphite offers a simpler setup with minimal overhead—just send metrics as plain text and query them easily with familiar tools like Grafana. Whether you're setting up dashboards, alerts, or just keeping an eye on traffic, this guide will get you actionable insights in no time!

Read Post

MetricFire

Read more about Scraping NGINX Metrics with OpenTelemetry & Exporting to Carbon

Challenges in designing AWS architecture

Feb 13, 2025 By Kirubanandan RA In Site24x7

Designing AWS architecture is a complex task. It requires careful planning; a deep understanding of cloud services; and the ability to balance performance, cost, security, and scalability. As organizations migrate to the cloud or expand their existing cloud infrastructure, they often face several challenges that can impact the success of their architecture. Once the architecture is deployed, effective cloud monitoring becomes critical to ensure optimal performance and reliability.

Read Post

Site24x7

Read more about Challenges in designing AWS architecture

Simplifying Kubernetes architecture for DevOps

Feb 13, 2025 By Kirubanandan RA In Site24x7

Kubernetes has become the go-to platform for managing containerized applications, but its architecture can seem complex to DevOps teams. Let’s break it down into simple terms and explore how tools like Site24x7 can simplify the process of designing and monitoring Kubernetes architecture.

Read Post

Site24x7

Read more about Simplifying Kubernetes architecture for DevOps

TIL - Playwright Supports Simple API Schema Validation!

Feb 13, 2025 By Checkly In Checkly

Join Stefan Judis, Playwright ambassador, as he explains how to use Playwright's `expect.any()` and `toMatchObject()` methods to implement a simple and basic API response schema validation.

View Video

Checkly

Read more about TIL - Playwright Supports Simple API Schema Validation!

Learn about cloud waste and 6 effective ways to reduce it

Feb 12, 2025 By CloudSpend In ManageEngine

Cloud waste occurs when cloud resources are unutilized or underutilized. Resource under-utilization occurs when more resources are procured than are actually needed by virtual machines (VMs) at runtime. Cloud providers continue to charge for these provisioned resources regardless of whether they are used or not, resulting in unchecked expenditure.

Read Post

ManageEngine

Read more about Learn about cloud waste and 6 effective ways to reduce it

Out-of-box OpenTelemetry-powered Kafka & Celery monitoring

Feb 12, 2025 By Ankit Anand In SigNoz

Messaging queues power modern distributed systems, handling background tasks, event-driven architectures, and real-time data streaming. However, debugging issues in Kafka and Celery queues has traditionally been a black box, with limited correlation between message producers, consumers, and broker metrics. With OpenTelemetry-powered Kafka & Celery monitoring, SigNoz introduces the industry's first fully integrated observability solution for messaging queues powered by OpenTelemetry.

Read Post

SigNoz

Read more about Out-of-box OpenTelemetry-powered Kafka & Celery monitoring

The Best API Monitoring Tools in 2025: A Complete Guide

Feb 12, 2025 By Patrick Edmonds In uptime

Imagine its Black Friday and your e-commerce platform suddenly stops processing payments. The culprit? A critical API connection to your payment processor has failed, and you had no idea until angry customers started flooding your support channels. By the time your team identifies and fixes the issue, you’ve already lost thousands in potential sales and damaged your brand reputation.

Read Post

uptime

Read more about The Best API Monitoring Tools in 2025: A Complete Guide

What is Network Response Time & How to Monitor It

Feb 12, 2025 By Alyssa Lamberti In Obkio

In a world where every second counts, one crucial metric that often flies under the radar is: Network Response Time. You might be wondering, "What exactly is network response time, and why should I care about it?" In this blog post, we're going to break down the concept of network response time into digestible bits (pun intended), and we'll explore why it's a game-changer for businesses of all sizes.

Read Post

Obkio

Read more about What is Network Response Time & How to Monitor It

Caution: High Value Information #webinar #sre

Feb 12, 2025 By Catchpoint In Catchpoint

Join us for an exclusive webinar with Ben Good from Google as we explore the findings in the 2024 State of DevOps report. For over a decade, the DORA report has provided critical insights into the capabilities and practices that fuel high-performing technology organizations. This report highlights the significant impact of AI on software development, explores platform engineering’s promises and challenges, and emphasizes user-centricity and stable priorities for organizational success.

View Video

Catchpoint

Monitoring

Read more about Caution: High Value Information #webinar #sre

Server Monitoring with Graphite

Feb 12, 2025 By Elliot Langston In MetricFire

Server monitoring is crucial to learn these days to use your servers efficiently. It helps optimize the performance of a server and diagnose issues productively. One useful tool used these days is Graphite, which helps monitor a server’s performance and provides graphing solutions by gaining valuable insights into your server. You can explore MetricFire’s Hosted Graphite service today by signing up for a free trial or booking a demo session.

Read Post

MetricFire

Read more about Server Monitoring with Graphite

How to Troubleshoot An Internet Local Loop Issue | Obkio Use Case Series

Feb 12, 2025 By Obkio In Obkio

Is your Internet connection acting up? In this video, we’ll walk you through how to identify and troubleshoot an Internet Local Loop issue using Obkio’s Network Performance Monitoring tool. Learn how to pinpoint the root cause of connectivity problems and ensure a reliable network for your business. What You’ll Learn: What an Internet Local Loop is How to detect Local Loop issues How Obkio helps you troubleshoot network problems.

View Video

Obkio

Read more about How to Troubleshoot An Internet Local Loop Issue | Obkio Use Case Series

How to cut costs for metrics and logs: a guide to lowering expenses in Grafana Cloud

Feb 12, 2025 By Daniel Bailey In Grafana

Observability is essential to maintaining system reliability, but as your infrastructure scales, so do your costs. Between metrics and logs, managing telemetry data can become overwhelming and expensive. Grafana Cloud is already designed to be cost-efficient, but scaling can still present cost challenges. The good news? Grafana provides robust tools and best practices to help optimize observability data and rein in spending.

Read Post

Grafana

Read more about How to cut costs for metrics and logs: a guide to lowering expenses in Grafana Cloud

Integrate AppSignal with AWS Fargate in Python Flask

Feb 12, 2025 By Camilo Reyes In AppSignal

In this tutorial, we’ll show you how to integrate AppSignal with a Flask application running on AWS Fargate. Fargate is a serverless container service that allows you to run Docker containers in the cloud. By integrating AppSignal with AWS Fargate, you can monitor the performance of your Flask application and get insights.

Read Post

AppSignal

Read more about Integrate AppSignal with AWS Fargate in Python Flask

Let's Encrypt Ends Expiry Emails - What Now?

Feb 12, 2025 By Sematext In Sematext

Let’s Encrypt has announced that it will no longer send certificate expiration notification emails. If you’ve relied on those emails to remind you when to renew your SSL/TLS certificates, it’s time to rethink your approach.

Read Post

Sematext

Read more about Let's Encrypt Ends Expiry Emails - What Now?

ELK vs New Relic: Which Monitoring Tool Should You Choose in 2025?

Feb 12, 2025 By Pavithra Parthiban In Atatus

Effective observability is crucial for maintaining system performance and reliability. ELK Stack and New Relic are two widely used solutions that offer distinct approaches to monitoring, tracing, and logging. This comparison will help you understand their core features, use cases, and strengths, enabling you to make a more informed decision on which tool best aligns with your organizational goals. Lets get started!

Read Post

Atatus

Read more about ELK vs New Relic: Which Monitoring Tool Should You Choose in 2025?

How to Filter Docker Logs with Grep

Feb 12, 2025 By Anjali Udasi In Last9

Managing logs in Docker can quickly become overwhelming, especially when dealing with multiple containers. If you’ve ever tried to sift through a sea of log entries looking for a specific error or debugging message, you know the struggle. Fortunately, you can pipe docker logs output through grep to filter logs efficiently. This guide breaks down how to use docker logs grep it effectively, including practical examples to help you debug and monitor your containerized applications like a pro.

Read Post

Last9

Read more about How to Filter Docker Logs with Grep

Ubuntu System Logs: How to Find and Use Them

Feb 12, 2025 By Anjali Udasi In Last9

System logs play a crucial role in debugging and monitoring in Ubuntu. When a service misbehaves or an unexpected crash happens, logs hold the answers. They’re also great for keeping an eye on system performance. Knowing how to access, read, and manage these logs can save you hours of troubleshooting. This guide covers everything you need to know about Ubuntu system logs—from where they’re stored to how to analyze them efficiently.

Read Post

Last9

Read more about Ubuntu System Logs: How to Find and Use Them

eG Innovations' AIOps-Powered APM

Feb 12, 2025 By Swaminathan J In eG Innovations

I recently wrote about how eG Innovations AIOps-powered monitoring benefits those working with Digital Workspaces – today I’ll cover how those same AIOps (Artificial Intelligence for IT Operations) capabilities also make the eG Enterprise platform a leader in the APM (Application Performance Monitoring) space. The eG Enterprise platform is equipped with capabilities for automated corrective actions, event-based triggers, and remote-control functionalities.

Read Post

eG Innovations

Read more about eG Innovations' AIOps-Powered APM

Traces Without Limits - Load a Million Spans with SigNoz | Launch Week 3.0 Day 2

Feb 12, 2025 By SigNoz In SigNoz

SigNoz is now the only open-source distributed tracing tool capable of loading a million spans seamlessly. With our latest improvements, users can navigate, analyze, and debug even the largest traces effortlessly.

View Video

SigNoz

Read more about Traces Without Limits - Load a Million Spans with SigNoz | Launch Week 3.0 Day 2

Deeper Trace Analytics - Quickly search through all spans, entry spans and root spans

Feb 12, 2025 By SigNoz In SigNoz

Debugging distributed systems can often feel like searching for a needle in a haystack. When issues arise, devs need faster ways to pinpoint critical spans within their traces. With our latest Deeper Trace Analytics update, we now enable powerful filtering for root and entry spans — making it significantly easier to analyze and debug distributed traces.

View Video

SigNoz

Read more about Deeper Trace Analytics - Quickly search through all spans, entry spans and root spans

Reducing the Costs and Operational Overhead of Kafka Infrastructures

Feb 12, 2025 By Sean Riley In meshIQ

Kafka is powerful. No doubt about it. But it’s also a beast when it comes to operational complexity and cost. What starts as a simple deployment quickly turns into a resource-hungry system that eats up engineering hours, compute power, and budget. Let’s consider a company that eagerly rolls out Kafka to streamline event streaming. Year one? Smooth sailing. Everything runs fine, and the team feels great. Year two? The cracks start to show.

Read Post

meshIQ

Read more about Reducing the Costs and Operational Overhead of Kafka Infrastructures

Deeper Trace Analytics - Analyze Root & Entry Spans with Ease | SigNoz Launch Week 3.0 Day 4

Feb 12, 2025 By SigNoz In SigNoz

View Video

SigNoz

Read more about Deeper Trace Analytics - Analyze Root & Entry Spans with Ease | SigNoz Launch Week 3.0 Day 4

The top 5 network security threats every CIO should know in 2025

Feb 12, 2025 By Rama Venkatesan In Site24x7

During a routine network check, your network bandwidth monitoring tool flags an unusual spike in bandwidth usage from a critical server. Further investigation reveals an unauthorized data transfer attempt originating from a misconfigured device. What would have happened if the IT team did not have a monitoring tool to identify the spike? Without the right tools, this simple red flag could escalate into a costly disaster: ransomware, compliance fines, or even operational paralysis.

Read Post

Site24x7

Read more about The top 5 network security threats every CIO should know in 2025

Getting started with SCOM dashboards

Feb 12, 2025 By Sameer Mhaisekar In Squared Up

In this blog, we will use the SquaredUp Cloud SCOM plugin to connect to our SCOM Management Group and take a look at what we get out of the box. SquaredUp Cloud is a data visualization tool that can connect to 70+ data sources – perfect for bringing varied data together in a single pane of glass. Display your SCOM data alongside other important metrics.

Read Post

Squared Up

Read more about Getting started with SCOM dashboards

The Modern Data Center: How AI is Reshaping Infrastructure

Feb 12, 2025 By LogicMonitor In LogicMonitor

The traditional data center is undergoing a dramatic transformation. As artificial intelligence reshapes industries from healthcare to financial services, it’s not just the applications that are changing—the very infrastructure powering these innovations requires a fundamental rethinking. Today’s data center bears little resemblance to the server rooms of the past.

Read Post

LogicMonitor

Read more about The Modern Data Center: How AI is Reshaping Infrastructure

Reducing the Costs and Operational Overhead of Apache Kafka Infrastructures

Feb 12, 2025 By Sean Riley In meshIQ

The Hidden Costs of Apache Kafka Apache Kafka is powerful. No doubt about it. But it’s also a beast when it comes to operational complexity and cost. What starts as a simple deployment quickly turns into a resource-hungry system that eats up engineering hours, compute power, and budget. Let’s consider a company that eagerly rolls.

Read Post

meshIQ

Read more about Reducing the Costs and Operational Overhead of Apache Kafka Infrastructures

Trusted by customers, recognized on Gartner Peer Insights

Feb 12, 2025 By ManageEngine

ManageEngine recognized as a Customers' Choice in the 2024 Gartner Peer Insights™ Voice of the Customer for Observability Platforms.

Get Report

ManageEngine

Monitoring

Read more about Trusted by customers, recognized on Gartner Peer Insights

Bridging identity management with operations

Feb 12, 2025 By ManageEngine

Monitoring solutions can help you eliminate AD service outages for good. Learn how you can implement proactive monitoring in your Active Directory environment with our free ebook!

Get EBook

ManageEngine

Monitoring

Read more about Bridging identity management with operations

Only 4% of public sector IT professionals think graduates are ready for real-world roles

Feb 11, 2025 By SolarWinds In SolarWinds

Lack of adequate graduates is major reason two-fifths (40%) also believe there is a significant skills gap in the public sector IT industry.

Read Post

SolarWinds

Read more about Only 4% of public sector IT professionals think graduates are ready for real-world roles

OpenTelemetry-Powered Infrastructure Monitoring - SigNoz Launch Week 3.0 Day 1

Feb 11, 2025 By SigNoz In SigNoz

Today, we’re excited to announce a much-awaited feature in SigNoz: Infrastructure Monitoring. With our latest OpenTelemetry-powered Infra Monitoring, we bring you a native OpenTelemetry experience that seamlessly integrates infrastructure metrics with application performance data.

View Video

SigNoz

Read more about OpenTelemetry-Powered Infrastructure Monitoring - SigNoz Launch Week 3.0 Day 1

Early Warning in AIOps from HEAL Software: The Key to Preventing Downtime

Feb 11, 2025 By Raja Shekar Mulpuri In HEAL Software

The answer is yes. But, as with any AI solution, the reality is more nuanced. At HEAL Software, we have spent years perfecting our Early Warning feature by analyzing anonymized data from thousands of global customers and collaborating with IT leaders across industries. AIOps isn’t just a buzzword—it’s a necessity for modern enterprises looking to minimize downtime and enhance operational efficiency.

Read Post

HEAL Software

Read more about Early Warning in AIOps from HEAL Software: The Key to Preventing Downtime

What is Synthetic Monitoring: The Secret Sauce to Network Monitoring

Feb 11, 2025 By Alyssa Lamberti In Obkio

Picture this: You're the IT manager at a large company, and you're responsible for ensuring that your network is running smoothly. But how do you know if everything is working as it should be? You could wait for someone to report a problem, but that's reactive and not ideal. You could monitor your network constantly, but that's impractical and time-consuming. So what's the solution? Enter synthetic monitoring, the secret sauce to network monitoring.

Read Post

Obkio

Read more about What is Synthetic Monitoring: The Secret Sauce to Network Monitoring

Distributed Tracing 101: Definition, Working and Implementation

Feb 11, 2025 By Anjali Udasi In Last9

Modern applications rely on microservices, making it tough to track issues across services. Distributed tracing helps by mapping a request’s journey and pinpointing latency, failures, and dependencies. Unlike traditional monitoring, tracing connects the dots between services, offering deeper visibility. But implementing it isn’t easy—it brings high data volumes, performance overhead, and complexity.

Read Post

Last9

Read more about Distributed Tracing 101: Definition, Working and Implementation

AWS CSPM Explained: How to Secure Your Cloud the Right Way

Feb 11, 2025 By Anjali Udasi In Last9

As organizations expand their AWS footprint, maintaining visibility and control over configurations can be challenging. Misconfigurations, unnoticed vulnerabilities, and compliance gaps can create serious security risks. AWS Cloud Security Posture Management (CSPM) helps teams navigate these challenges by automating security checks, ensuring compliance, and providing continuous monitoring. Here’s what you need to know about AWS CSPM and why it’s essential for securing your cloud environment.

Read Post

Last9

Read more about AWS CSPM Explained: How to Secure Your Cloud the Right Way

Monitoring Kubernetes Resource Usage with kubectl top

Feb 11, 2025 By Ujjwal Goyal In Last9

Efficient resource utilization is key to running Kubernetes workloads smoothly. Whether you're troubleshooting performance issues, optimizing resource requests and limits, or keeping an eye on cluster health, the kubectl top command is an essential tool. It provides real-time CPU and memory usage metrics for nodes and pods, helping you make informed decisions about scaling and resource allocation.

Read Post

Last9

Read more about Monitoring Kubernetes Resource Usage with kubectl top

Never Stand Watch Alone: Apica is the Always-On Partner for SREs

Feb 11, 2025 By Lori Bertelli In Apica

As we navigate through 2025, Site Reliability Engineers face unprecedented challenges in maintaining system reliability and performance at scale. With the rapid evolution of distributed systems, containerization, and AI-driven operations, SREs need more sophisticated tools than ever to successfully do their job as serving as grid guardians.

Read Post

Apica

Read more about Never Stand Watch Alone: Apica is the Always-On Partner for SREs

Uptime Monitoring Arrives in Sentry!

Feb 11, 2025 By Sentry In Sentry

Uptime Monitoring in Sentry has arrived! Cody from Sentry's developer experience team takes you through how to configure uptime monitors to monitor not just website status, but also how it uses tracing to help troubleshoot and debug issues.

View Video

Sentry

Monitoring

Read more about Uptime Monitoring Arrives in Sentry!

Stop Losing Sales! The Biggest UX Friction Traps in eCommerce

Feb 11, 2025 By Germain UX Team In Germain UX

Friction in eCommerce is a silent sales killer. When customers hit roadblocks—slow pages, confusing layouts, unnecessary steps—they ditch their carts and move on. The problem? Many online stores create friction without even realizing it. But here’s the deal: Not all friction is the same. Some comes from clunky tech, while other issues stem from poor design choices or pushy sales tactics.

Read Post

Germain UX

Read more about Stop Losing Sales! The Biggest UX Friction Traps in eCommerce

Continuing the DEX Journey: Building on Success

Feb 11, 2025 By Nexthink In Nexthink

View Video

Nexthink

Read more about Continuing the DEX Journey: Building on Success

9 Reasons Your Business Needs Continuous Network Monitoring in 2025

Feb 11, 2025 By Arpit Sharma In Motadata

Numerous technological advancements have made it easier to conduct financial transactions and business. However, cyber-attacks and network inefficiency remain a threat. That’s why your business needs continuous network monitoring. Keeping constant watch over the IT infrastructure of your business is crucial for its survival. It would be very disappointing for your thriving enterprise to come crashing down due to easily thwarted threats that went unnoticed.

Read Post

Motadata

Read more about 9 Reasons Your Business Needs Continuous Network Monitoring in 2025

Traces Without Limits - Load a Million Spans with SigNoz

Feb 11, 2025 By Ankit Anand In SigNoz

Observability at scale is challenging—especially when dealing with high-volume distributed traces. Traditional tracing tools struggle with large traces containing thousands of spans, often leading to sluggish UIs and an unmanageable debugging experience. Most tracing tools we checked have a limit on the maximum spans they can load for a single trace. But with SigNoz, we’ve redefined what’s possible.

Read Post

SigNoz

Read more about Traces Without Limits - Load a Million Spans with SigNoz

Why LogicMonitor is best for network monitoring

Feb 11, 2025 By LogicMonitor In LogicMonitor

As modern networks evolve into intricate ecosystems spanning on-premises, cloud, and hybrid environments, the need for a robust, scalable monitoring solution has never been greater. Organizations face the challenge of maintaining performance, minimizing downtime, and managing ever-increasing complexity.

Read Post

LogicMonitor

Read more about Why LogicMonitor is best for network monitoring

5 Ways to Avoid Alert Fatigue in Network Monitoring

Feb 11, 2025 By LogicMonitor In LogicMonitor

Alert fatigue is the silent productivity killer in IT operations, and its impact is more significant than you might think. A 2023 survey by CloudHealth Technologies found that 63% of organizations deal with over 1,000 cloud infrastructure alerts every single day. 22% report receiving more than 10,000 alerts each day. This highlights the critical need to minimize alert fatigue.

Read Post

LogicMonitor

Read more about 5 Ways to Avoid Alert Fatigue in Network Monitoring

Logz.io Open 360 Platform Overview

Feb 11, 2025 By logz.io In logz.io

Welcome to Logz.io, where we make monitoring, troubleshooting, and optimizing your systems easier than ever. Our AI-driven observability platform helps you: Ingest and manage your logs effortlessly Analyze and visualize data with powerful filtering & alerting Pinpoint root causes instantly with AI-powered RCA Optimize observability costs with DataHub Ensure peak system performance with Kubernetes 360 & App 360.

View Video

logz.io

Read more about Logz.io Open 360 Platform Overview

Get your IT together with Auvik

Feb 11, 2025 By Auvik In Auvik

At Auvik, we simplify IT management by giving you greater visibility into your network, devices, and SaaS apps. Our software helps you navigate the complexities of deploying laptops, monitoring traffic, troubleshooting issues, and more—so you can manage your infrastructure seamlessly. Let Auvik be your compass to IT success, free of friction.

View Video

Auvik

Read more about Get your IT together with Auvik

Out-of-the-box OpenTelemetry-powered Kafka & Celery monitoring | SigNoz Launch Week 3.0 Day 3

Feb 11, 2025 By SigNoz In SigNoz

Today, we are excited to announce OpenTelemetry-powered messaging queue monitoring in SigNoz. Debugging issues in Kafka and Celery queues has traditionally been a black box, with limited correlation between message producers, consumers, and broker metrics. With our messaging queue monitoring, teams can correlate Kafka broker metrics with OpenTelemetry spans, enabling deep insights into consumer lag, throughput, drop rates, and performance bottlenecks.

View Video

SigNoz

Read more about Out-of-the-box OpenTelemetry-powered Kafka & Celery monitoring | SigNoz Launch Week 3.0 Day 3

Challenges and Best Practices for Monitoring SaaS-based Businesses

Feb 11, 2025 By Dotcom-Monitor In Dotcom-Monitor

SaaS offers convenience, scalability, and ease of access which makes it a powerful choice for businesses of all sizes. However, monitoring SaaS applications presents unique challenges that can impact performance, security, and user experience if not handled properly. To maintain client trust and meet Service Level Agreements (SLAs), SaaS providers must implement a proactive monitoring strategy.

Read Post

Dotcom-Monitor

Read more about Challenges and Best Practices for Monitoring SaaS-based Businesses

How to Choose the Right Network Monitoring Tool: 7 Essential Factors

Feb 11, 2025 By LogicMonitor In LogicMonitor

Half of all server failures lead to staff working overtime, driving up costs and highlighting the critical need for effective monitoring. This underscores the importance of choosing the right network monitoring tool. It is a critical decision that impacts not only how well your infrastructure performs today but also how easily it can scale and adapt in the future. A comprehensive monitoring solution needs to balance deep technical capabilities with ease of use and scalability.

Read Post

LogicMonitor

Read more about How to Choose the Right Network Monitoring Tool: 7 Essential Factors

Think proactive monitoring for Teams Phone is too good to be true? Think again.

Feb 11, 2025 By Mia Martello In Martello Technologies

Collaboration platforms like Microsoft Teams are absolutely central to how enterprises get business done these days. But sometimes the fastest, most direct way to answer a question, solve a problem or make a connection is still to pick up the phone and call. The value of solutions like Microsoft Teams Phone is that they offer the best of both worlds: the simplicity and efficiency of voice communication integrated with digital collaboration tools and capabilities.

Read Post

Martello Technologies

Read more about Think proactive monitoring for Teams Phone is too good to be true? Think again.

Resolving Redis connection issues with comprehensive log review

Feb 11, 2025 By Subashree K In Site24x7

Redis is a highly efficient, versatile in-memory data store that is commonly utilized in modern applications. However, like any technology, it is not without its challenges, particularly when it comes to managing connections. By systematically reviewing Redis logs, you can diagnose and resolve these problems effectively. This blog provides an overview of Redis logs, explores their importance, and highlights how log management tools can simplify troubleshooting.

Read Post

Site24x7

Read more about Resolving Redis connection issues with comprehensive log review

Resolving Kafka consumer lag with detailed consumer logs for faster processing

Feb 11, 2025 By Subashree K In Site24x7

Apache Kafka is a distributed event streaming platform designed to handle large volumes of real-time data. It is widely used for messaging, logging, event processing, and real-time analytics. Kafka is known for its ability to handle high throughput, fault tolerance, and scalability, making it an essential tool for modern data-driven applications. Kafka operates with three main components: Latency refers to the time delay between when a message is produced and when it is consumed.

Read Post

Site24x7

Read more about Resolving Kafka consumer lag with detailed consumer logs for faster processing

Understanding the Observability Data Lifecycle: From Data Ingestion to Automated Actions

Feb 11, 2025 By ScienceLogic In ScienceLogic

Modern IT estates are increasingly complex, generating vast amounts of data – some critical and actionable, but much of it mere noise. Extracting meaningful insights to ensure optimal system health and IT performance is beyond the scope of humans. This is where observability, enhanced by AI and automation, becomes essential.

Read Post

ScienceLogic

Read more about Understanding the Observability Data Lifecycle: From Data Ingestion to Automated Actions

The Three Pillars of Network Monitoring: A Holistic Strategy

Feb 11, 2025 By LogicMonitor In LogicMonitor

To truly safeguard your infrastructure, it’s crucial to adopt a holistic strategy that covers every aspect of your network’s health and performance. This means integrating fault monitoring, performance monitoring, and availability monitoring into a comprehensive strategy. Lets discuss how a well-rounded approach to network monitoring can help you maintain resilience, optimize performance, and prevent downtime.

Read Post

LogicMonitor

Read more about The Three Pillars of Network Monitoring: A Holistic Strategy

Monitor Google Cloud: simplify and centralize your cloud provider observability with Grafana Cloud

Feb 11, 2025 By Vasil Kaftandzhiev In Grafana

Organizations increasingly rely on Google Cloud to power critical parts of their businesses, but managing those environments often involves navigating a labyrinth of disparate data, tools, and processes. We built Google Cloud Observability in Grafana Cloud to reduce the complexity and confusion by providing a unified, scalable solution designed to simplify monitoring, enhance visibility, and optimize costs.

Read Post

Grafana

Read more about Monitor Google Cloud: simplify and centralize your cloud provider observability with Grafana Cloud

Your App Might Be Down; Let's Fix It - Introducing Sentry Uptime Monitoring

Feb 11, 2025 By Sasha Blumenfeld In Sentry

Even at Sentry, we're not immune to downtime. In a moment of "oh-the-irony," we once took down our own application with a bad migration. We were adding a field to a critical database table, and the migration locked it completely. Since this table was essential to Sentry’s operation, the entire app went down. The website wouldn’t load, ingestion paused—everything ground to a halt.

Read Post

Sentry

Read more about Your App Might Be Down; Let's Fix It - Introducing Sentry Uptime Monitoring

Right Data, Right Now: Why Timely, Actionable Network Observability is Essential

Feb 11, 2025 By Alec Pinkham In Broadcom

For teams in many organizations, the work of IT and network management keeps getting more difficult. A recent EMA survey offers some findings that clearly illustrate this point. When respondents were asked which networking skills are the most difficult to find, several roles received a response of 30% or more, including network security, network monitoring and troubleshooting, and data center networking.

Read Post

Broadcom

Read more about Right Data, Right Now: Why Timely, Actionable Network Observability is Essential

NiCE VMware Management Pack 5.8

Feb 10, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

Great news for all VMware users! NiCE just rolled out VMware Management Pack 5.8 for Microsoft SCOM, bringing full support for VMware vSphere 8.0.1, 8.0.2, and 8.0.3. This update keeps your monitoring sharp and up-to-date with the latest VMware environments. Plus, we’ve polished up the docs to make life easier. If you’re an existing customer, this update is ready for you! Stay ahead of the game and keep your virtual environments running smoothly!

Read Post

NiCE IT Mgmt

Read more about NiCE VMware Management Pack 5.8

Log Levels: Answers to the Most Common Questions

Feb 10, 2025 By Anjali Udasi In Last9

Logging is essential for understanding what’s happening inside your software. It helps developers and operators catch issues, monitor system health, and track application behavior. A big part of logging is log levels—these indicate how serious a message is, from routine updates to critical errors. In this post, we’ll break down everything you need to know about log levels, how they compare to Syslog log levels, and best practices for making the most of your logs.

Read Post

Last9

Read more about Log Levels: Answers to the Most Common Questions

The Ultimate Guide to OpenTelemetry Visualization

Feb 10, 2025 By Prathamesh Sonpatki In Last9

Modern software systems are complex, with multiple services interacting across different environments. Understanding how they behave—tracking performance, identifying bottlenecks, and diagnosing failures—requires more than just collecting data. OpenTelemetry provides a standardized way to gather logs, metrics, and traces, but the real value comes from making that data easy to interpret through visualization.

Read Post

Last9

Read more about The Ultimate Guide to OpenTelemetry Visualization

Cloud Surfing: Entering a Resilient Cloud Era

Feb 10, 2025 By Splunk In Splunk

Unlock the secrets to success in the cloud era. Hear from Splunk's Tom Stoner and The Futurum Group's Daniel Newman as they delve into the pivotal roles of security, consideration and adaptability in shaping organizations' journey to the cloud.

View Video

Splunk

Read more about Cloud Surfing: Entering a Resilient Cloud Era

Catching the Wave of Efficiency with Splunk Cloud Platform

Feb 10, 2025 By Splunk In Splunk

Curious about how the Splunk Cloud Platform boosts efficiency? Nemi George, VP of Information Security at Pacific Dental Services (PDS), shares how Splunk brings intelligence to the table and helps PDS work faster.

View Video

Splunk

Read more about Catching the Wave of Efficiency with Splunk Cloud Platform

Why Migrate to Splunk Cloud Platform video

Feb 10, 2025 By Splunk In Splunk

Catch the wave to navigate sensitive data and maintain compliance with the Splunk Cloud Platform. Splunk's Kuntal Das showcases the top reasons to migrate to the cloud and how we help you do it with confidence.

View Video

Splunk

Read more about Why Migrate to Splunk Cloud Platform video

OpenTelemetry-Powered Infrastructure Monitoring

Feb 10, 2025 By Ankit Anand In SigNoz

Today, we’re excited to announce a much-awaited feature in SigNoz: Infrastructure Monitoring, built natively on OpenTelemetry. Infrastructure monitoring is a critical aspect of modern observability. Without proper visibility into your infrastructure resources, troubleshooting issues, optimizing costs, and maintaining performance become challenging.

Read Post

SigNoz

Read more about OpenTelemetry-Powered Infrastructure Monitoring

Network Loops: The Good, the Bad and the Ugly

Feb 10, 2025 By Ryan LaFlamme In Auvik

Network loops are a bit of a boogeyman. Loops are necessary for a well designed, resilient network – it’s the way to achieve redundancy with switches. Done right, loops protect your network from outages. Done wrong, a loop causes a major outage.

Read Post

Auvik

Read more about Network Loops: The Good, the Bad and the Ugly

Out-of-box OpenTelemetry-powered Kafka & Celery monitoring

Feb 10, 2025 By SigNoz In SigNoz

Out-of-box OpenTelemetry-powered Kafka & Celery monitoring.

View Video

SigNoz

Read more about Out-of-box OpenTelemetry-powered Kafka & Celery monitoring

OpenTelemetry Powered Infrastructure Monitoring Demo

Feb 10, 2025 By SigNoz In SigNoz

OpenTelemetry Powered Infrastructure Monitoring Demo.

View Video

SigNoz

Read more about OpenTelemetry Powered Infrastructure Monitoring Demo

Query the Latest Values in Under 10ms with the InfluxDB 3 Last Value Cache

Feb 10, 2025 By Scott Anderson In InfluxData

As part of the InfluxDB 3 Core and InfluxDB 3 Enterprise public alpha, the Last Value Cache (LVC) is available for testing. The LVC lets you cache the most recent values for specific fields in a table, improving the performance of queries that return the most recent value of a field for specific time series or the last N values of a field, typical of many monitoring workloads. With the LVC, these types of queries return in under 10ms.

Read Post

InfluxData

Read more about Query the Latest Values in Under 10ms with the InfluxDB 3 Last Value Cache

Cloud Monitoring's Blind Spot: The User Perspective

Feb 10, 2025 By Brandon DeLap In Catchpoint

The evolution of internet-centric application delivery has worsened IT's visibility gaps into what impacts an end user's experience. This problem is exacerbated when these gaps lead to negative business consequences, such as loss of revenue or lower Net Promoter Scores (NPS). The need to address this worsening visibility gap problem is reinforced by Gartner’s recent publication of its first Magic Quadrant for Digital Experience Monitoring (DEM).

Read Post

Catchpoint

Read more about Cloud Monitoring's Blind Spot: The User Perspective

What is Platform Engineering and Why is it Important?

Feb 10, 2025 By Wendy Howard In eG Innovations

Without the right frameworks in place, software development often feels like managing a project with too many moving parts and no cohesive plan. A good solution to this problem would be having a unified platform that streamlines processes, integrates tools, and provides consistency across the development lifecycle. That’s what platform engineering offers—it simplifies the complexities of software development by making it easier to build, deploy, and maintain digital infrastructure.

Read Post

eG Innovations

Read more about What is Platform Engineering and Why is it Important?

Stop Logging the Request Body!

Feb 10, 2025 By Martin Thwaites In Honeycomb

With more and more people adopting OpenTelemetry and specifically using the tracing signal, I’ve seen an uptick in people wanting to add the entire request and response body as an attribute. This isn’t ideal, as it wasn’t when people were logging the body as text logs. In this blog post, I’ll explain why this is a bad idea, what are the pitfalls, and more importantly, what you should do instead.

Read Post

Honeycomb

Read more about Stop Logging the Request Body!

New Relic vs Kibana: A Guide to Choosing the Right Tool in 2025

Feb 10, 2025 By Pavithra Parthiban In Atatus

New Relic and Kibana are popular monitoring and observability tools that provide a wide range of features for analysing and visualizing data. In this post, I have compared New Relic and Kibana based on key aspects such as data ingestion, dashboards and visualizations, log management, alerting, pricing and more. Lets take a look at each tool's capabilities, strengths, and weaknesses to help you understand how they differ and which one is best suited to your needs.

Read Post

Atatus

Read more about New Relic vs Kibana: A Guide to Choosing the Right Tool in 2025

Grafana Beyla 2.0: distributed traces, scalable Kubernetes deployments, and more

Feb 10, 2025 By Nikola Grcevski In Grafana

In November 2023, we released Grafana Beyla 1.0, the first major milestone in our pursuit of zero-code (and zero-effort) eBPF instrumentation. We delivered a way — through a single command-line — to automatically instrument any application supporting HTTP/gRPC protocols, as well as provide basic network packet flow information.

Read Post

Grafana

Read more about Grafana Beyla 2.0: distributed traces, scalable Kubernetes deployments, and more

Observe Your Google Cloud Infrastructure | Demo: New Grafana Cloud Application | Grafana Labs

Feb 10, 2025 By Grafana In Grafana

Want to monitor your Google Cloud infrastructure more effectively? Join Vasil Kaftandzhiev as he introduces Grafana Cloud’s new application designed specifically for Google Cloud observability. In this video, you'll discover how to: Optimize and troubleshoot your Google Cloud services Leverage out-of-the-box dashboards with key metrics and thresholds Set up comprehensive alerting for real-time incident response Streamline log management with an all-in-one logs view for faster root cause analysis Configure logs and metrics effortlessly using Grafana Alloy.

View Video

Grafana

Read more about Observe Your Google Cloud Infrastructure | Demo: New Grafana Cloud Application | Grafana Labs

WhatsUp Gold SNMP Table Active Monitor

Feb 10, 2025 By WhatsUp Gold In WhatsUp Gold

Watch this video to learn how to monitor the dynamic SNMP tabular data on your network devices using WhatsUp Gold, even when the number of instances varies. As an example, we’ll demo how to create an active monitor for the percentage of disk space currently in use on a storage device.

View Video

WhatsUp Gold

Read more about WhatsUp Gold SNMP Table Active Monitor

Faster Android Debugging using Sentry - Demo

Feb 10, 2025 By Sentry In Sentry

Join Sentry Solutions Engineer Simon Zhong for an introductory demo on accelerating Android debugging with Sentry.

View Video

Sentry

Read more about Faster Android Debugging using Sentry - Demo

How To Monitor Kubernetes with Splunk Infrastructure Monitoring

Feb 10, 2025 By Caitlin Halla In Splunk

Kubernetes is the standard for orchestrating containerized microservices — but it can present some monitoring challenges. Luckily, we’ve already covered why monitoring Kubernetes is a must-do, the basics of how to do it, and the options you have for collecting monitoring data from a K8s environment.

Read Post

Splunk

Read more about How To Monitor Kubernetes with Splunk Infrastructure Monitoring

Solve Problems Faster with New, Smarter AI and Integrations in Splunk Observability

Feb 10, 2025 By Connor Tye In Splunk

As businesses scale across hybrid and multi-cloud environments and integrate AI-powered technologies, complexity grows — and with it, the risk of performance degradation and cost of downtime. To avoid facing customer-impacting IT issues, organizations need better ways to correlate data across environments, detect anomalies before they escalate, and resolve incidents more efficiently. That’s where Splunk and Cisco come in.

Read Post

Splunk

Read more about Solve Problems Faster with New, Smarter AI and Integrations in Splunk Observability

From Datadog to Grafana Cloud: Why companies migrate and how it changes business for the better

Feb 10, 2025 By Michelle Tan In Grafana

“Impossibly expensive.”“Generic database metrics.”“Exceeding limits.”“No transparency.” These are the words our customers use to explain why they looked for a Datadog alternative and migrated onto Grafana Labs’ observability solutions. Grafana Cloud provided the scalability that LexisNexis Risk Solutions needed to migrate acquired companies into a unified observability platform. “We’ve had migrations from Datadog.

Read Post

Grafana

Read more about From Datadog to Grafana Cloud: Why companies migrate and how it changes business for the better

How to visualize user journeys with Site24x7 to spot opportunities to improve the UX

Feb 10, 2025 By Ramkumar Ramaswamy In Site24x7

Before judging anyone, walk a mile in their shoes. This is a great idiom that emphasizes the importance of experiencing what your customers experience when you offer a service. With empathy, IT product owners can ensure that their operations take into account user journeys to be responsive and responsible.

Read Post

Site24x7

Read more about How to visualize user journeys with Site24x7 to spot opportunities to improve the UX

Switching things up with Auvik

Feb 10, 2025 By Auvik In Auvik

Switch Gears This Year! Tolerating tedious and slow manual network processes is soooo 2024.

View Video

Auvik

Read more about Switching things up with Auvik

4 Best Backlink Indexing Tools in 2025 (SEO Indexers Guide)

Feb 10, 2025 By OpsMatters In OpsMatters

Website owners and SEO professionals need effective tools to speed up search engine recognition of their backlinks. This review analyzes the leading backlink indexers available in 2025, testing their performance metrics, costs, and technical capabilities. We've tested Giga Indexer, Rapid URL Indexer, Backlink Indexing Tool, and Indexceptional to determine how well each one handles the indexing process.

Read Post

OpsMatters

Read more about 4 Best Backlink Indexing Tools in 2025 (SEO Indexers Guide)

Coralogix Releases eBPF Observability for K8s Workloads

Feb 9, 2025 By Chris Cooney In Coralogix

There are several big barriers to an effective tracing strategy. Modern applications require complex code instrumentation, and legacy applications might not be so easy to alter, and that’s assuming every engineering team can be engaged to make the necessary changes. eBPF & OpenTelemetry flip this entire problem on its head, and Coralogix is one of the first major observability platforms to leverage this exciting functionality, to provide an unobtrusive, low risk overview of your system.

Read Post

Coralogix

Read more about Coralogix Releases eBPF Observability for K8s Workloads

Unveiling Azure's Hidden Costs: What You Need to Know

Feb 9, 2025 By Gil Gross In Anodot

So, you’re new to the cloud or just starting off with Azure. You’re probably starting your first project and using the Azure Calculator to help estimate your monthly run rate. The problem is that Azure, like all clouds, has hidden costs. So why does the cloud have hidden costs? Well, while we call them hidden costs, it’s really more a matter of unexpected costs or unknown costs.

Read Post

Anodot

Read more about Unveiling Azure's Hidden Costs: What You Need to Know

Cloud storage: Walkthrough, challenges and solutions

Feb 9, 2025 By Geoffrin Edwin In Site24x7

Cloud storage has become an integral part of enterprise IT infrastructure. Cloud engineers, SREs, SysAdmins, and CTOs are always on the look out for more avenues to keep their organization's data secure, accessible, and managed. In this blog post, let us explain cloud storage in detail, the associated challenges, and how to overcome them.

Read Post

Site24x7

Read more about Cloud storage: Walkthrough, challenges and solutions

Strategic IP address management (IPAM): A must-have solution for high volume networks

Feb 9, 2025 By Rama Venkatesan In Site24x7

Managing enterprise IT infrastructure isn’t just about staying afloat—it’s about being one step ahead with strategic IP address management in modern enterprise IT. Each day, IT teams grapple with network sprawl, security challenges, and the constant demand for scalability. But here’s a question: how does your enterprise manage its IP address space? If your answer is “manually” or “through spreadsheets,” it’s time to rethink your approach.

Read Post

Site24x7

Read more about Strategic IP address management (IPAM): A must-have solution for high volume networks

Navigating AWS policy changes in 2025: The role of CloudSpend in mitigating the impacts

Feb 7, 2025 By CloudSpend In ManageEngine

AWS SP/RI policy changes In a significant move, AWS announced policy changes to the use of Reserved Instances (RIs) and Saving Plans (SPs), which are set to take effect on June 1, 2025. These changes are particularly crucial for MSPs, resellers, and other organizations that rely on shared RIs and SPs to manage cloud costs.

Read Post

ManageEngine

Read more about Navigating AWS policy changes in 2025: The role of CloudSpend in mitigating the impacts

SolarWinds to be acquired by Turn/River Capital

Feb 7, 2025 By SolarWinds In SolarWinds

SolarWinds shareholders to receive $18.50 per share in cash, with a total enterprise value of $4.4 billion; SolarWinds to become a privately held company upon completion of the transaction.

Read Post

SolarWinds

Read more about SolarWinds to be acquired by Turn/River Capital

Managed OpenSearch: Pricing and How Logit.io is the Best Value

Feb 7, 2025 By Lee Smith In Logit.io

If you’re considering OpenSearch for your search and analytics infrastructure, the first question that likely comes to mind is: what will it cost? OpenSearch, the powerful, open-source search engine and analytics platform, provides a highly scalable solution for businesses. However, while the software itself is free to use, there are still costs associated with hosting, maintaining, and scaling OpenSearch clusters.

Read Post

Logit.io

Read more about Managed OpenSearch: Pricing and How Logit.io is the Best Value

Why Cybersecurity Asset Management is Crucial for Cyber Hygiene

Feb 7, 2025 By Teneo In Teneo

The concept of managing IT assets for security purposes has been around since the earliest days of computer networks in business. However, the term “Cybersecurity Asset Management (CAM)” itself is relatively new, however, Teneo have been opening minds to CAM for some time now, here is a summary of what it is and why it’s so important as part of maintaining good Cyber Hygiene.

Read Post

Teneo

Read more about Why Cybersecurity Asset Management is Crucial for Cyber Hygiene

How to Debug and Log in PHP

Feb 7, 2025 By Richard C. In Sentry

This guide explains how errors work in PHP and how to debug them efficiently using logging functions and Sentry. The information in this guide is correct for PHP 8 and perhaps above, depending on how much future PHP versions change.

Read Post

Sentry

Read more about How to Debug and Log in PHP

Casio UK Hit With Payment Skimming Attack

Feb 7, 2025 By Georgina Grant-Muller In RapidSpike

In early February 2025, reports emerged of a sophisticated web skimming attack that compromised the UK website of electronics manufacturer Casio, and at least 16 other ecommerce sites. This Magecart-style breach led to the theft of customers’ personal and payment information, highlighting the persistent threat of digital skimming to online retailers. Image Source: Casio UK Website.

Read Post

RapidSpike

Read more about Casio UK Hit With Payment Skimming Attack

Speed up resolving iOS issues using Sentry

Feb 7, 2025 By Sentry In Sentry

Join Sentry Solutions Engineer Karan Pujji for an introductory demo on accelerating iOS debugging with Sentry. Watch the demo to see Sentry in action.

View Video

Sentry

Monitoring

Read more about Speed up resolving iOS issues using Sentry

How Azure Observability Optimizes Performance and Monitoring

Feb 7, 2025 By Anjali Udasi In Last9

Observability in Azure isn’t just about tracking metrics—it’s about truly understanding how your cloud infrastructure, applications, and services are performing. It helps you spot issues before they become problems, optimize performance, and ensure security. In this guide, we’ll break down Azure Observability in a way that’s easy to follow, covering key concepts, best practices, and some useful tricks to give you an edge.

Read Post

Last9

Read more about How Azure Observability Optimizes Performance and Monitoring

Releasing Icinga for Windows v1.13.0

Feb 7, 2025 By Christian Stein In Icinga

Today we are happy to announce that we released Icinga for Windows v1.13.0 a couple of days ago. We have already talked about the changes coming to v1.13.0 with the beta blog-post last year in more depth, and will focus only on some core changes here.

Read Post

Icinga

Read more about Releasing Icinga for Windows v1.13.0

Services can now be grouped within sections

Feb 7, 2025 By Leo Baecker In Hyperping

Today we're introducing groups for your status page sections, bringing a new level of organization to your service monitoring. You can now create logical groupings of related services within each section, making your status pages even more structured and easier to navigate.

Read Post

Hyperping

Read more about Services can now be grouped within sections

Top 5 outages detected by StatusGator in January

Feb 7, 2025 By Colin Bartlett In StatusGator

StatusGator continues to deliver crucial early warnings for major service disruptions, detecting outages before official acknowledgment. Below, we highlight major incidents from January 2025, where StatusGator’s real-time monitoring kept users informed and helped minimize workflow disruptions.

Read Post

StatusGator

Read more about Top 5 outages detected by StatusGator in January

How to Troubleshoot Networks with Employees Working from Home

Feb 7, 2025 By Alyssa Lamberti In Obkio

With employees working from home, often relying on personal Internet connections and consumer-grade equipment, IT teams face a new set of challenges in ensuring seamless connectivity. Unlike traditional office environments, where networks are controlled and optimized, home networks are unpredictable and prone to a variety of issues – from slow Internet speeds to intermittent connectivity.

Read Post

Obkio

Read more about How to Troubleshoot Networks with Employees Working from Home

Everything You Need to Know About Microsoft Sentinel Pricing

Feb 7, 2025 By Anjali Udasi In Last9

Keeping your organization secure is more important than ever. Microsoft Sentinel, a cloud-native Security Information and Event Management (SIEM) solution, helps detect and respond to threats effectively. But to get the most out of it, it’s important to understand how the pricing works.

Read Post

Last9

Read more about Everything You Need to Know About Microsoft Sentinel Pricing

Powerful monitoring tools, custom dashboards, and alerting systems #shorts #datadog

Feb 7, 2025 By Datadog In Datadog

By leveraging Datadog’s powerful monitoring tools, custom dashboards, and alerting systems, Telkomsel gained deep visibility into its infrastructure, significantly reducing incidents and improving operational efficiency.

View Video

Datadog

Read more about Powerful monitoring tools, custom dashboards, and alerting systems #shorts #datadog

Can you solve this riddle?

Feb 7, 2025 By Catchpoint In Catchpoint

What to learn the answer?

View Video

Catchpoint

Monitoring

Read more about Can you solve this riddle?

NiCE AIX Management Pack | 5 Minutes Explainer Video

Feb 7, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

This short video will give a quick overview of the main features of the NiCE AIX Management Pack, such as Discovery, Monitors, Advanced Product Knowledge, Tasks, Performance Views, Reporting, and, of course, Security aspects of advanced AIX monitoring on Microsoft System Center Operations Manager.

View Video

NiCE IT Mgmt

Read more about NiCE AIX Management Pack | 5 Minutes Explainer Video

Elevating together: SolarWinds unveils new features in 2025 Partner Programme

Feb 6, 2025 By SolarWinds In SolarWinds

Latest advancements designed for accelerated partner growth and elevated experiences in 2025.

Read Post

SolarWinds

Read more about Elevating together: SolarWinds unveils new features in 2025 Partner Programme

Frontend Monitoring: Deliver Seamless and Performant User Experiences

Feb 6, 2025 By Rox Williams In Honeycomb

88% of online consumers are less likely to return to a site after a bad user experience. This means that addressing frontend issues such as slow load times, broken features, and unresponsive elements is crucial. Frontend monitoring helps development and IT teams proactively catch and resolve these issues to improve their user experience.

Read Post

Honeycomb

Read more about Frontend Monitoring: Deliver Seamless and Performant User Experiences

Get Started with #InfluxDB 3 Core in Seconds

Feb 6, 2025 By InfluxData In InfluxData

InfluxData Product Manager Pete Barnett breaks it down step by step, getting you up and running in seconds.

View Video

InfluxData

Read more about Get Started with #InfluxDB 3 Core in Seconds

Elastic Cloud Serverless now available in technical preview on Microsoft Azure

Feb 6, 2025 By Yuvi Gupta In Elastic

Elastic Cloud Serverless provides the fastest way to start and scale security, observability, and search solutions — without managing infrastructure. Today, we are excited to announce the technical preview of Elastic Cloud Serverless on Microsoft Azure — now available in the EastUS region. Elastic Cloud Serverless provides the fastest way to start and scale security, observability, and search solutions without managing infrastructure.

Read Post

Elastic

Read more about Elastic Cloud Serverless now available in technical preview on Microsoft Azure

Why observability needs FinOps, and vice versa: the Vantage integration with Grafana Cloud

Feb 6, 2025 By Ben Schaechter In Grafana

Ben Schaechter is co-founder & CEO of Vantage, a cloud cost management platform that provides actionable insights for every engineer. Observability tools have changed the way we monitor infrastructure and applications, as teams get complete visibility into performance across complex, multi-cloud environments. But as all that infrastructure scales, costs rise with it, and organizations are left to ask: Where are my costs going—and why?

Read Post

Grafana

Read more about Why observability needs FinOps, and vice versa: the Vantage integration with Grafana Cloud

Pro-level debugging: a tour of the new Sentry issue view

Feb 6, 2025 By Sentry In Sentry

Debugging is like solving a puzzle. Sentry's new issue view is designed to give you more clues when solving your puzzles, so you can debug faster, and get back to building.

View Video

Sentry

Read more about Pro-level debugging: a tour of the new Sentry issue view

The role of Redis monitoring in scaling applications for high-traffic environments

Feb 6, 2025 By Sinjan Ballav In Site24x7

High-traffic applications demand speed, reliability, and scalability, making Redis a top choice for tasks like caching and real-time analytics. However, as traffic grows, ensuring Redis operates at peak performance requires effective monitoring. By tracking key metrics, addressing bottlenecks, and optimizing resource use, Redis monitoring plays a vital role in maintaining stability and scalability.

Read Post

Site24x7

Read more about The role of Redis monitoring in scaling applications for high-traffic environments

Top 10 challenges for SREs and how to overcome them with APM tools

Feb 6, 2025 By Sindu Priyadharshini V In Site24x7

According to Google, "SRE is what you get when you treat operations as a software problem.” The role of site reliability engineers (SREs) is evolving rapidly to ensure optimal application performance in today's evolving IT environments. SREs are expected to provide proactive and predictive solutions for the issues arising from managing such environments. A Gartner report even suggests that by 2025, 70% organizations will be depending on SRE practices to ensure operational resilience.

Read Post

Site24x7

Read more about Top 10 challenges for SREs and how to overcome them with APM tools

Migrating to Amazon DaaS - Part 1 - How to leverage AIOps monitoring during a migration to Amazon WorkSpaces or AppStream 2.0

Feb 6, 2025 By Mike Ferioli In eG Innovations

If you are considering or planning a migration to Amazon Workspaces or AppStream 2.0, you’ll also want to consider how you integrate effective monitoring into your planning and execution – this will not only save you time and money long term but will also help you measure and achieve success.

Read Post

eG Innovations

Read more about Migrating to Amazon DaaS - Part 1 - How to leverage AIOps monitoring during a migration to Amazon WorkSpaces or AppStream 2.0

Expanded search now with 50+ fields, here is how to search

Feb 6, 2025 By Rollbar In Rollbar

How to search 50+ standard and custom fields in Rollbar.

View Video

Rollbar

Read more about Expanded search now with 50+ fields, here is how to search

Virtana in Gartner Research 2024: A Mark of Excellence in Infrastructure Observability

Feb 6, 2025 By Virtana Insight In Virtana

Research and analysis by Gartner¹ carries significant weight in the technology industry, serving as a trusted source of insights for IT decision-makers worldwide. Their rigorous evaluation processes and comprehensive market analysis help organizations make informed technology investments. When a company is featured across multiple Gartner research publications, it demonstrates market relevance and solution maturity.

Read Post

Virtana

Read more about Virtana in Gartner Research 2024: A Mark of Excellence in Infrastructure Observability

Access your data with Federated Analytics for Amazon Security Lake. Insights from Splunk, AWS, and A

Feb 6, 2025 By Splunk In Splunk

Federated Analytics gives organizations the full power of Splunk extended to data stored in Amazon Security Lake. Trusted partners like Accenture are helping bring these new capabilities to life at organizations around the world.

View Video

Splunk

Read more about Access your data with Federated Analytics for Amazon Security Lake. Insights from Splunk, AWS, and A

What Does Low Network Bandwidth Mean & How to Fix It

Feb 6, 2025 By Andrii Kernitskyi In Obkio

Network performance is critical for everything from streaming videos to running cloud applications. But what happens when your network feels sluggish, and tasks that should take seconds suddenly take minutes? The culprit could be low network bandwidth. In this article, we’ll break down what low bandwidth means, how it affects your network, and actionable steps to fix it.

Read Post

Obkio

Read more about What Does Low Network Bandwidth Mean & How to Fix It

Getting started with PowerShell dashboards

Feb 6, 2025 By Sameer Mhaisekar In Squared Up

SquaredUp is a flexible dashboard and analytics platform that makes it really easy to turn your PowerShell scripts into dashboards that you can use for monitoring or sharing. In this article we’ll take a look at getting started with the PowerShell plugin for SquaredUp and build our first dashboard. Sign up for a free account if you’d like to follow along.

Read Post

Squared Up

Read more about Getting started with PowerShell dashboards

The Role of Log Monitoring in Securing Hybrid Cloud Infrastructures

Feb 6, 2025 By Arpit Sharma In Motadata

Hybrid cloud services have become a cornerstone for many businesses. These technologies, which combine the strengths of private and public clouds, assist enterprises in achieving their dreams of scalability, flexibility, and cost-efficiency. However, this added optimization comes at a cost, particularly with increased operational complexity and security concerns. To minimize cyber threats and secure their data, businesses must invest in more security solutions, such as log monitoring.

Read Post

Motadata

Read more about The Role of Log Monitoring in Securing Hybrid Cloud Infrastructures

Latest Product Updates and Features in Logz.io | February 2025

Feb 6, 2025 By Henn Idan In logz.io

We’re excited to announce a series of upgrades to our AI Agent, Log Management Explore UI and core integrations designed to empower you with even deeper observability and streamlined operations. These updates enhance account visibility, multi-telemetry trace insights, and logging capabilities while ensuring seamless compatibility with OpenTelemetry. Read on to discover how these enhancements can help you gain more clarity and control over your environment.

Read Post

logz.io

Read more about Latest Product Updates and Features in Logz.io | February 2025

How to Optimize Your Observability Spend in 2025

Feb 6, 2025 By Gregorio Fusco In logz.io

According to the 2024 Logz.io Observability Pulse Survey, 91% of respondents said they’re actively looking for ways to reduce observability costs, and 50% want better visibility into their monitoring expenses.

Read Post

logz.io

Read more about How to Optimize Your Observability Spend in 2025

Generation AI (Episode 2): How Generative AI is Shaping the Future of Security Operations

Feb 6, 2025 By Elastic In Elastic

The next golden age of artificial intelligence has arrived, but the path forward is far from certain. Technology leaders are presented with a tremendous opportunity to revolutionize their business — that is, if they can find a way to tap into the full potential of their organization's data. In Episode 2 of Elastic's new limited series, Generation AI, Elastic's CISO, Mandy Andress, shares how she believes generative AI will shape the future of the security operations in the modern enterprise.

View Video

Elastic

Read more about Generation AI (Episode 2): How Generative AI is Shaping the Future of Security Operations

Generation AI (Episode 1): How Generative AI is Shaping the Future of Enterprises

Feb 6, 2025 By Elastic In Elastic

The next golden age of artificial intelligence has arrived, but the path forward is far from certain. Technology leaders are presented with a tremendous opportunity to revolutionize their business — that is, if they can find a way to tap into the full potential of their organization's data. In Episode 1 of Elastic's new limited series, Generation AI, Elastic's CIO, Matt Minetola, shares how he believes generative AI will shape the future of the modern enterprise.

View Video

Elastic

Read more about Generation AI (Episode 1): How Generative AI is Shaping the Future of Enterprises

Ban Flaky Tests With These Playwright Command Line Gems!

Feb 6, 2025 By Checkly In Checkly

Join Stefan Judis, Playwright ambassador, as he explains how to use the Playwright CLI flags "--repeat-each" and "--only-changed" to identify flaky tests before adding them to your project.

View Video

Checkly

Read more about Ban Flaky Tests With These Playwright Command Line Gems!

AWS Monitoring Trends 2025

Feb 6, 2025 By ManageEngine Site24x7 In Site24x7

Discover the top trends shaping AWS monitoring in 2025! From AI-powered predictive analytics to sustainability-focused tools, this video dives into the innovations driving the future of cloud infrastructure. Topics Covered: Stay ahead in the evolving cloud landscape with these key trends. Watch now to learn how to achieve smarter, faster, and more sustainable AWS monitoring in 2025 and beyond! Subscribe for more cloud insights!

View Video

Site24x7

Read more about AWS Monitoring Trends 2025

Micro Lesson: Open Telemetry Collector Remote Management

Feb 6, 2025 By Sumo Logic In Sumo Logic

This video demonstrates remote management of data collection by enabling setup and configuration from the Sumo Logic UI. For more information, refer to the Sumo Logic documentation here.

View Video

Sumo Logic

Read more about Micro Lesson: Open Telemetry Collector Remote Management

Monitor Amazon Kinesis Firehose in Hosted Graphite

Feb 6, 2025 By Charlie von Metzradt In MetricFire

We’ve supported syncing your metrics from Kinesis Streams, Amazon’s streaming data platform, for several years. Kinesis Streams helps you gather and process streaming data which can then be monitored in your Hosted Graphite account. Recently, we’ve added support for Firehose, a fully managed and scalable service that allows users to stream data to destinations like Amazon Simple Storage Service (Amazon S3), Amazon Redshift, or Amazon Elasticsearch Service (Amazon ES).

Read Post

MetricFire

Read more about Monitor Amazon Kinesis Firehose in Hosted Graphite

Real User Monitoring for B2B vs. B2C Businesses

Feb 6, 2025 By Emma Lampert In Coralogix

Imagine you’re a product manager at a B2B SaaS company. Monday morning, a frustrated client floods your inbox—their workflows were disrupted by a slowdown you could’ve caught sooner with better user insights. Now, imagine running an e-commerce store on Cyber Monday. Traffic surges, but abandoned carts spike. Your RUM dashboard reveals slow mobile checkouts. A quick fix saves thousands in sales.

Read Post

Coralogix

Read more about Real User Monitoring for B2B vs. B2C Businesses

Generation AI (Episode 3): How Generative AI is Shaping the Future of Customer Support

Feb 6, 2025 By Elastic In Elastic

The next golden age of artificial intelligence has arrived, but the path forward is far from certain. Technology leaders are presented with a tremendous opportunity to revolutionize their business — that is, if they can find a way to tap into the full potential of their organization's data. In Episode 3 of Elastic's new limited series, Generation AI, Elastic's VP of Global Customer Support, Julie Rudd, shares how she believes generative AI will shape the future of customer support.

View Video

Elastic

Read more about Generation AI (Episode 3): How Generative AI is Shaping the Future of Customer Support

NGINX Log Monitoring: What It Is, How to Get Started, and Fix Issues

Feb 6, 2025 By Ujjwal Goyal In Last9

Ensuring that your web applications run smoothly and securely is essential. NGINX, known for its high performance and scalability, plays a key role in delivering web content. But to keep everything running efficiently, you need to monitor and analyze its logs properly. This guide will walk you through how to configure, analyze, and make the most of NGINX logs to stay on top of your server’s health.

Read Post

Last9

Read more about NGINX Log Monitoring: What It Is, How to Get Started, and Fix Issues

Learn to Forecast Time Series Data Using ML & InfluxDB

Feb 6, 2025 By Suyash Joshi In InfluxData

Forecasting is all about predicting the future—in data science, it is one of the key skills in dealing with time series data, such as stock price prediction, sales forecasting, logistics planning, etc. In this tutorial, we’ll learn how to forecast the notorious weather pattern of London, UK, using the following free and open source technologies.

Read Post

InfluxData

Read more about Learn to Forecast Time Series Data Using ML & InfluxDB

Beyond monitoring: The power of observability

Feb 6, 2025 By Adam White In Sumo Logic

The demand for seamless user experiences and robust system reliability is at an all-time high, and businesses are racing to meet these expectations. But as system complexity increases, traditional monitoring tools are falling short. Observability offers a paradigm shift. It goes beyond tracking metrics and provides deep insights to understand the “why” behind system behavior by parsing and contextualizing unstructured data.

Read Post

Sumo Logic

Read more about Beyond monitoring: The power of observability

How to Monitor Error Logs in Real-Time: An In-Depth Guide

Feb 6, 2025 By Anjali Udasi In Last9

For system admins and developers, being able to track error logs in real time is crucial. It’s not just about fixing problems; it’s about keeping everything running smoothly, ensuring systems perform at their best, and catching issues before they snowball into bigger ones. This guide breaks down the tools and commands that make real-time log monitoring easier and more effective, offering more than just the basics.

Read Post

Last9

Read more about How to Monitor Error Logs in Real-Time: An In-Depth Guide

What is cloud cost anomaly detection?

Feb 5, 2025 By CloudSpend In ManageEngine

What is cloud cost anomaly detection? Cloud cost anomaly detection is a perfect example of the saying “a penny saved is a penny earned.” Imagine your monthly cloud bill suddenly skyrocketing despite no new services being added to your app.

Read Post

ManageEngine

Read more about What is cloud cost anomaly detection?

NiCE Office Hours for Microsoft SCOM

Feb 5, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

We are excited to introduce NiCE Office Hours – a dedicated time slot where you can get expert insights, guidance, and answers to all your questions about SCOM, NiCE Management Packs, and custom management pack authoring. When? Every Thursday from 3:00 PM to 5:00 PM CET/CEST.

Read Post

NiCE IT Mgmt

Read more about NiCE Office Hours for Microsoft SCOM

AWS CloudWatch Custom Metrics: Types & Setup Guide [With Examples]

Feb 5, 2025 By Anjali Udasi In Last9

Amazon CloudWatch is a monitoring and observability service that provides real-time insights into AWS resources and applications. While CloudWatch provides many default metrics, sometimes you need custom metrics to monitor specific aspects of your infrastructure or applications. This guide covers everything you need to know about CloudWatch custom metrics, from basics to advanced use cases.

Read Post

Last9

Read more about AWS CloudWatch Custom Metrics: Types & Setup Guide [With Examples]

Using a transformer-based text embeddings model to reduce Sentry alerts by 40% and cut through noise

Feb 5, 2025 By Tillman Elser In Sentry

Sentry uses Issue Grouping to aggregate identical errors and prevent duplicate issues from being created, and duplicate alerts being sent. One of the chief complaints we’ve heard from our users is that in some cases the existing algorithm did not sufficiently group similar errors together, and Sentry would create separate issues and alerts, causing unnecessary disruption–or at least annoyance–to developers.

Read Post

Sentry

Read more about Using a transformer-based text embeddings model to reduce Sentry alerts by 40% and cut through noise

5 Types of Checks Every Shopify Store Should Have

Feb 5, 2025 By Lucas Lalonde In uptime

Running an online store based on Shopify can be a stressful experience. Meeting sales quotas or metrics, ensuring the store’s accessibility, and accessing data on user statistics are all concerns that any Shopify store owner will encounter. Though Shopify provides an excellent solution for sellers, additional monitoring services to ensure that the store is always available can be very helpful.

Read Post

uptime

Read more about 5 Types of Checks Every Shopify Store Should Have

Solving a GitHub Mystery in Seconds

Feb 5, 2025 By Winston Bowden In Lumigo

It started like any other alert. An issue was detected by Lumigo. Service: prod_copilot_repo-parser_cont Error Type: FailedToProcessGithubRepository A notification popped up in Slack. Someone’s repository wasn’t processing correctly. Not a great way to start the day.

Read Post

Lumigo

Read more about Solving a GitHub Mystery in Seconds

Introducing WebPageTest Expert Plan: Real-Time Insights, Synthetic + RUM together in One Platform

Feb 5, 2025 By Mikayla Reddy In Catchpoint

Imagine this: You push a major update to your website, confident that everything looks great. Hours later, traffic plummets. Your users complain about slow load times, but when you check WebPageTest, everything seems fine. What’s missing? Real-time insights and proactive monitoring.

Read Post

Catchpoint

Read more about Introducing WebPageTest Expert Plan: Real-Time Insights, Synthetic + RUM together in One Platform

How To Ping Check with Grafana Cloud Synthetic Monitoring

Feb 5, 2025 By Grafana In Grafana

Learn how to set up a ping check using Grafana Cloud Synthetic Monitoring to monitor service availability and network performance.

View Video

Grafana

Read more about How To Ping Check with Grafana Cloud Synthetic Monitoring

Revising Icinga Exchange

Feb 5, 2025 By Noé Costa In Icinga

Icinga is an open-source project, but it’s only become the product we like to use thanks to co-development, brainstorming and suggestions from the community. That’s why we created a platform in the past to facilitate the exchange of custom implementations like check plug-ins, styles, extensions and bridges to third-party systems. We’re talking about our Exchange Portal, of course.

Read Post

Icinga

Read more about Revising Icinga Exchange

AI Monitoring and LLMOps with PagerDuty

Feb 5, 2025 By Mitra Goswami In PagerDuty

This post was authored by Mitra Goswami, Ralph Bird, Everaldo Aguiar, and Scott Sieper. Over the past two years, generative AI (GenAI) has come a long way, from the early excitement of ChatGPT to early explorations and more and more companies deploying GenAI-powered features into production.

Read Post

PagerDuty

Read more about AI Monitoring and LLMOps with PagerDuty

Getting Started with OpenTelemetry Java SDK

Feb 5, 2025 By Prathamesh Sonpatki In Last9

Understanding how your applications perform is crucial. OpenTelemetry has emerged as a powerful observability framework, offering a standardized approach to collecting telemetry data such as metrics, logs, and traces. For Java developers, the OpenTelemetry Java SDK provides the tools necessary to instrument applications effectively. This guide is all about the OpenTelemetry Java SDK, exploring its components, configuration, and advanced features to help you harness its full potential.

Read Post

Last9

Read more about Getting Started with OpenTelemetry Java SDK

Announcing Checkly Traces: Unified Synthetic Monitoring and Distributed Tracing

Feb 5, 2025 By Sara Miteva In Checkly

Until recently, Checkly was telling you what broke in your app. Now, it can also tell you why it broke. We're excited to announce the general availability of Checkly Traces, a new addition to our synthetic monitoring platform that bridges the gap between frontend monitoring and backend observability. By combining synthetic monitoring with distributed tracing, Checkly Traces empowers development teams to detect, diagnose, and resolve issues faster than ever before.

Read Post

Checkly

Read more about Announcing Checkly Traces: Unified Synthetic Monitoring and Distributed Tracing

Why Observability 2.0 Is Such a Gamechanger

Feb 5, 2025 By Erwin van der Koogh In Honeycomb

One of the hardest parts of my job is to get people to appreciate just how much of a difference Honeycomb/observability 2.0 is compared to their current way of working. It’s not just a small step up or a linear improvement. Rather, it’s an entire step change in the way that you write, deploy, and operate software for your customers.

Read Post

Honeycomb

Read more about Why Observability 2.0 Is Such a Gamechanger

Full Guide to Linux Disk IO Monitoring, Alerting and Tuning

Feb 5, 2025 By Sematext In Sematext

Disk IO (Input/Output) is a core aspect of system performance. Whether you’re managing a database, a web application, or a cloud server, how efficiently your system reads and writes data affects everything from response times to stability. Unlike high CPU usage or memory bottlenecks that often manifest immediately, disk IO issues tend to creep up silently—until they slow down critical processes.

Read Post

Sematext

Read more about Full Guide to Linux Disk IO Monitoring, Alerting and Tuning

How to Stop Memory Leaks Before they Crash Your Linux System

Feb 5, 2025 By Sematext In Sematext

Imagine you’ve got a leaky faucet in your kitchen. At first, it’s just a drip here and there—annoying, sure, but not enough to ruin your day. But leave it unchecked, and soon that drip turns into a steady trickle. Your water bill skyrockets, the sink overflows, and before you know it, you’re ankle-deep in chaos. Now, replace that faucet with a Linux system, and you’ve got a memory leak.

Read Post

Sematext

Read more about How to Stop Memory Leaks Before they Crash Your Linux System

5 Ways to Prevent CPU Overload on Linux Servers

Feb 5, 2025 By Sematext In Sematext

Every server administrator’s nightmare starts with a message: “CPU usage at 100%” It’s that critical moment when your Linux server transforms from a reliable workhorse into a sluggish mess, taking your applications and user experience down. We’ve all been there… staring at a terminal, watching load averages climb, while frantically trying to figure out which process decided to throw a CPU-hungry party on our server.

Read Post

Sematext

Read more about 5 Ways to Prevent CPU Overload on Linux Servers

OpenTelemetry, Prometheus, and More: Which Is Better for Metrics Collection and Propagation?

Feb 5, 2025 By Zhu Jiekun In VictoriaMetrics

What happens if we put OpenTelemetry, Prometheus 2.x, Prometheus 3.x, and vmagent together for comparison in scraping and pushing data to remote storage?

Read Post

VictoriaMetrics

Read more about OpenTelemetry, Prometheus, and More: Which Is Better for Metrics Collection and Propagation?

AppSignal Now Offers Support for Long-Running Streaming Rack Responses in Ruby

Feb 5, 2025 By Connor James In AppSignal

We're excited to announce that AppSignal now offers improved monitoring for long-running streaming Rack responses. Our improved Rack response monitoring means you can gain deeper visibility into the health of your Ruby application's long-running responses, allowing you to catch errors that may arise minutes or even hours after a request's body is served. This new layer of observability results from a valuable contribution from Julik Tarkhanov, Director of Engineering at Cheddar Payments.

Read Post

AppSignal

Read more about AppSignal Now Offers Support for Long-Running Streaming Rack Responses in Ruby

What Is Network Device Monitoring? Find Out 5 Top Monitoring Tools

Feb 5, 2025 By Staff Contributor In SolarWinds

Businesses, organizations, and individuals rely on networks to communicate and exchange data. The rapid growth of technology and increasing reliance on networked systems have made robust network performance and security critical. However, maintaining optimal network performance and security is a difficult task. Network failures, security breaches, and performance bottlenecks can result in substantial financial losses and reputational damage. What Is Network Device Monitoring?

Read Post

SolarWinds

Read more about What Is Network Device Monitoring? Find Out 5 Top Monitoring Tools

Monitoring coffee: Tales from Hosted Graphite's secret lab

Feb 5, 2025 By MetricFire Blogger In MetricFire

It has been said that software engineers are organisms that convert caffeine into code. Not all software engineers need coffee to get by, but it's popular enough that it'd be silly for us not to have an office coffee machine... …it'd also be sort of silly for a monitoring company not to monitor that coffee machine, which is so crucial that we could make a reasonable argument for it being part of the production infrastructure.

Read Post

MetricFire

Read more about Monitoring coffee: Tales from Hosted Graphite's secret lab

Locking Down PostgreSQL with SSL: Secure Remote Connections Like a Pro

Feb 5, 2025 By Benjamin Pitts In MetricFire

PostgreSQL is a beast when it comes to handling data, but if you're running an instance that needs to be accessed remotely, securing it with SSL is non-negotiable. Without SSL, your database connection is essentially an open book for anyone snooping on the network. Let’s lock it down with properly signed certificates!

Read Post

MetricFire

Read more about Locking Down PostgreSQL with SSL: Secure Remote Connections Like a Pro

Kubernetes Monitoring and Alerting Made Easy with Splunk Observability Cloud and OpenTelemetry

Feb 5, 2025 By Splunk In Splunk

In this video, I'll show you how to quickly setup monitoring and alerting for your Kubernetes clusters using Splunk Observability Cloud. We’ll start by deploying the Splunk OpenTelemetry Collector using Helm, and then use the Kubernetes Navigator inside Splunk Observability Cloud to view the health of our cluster and the applications it’s hosting. I’ll demonstrate AutoDetect detectors and alerts by intentionally triggering an issue in the cluster and walk through the alerting process. We’ll review the alerts in Splunk Observability Cloud and then resolve the issue in the cluster.

View Video

Splunk

Read more about Kubernetes Monitoring and Alerting Made Easy with Splunk Observability Cloud and OpenTelemetry

Getting Started with M365 dashboards

Feb 5, 2025 By Sameer Mhaisekar In Squared Up

SquaredUp is a flexible dashboard and analytics platform that makes it really easy to dashboard your M365 and Intune usage and analytics. You can then use it for monitoring or sharing! In this article we’ll take a look at getting started with the M365 plugin for SquaredUp and building our first dashboard. Sign up for a free account if you’d like to follow along.

Read Post

Squared Up

Read more about Getting Started with M365 dashboards

How AI-powered anomaly detection is transforming APM for SREs

Feb 5, 2025 By Sindu Priyadharshini V In Site24x7

Site reliability engineers (SREs) often face challenges in keeping an organization’s sites running smoothly as the complexity of distributed systems steadily increases. With the rise of microservices, cloud-native architectures, and massive data volumes, manual monitoring and troubleshooting are no longer sustainable. SREs must navigate hurdles like alert fatigue, incident response delays, and the constant pressure to maintain system reliability.

Read Post

Site24x7

Read more about How AI-powered anomaly detection is transforming APM for SREs

Top 5 EdTech outages detected by StatusGator in January 2025

Feb 5, 2025 By Colin Bartlett In StatusGator

Educational platforms are essential for students, educators, and institutions, making service disruptions especially impactful. StatusGator’s early detection ensures that users receive timely alerts before official acknowledgments, helping them navigate unexpected downtime. Below, we recap significant education-related outages from January 2025, where StatusGator kept users ahead of disruptions.

Read Post

StatusGator

Read more about Top 5 EdTech outages detected by StatusGator in January 2025

Petabyte Scale, Gigabyte Costs: Mezmo's Evolution from ElasticSearch to Quickwit

Feb 5, 2025 By Mezmo In Mezmo

At Mezmo, we handle an enormous volume of telemetry data for our customers and ourselves, requiring a robust and efficient search and analytics backend. For years, ElasticSearch served us well, but as our infrastructure grew to a multi-cluster, multi-petabyte scale, we started to see the cracks—rising costs, performance bottlenecks, and scalability concerns. We needed a change, one that would make our system more cost-effective while maintaining speed and reliability.

Read Post

Mezmo

Read more about Petabyte Scale, Gigabyte Costs: Mezmo's Evolution from ElasticSearch to Quickwit

How to Optimize Costs and Strengthen IT with Teneo's Deep Observability

Feb 5, 2025 By Teneo In Teneo

Teneo understands that it can be hard to balance cost and depth of observability in todays fast-paced digital landscape, where organizations face the challenge of managing increasingly complex IT infrastructures while keeping costs under control. Achieving this balance requires a new approach, this is why we have developed our Open Observability platform, a critical component of Teneo’s StreamlineX framework.

Read Post

Teneo

Read more about How to Optimize Costs and Strengthen IT with Teneo's Deep Observability

Telemetry Pipeline 101

Feb 5, 2025 By Mezmo In Mezmo

Are you looking to enhance your observability and gain deeper insights into your systems? Curious about how a Telemetry Pipeline can revolutionize your monitoring and troubleshooting capabilities while keeping the cost low? Join Mezmo’s Bill Balnave (Vice President of Technical Services) for an insightful webinar unraveling Telemetry Pipeline’s key concepts, highlighting its significance in modern software development and operations. Discover how a Telemetry Pipeline enables you to collect, profile, transform, and analyze crucial telemetry data from your applications and infrastructure.

View Video

Mezmo

Read more about Telemetry Pipeline 101

How to search custom fields / tags in Rollbar

Feb 5, 2025 By Rollbar In Rollbar

Quickly and easily search custom fields / tags in Rollbar to instantly understand the items affecting your customers.

View Video

Rollbar

Read more about How to search custom fields / tags in Rollbar

Take Control of Incidents: Smarter Filtering, Collaboration & Insights

Feb 5, 2025 By Tomas Koprusak In Uptime Robot

Managing incidents just got a whole lot easier. With our latest update, you get better visibility, smarter filtering, and a seamless way to collaborate with your team – right inside UptimeRobot. You can now access all incidents in one place under the new Incidents tab in the left sidebar. FYI: Some of our best improvements come directly from you—like the ability to add comments and make incidents searchable. We listen and we don’t judge—do you have an idea?

Read Post

Uptime Robot

Read more about Take Control of Incidents: Smarter Filtering, Collaboration & Insights

How to visualize CSV data with Grafana

Feb 5, 2025 By David Allen In Grafana

While CSV data is often associated with popular spreadsheet apps like Google Sheets or Microsoft Excel, Grafana offers a number of capabilities to quickly visualize and analyze data stored in a CSV format. In this post, we’ll walk through an example of how to use Grafana to visualize any CSV file from anywhere on the web. More specifically, we will: Moving forward, you can also apply these steps to build any kind of dashboard within Grafana.

Read Post

Grafana

Read more about How to visualize CSV data with Grafana

Sponsored Post

Top 10 .NET exceptions (part one)

Feb 4, 2025 By Rowan Tandy In Raygun

Exception handling is essential to.NET development, but not all exceptions are equal. Some, like NullReferenceException, surprise developers with unclear stack traces and production crashes. Others, such as MySQLException or HttpRequestException, often point to issues like resource mismanagement or network failures. At Raygun, we've worked with teams around the world to monitor and fix software issues, giving us deep insight into how exceptions occur and how to handle them effectively.

Read Post

Raygun

Read more about Top 10 .NET exceptions (part one)

6 key steps to drive successful network automation in your enterprise

Feb 4, 2025 By General In ManageEngine

The complexity of modern networks has surged due to digital transformation, hybrid work models, and evolving security threats, making manual management increasingly unsustainable. Network automation addresses this challenge by streamlining operations and enabling networks to adapt and remain resilient in an ever-changing environment. A recent Gartner study predicts that by 2026, 30% of enterprises will automate more than half of their network activities.

Read Post

ManageEngine

Read more about 6 key steps to drive successful network automation in your enterprise

Sponsored Post

Introducing Agentic AI Platform by Fabrix.ai

Feb 4, 2025 By Raju Datla In Fabrix

Over the past couple of years, many of us have been utilizing Generative AI interfaces and co-pilots to enhance our communication, conduct research, and summarize complex information. AI-based agents are digital entities created to autonomously derive insights from data and execute actions. Agents are focused on accomplishing a specific outcome without the needfor constant human intervention.

Read Post

Fabrix

Read more about Introducing Agentic AI Platform by Fabrix.ai

On-premise vs SaaS 2025

Feb 4, 2025 By Sancho Lerena In Pandora FMS

In the world of infrastructure management and enterprise software, the choice between on-premise and SaaS (Software as a Service) solutions has become a strategic decision for every organization, influencing key areas such as security, flexibility and operational costs. Both models offer different approaches to software implementation and usage.

Read Post

Pandora FMS

Read more about On-premise vs SaaS 2025

Dynatrace vs Grafana - A Detailed Comparison for 2025

Feb 4, 2025 By Pavithra Parthiban In Atatus

Dynatrace and Grafana are popular tools when it comes to monitoring and observability. In this post, I have compared Dynatrace and Grafana on important features like APM, log management, infrastructure monitoring, pricing, etc.

Read Post

Atatus

Read more about Dynatrace vs Grafana - A Detailed Comparison for 2025

Streamlining Telemetry with Apica's Fleet Management Solution: A Deep Dive

Feb 4, 2025 By Lori Bertelli In Apica

In the rapidly evolving IT environment, observability at scale has become a critical challenge for organizations aiming to maintain operational excellence. The proliferation of telemetry collection agents across diverse infrastructures often increases complexity, resource strain, and configuration inconsistencies.

Read Post

Apica

Read more about Streamlining Telemetry with Apica's Fleet Management Solution: A Deep Dive

Getting started with JIRA dashboards

Feb 4, 2025 By Sameer Mhaisekar In Squared Up

SquaredUp is a data visualization tool that can connect to a variety of data sources to bring the data together in a single pane of glass. In this blog we will connect our JIRA and Confluence instances to SquaredUp using the built-in plugins and create dashboards for them.

Read Post

Squared Up

Read more about Getting started with JIRA dashboards

SLOs: a guide to setting and benefiting from service level objectives

Feb 4, 2025 By Jake Swiss In Grafana

If you’re running a technology-driven business, reliability isn’t optional—it’s essential. But how do you balance speed and innovation with a level of reliability that satisfies your customers? That’s where service level objectives (SLOs) come in. SLOs offer a framework for defining and achieving reliability goals, aligning technical efforts with user needs, and driving meaningful outcomes for your business.

Read Post

Grafana

Read more about SLOs: a guide to setting and benefiting from service level objectives

Enhanced Search: Key/Value Searching Now Available for Custom Fields

Feb 4, 2025 By Rollbar In Rollbar

Following our recent update on faster item search, we’re excited to introduce another powerful improvement to our search functionality: key/value searches for specific fields!

Read Post

Rollbar

Read more about Enhanced Search: Key/Value Searching Now Available for Custom Fields

Expanded Search: Nearly All Standard Fields Are Now Searchable

Feb 4, 2025 By Rollbar In Rollbar

Following our recent updates on faster item search and searching within custom data fields, we’re excited to announce another major improvement: nearly all standard data fields within Rollbar are now indexed and searchable!

Read Post

Rollbar

Read more about Expanded Search: Nearly All Standard Fields Are Now Searchable

Keeping Spending in Check: Observability's Positive Impact on Cost Management

Feb 4, 2025 By ScienceLogic In ScienceLogic

Tool sprawl within organizations doesn’t just create a fragmented user experience; it poses a real threat to enterprises’ bottom lines. Consider these statistics: This fragmentation significantly limits worker productivity. IT leaders spend hundreds of hours trying to manage multiple tools, map their environments, and upkeep aging systems that are either outdated or simply no longer necessary.

Read Post

ScienceLogic

Read more about Keeping Spending in Check: Observability's Positive Impact on Cost Management

Maintain smooth game play with Sentry's game engine support

Feb 4, 2025 By Bruno Garcia In Sentry

Sentry integrates directly with the tools you already use, giving you real-time crash and performance insights for Unity, Unreal Engine, and Godot. No matter what engine you use, Sentry helps you find and fix issues before they impact your players.

Read Post

Sentry

Read more about Maintain smooth game play with Sentry's game engine support

Quickly get rich, actionable context for alerts with Datadog's new Monitor Status page

Feb 4, 2025 By David Iparraguirre In Datadog

Providing rich context for monitor alerts is an essential part of any robust, scalable monitoring strategy. Alerts that send teams scrambling for basic background information prolong troubleshooting, hindering effective incident response and heightening the potential for service disruption. Given the increasing complexity of modern, distributed applications, however, breaking down knowledge silos in order to ensure consistent access to critical context for alerts can be a challenge.

Read Post

Datadog

Read more about Quickly get rich, actionable context for alerts with Datadog's new Monitor Status page

Wireless Network Management with Site24x7

Feb 4, 2025 By ManageEngine Site24x7 In Site24x7

Struggling with Wi-Fi connectivity issues? Wireless LAN controllers (WLCs) are the backbone of enterprise networks, but they’re not without challenges. From access point disconnections to overloaded controllers, even small issues can disrupt your operations. With Site24x7, you can proactively monitor and optimize your wireless network. Get real-time insights, detailed analytics, and instant alerts to troubleshoot problems before they impact users.

View Video

Site24x7

Read more about Wireless Network Management with Site24x7

How to search errors and logs in Rollbar

Feb 4, 2025 By Rollbar In Rollbar

Learn how you can now search any custom field quickly an easily in Rollbar. Including exceptions, errors, logs and more.

View Video

Rollbar

Read more about How to search errors and logs in Rollbar

Robots Arrive to Autofix Bugs

Feb 4, 2025 By Sentry In Sentry

We've been messing with AI, and we came up with a feature that automatically fixes your bugs for you!

View Video

Sentry

Read more about Robots Arrive to Autofix Bugs

How to Optimize Website Images: The Complete 2025 Guide

Feb 4, 2025 By Request Metrics In Request Metrics

Images are big. Really big. The bytes required for an image dwarf most site’s CSS and JavaScript assets. Slow images will damage your Core Web Vitals, impacting your SEO and costing you traffic. Images are usually the element driving Largest Contentful Paint and load delays can increase your Cumulative Layout Shift. If you’re not familiar with these metrics, check them out in the Definitive Guide to Measuring Web Performance.

Read Post

Request Metrics

Read more about How to Optimize Website Images: The Complete 2025 Guide

Taking a step towards network resilience: The importance of real-time alerts

Feb 4, 2025 By Rama Venkatesan In Site24x7

Is your network prepared to handle unexpected disruptions, or are you constantly in fire-fighting mode? As organizations become increasingly reliant on uninterrupted connectivity, network downtime, slow response times, or undetected vulnerabilities can directly affect customer experience, employee productivity, and even your bottom line. So, how can you proactively address these challenges?

Read Post

Site24x7

Read more about Taking a step towards network resilience: The importance of real-time alerts

Resolving Heroku deployment issues using comprehensive log data

Feb 4, 2025 By Subashree K In Site24x7

Deploying applications on Heroku offers a streamlined process for developers, but even the most well-optimized setups can encounter deployment issues. To effectively resolve these issues, it's crucial to gain real-time insights into your app’s behavior, traffic, and performance metrics. The solution to resolving Heroku deployment challenges lies in leveraging the power of log management.

Read Post

Site24x7

Read more about Resolving Heroku deployment issues using comprehensive log data

AKS Monitoring Explained: Tools and Strategies for Azure Kubernetes Service

Feb 4, 2025 By Albin Thomas In Digitate

In recent years, container platforms – also referred to as container orchestration systems – have grown in popularity and transformed the processes of software development, testing, and deployment.

Read Post

Digitate

Read more about AKS Monitoring Explained: Tools and Strategies for Azure Kubernetes Service

10 Kubernetes Monitoring Tools You Can't-Miss in 2025

Feb 4, 2025 By Anjali Udasi In Last9

Monitoring a Kubernetes cluster isn’t just about keeping an eye on CPU and memory usage. It’s about understanding system health, detecting anomalies before they cause outages, and ensuring applications run smoothly. With so many tools available, choosing the right one can feel overwhelming. This guide covers the best Kubernetes monitoring tools, their use cases, and key factors to consider.

Read Post

Last9

Read more about 10 Kubernetes Monitoring Tools You Can't-Miss in 2025

Find and Fix Performance Bottlenecks with Sentry's Trace Explorer

Feb 4, 2025 By Will McMullen In Sentry

We’ve all worked on that app that hangs just a little too long in weird places, or had that query we could never get to perform just right. The network waterfall in Chrome DevTools can’t quite show us what’s going on behind the scenes, and tracing with OTel (and honestly, tracing in Sentry) was just… hard. Today that changes.

Read Post

Sentry

Read more about Find and Fix Performance Bottlenecks with Sentry's Trace Explorer

CLI Operations for InfluxDB 3 Core and Enterprise

Feb 4, 2025 By Anais Dotis-Georgiou In InfluxData

This blog covers the nitty-gritty of essential command-line tools and workflows to effectively manage and interact with your InfluxDB 3 Core and Enterprise instances. Whether you’re starting or stopping the server with configurations like memory, file, or object store, this guide will walk you through the process. We’ll also look at creating and writing data into databases using authentication tokens, exploring direct line protocol input versus file-based approaches for tasks like testing.

Read Post

InfluxData

Read more about CLI Operations for InfluxDB 3 Core and Enterprise

SSHD Logs 101: Configuration, Security, and Troubleshooting Scenarios

Feb 4, 2025 By Anjali Udasi In Last9

Secure Shell (SSH) is a fundamental tool for remote system administration, and its logs play a critical role in security monitoring, debugging, and compliance. SSHD logs provide insights into authentication attempts, connection successes, failures, and potential intrusions. This guide explores everything you need to know about SSHD logs, including their location, format, analysis, and lesser-known security practices to maximize their effectiveness.

Read Post

Last9

Read more about SSHD Logs 101: Configuration, Security, and Troubleshooting Scenarios

Website Performance Benchmarks: What You Should Aim For [with Examples]

Feb 4, 2025 By Anjali Udasi In Last9

When it comes to your website, speed is everything. A slow site frustrates users, drives up bounce rates, and even impacts your revenue. That’s where website performance benchmarks come in. They help you figure out how well your site is performing, where it needs improvement, and—most importantly—what you can do to make it faster. In this guide, we'll walk you through the key benchmarks, the tools you need, and a few tips that’ll help your site outshine the competition.

Read Post

Last9

Read more about Website Performance Benchmarks: What You Should Aim For [with Examples]

Top 11 API Monitoring Tools You Need to Know

Feb 4, 2025 By Anjali Udasi In Last9

APIs are the backbone of modern software, quietly powering everything we interact with. But just because they’re invisible doesn’t mean they can’t run into issues. From response times to uptime, keeping an eye on your APIs is key to making sure everything works smoothly. In this guide, we’ll explore 11 popular API monitoring tools to help you find the one that best fits your needs.

Read Post

Last9

Read more about Top 11 API Monitoring Tools You Need to Know

How to Set Up Actually Useful SLOs | Introduction to SLOs | Grafana Labs

Feb 4, 2025 By Grafana In Grafana

Service Level Objectives (SLOs) should be more than just numbers on a dashboard—they should help your team deliver real value to your users. In this video, Jake Swiss from Grafana Labs walks you through three simple steps to create SLOs that align with business goals and drive better decision-making. Step 1: Understand What Really Matters – Align SLOs with customer expectations Step 2: Define Clear, Measurable Targets – Use RED metrics (Rate, Errors, Duration) to track meaningful performance Step 3: Continuously Iterate & Fine-Tune – Adjust SLOs based on historical data and team feedback.

View Video

Grafana

Read more about How to Set Up Actually Useful SLOs | Introduction to SLOs | Grafana Labs

How to Overcome Alert Fatigue in Your Alerting System | Introduction to SLOs | Grafana Labs

Feb 4, 2025 By Grafana In Grafana

Cut Through Alert Noise with SLOs! Tired of endless alerts that don’t reflect real issues? SLOs (Service Level Objectives) help reduce noise by focusing on what truly impacts users. Instead of reacting to every minor spike, set SLOs to trigger alerts only when reliability is at risk.

View Video

Grafana

Read more about How to Overcome Alert Fatigue in Your Alerting System | Introduction to SLOs | Grafana Labs

Kentik - Cloud Observability

Feb 4, 2025 By Kentik In Kentik

Kentik Cloud provides comprehensive visibility across all major public clouds, offering seamless insight into cloud-to-on-prem network paths and the public internet routes connecting them. Identify latency, loss, jitter, and application-specific traffic while providing deep visibility into cloud networking constructs like ACLs to spot security issues. With powerful analytics, Kentik Cloud enables you to visualize intra-cloud traffic, identify idle resources for optimization, and leverage historical data to uncover trends and seasonal patterns—ensuring optimal cloud performance and cost efficiency.

View Video

Kentik

Read more about Kentik - Cloud Observability

Kubernetes 101

Feb 4, 2025 By Jeff Darrington In Graylog

When you get behind the wheel of your car, one of the first things you see is the dashboard. Your dashboard provides various information about all the different technologies that make the car run smoothly, like helping you control your speed, providing insight into your fuel levels, and offering suggestions for regular maintenance, like oil changes. For developers, Kubernetes acts as that one-glance dashboard to provide insights about container performance, maintenance needs, and storage requirements.

Read Post

Graylog

Read more about Kubernetes 101

System Center 2025 Unveiled: Insights and Expert Discussion on SCOM, SCORCH, SCSM, and Beyond

Feb 3, 2025 By NiCE IT Mgmt In NiCE IT Mgmt

The future of IT operations is here! Join us for an exclusive expert panel discussion on Microsoft System Center 2025, where industry leaders will explore the latest advancements and strategies for optimizing enterprise IT environments.

Read Post

NiCE IT Mgmt

Read more about System Center 2025 Unveiled: Insights and Expert Discussion on SCOM, SCORCH, SCSM, and Beyond

Top 3 tools for reporting Zendesk metrics

Feb 3, 2025 By Mike Halfacree In Squared Up

Zendesk is a popular choice for customer service and support, offering a range of tools to manage interactions and boost customer satisfaction. However, making sense of all the data it collects requires robust reporting tools. Zendesk Explore, Power BI, and SquaredUp are three powerful tools that can help you unlock valuable insights from your Zendesk data, but each has its unique strengths.

Read Post

Squared Up

Read more about Top 3 tools for reporting Zendesk metrics

A Complete Guide on Synthetic Monitoring | How to Improve Your Web & App Performance?

Feb 3, 2025 By Arpit Sharma In Motadata

Making your brand stand out in digital business is more challenging than it sounds. More than 2.87 million apps are available on Play store and other platforms. In fact, as per reports, almost 252,000 new websites are created and launched daily. Competing in this large market without proper monitoring and strategy is a complete waste of time.

Read Post

Motadata

Read more about A Complete Guide on Synthetic Monitoring | How to Improve Your Web & App Performance?

9 essential metrics to track for effective IT operations with log management tools

Feb 3, 2025 By Subashree K In Site24x7

Monitoring the correct metrics is crucial for efficient IT operations, as it ensures the smooth functioning of an organization's infrastructure. One crucial aspect of this process is log management, which empowers IT teams to address critical aspects of IT infrastructure, including performance, availability, security, resource usage, and integration.

Read Post

Site24x7

Read more about 9 essential metrics to track for effective IT operations with log management tools

How to handle traffic spikes in Rollbar

Feb 3, 2025 By Rollbar In Rollbar

Spikes in traffic can happen at anytime, here are some ways to handle these spikes in Rollbar.

View Video

Rollbar

Monitoring

Read more about How to handle traffic spikes in Rollbar

Getting started with WebAPI dashboards

Feb 3, 2025 By Sameer Mhaisekar In Squared Up

SquaredUp is a data visualization tool that can connect to a variety of data sources to bring the data together in a single pane of glass. In this blog we will demonstrate how SquaredUp can connect to any REST APIs out there and build dashboards on the data returned.

Read Post

Squared Up

Read more about Getting started with WebAPI dashboards

How To Configure a PostgreSQL Datasource in Grafana

Feb 3, 2025 By Benjamin Pitts In MetricFire

So, you’ve got a PostgreSQL database packed with juicy data, and you want to turn those raw numbers into slick, interactive Grafana dashboards? Good call! Grafana’s PostgreSQL datasource is like the secret handshake that lets you visualize your data in style—no extra ETL magic required. In this guide, we’ll walk through getting PostgreSQL and Grafana to play nice, covering everything from connection settings to query tuning.

Read Post

MetricFire

Read more about How To Configure a PostgreSQL Datasource in Grafana

January product updates

Feb 3, 2025 By Colin Bartlett In StatusGator

It’s been a busy month at StatusGator HQ as we focused on improvements to the status page — one of many features that helps you communicate the status of all your cloud services to your stakeholders. Here’s a quick recap of this month’s updates. Let’s take a look at what we’ve rolled out! As a reminder, you can see all these updates here on the blog as they are released or in our product update sidebar inside StatusGator.

Read Post

StatusGator

Read more about January product updates

Monitor Google Cloud TPUs with Datadog

Feb 3, 2025 By Bowen Chen In Datadog

The rapidly growing interest in AI has raised a corresponding demand for specialized cloud compute that is built to run training and inference workloads in a cost-efficient and performant manner. Google Cloud Tensor Processing Units (TPUs) have become a popular accelerated compute solution for AI/ML workloads.

Read Post

Datadog

Read more about Monitor Google Cloud TPUs with Datadog

Booking.com's Journey to Enhanced Observability

Feb 3, 2025 By Brian Chang In Honeycomb

Since its early startup beginnings in Amsterdam, Booking.com has redefined the travel industry, establishing itself as a premier platform for millions of travelers worldwide. With over 28 million accommodation listings and a staggering 1.5 million room nights booked every day, Booking.com operates on a scale that demands a robust and constantly monitored infrastructure.

Read Post

Honeycomb

Read more about Booking.com's Journey to Enhanced Observability

Fast, Accurate and Powerful Item Search

Feb 3, 2025 By Rollbar In Rollbar

We’ve overhauled the search backend that powers the Item List UI and Item Search API. Item Search is now far more powerful, working with custom fields and nearly all the data you send Rollbar. Searches return quickly and return the results you'd expect.

Read Post

Rollbar

Read more about Fast, Accurate and Powerful Item Search

The Basics of Log Parsing (Without the Jargon)

Feb 3, 2025 By Anjali Udasi In Last9

Logs are crucial for understanding what's happening in your system, but they can often be hard to make sense of. Log parsing is the key to turning raw, unstructured data into something useful. In this blog, we'll explore the basics of log parsing, its importance, and how it helps you extract valuable insights from your logs without all the clutter.

Read Post

Last9

Read more about The Basics of Log Parsing (Without the Jargon)

OpenTelemetry Processors: Workflows, Configuration Tips, and Best Practices

Feb 3, 2025 By Prathamesh Sonpatki In Last9

Most developers are familiar with Opentelemetry core components—Traces, Metrics, and Logs. But there’s one part of the OpenTelemetry ecosystem that doesn’t always get the spotlight: processors. These behind-the-scenes operators shape your data pipeline, helping you filter, enrich, and fine-tune telemetry data before it reaches your backend systems. Processors play a key role in making sure your data is cleaner, more useful, and just the way you need it.

Read Post

Last9

Read more about OpenTelemetry Processors: Workflows, Configuration Tips, and Best Practices

Addressing issues and fixing incidents faster #datadog #shorts

Feb 3, 2025 By Datadog In Datadog

Addressing issues and fixing incidents faster than ever was important to SeatGeek, a leading ticketing platform that connects millions of users to live events. Watch how they mastered incident response by integrating Datadog Incident Management.#incident.

View Video

Datadog

Read more about Addressing issues and fixing incidents faster #datadog #shorts

Syslog Protocol: A Reference Guide

Feb 3, 2025 By Jeff Darrington In Graylog

Syslog was developed in the 1980s by Eric Allman as part of the Sendmail project and adopted by many systems over the years. When looking at Syslog, there are a few protocol options, each with slight differences. In this reference guide, I’ll break down the differences so that you have a guide to see these formats when utilizing this protocol.

Read Post

Graylog

Read more about Syslog Protocol: A Reference Guide

How CXOs can simplify compliance in high-regulation sectors

Feb 2, 2025 By Rama Venkatesan In Site24x7

How do businesses in highly regulated sectors ensure network compliance while still fostering innovation and maintaining operational efficiency? As regulatory pressure and operational complexities increase, along with the growing divide between external demands and internal capabilities, traditional approaches to compliance are becoming outdated and insufficient for the future.

Read Post

Site24x7

Read more about How CXOs can simplify compliance in high-regulation sectors

DOES Cache Rule Everything Around Me? - Using Compression for our Prometheus Cache

Feb 1, 2025 By Umut Uzgur In Checkly

Checkly is a key part of a professional developer’s workflow, making it easy to know if your service is up or down, and measure performance. As we integrate with almost any development workflow, we also have Prometheus endpoints to let you use the popular Grafana stack to keep track of your site checks’ status. As large enterprise users grew in usage, their check performance data grew in parallel, and our endpoint started returning occasional 429 status codes.

Read Post

Checkly

Read more about DOES Cache Rule Everything Around Me? - Using Compression for our Prometheus Cache

Operations | Monitoring | ITSM | DevOps | Cloud