Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

ICMP Required for Traceroute and Network Diagnostics

As previously detailed on the Exoprise blog, the ICMP (Internet Control Message Protocol) is crucial for troubleshooting, monitoring, and optimizing network performance in today’s Internet-connected world. Despite historical security concerns, disabling ICMP is unnecessary and hampers network troubleshooting efforts. Modern firewalls can effectively manage the security risks associated with ICMP.

Top tips: 5 steps to take while implementing a predictive maintenance strategy

Top tips is a weekly column where we highlight what’s trending in the tech world today and list out ways to explore these trends. This week we’re looking at five steps should follow when devising an effective predictive maintenance strategy for your organization. Have you ever wondered what it would feel like to be able to look into the future? Well, thanks to predictive maintenance, you can do just that!

5 AWS Logging Tips and Best Practices

If you’re an Amazon Web Services (AWS) user, you’re probably familiar with some of Amazon’s native services available for logging and monitoring, such as CloudWatch and CloudTrail. With that said, log management can get complicated quickly, especially if you’re dealing with a high volume of logs from AWS Lambda functions or a multi-cloud/hybrid cloud environment.

How AI is Changing Modern Networking

Artificial intelligence has changed the landscape of technology, especially as we collect and analyze vast amounts of data. As AI workloads are distributed among compute nodes and across data centers, what problems are posed for traditional networking, and what solutions can we discover? This live discussion between Justin Ryburn, Field CTO at Kentik, and Phillip Gervasi, Director of Technical Evangelism at Kentik, dives deep into the current state of data center networking in the age of AI.

Building a Distributed Security Team

In this live stream, Cjapi’s James Curtis joins me to discuss the challenges of building a distributed global security team. Watch the full video or read on to learn about some hard-won examples of how to be successful with remote team building and management. Talent is hard to find, and companies are hiring from all over the world to build the best teams possible, but this trend has a price.

Machine Learning for Fast and Accurate Root Cause Analysis

Machine Learning (ML) for Root Cause Analysis (RCA) is the state-of-the-art application of algorithms and statistical models to identify the underlying reasons for issues within a system or process. Rather than relying solely on human intervention or time-consuming manual investigations, ML automates and enhances the process of identifying the root cause.

Grafana 10.1: TraceQL query results streaming

Tempo offers amazing performance, but there are still cases where TraceQL queries take a long time to return results. This could be due to a multitude of reasons from the complexity of the query, amount of choices stored, or the timeframe selected. See how to navigate your query results more quickly, with query results streaming, available as an experimental feature in Grafana version 10.1.