Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Introducing StackState 4.6: Harnessing the Power of Topology + Telemetry + Traces + Time

Companies depend on observability insights to provide reliable online services to their customers. To support their efforts, StackState is proud to announce a new version of our unique topology-powered observability software, StackState v4.6, available now. This new version brings powerful new capabilities to DevOps and SRE teams who need to maintain a deep understanding of how their stack is behaving to meet their SLOs.

Debugging Race Conditions in Production

Race conditions can occur when a multithreaded application accesses a shared resource using over one thread. Unless we have guards in place, the result might depend on which thread “got there first”. This is especially problematic when the state is changed externally. A race can cause more than just incorrect behavior. It can enable a security vulnerability when the resource in question can be corrupted in the right way. A good example of race condition vulnerabilities is mangling memory.

APM is Legacy. Distributed Tracing is Designed for Modern Teams

Some background. Having implemented at least 20 or more APM systems in production as an end-user at various companies, and both deployed and managed countless monitoring tools outside APM, I understand the role of the practitioner. Later on, I shifted to Gartner and led the APM Magic Quadrant for four years, finally spending another four years at AppDynamics (operating under Cisco after two years).

How to Optimize Cloud Monitoring Costs Using Flow Logs in Progress Flowmon

This blog post discusses some of the best practices for balancing the costs of cloud traffic monitoring while maintaining a reasonable level of visibility. Progress Flowmon 12 has introduced the processing of native flow logs from Google Cloud and Microsoft Azure, plus it has enhanced support for Amazon Web Services (AWS) flow logs.

Running VictoriaMetrics on ARM-based processors

ARM processors become more popular and more cost-effective according to many benchmarks. One of them was made by Percona for MySQL. Some of our users reported issues with VictoriaMetrics at AWS Graviton instances. The main concerns were higher CPU and disk IO usage compared to x86 instances of the same size and for the same workload. By that time, we verified that VictoriaMetrics works fine for raspberry and IoT devices, but didn’t do any optimizations for ARM builds.

What Is Log Retention?

The idea of paying money to store logs nobody is looking at may seem like a waste. Well, that is until you need those logs. At that point, you see how valuable log retention is, especially if there’s a security or compliance issue. When you prioritize log retention, you can look back to investigate an incident or provide data for an audit — especially when you centralize log and metric data in one platform.