Operations | Monitoring | ITSM | DevOps | Cloud

Monitoring

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Three Pillars of Observability [And Beyond]

Observability is often defined in the context of three pillars: logs, metrics, and traces. Modern-day cloud-native applications are complex and dynamic. To avoid surprises and performance issues, you need a robust observability stack. But is observability limited to collecting logs, metrics, and traces? How is observability evolving to make our systems more observable? In this tutorial, we cover.

How to Identify DNS Issues: The IT Handbook

In the world of the Internet, where every click, request, and data transfer relies on seamless connectivity, Domain Name System (DNS) issues can be the silent disruptors that bring the entire digital ecosystem to a halt. As organizations and individuals become increasingly dependent on the Internet for their day-to-day operations, understanding and troubleshooting DNS problems have become essential skills for IT professionals.

The concise guide to Grafana Loki: Everything you need to know about labels

Welcome to Part 2 of the “Concise guide to Loki,” a multi-part series where I cover some of the most important topics around our favorite logging database: Grafana Loki. As I reflect on the fifth anniversary of Loki, it felt like a good opportunity to summarize some of the important parts of how it works, how it’s built, how to run it, etc. And as the name of the series suggests, I’m doing it as concisely as I can.

The Three FinOps Phases for MVP Success

Your FinOps foundations are down in your company’s cloud (woohoo!), but what comes next? How can you boost your MVP success in the cloud with your FinOps strategy? In this blog post, we’ll briefly dive into the three phases of your FinOps for top-notch implementation from beginning to end. Need a refresher on setting up an MVP FinOps framework for your cloud? In part 1 of our series, we’ll show you how it’s done!

Five Tips for Monitoring Your Cloud Application

Page load time is inversely related to page views and conversion rates. While probably not a controversial statement, as the causality is intuitive, there is empirical data from industry leaders such as Amazon, Google, and Bing to back it in High Scalability and O’Reilly’s Radar, for example. As web technology has become much more complex over the last decade, the issue of performance has remained a challenge as it relates to user experience.

Scaling Up, One Network Bottleneck at a Time #shorts #datadog

Processing data at scale involves moving packets through a network—but what happens when that network isn't cooperative? Anatole Beuzon, a Software Engineer at Datadog, discusses how he investigated and resolved network issues in Datadog’s larger data-processing apps and how you can apply these same methods to your own production workloads.

Time Series Data and Real World AI: A Fireside Chat

Recently, InfluxData CEO Evan Kaplan sat down with Developer Advocate Jay Clifford to discuss the role of time series data and AI in industry, how it’s evolving, and specifically, the role of time series data in AI. They also discussed the future of InfluxDB in terms of real-time analytics and its role in the AI landscape.