Operations | Monitoring | ITSM | DevOps | Cloud

How Delivery Hero uses Kubecost and Datadog to manage Kubernetes costs in the cloud

As the world’s leading local delivery platform, Delivery Hero brings groceries and household goods to customers in more than 70 countries. Their technology stack comprises over 200 services across 20 Kubernetes clusters running on Amazon EKS. This cloud-based, containerized infrastructure enabled them to scale their operation to support increasing demand as the volume of orders placed on their platform doubled during the pandemic.

Troubleshoot blocking queries with Datadog Database Monitoring

Blocked queries are one of the key issues faced by database analysts, engineers, and anyone managing database performance at scale. Blocking can be caused by inefficient query or database design as well as resource saturation, and can lead to increased latency, errors, and user frustration. Pinpointing root blockers—the underlying problematic queries that set off cascading locks on database resources—is key to troubleshooting and remediating database performance issues.

Webinar Highlight: Introducing InfluxDB's New Time Series Database Engine

As part of the InfluxDB Cloud, powered by IOx launch, Paul Dix and Balaji Palani provided an InfluxDB Cloud overview and demo. In case you missed it, this blog is a quick 5 minute read summarizing the webinar. We shared the recording and the slides from the presentation for everyone to review and watch at your leisure.

Anomaly detection on Prometheus metrics

We have recently extended the native machine learning (ML) based anomaly detection capabilities of Netdata to support all metrics, regardless on their collection frequency (update every). Previously only metrics collected every second were supported, but now Netdata can run anomaly detection out of the box with zero config on metrics with any collection frequency.

Public Dashboards, Incident Management, and Our New Analytics API

Late last year we announced improvements to our public dashboards that included a revamped dashboard design that allowed users to see monitoring data in a more easily-digestible way, on any device. We improved performance across the board, and also introduced new incident management functionality—available for paid plans only—that allows users to more easily communicate scheduled maintenance notices and alert developers to minor and major incidents.

Log Analytics 2023 Guide

As enteprise networks grow larger and more complex, IT teams are increasingly dependent on the enhanced network visibility and monitoring capabilities provided by log analytics solutions. Log analytics gives enterprise Engineering, DevOps, and SecOps teams the ability to efficiently troubleshoot cloud services and infrastructure, monitor the security posture of enterprise IT assets, and measure application performance throughout the application lifecycle or DevOps release pipeline.

6 ways to supercharge mobile app performance

With over 7 billion mobile users worldwide, there’s almost one device for every person on the planet. Not surprisingly, the most popular apps are dominated by social media, messaging and entertainment platforms. But consumers are also shopping and managing finances via mobile devices. And while most users are accustomed to waiting a few seconds for a web application response, mobile users are less forgiving and expect an instant reaction to their swipes and taps.

Enabling TLS on a Cribl Leader Node: Step-by-Step Guide

Securing your internal systems with TLS can be a daunting task, even for experienced administrators. However, with the right tools and guidance, the process can be made more manageable. In this blog, we’ll show you how to enable TLS for your internal systems on your Cribl Leader Node. We’ll walk you through the steps, and provide a video tutorial embedded below to help you follow along.

What is Network Performance Monitoring: The Gandalf of Networks

In today's digital age, network performance monitoring has become a crucial aspect of IT infrastructure management. The ability to monitor and manage network performance is essential for ensuring that networks run efficiently and effectively. Just as Gandalf served as a wise and trusted advisor to the Fellowship of the Ring in J.R.R.