Operations | Monitoring | ITSM | DevOps | Cloud

Streamlining Incident Management with our latest feature update: Merge Incidents

Hey folks! We‘re back with another nifty feature to your Incident Management tool arsenal. You now have the ability to merge incidents with a few clicks! With this latest update you can reduce the noise while dealing with a complex incident by merging incidents across services under a parent incident. Typically this can occur when multiple incidents stem from the same underlying issue or root cause.

Top tips: 5 ways to enhance your knowledge in AI

Top tips is a weekly column where we highlight what’s trending in the tech world today and list out ways to explore these trends. This week we’re looking at five ways ways you can build upon the basics and start incorporating AI in your everyday. AI technology is now utilized in some form by almost 77% of devices. Nearly every industry has incorporated, or is trying to incorporate, AI in some way or another.

Checkly Expands Monitoring Capabilities with Introduction of Heartbeat Checks

Checkly, the leading Monitoring as Code provider, expanded its platform's monitoring capabilities with the introduction of Heartbeat Checks, also known as CRON monitoring or dead man's switches. Also introduced today, Smart Retries is designed to reduce alert fatigue.

When generative AI helpdesks take control, will humans make the cut?

Let’s take a nostalgic trip down memory lane of traditional IT helpdesks. It’s reminiscent of waiting in long lines only to be told you’re in the wrong line. Or being serenaded by never-ending loops of elevator music? But, in a world where businesses are now laser-focused on customer success, these methods are antiquated and impractical. Don’t believe me? Ask any customer success manager.

Comparing Datadog and New Relic's support for OpenTelemetry data

OpenTelemetry is the future of Observability, APM, Monitoring, whatever you want to call ‘the process of knowing what our software is doing.’ It’s becoming common knowledge that your time is better spent gaining experience with an open, standardized system for telemetry than closed-source or otherwise proprietary standard. This truth is so universally acknowledged that all the big players in the market have made announcements of how they’re embracing OpenTelemetry.

Can Your Racks Support NVIDIA DGX H100 Systems?

AI is booming. The AI market is projected to grow 37.3% annually from 2023 to 2030. With so many organizations adopting or considering AI applications, data centers need to be ready to support the new demand. However, without the right tools and data, it is difficult to understand if your existing facilities have the capacity to support systems like the “gold standard for AI infrastructure,” the NVIDIA DGX H100.

Kubernetes Logging with Filebeat and Elasticsearch Part 2

In this tutorial, we will learn about configuring Filebeat to run as a DaemonSet in our Kubernetes cluster in order to ship logs to the Elasticsearch backend. We are using Filebeat instead of FluentD or FluentBit because it is an extremely lightweight utility and has a first-class support for Kubernetes. It is best for production-level setups. This blog post is the second in a two-part series. The first post runs through the deployment architecture for the nodes and deploying Kibana and ES-HQ.

Kubernetes Logging with Filebeat and Elasticsearch Part 1

This is the first post of a 2 part series where we will set up production-grade Kubernetes logging for applications deployed in the cluster and the cluster itself. We will be using Elasticsearch as the logging backend for this. The Elasticsearch setup will be extremely scalable and fault-tolerant. ‍