Operations | Monitoring | ITSM | DevOps | Cloud

The Great Debate of 2023: Single Vendor vs Best of Breed Solutions

The debate between single vendor solutions and best of breed approaches has been ongoing for decades in the technology industry. Engineers have always sought out options and choice, and this has led to a shift in the dominance of large vendors in each stage of technological development. As soon as IBM sold enterprises the mainframe solution, engineers started to look for other options.

A beginner's guide to Kubernetes application monitoring

Application performance monitoring (APM) involves a mix of tools and practices to track specific performance metrics. Engineers use APM to monitor and maintain the health of their applications and ensure a better user experience. This is crucial to high quality architecture, development, and operations, but it can be difficult to achieve in Kubernetes since the container orchestration system doesn’t provide an easy way to monitor application data like it does for other cluster components.

Kubernetes network monitoring: What is it, and why do you need it?

In this article, we will dive into Kubernetes network monitoring and metrics, examining these concepts in detail and exploring how metrics in an application can be transformed into tangible, human-readable reports. The article will also include a step-by-step tutorial on how to enable Calico’s integration with Prometheus, a free and open-source CNCF project created for monitoring the cloud.

Outages Happen. Now What?

Network outages happen more often than you think. We may not experience them directly or even know they're occurring at all. When outages affect household names like Facebook, Amazon, Microsoft, and others, however, we're sure to find out after the fact that there was an issue. Depending on the user's activities and the duration of the issue, stress and frustration levels can vary. When a marketer can’t get that ground-breaking advertisement up on Facebook, they can get antsy.

Webinar Recap: How Observability Impacts SRE, Development, and Security Teams

In today’s fast paced and constantly evolving digital landscape, observability has become a critical component of effective software development. Companies are relying more on and using machine and telemetry data to fix customer problems, refine software and applications, and enhance security. However, while more data has empowered teams with more insights, the value derived from that data isn’t keeping pace with this growth. So how can these teams derive more value from telemetry data?

GitHub Tried to Change the Checksum for Release Archives. You Should Start Hosting Your Own.

Yesterday, GitHub changed how the archives they provided are made. The result of this change surprised developers, triggering pipeline failures all over the world in most ecosystems. According to this GitHub post, this is a consequence of recent changes to Git itself, released almost six months ago and just deployed within GitHub now with unforeseen impact. This change has thankfully been retracted.

Datadog's commitment to OpenTelemetry and the open source community

The OpenTelemetry (OTel) project is an open source initiative with the goal of providing vendor-neutral standards and tools that enable users to collect telemetry from any source in their environment and send it to any backend. A core tenet of Datadog is to provide a single, unified platform for customers to easily collect and monitor all of their observability data, regardless of where it comes from.

Test Observability with Sumo Logic

The software industry has seen many evolutions. There is a new disruption in the market every five years or so. Software testing cannot remain isolated from all the latest trends and technologies. Testing strategies need to keep up with agile development, faster deployments and increasing customer demand for reliability and user-friendly interfacing. They need to be able to grow just as quickly and just as reliably as the business logic.