Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

What Is Distributed Tracing

Systems and applications alike have become progressively distributed as microservices, open-source tools, and containerisation have gained traction. In order to actively monitor and respond quickly to issues that arise in our environment, distributed tracing has proven to be vital for businesses such as Uber, Postmates, Hello Fresh and TransferWise. It is, however, important to clarify what distributed tracing actually means.

Why AIOps may be necessary for the future of engineering

Machine learning has crossed the chasm. In 2020, McKinsey found that out of 2,395 companies surveyed, 50% had an ongoing investment in machine learning. By 2030, machine learning is predicted to deliver around $13 trillion. Before long, a good understanding of machine learning (ML) will be a central requirement in any technical strategy. The question is — what role is artificial intelligence (AI) going to play in engineering?

Monitoring Rails applications with Datadog

Rails is a Ruby framework for developing web applications. It favors the Model-View-Controller (MVC) architecture and includes generators that create the files needed for each MVC component. Rails applications consist of a database, an application server for running application code, and a web server for processing requests. Rails provides multiple integrations for its supporting database (e.g., MySQL and PostgreSQL) and web server (e.g., Apache and NGINX).

Autoscale your Kubernetes workloads with any Datadog metric

Editor’s note: This post was updated on August 9, 2022, to include a demonstration of how to enable highly available support for HPA. It was also updated on November 12, 2020, to include a demonstration of how to autoscale Kubernetes workloads based on custom Datadog queries using the new DatadogMetric CRD.

Deutsche Bergbau-Museum Bochum

The Deutsche Bergbau-Museum Bochum (DBM), or the German Mining Museum, is one of the premier locations to show those interested in the history of mining. Over its 100+ years of operation, the DBM has evolved its exhibits to use multimedia players and other digital devices connected to its network. However, the museum’s three-person IT staff faced a series of problems, including a lack of insight into where network outages occurred and no alert notifications. After working alongside P&W Netzwerk, DBM deployed Progress WhatsUp Gold to manage the broad network environment cost-effectively.

What is Infrastructure as Code?

Cloud services were born at the beginning of 2000 with companies such as Salesforce and Amazon paving the way. Simple Queuing Service (SQS) was the first service to be launched by Amazon Web Services in November 2004. It was offered as a distributed queuing service and it is still one of the most popular services in AWS. By 2006 more and more services were added to the offering list.

Tensu: An Open Source Text UI for Sensu Go

A Two Sigma engineer explains why we built Tensu, an open source TUI (text user interface)-based program for interacting with Sensu Go’s observability pipeline and backend API. In this article we will be putting a spotlight on Tensu, an open source terminal-based dashboard for interacting with and responding to events from the Sensu Go observability pipeline and backend API.

Lessons Learned From Building a Company and Raising Kids

When I had my first child almost six years ago, I expected that most of my time would be spent in the role of a teacher rather than a student. I have two kids now — and I’m certainly teaching them as much as I can as they grow and learn to navigate the world — but if someone were keeping score, my kids might end up on top when it comes to who’s taught who more. Another thing that surprised me is how similar building a family is to build a company from the ground up.

Improving DevOps Performance with DORA Metrics

Everyone in the software industry is in a race to become more agile. We all want to improve the performance of our software development lifecycle (SLDC). But how do you actually do that? If you want to improve your performance, first determine what KPI you’d like to improve. DORA metrics offer a good set of KPIs to track and improve. It started as a research by the DevOps Research and Assessment (DORA) and Google Cloud (which later acquired DORA), to understand what makes high performing teams.