Operations | Monitoring | ITSM | DevOps | Cloud

Distributed Tracing with Sentry: How to Find the Root Cause of Errors Across Applications

Implementing Sentry on all your services allows you to use distributed tracing to find the root cause of errors. Just because an error happens in a browser or mobile app, doesn’t mean the issue is with the frontend or mobile code. The issue could stem from an error with code in a different project that they interact with in some way. Distributed tracing empowers developers to find the actual cause of hard to fix issues.

IBM's journey to tens of thousands of production Kubernetes clusters

IBM Cloud has made a massive shift to Kubernetes. From an initial plan for a hosted Kubernetes public cloud offering it has snowballed to tens of thousands of production Kubernetes clusters running across more than 60 data centers around the globe, hosting 90% of the PaaS and SaaS services offered by IBM Cloud. I spoke with Dan Berg, IBM Distinguished Engineer, to find out more about their journey, what triggered such a significant shift, and what they learned along the way.

Announcing General Availability of PagerDuty's Slack Integration

When PagerDuty’s VP of Product Management Rachel Obstler announced the beta version of our new Slack integration in April in her “Anticipating, Monitoring, and Managing Incidents via Slack” panel at Slack Frontiers, we expected significant interest in the integration among our customers.

Open Source can be a silver bullet, but your application might be a werewolf

I was reminiscing about an incident that happened at a past job with an old co-worker. You know the one, the one where you installed a library that makes some task of yours simple, only to reveal the library makes things worse. This incident in particular involved the way that images served out of our Ruby on Rails application, and the library that made it possible to “easily resize before serving” them.

How to collect, customize, and centralize Node.js logs

When you need to troubleshoot an issue in your Node.js application, logs provide information about the severity of the problem, as well as insights into its root cause. You can use logs to capture stack traces and other types of activity, and trace them back to specific session IDs, user IDs, request endpoints—anything that will help you efficiently monitor your application.

Is your website monitoring service using the latest browser version?

Chrome 77 dropped last week, and our 209 checkpoint locations are in the process of updating or are already using the newest version of Chrome to monitor your website. That means your Web Application Monitoring and Full Page Check performance monitors are in sync with your users with the latest and greatest Chrome browser. Is your website monitoring service current?

Sean Porter: Scaling Sensu Go

For over eight years, the Sensu community has been using Sensu to monitor their applications and infrastructure at scale. Sensu Go became generally available at the beginning of this year, and was designed to be more portable, easier and faster to deploy, and most importantly: more scalable than ever before! In this talk, Sensu CTO Sean Porter will share Sensu Go scaling patterns, best practices, and case studies. He’ll also explain our design and architectural choices and talk about our plan to take things even further.