Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Delivering a Successful Microsoft Teams User Experience - Microsoft MVP, Nick Cavalancia

Think delivering Microsoft Teams service to your users is simple? Microsoft MVP Nick Cavalancia lists all of the factors at play in delivering a successful Teams user experience. Spoiler alert: most of them have nothing to do with Microsoft.

Using Playwright and Checkly to create an ecommerce synthetic monitoring check (no audio)

This short video shows the creation of a synthetic monitoring check using Playwright and Checkly. Here's an outline of the steps in the video: Note: If you’d like to follow along with the steps here, make sure that you have Playwright installed first.

How IT Departments Can Manage the Infrastructure to Boost Microsoft Teams Call Quality

The person speaking to you can only hear a sort of robot voice and the dreaded pixilation of your face on a video chat has descended on your important call with your team or, much, much worse, with a VIP client – does panic set in? Well, if your IT team doesn’t have a good way to boost Teams call quality then yes. Inconsistent Teams call quality is a problem plaguing businesses in every sector, and it’s an issue that can have a real negative impact on an organization.

Scaling Ingest With Ingest Telemetry

With the introduction of Environments & Services, we’ve seen a dramatic increase in the creation of new datasets. These new datasets are smaller than ones created with Honeycomb Classic, where customers would typically place all of their services under a single, large dataset. This change has presented some interesting scaling challenges, which I’ll detail in this post, along with the solution we used, and how we leveraged Honeycomb’s own telemetry to scale Honeycomb.

Is Your Ecommerce Site Ready for Black Friday and Cyber Monday?

The holiday shopping season is one of the most stressful periods for operators of retail and ecommerce businesses, as the seasonal surge of holiday shoppers can put massive amounts of stress and strain on even the most well-architected websites. Here’s a recent example from 2021: The Office Depot website suffered an outage during Cyber Monday that knocked the online shop offline for hours, impacting the ability of customers to place orders online.

Installing the HG Heroku Monitoring & Dashboards Add-on

HG or HostedGraphite provides a complete infrastructure and application monitoring platform from a suite of open-source monitoring tools. Depending on the setup, you can choose Hosted Graphite as your data source and view all required metrics on beautiful Grafana dashboards in real time. Hosted Graphite offers a wide range of tools, add-ons, and plugins that make it possible to measure, analyze, and visualize large amounts of data about your applications with ease.

How to centralize thousands of data sources with Grafana: Inside Adform's observability system

Over the course of two decades, Adform grew from a dream between friends huddled in a basement to a leading advertising tech platform powering more than 25,000 clients worldwide. Success brought external accolades, but it also created the need for internal innovation to support the company’s continued growth. In 2018, Adform was still operating in startup mode, which meant developers and teams cherry-picked the tools that worked best for them.

Democratizing Observability

DevOps principles have helped many organizations improve cross-team collaboration, which has in turn led to increased reliability and velocity in the development lifecycle. In this session moderated by Jason Yee, we hear from panelists who have applied these same DevOps principles to observability, helping them unlock data-based insights and empower teams to make smarter, more informed decisions.

Architecting for Reliability

As modern systems become increasingly more complex, the risk of incidents and outages increases. Old approaches to reliability can sometimes be adapted to novel system designs, but other times new methods need to be invented. In this panel session moderated by Datadog’s Jason Yee, you’ll hear from SRE leaders and systems architects across the industry about how they’re designing and operating systems to achieve greater reliability.