Operations | Monitoring | ITSM | DevOps | Cloud

Blog

Microsoft 365 Outage on June 5th, EX571516, MO571683

On Monday morning, June 5th there was a wide scale outage for Microsoft 365. Interestingly, for this one, they first reported it with a barrage of duplicate health status emails (why, we have no idea) but the issue was much more widespread than that – it was affecting most Microsoft Office 365 services: The first incident was Incident EX571516: Some users are unable to access Outlook on the web, and may experience issues with other Exchange Online services.

Customer-Centric Observability: Experiences, Not Just Metrics

Martin and Jess recently conversed with Todd Gardner of RequestMetrics as part of the O11ycast podcast. We don’t normally write blogs based on these conversations, but there were impactful comments in that episode that bear repeating. You can listen to the full conversation if you wish. Let’s get into it!

What Is Root Cause Analysis?

Root Cause Analysis (RCA) is a systematic process designed to uncover the fundamental, underlying issues that lead to IT incidents. These 'root causes' are often masked by surface-level symptoms, making them challenging to identify without a systematic approach. Root Cause Analysis serves as a metaphorical excavation, drilling past the initial problems to discover deeper, hidden issues.

Seamless Transition to Kubernetes: Ninetailed's Path to Production with Qovery

In today's fast-paced digital world, companies are always looking for ways to optimize their deployment processes. This is especially true for Ninetailed, a leading composable personalization and experimentation solution for digital teams, as they aim to provide exceptional experiences for their customers. To achieve their goals, Ninetailed embarked on a journey to enhance their DevOps practices and scale their infrastructure effectively.

Office 365 Monitoring: The Challenges, and What to Do About Them

Office 365 is used by more than one million companies around the world. Business employees count on these apps constantly to do their jobs, whether they’re writing documents, updating spreadsheets, building slides, or checking email. While cloud-based apps like Office 365 offer undeniable advantages for enterprises and business users, they also create tough challenges for IT operations and network operations (NetOps) teams.

How To Enable QUIC Load Balancing on HAProxy

HTTP/3 is the latest generation of the HTTP protocol, and its ability to serve applications over QUIC offers several advantages for user experience, including reduced latency, improved reliability, and faster page loading as a result of fixing the head-of-line blocking issue in previous versions of HTTP. Both HAProxy and HAProxy Enterprise offer support for using HTTP/3 over QUIC, although the steps for enabling QUIC in HAProxy and HAProxy Enterprise are different.

ITIL Service Strategy Definition, Processes, and Implementation

The ITIL Service strategy is the first stage in the service lifecycle. It paves the way for the following four. This stage contains the guidelines for organizations to set out a solid strategy for their IT services, position them in the appropriate place in the service portfolio, and ensure they add value from a financial and experience perspective. This article will define service strategy, its purpose, and examine in detail the five processes that comprise it.

Monitoring your Nextjs application using OpenTelemetry

Nextjs is a production-ready React framework for building single-page web applications. It enables you to build fast and user-friendly static websites, as well as web applications using Reactjs. Using OpenTelemetry Nextjs libraries, you can set up end-to-end tracing for your Nextjs applications. Nextjs has its own monitoring feature, but it is only limited to measuring the metrics like core web vitals and real-time analytics of the application.

6 Key Factors to Consider When Choosing a Website Platform

Choosing the right website platform is an important decision for anyone looking to establish a solid online presence. In fact, choosing the wrong website platform has exposed brands to issues like security breaches, poor mobile responsiveness, and terrible load speeds. To buttress the last point, Google research showed that 32% of users would leave your website if it experiences poor load speed. In other words, they want a good user experience.

Understanding the network edge and edge networking

Generally speaking, a network ‘edge’ is the boundary between two separate networks - where one network ends and another begins. Edges are important, primarily from a security standpoint as they define the jurisdictions owned by different parties, and ‘the edge’ has become a more popular topic of conversation recently through trends which see network resources moved from centralised locations to network edges to make them closer to the end user.