Operations | Monitoring | ITSM | DevOps | Cloud

SRE

The latest News and Information on Service Reliability Engineering and related technologies.

Sponsored Post

Kubernetes Monitoring Best Practices

Kubernetes can be installed using different tools, whether open-source, third-party vendor, or in a public cloud. In most cases, default installations have limited monitoring capabilities. Therefore, once a Kubernetes cluster is running, administrators must implement monitoring solutions to meet their requirements. Typical use cases for Kubernetes monitoring include: Effective Kubernetes monitoring requires a mix of tools, strategy, and technical expertise. To help you get it right, this article will explore seven essential Kubernetes monitoring best practices in detail.

The Medium is the Message: How to Master the Most Essential Incident Communication Channels

We’ve all seen it: a company experiencing a major incident and going radio silent, leaving their customers to wonder “Are they doing something about this?!”. If you’ve ever been on the inside of something like this, you know the answer is most likely yes, there are people working hard to put out the fire as quickly as possible. But when it comes to incidents, perception is reality for customers.

Looking Beyond Atlassian StatusPage: The 5 Best Alternatives

Status Pages are crucial cogs in your Incident Communication process, they serve as vital channels to keep your stakeholders informed during periods of downtime. Although there are many proficient tools in the market, such as Atlassian Status Page and Status.io, these standalone Status Pages can come with a hefty price tag, with various pricing plans and tiers for both Public and Private Status Pages. Moreover, with Atlassian Cloud’s recent issues, its dependability is in question.

Breaking Down the Pillars of Observability from Data to Outcomes

The world of cloud-native and distributed microservices has revolutionized software development and deployment. However, the sheer volume of data these systems generate can often lead to confusion and uncertainty. You're not alone if you've ever felt lost in the sea of observability data.

Webinar: Embracing Declarative Provisioning and Observability in cloud environments

Organizations face increasingly complex challenges in deploying and managing their systems in today's rapidly evolving technological landscape. Declarative provisioning and observability have emerged as a powerful approach to address these challenges. This talk delves into declarative provisioning and observability, exploring its benefits, principles, and practical implementation strategies.

Introduction to ELK Tech Stack

ELK Stack, also known as the Elastic Stack is a powerful and versatile open-source toolset that has revolutionized the way businesses manage and analyze their data. ELK Stack seamlessly integrates these three robust components to offer a comprehensive solution for searching, analyzing, and visualizing large volumes of data in real-time. So, buckle up, for a comprehensive overview of the ELK stack and its components, which will be a great starting point for beginners.

Pinpoint performance issues in downstream services with the Dependency Map Navigator

Visibility into the upstream and downstream dependencies of your services is key to maintaining a performant microservices environment. Application developers and SREs rely on this visibility to quickly trace issues back to the source, which is essential during incidents—when time is of the essence—throughout day-to-day operations, and as systems evolve and scale.