Operations | Monitoring | ITSM | DevOps | Cloud

Latest Publications

Chaos Engineering: Finding Failures Before They Become Outages

Learn the basics of Chaos Engineering: discover the tools, tests, and culture needed to create better software and prevent outages and downtime. This whitepaper provides a comprehensive introduction to the discipline of Chaos Engineering including why it is more needed than ever, how to get started, and best practices to maximize learnings and reduce risk.

How to Implement Chaos Engineering at Your Company

By following this guide, you'll successfully increase your organization's reliability with minimal effort and risk. This document will serve as your guide to implementing Chaos Engineering and Gremlin within your organization. From educating your team on the principles of Chaos Engineering to running automated experiments, this guide will walk through each stage of the adoption process in order to ensure a smooth and successful rollout.

Chaos Engineering for DynamoDB

Amazon DynamoDB is fast, powerful, and intended for high availability. These are all valuable attributes in a data storage solution, but to be useful as advertised, it must be configured thoughtfully. Learn how to use Chaos Engineering to ensure DynamoDB performs the way you expect. In this guide, we cover: Amazon DynamoDB is one of the most popular NoSQL databases and is the data store of choice for many teams running production workloads in AWS.

Tackle Application Modernization in Days and Weeks, Not Months and Years

Why You Need Application Modernization Most developers work on existing applications (apps): products and services that have been built, maintained, and updated over long periods of time. Normally, these apps exist as a web of tightly coupled, sparsely documented systems.

How to Achieve AWS, Azure, or GCP Observability at Scale

The adoption of multi-cloud is on the rise among enterprises. However, major cloud providers including AWS, Azure, GCP, or VMware Cloud on AWS are different and monitoring across diverse cloud environments is not easy. Learn what metrics should you as DevOps or SRE engineer observe on each of the major cloud providers. Also, learn why is Tanzu Observability by Wavefront essential for unified, full-stack, multi-cloud observability, and analytics.

Three things every SharePoint administrator should monitor

Microsoft SharePoint is one of the most business-critical services in enterprise IT. It acts as a central warehouse that hosts large volumes of business data primarily stored for the purpose of internal collaborations among employees. It empowers users to access this data from anywhere, anytime, and from any device. Hence, any performance hiccup in a SharePoint environment could have major repercussions on the business continuity of an organization. Therefore, it is crucial to monitor SharePoint environments to ensure seamless workflow.

Observability with AIOps For Dummies

This new eBook explains how DevOps and SREs can develop more and operate less by applying AI to events, metrics, traces and logs to keep CI/CD agile and your business growing. DevOps and SRE teams build critical digital services, but often spend more time troubleshooting their complex applications and infrastructures than innovating. What's the solution? Combine AIOps algorithmic analysis and automation with observability's detailed operational data.