Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Does Every Incident Need a Retrospective? Here's What the Experts Have to Say

Every quarter, we host a roundtable discussion centered around the challenges encountered by incident responders at the world’s leading organizations. These discussions are lightly facilitated and vendor-agnostic, with a carefully curated group of experts. Everyone brings their own unique perspective and experience to the group as we dive deep into the real-world challenges incident responders are facing today.

Unleashing the power of AI and automation for effective Cloud Cost Optimization in 2024

In the current dynamic business environment, cloud computing has emerged as the fundamental driver of innovation and scalability. As companies increasingly rely on the cloud for their business initiatives achieving cloud cost optimization remains a significant hurdle.

Collecting OpenShift container logs using Red Hat's OpenShift Logging Operator

This blog explores a possible approach to collecting and formatting OpenShift Container Platform logs and audit logs with Red Hat OpenShift Logging Operator. We recommend using Elastic® Agent for the best possible experience! We will also show how to format the logs to Elastic Common Schema (ECS) for the best experience viewing, searching, and visualizing your logs. All examples in this blog are based on OpenShift 4.14.

Q&A: What IT Automation Best Practices Should You Know Right Now? - Part 2

With a limitless load of questions on IT automation and the industry’s biggest trends, Resolve’s “Ask Me Anything (AMA)” session went about tackling them in an all-new way. We threw out the preparation, we threw out the scripts, and we asked our community to submit the questions that matter most to them and their organizations. Part of our leadership team took the hot seat and provided answers in real time, sans dress rehearsal.

Supercharged with AI

One of the most painful parts of incident management is keeping on top of the many things that happen when you’re right in the middle of an incident. From figuring out and communicating what’s happening, to ensuring you learn from previous incidents, and even capturing the right actions – incidents are hard, but they don’t need to be this hard.

GitHub Pull Request Management with GitKraken Client

Let’s dive into the world of pull requests (PRs). They’re the bridges connecting your hard work to the bigger project, facilitating code review, collaboration, and more. But why are they so crucial, and how can tools like GitKraken Client and GitHub take their management to the next level? Keep reading to explore the unique features of both platforms, plus time-saving tips for efficient PR management.

EKS Add-ons And Integrations: Evaluating Cost Impacts

Amazon Kubernetes Service (EKS) has rapidly become the de facto solution for organizations seeking to deploy, manage, and scale containerized applications using Kubernetes. EKS simplifies the complexities associated with Kubernetes, allowing teams to focus on developing and deploying applications more efficiently. However, as organizations scale their Kubernetes environments, managing and optimizing costs can quickly become a significant concern.

Easily Monitor URL and IP Availability Using Telegraf with Ping

Monitoring your domain URLs and server IPs is important for many reasons and plays a crucial role in ensuring the health, performance, and security of a network or web application. Monitoring hosted IPs within your infrastructure helps track the availability and uptime of websites and services. It also allows organizations to identify and respond quickly to downtime or outages, minimizing the impact on users.