Operations | Monitoring | ITSM | DevOps | Cloud

DevOps

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Achieving SLO Success with Golden Signals and Reliability Testing

The four Golden Signals are an easy and effective way to measure the most important aspects of a system, and when paired with a reliability management platform like Gremlin, they help you proactively meet your SLOs so you can meet your legal obligations and deliver the perfect customer experience.

AI at the Peak of Inflated Expectations? A Reality Check

The AI hype is undeniable. Buzzwords like ‘machine learning’, ‘deep learning’, and ‘artificial intelligence’ have permeated boardrooms, media, and tech conferences. However, recent market movements suggest that AI might be at the ‘peak of inflated expectations’. Nvidia, a leading player in AI hardware, has seen its stock plummet by about 20% over the last month (8th July to 8th August 2024).

On-Call Rotations and Schedules: A Guide for 2024

In an increasingly connected world where businesses operate around the clock, the importance of having an effective on-call system cannot be stressed enough. With technological advances and the expectation of immediate attention to business-critical issues, creating a reliable on-call rotation and schedule is essential for ensuring operational continuity. This comprehensive guide will walk you through the various aspects of on-call rotations and schedules that you need to consider for 2024.

How to Import Existing ilert Resources into Terraform

Welcome to our detailed guide, which will help you incorporate your current ilert configurations for incident management into Terraform. Here, you will find a step-by-step tutorial to import your existing ilert resources to the Infrastructure as Code project and recommendations from our engineering team on best practices to maintain consistency across your infrastructure and incident management processes.

Container Monitoring Demo

Datadog Container Monitoring gives you real-time, end-to-end visibility into the health, security, and resource usage of your containerized environments. In this demo, we’ll show you how Datadog measures container health alongside security posture and resource utilization, offering end-to-end monitoring and optimization for your container ecosystem.

The high stakes of SDLC compliance: Lessons from EVE Online's battle of B-R5RB and Equifax

n our previous exploration of The Punchcard Paradigm, we traced the roots of modern compliance practices back to the early days of computing. We saw how the physical constraints of punchcards shaped programming practices and how those practices lingered long after the technology had evolved. Now, let’s dive deeper into why modern compliance is more critical than ever in today’s digital landscape.

The Essential Guide to Cloud Migration Planning

Cloud migration is the process of moving your infrastructure, applications, and data to the cloud. While the potential benefits—like cost savings, scalability, and modernization—are significant, the risks are just as high if you don’t have a solid plan in place. Without clear strategies, you could face unexpected costs, technical challenges, and security issues.

Kubernetes Cost Optimization: 9+ Ways To Lower Costs in 2024

If cost optimization is your only reason for adopting Kubernetes and containers, you might be in for a rude surprise — many companies find that costs increase after moving to Kubernetes. Even companies who adopt Kubernetes for other reasons, like time-to-market advantages, should follow basic cost control best practices to stay within the budget. Optimizing cloud costs related to running Kubernetes doesn’t have to involve trade-offs for performance or availability.