Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

PTO peace of mind: Sync Grafana OnCall with Google Calendar out-of-office events

Sometimes, the little things can make a big difference. We’ve added a new feature in Grafana Incident & Response Management (IRM) that lets you sync your Google Calendar out-of-office events with Grafana OnCall.

Insights of an Observability Advocate: The Challenges and Rewards

At a recent SRE Meetup in Bangalore, we had the pleasure of meeting Akshay Deshpande. During our conversation, Akshay, who manages a Performance/Observability Engineering team at Smarsh discussed his passion for observability and his constant drive to improve the field. Smarsh helps companies gain valuable insights from their communication data, enabling them to proactively identify potential regulatory and reputational risks before they escalate.
Sponsored Post

Comparing the Top 5 On-Call Management Software Solutions in 2024

SRE and DevOps teams are the backbone of system uptime and reliability. But managing On-Call schedules, alerts, and communication during incidents can quickly turn resolution efforts into burnout. This blog explores the top On-Call management tools in 2024, designed to streamline Incident Response and keep your team ready for action.

A Day in Life of DevOps Engineer

Let me tell you, the life of a DevOps engineer is anything but boring. It's a constant pull between automation, collaboration, and troubleshooting, all with a healthy dose of caffeine thrown in for good measure. One day you might be scripting a deployment pipeline, the next you’re diving into server logs to diagnose a critical error. It's a role that demands versatility, a problem-solving mindset, and a learner’s excitement.

The rising costs of downtime

IT outages are a financial nightmare. Beyond revenue impact, unplanned downtime translates to lost productivity, frustrated customers, and potential reputation damage. To understand the true impact of these events, Enterprise Management Associates (EMA) conducted a comprehensive study with more than 400 IT professionals from varying company sizes and roles in North America, EMEA, and APAC regions.

Igniting Innovation: The Power of Empowered Engineers

In the fast-paced world of technology, innovation is not just a buzzword—it's a necessity. As organizations strive to stay ahead of the curve and deliver cutting-edge solutions, they must foster a culture that empowers engineers to drive change and lead transformative projects. Throughout my career, I have witnessed firsthand the impact that empowered engineers can have on an organization, and I believe that unlocking their potential is key to achieving long-term success.

Beyond SLAs: Rethinking Service Level Objectives in Incident Response

In the context of IT service management, Service Level Agreements (SLAs) have long been the cornerstone for measuring and ensuring the quality of services provided to customers. However, as technology evolves and incidents become more complex, relying solely on SLAs may not be sufficient. This is where Service Level Objectives (SLOs) come into play, offering a more nuanced approach to Incident Response.

Operational Excellence at the New York Stock Exchange: Our Q&A with NYSE's President

Mitigating the risk of operational failure is top of mind—and a top budget priority—for executives. A single unplanned event can have a disruptive effect across the organization, an outcome management teams work hard to avoid. For the New York Stock Exchange (NYSE), operational resilience is critical given the role it plays in the global economy and capital flows.