%term

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Introducing Cloud Cost Intelligence for Snowflake

Jan 26, 2021 By Tim Buntel In CloudZero

Here at CloudZero, we work with some of the top software-driven companies out there. Like us, they’re building their products on Amazon Web Services (AWS), along with whatever best-of-breed providers meet their needs. It’s no secret that in recent years, Snowflake has seen — well, some serious success. For many companies, including CloudZero, they're the data warehouse provider of choice — and an essential component of delivering their products.

Read Post

CloudZero

Read more about Introducing Cloud Cost Intelligence for Snowflake

How to monitor your AWS servers via MetricFire

Jan 26, 2021 By Lindsey Rogerson In MetricFire

In this article we explore the basics of monitoring Amazon Web Services (AWS) by feeding metrics to Grafana through Hosted Graphite’s agent and also through Hosted Graphite’s AWS add-on. This will allow us to monitor metrics from applications and servers hosted in AWS with clarity and depth. This article assumes you have created a Hosted Graphite account.

Read Post

MetricFire

Read more about How to monitor your AWS servers via MetricFire

What Is Interconnection And Why Is It So Important To Enterprises?

Jan 25, 2021 By Alex Hawkes In Console Connect

Enterprise network connectivity has evolved in line with changing business needs over the last few decades and as we saw with the sudden shift to remote working in 2020, the evolution cycle is speeding up in response to environmental change. This makes interconnection more important than ever to the modern enterprise.

Read Post

Console Connect

Read more about What Is Interconnection And Why Is It So Important To Enterprises?

Tyler Wells on building a culture of reliability at Twilio

Jan 25, 2021 By Andre Newman In Gremlin

What does reliability look like at a company that has thousands of employees and provides critical communication services to over 150,000 customers? We talked with Tyler Wells, Senior Director of Engineering at Twilio, to learn how he and his team created a culture of reliability at Twilio. He talked in depth about his experiences developing reliability goals, building reliability practices, and aligning engineering teams on these objectives.

Read Post

Gremlin

Read more about Tyler Wells on building a culture of reliability at Twilio

Achieving the Observability Imperative Requires AI

Jan 25, 2021 By Will Cappelli In Moogsoft

The shift to Observability Over the last six months, unified monitoring, log management, and event management vendors have reoriented their technology portfolios (often without any change to the underlying functionality) towards Observability. In so doing, a fair amount of confusion has been generated in the market.

Read Post

Moogsoft

Read more about Achieving the Observability Imperative Requires AI

The Future of Kubernetes on DevOps Radio

Jan 25, 2021 By Jim Shilts In Shipa

In this episode of DevOps Radio, Shipa’s CEO and Founder Bruno Andrade joins host Brian Dawson to discuss his thoughts on the future of Kubernetes. DevOps Radio is a CloudBees-sponsored podcast series. Hosting experts from around the industry, the show dives into what it takes to successfully develop, deliver and deploy software in today’s ever-changing business environment. From DevOps to Docker, each episode features real-world insights and a few stories, tips, industry scoop and more.

Read Post

Shipa

Read more about The Future of Kubernetes on DevOps Radio

How to build your own incident management process

Jan 25, 2021 By Eyal Katz In Exigence

IT incident management is a fundamental operational process designed to ensure rapid service restoration. This process is typically assigned to the help desk but is also very much entrenched in the day-to-day of DevOps. When incident management goes right, service is restored quickly and the impact on productivity, continuity, and customer satisfaction is minimal.

Read Post

Exigence

Read more about How to build your own incident management process

7 Tips On Building And Maintaining An SRE Team In Your Company

Jan 22, 2021 By Squadcast Community In Squadcast

In today's "always on" world, Reliability is a primary business KPI. Plant the culture of Reliability by implementing these 7 simple tips to build a solid SRE team in your organization. Many of today’s hottest jobs didn’t exist at the turn of the millennium. Social media managers, data scientists, and growth hackers were never heard of before. Another relatively new job role in demand is that of a Site Reliability Engineer or SRE. The profession is quite new.

Read Post

Squadcast

Read more about 7 Tips On Building And Maintaining An SRE Team In Your Company

Take the first step toward SRE with Cloud Operations Sandbox

Jan 22, 2021 By Simon Zeltser In Google Operations

At Google Cloud, we strive to bring Site Reliability Engineering (SRE) culture to our customers not only through training on organizational best practices, but also with the tools you need to run successful cloud services. Part and parcel of that is comprehensive observability tooling—logging, monitoring, tracing, profiling and debugging—which can help you troubleshoot production issues faster, increase release velocity and improve service reliability.

Read Post

Google Operations

Read more about Take the first step toward SRE with Cloud Operations Sandbox

Level Up 2020 Highlights

Jan 22, 2021 By LogicMonitor In LogicMonitor

Hear from LogicMonitor leadership on some of the biggest announcements and additions to the LM product suite in 2020. We release an array of features that allow IT and Dev Ops teams to have full visibility into every corner their infrastructure, and with the addition of LM Logs we're on a mission to provide an extensible, fully unified observability platform.

View Video