Operations | Monitoring | ITSM | DevOps | Cloud

Advancing IT Operations Maturity for Digital Transformation

This is part two in our blog series on Digital Transformation during Covid-19. You can read the first blog here. The global pandemic and subsequent lockdowns affecting over half the world’s population are already affecting digital transformation, on both an organizational and employee level. Starting by improving your internal customer experience is a good first step, after which digital transformation efforts can and should focus on enhancing the external customer experience in every way.

Early-detection of Potential Sources of Failure in Serverless

We recently wrote about why serverless applications fail and how to design resilient architectures. Being able to detect early-stage failure indicators can be invaluable. With proper monitoring, developers move from waiting for the system to crash and adopt a more proactive attitude in managing resource allocation and architecture design to avoid bottlenecks and performance degradation.

The First and Last Conference of the Year

I was excited to attend DevOpsDays in New York City in March of 2020, but then again, who wouldn’t be? A whole week in the Big Apple with Liz Fong and Christine Yen, yes, please! I joined Honeycomb as a product designer in January of 2020, making this my first event as a Honeycomb employee. In addition to meeting our users, it was a chance for me to talk with people just starting their observability journey. As a product designer, my focus is on improving the overall user experience.

The role of shift-left testing in an agile environment

With the rapid growth of security threats to infrastructure, it’s more important than ever to proactively address vulnerabilities. As an open-source project, built on the trust of users and contributors, Netdata has security concerns at its core. Because we’re committed to code security and quality, we apply Agile principles throughout the software development process. A component of this includes regular static analysis.

IT and DevOps Resources for COVID-19

We’re all wrestling with less than ideal circumstances during the pandemic of COVID-19. Whether you’re sheltering in place or simply practicing social distancing, it’s safe to say we’re all adjusting to a temporary new normal. One commonality is the need for connectivity. If infrastructure fails, business will screech to a halt and we will find ourselves in a new kind of mess altogether.

Remote Monitoring Third Party Status Pages

The debate around allowing employees to work from home is now moot. Due to these unusual times, businesses must have the ability to handle the majority of their primary functions remotely. The implications of this are pretty broad in scope and have IT shops scrambling to address the concerns of how to monitor the applications that enable efficient work from home strategies.

Monitoring Azure Backup and Replication Jobs

We all know that systems fail. We plan for this with failover partners and system backups. But can you really trust your backups? If you are using Azure monitoring, your backup and site recovery can be complicated. LogicMonitor provides clarity. Our Azure Backup monitoring service provides simple, secure, and cost-effective solutions for backing up and recovering your data using the Azure cloud.

How histograms changed the game for monitoring time series with Prometheus

Histograms are one of my favorite topics in the Prometheus universe. Last November, I delivered a talk at PromCon EU 2019 that was titled Prometheus Histograms – Past, Present, and Future. Only the part about the past had to be cut due to time constraints. But I made a promise to resurrect my talk about the history of histograms and I kept my word. In February, I premiered the Secret History of Prometheus Histograms at FOSDEM 2020.