Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Putting Developers First: The Core Pillars of Dynamic Observability

Organizations today must embrace a modern observability approach to develop user-centric and reliable software. This isn’t just about tools; it’s about processes, mentality, and having developers actively involved throughout the software development lifecycle up to production release. In recent years, the concept of observability has gained prominence in the world of software development and operations.

Can you have a career in Node without knowing Observability?

”Isn’t Observability something for Ops to worry about?” I’ve heard this response more than once when talking about how developers should learn OpenTelemetry. I wanted to write this piece to show you how important and how easy it is to learn observability from day one as a coder.

The Microscope for Embedded Code: How Tracealyzer Revealed Our Bug

Tracealyzer. You can’t stay in the wonderful world of debugging and profiling code without hearing the name. If you look at Percepio’s website, it is compared to the oscilloscopes of embedded code. Use it to peek deep inside your code and see what it does. Of course, the code receives an interrupt and checks a CRC before sending the data through SPI, but how does it do it? And how long does it take?

Acing server performance: Don't overlook these crucial 11 monitoring metrics

A server, undeniably, is one of the most crucial components in a network. Every critical activity in a hybrid network architecture is somehow related to server operations. Servers don’t just serve as the spine of modern computing operations—they are also pivotal for network communications. From sending emails to accessing databases and hosting applications, a server’s reliability and performance have a direct impact on the organization’s growth.

Bill Kennedy: The mistake boot, building ACs, Black boxes & AI in software - The Reliability Podcast

The Reliability podcast aims to speak with engineers who have worked on large, complex systems and glean through their learnings. What best practices should one imbibe? What are non-negotiable learnings to become better at a craft? What’s ‘engineering’ going to be like with the advent of AI? We answer these and more tracing personal journeys of engineers who have built stellar careers around decoding the innumerable intricacies of software engineering.

From LCP to CLS: Improve your Core Web Vitals with Image Loading Best Practices

If you’re a front end developer, there’s a high probability you’ve built (or will build) an image-heavy page. And you’ll need to make it look great by serving high-quality image files. But you’ll also need to prioritize building a high-quality user experience by making sure your Core Web Vitals such as Cumulative Layout Shift and Largest Contentful Paint aren’t negatively affected, which also help with your search engine rankings.

Grafana 10.1: How to build dashboards with visualizations and widgets

Learn how to distinguish widgets from visualizations for building better dashboards with Grafana 10.1. This update will improve your dashboard creation process because if you want to integrate elements like text, news, or an annotation list, you no longer need to select a data source first. Plus, to optimize your editing experience, the plugins list and library panels are now context-aware, adjusting in real time based on whether you’re working with a widget or a visualization.

Improved time series, trend, and state timeline visualizations in Grafana 10.1

When you’re visualizing data in time series, trend, and state timeline panels, one challenge you might have faced is when arbitrary gaps in your data end up automatically connected in your visualization. This can distort the true picture of your data, leading to potential misinterpretations. In Grafana 10.1, you can now set a specific threshold on the x-axis in your Grafana dashboards to disconnect any data points above this threshold.

The importance of SDT and how to successfully schedule planned downtime

Scheduled downtime (SDT), also known as planned downtime, lets you perform maintenance, testing, or repairs on your systems, servers, software, data centers, and other infrastructure. While no business likes being offline, this preventative work is essential for ensuring your assets function correctly. Unlike unplanned outages, you can limit downtime so it minimizes impact on your company and customers.