Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

How a Globoplay engineer discovered the power of SigNoz, with Paulo Henrique de Morais Santiago

We sit down with Globo engineer and DevOps wizard Paulo Henrique de Morais Santiago, who along with experimenting with SigNoz as a New Relic alternative for Observability, is also the author of one of the top DevOps courses on Udemy. Check out his course at More about SigNoz.

What is MITRE ATT&CK and How to Use the Framework?

The MITRE ATT@CK® framework is one of the most widely known and used. The Flowmon Anomaly Detection System (ADS) incorporates knowledge of the MITRE ATT&CK framework. Using ADS and its MITRE ATT&CK knowledge makes detecting advanced threats against networks and IT systems easier and simplifies explaining the danger and risks when outlining an attack to all stakeholders.

A Release Strategy for Continuous Innovation

At Cribl, we take pride in doing things differently. Our Customers First mentality is at the heart of everything we do as an organization–from free education and sandboxes, community programs, and platforms, to streamlining legal reviews on contracts. We strive to solve problems from first principles – understanding root causes to build optimal experiences vs. piecemeal solutions together. We aim to be a partner—working with you to address your challenges holistically.

25 Best Status Page Examples Showcasing Top Communication Practices

Status pages are a transparent and effective way to inform users of any downtime or incidents disrupting the company’s service. Without a status page, users are left in the dark, and support tickets pile up, affecting your relationship with them and their trust. That’s why having a status page is essential for a business in 2023.

When Two Worlds Collide: AI and Observability Pipelines

In today's data-driven world, ensuring the stability and efficiency of software applications is not just a need but a requirement. Enter observability. But as with any evolving technology, there's always room for growth. That growth, as it stands today, is the convergence of artificial intelligence (AI) with observability pipelines. In this blog, we'll explore the idea behind this merge and its potential.

A complete guide to metrics cost management in Grafana Cloud

The macro economy can put a lot of pressure on organizations to reduce costs, typically with the central SRE and platform engineering teams coming under scrutiny. One common workaround we’ve seen countless teams make is compromising their observability by ingesting fewer metrics in the name of cost savings. But for centralized SRE/observability teams, the response to macro conditions should not be monitor less, but rather monitor smarter.

Mean Time to Repair (MTTR): Definition, Tips and Challenges

The availability and reliability of any IT service ultimately govern end-user experience and service performance, both of which have significant business impact. These two concepts — availability and reliability — are particularly relevant in the era of cloud computing, where software drives business operations, but that software is often managed and delivered as a service by third-party vendors.