Operations | Monitoring | ITSM | DevOps | Cloud

Cribl.Cloud Government Is a New Era of Secure Cloud Telemetry for Federal Agencies

As a Co-founder and CPO at Cribl, I'm genuinely stoked that our new federal suite, Cribl.Cloud Government, has achieved an “In Process” designation under the Federal Risk and Authorization Management Program (FedRAMP). This isn’t any old milestone. We’re bringing all of Cribl’s kickass capabilities to government agencies, even those that require the strictest compliance and security standards. Because, who doesn’t love a good set of rules?

Icinga Experience: Insights from Real-World Icinga Deployments Across Industries

Modern IT environments are hybrid, distributed, and constantly growing. To keep them reliable, organizations rely on monitoring that scales, automates, and integrates seamlessly into existing workflows. We collected 24 Icinga customer stories from industries including finance, telecom, manufacturing, and public services. What unites them is the choice of Icinga as a flexible and cost-efficient alternative to proprietary monitoring tools.

Faster, more memory-efficient performance in Grafana Mimir: a closer look at Mimir Query Engine

Until recently, Grafana Mimir — our open source, horizontally scalable, multi-tenant time series database (TSDB) — has exclusively used Prometheus’ PromQL engine to evaluate queries. While the PromQL engine works great, it sometimes needs a lot of memory to run, specifically in the Mimir querier component. To address this memory consumption issue, we recently introduced Mimir Query Engine (MQE).

What is Asynchronous Job Monitoring?

Modern applications don’t process everything inside the request/response path. To keep APIs responsive, time-consuming work like image resizing, payment processing, or data syncs is moved into background queues. Workers then pick up these asynchronous jobs and run them outside the main thread. Asynchronous job monitoring is the practice of tracking these background tasks: Without this visibility, background workers become a blind spot.

SQL performance improvements: finding the right queries to fix (part 1)

A few weeks ago, we massively improved the performance of the dashboard & website by optimizing some of our SQL queries. In this post, we'll share how we identified the queries that needed work. In the next post, we'll explore how we fixed each of them. We'll cover the basics and gradually work our way up to the more advanced/complex ways of identifying slow queries. In this post, you'll see: Let's go!

Making the invisible visible: Are your cloud firewalls and DDoS protection really working?

Every business builds strong defences to keep attackers out. Firewalls and DDoS protection serve that purpose, standing guard over company apps and websites, like knights at the castle gate keeping out trolls (not just the ones on X). But here’s the problem: those defences only work if users actually walk through the front gate. Sometimes, people find hidden paths or side doors around your walls, so the guards never see them enter.

Improving Transaction Speed and Transparency in Digital Operations

Transactions happen fast in our lives, whether it's paying for a coffee, sending a gift, or moving money between accounts. We expect speed and clarity. Delays, confusion, or hidden steps frustrate everyone. That's why new digital systems aim to make money moves quick and visible. In the coming sections, we'll look at what drives this change, why it matters for users and businesses, and where it might go next.

Why Comprehensive IT Risk Mitigation Is Essential in Modern Operations

The digital economy offers unprecedented opportunities for innovation, but it also presents a high-stakes risk that must be effectively managed to ensure operational resilience. Organisations that are heavily reliant on IT to provide services, control data, and establish trust with customers must prioritise risk avoidance as part of their operational resilience plan.