|
By Mezmo
Runbooks are rarely missing because teams don't value them. They're usually missing because incident response, follow-up, and platform work compete for the same limited time. By the time an issue is resolved, the knowledge is fresh, but the window to document it is already closing. That gap creates familiar failure modes: over-reliance on senior engineers, slower handoffs, and less confidence for whoever is on call next.
|
By Mezmo
How platform and SRE teams are using Mezmo's open-core agent framework — with any LLM, any tools, any observability backend.
|
By Henry Andrews
Over the last year, I’ve talked to dozens of SRE teams about AI. The excitement is real, but conversations hit a wall when we get to production reality. How does an agent manage complex context without losing the plot? How does it avoid hallucinating relationships between signals? Who owns the orchestration logic that ties it all together? We realized the bottleneck wasn’t model intelligence. It was the lack of a reliable logic layer between the data and the model.
|
By Mezmo
Grok structures logs. Context engineering connects systems. AI explains behavior. For years, Grok patterns have been the workhorse of the SRE world. Built on regular expressions, Grok helps teams extract structure from unstructured logs. As we explored in "Do You Grok It?", Grok is the key to turning messy log lines into usable fields. It's why our Grok Pattern Reference remains one of our most-visited resources — SREs are hungry for structure.
|
By Mezmo
As budgets reset for 2026, engineering leaders are making a resolution: no more vendor lock-in. Here’s how to keep that promise by building on the technical foundations of data reliability and simplified collection. It’s January 2026, and if you’re like most engineering leaders, you’re staring at your observability vendor contracts with a mix of frustration and resignation.
|
By Mezmo
A note from Lauren Nagel, Mezmo's VP of Product: At Mezmo, we believe the best observability tools aren't just built for users, they're built with them. Since the launch of Mezmo's AI SRE agent, we've listened and learned from our customers. The feedback and insights have been invaluable in helping our teams refine and enhance the experience. Today, we're excited to share our latest release, packed with improvements and powerful new capabilities that make our AI SRE even faster and more intuitive.
|
By Mezmo
This is blog 2 in our New Year, New Resolution Series on OTel migrations. Read the first post, "New Year, New Telemetry: Resolve to Stop Breaking Dashboards", here. Most New Year’s resolutions fail because they require a "big bang" change. If your 2026 mandate is to migrate to OpenTelemetry (OTel), the traditional approach is the definition of friction.
|
By Mezmo
It's 2026. Your New Year's resolution was to finally migrate to OpenTelemetry. But you're staring at dozens of dashboards that depend on your current data format, and that migration deadline is looming... Sound familiar? If you're an SRE or Platform Engineer facing a top-down OTel mandate, you're not alone. The challenge isn't just about adopting a new standard—it's about doing so without disrupting the observability systems your team depends on every day.
|
By Mezmo
By Bill Balnave, VP of Customer Success at Mezmo The core promise of modern observability is simple: cut Mean Time To Resolution (MTTR). Yet, despite a boom in tooling and investment over the last four years, the data tells a sobering story: our industry is actually getting worse at finding and resolving issues. Dashboards, once our trusted guide, have become the starting point for a chaotic "dashboard hunt" that rarely leads to the definitive root cause.
|
By Mezmo
For SREs juggling multiple services, third-party dependencies, and constant alerts, a critical service slowdown can quickly turn into chaos. APM Dashboards may show everything is fine, yet users are still experiencing problems. That gap—between application telemetry and real-world performance—can turn a five-minute fix into a two-hour war room.
|
By Mezmo
See how Mezmo LiveTail helps teams move from passive log search to active, real-time investigation. In this demo, you'll watch live telemetry stream across services and environments, identify emerging issues as they happen, and use real-time context to troubleshoot faster before signals are delayed, buried, or lost in the noise. LiveTail is part of Mezmo's Active Telemetry platform — built for platform engineers and SREs who need immediate visibility into what's happening across their stack right now, not after the fact.
|
By Mezmo
AI-powered root cause analysis only works when the data going into the model is clean, relevant, and structured. In this demo, we show how Mezmo's Active Telemetry approach helps engineers and SREs move from noisy application errors to immediate clarity. Using a restaurant ordering application running in Kubernetes, we trigger a database connection pool exhaustion issue and walk through two ways to investigate it with Mezmo.
|
By Mezmo
This video shows how Mezmo's AI Assistant turns noisy telemetry into clear answers when errors spike. By preprocessing data and surfacing only the most relevant patterns, Mezmo quickly identifies issues like database connection failures or resource shortages and delivers actionable recommendations. Watch how AI-powered root cause analysis helps teams troubleshoot faster and with confidence. Mezmo's AI Assistant is built for platform engineers and SREs who need fast, reliable root cause analysis across high-volume telemetry pipelines — without manually sifting through noise.
|
By Mezmo
Watch AURA autonomously respond to a production incident in real time—from building its reasoning context and querying PagerDuty and ClickHouse, to triggering a human-in-the-loop approval with the on-call SRE, to removing the stuck pod and validating remediation. Every behavior is defined in a simple config. AURA is Mezmo's AI-powered incident response agent built for platform engineers and SREs managing high-volume telemetry pipelines.
|
By Mezmo
Many engineering teams rely on ElasticSearch for search and analytics, but as data volumes grow, so do the challenges of scale, cost, and performance. At Mezmo, we faced this reality head-on, recognizing the need for a more efficient and scalable solution to support our multi-cluster, multi-petabyte telemetry data backend. After extensive evaluation, we made the leap to Quickwit, an open-source, cloud-native search engine for logs. But making such a fundamental architectural shift—without disrupting customers—was no small feat.
|
By Mezmo
Managing telemetry data efficiently is a constant balancing act—how do you maximize visibility while controlling costs? In this webinar, we’ll show you how Mezmo’s telemetry pipeline helps you make smarter decisions about your data.
|
By Mezmo
Are you looking to enhance your observability and gain deeper insights into your systems? Curious about how a Telemetry Pipeline can revolutionize your monitoring and troubleshooting capabilities while keeping the cost low? Join Mezmo’s Bill Balnave (Vice President of Technical Services) for an insightful webinar unraveling Telemetry Pipeline’s key concepts, highlighting its significance in modern software development and operations. Discover how a Telemetry Pipeline enables you to collect, profile, transform, and analyze crucial telemetry data from your applications and infrastructure.
|
By Mezmo
Watch our discussion on the 2024 DORA Accelerate State of DevOps report, where we dive into insights impacting software delivery, organizational strategy, and AI adoption in DevOps. We’ll review key findings and highlight practical steps for leaders to optimize development and delivery performance. Whether your organization is embracing AI, building internal platforms, or addressing burnout and resilience, this webinar will provide actionable takeaways for adapting to today’s evolving DevOps landscape.
|
By Mezmo
In today's digital-first, cloud-native world, effective log management is crucial. It enhances software quality, operational efficiency, and the customer experience. However, with the rise of distributed and microservices-based architectures, organizations now generate petabytes of log data daily, making analysis and storage increasingly challenging.
|
By Mezmo
The exponential growth of telemetry data presents a significant challenge for organizations, who often overspend on data management without fully capitalizing on its potential value. To unlock the true potential of their telemetry data, organizations must treat it as a valuable enterprise asset, applying rigorous data engineering principles to glean the critical insights and accelerated investigations this data is meant to enable. The telemetry data platform approach democratizes access across disciplines and personas and fosters widespread utilization across the organization.
|
By Mezmo
Logging in the age of DevOps has become harder and more critical than ever because it is key to maintaining visibility and security in today's fast-moving, highly dynamic environments. With these needs and challenges in mind, Mezmo has prepared this eBook to offer guidance on how best to approach the log management challenges that teams face today.
|
By Mezmo
A growing number of log management solutions available on the market today are offered as cloud-only services. Although cloud logging has its benefits, many organizations have requirements that can only be fulfilled with self-hosted/on-premises log management systems.
|
By Mezmo
Here's a complete guide covering all core components to help you choose the best log management system for your organization. From scalability, deployment, compliance, and cost, to on-prem or cloud logging, we identify the key questions to ask as you evaluate log management and analysis providers.
|
By Mezmo
Despite having an extensive feature set and being open source, organizations are beginning to realize that a free ELK license is not free after all. Rather, it comes with many hidden costs due to hardware requirements and time constraints that easily add to the total cost of ownership (TCO). Here, we uncover the true cost of running the Elastic Stack on your own vs using a hosted log management service.
- April 2026 (5)
- March 2026 (2)
- February 2026 (1)
- January 2026 (4)
- December 2025 (1)
- November 2025 (3)
- October 2025 (1)
- September 2025 (4)
- August 2025 (5)
- July 2025 (7)
- June 2025 (5)
- May 2025 (3)
- April 2025 (5)
- March 2025 (1)
- February 2025 (2)
- January 2025 (1)
- December 2024 (4)
- November 2024 (6)
- October 2024 (3)
- September 2024 (5)
- August 2024 (4)
- July 2024 (4)
- June 2024 (5)
- May 2024 (4)
- April 2024 (6)
- March 2024 (1)
- February 2024 (2)
- January 2024 (2)
- December 2023 (5)
- November 2023 (2)
- October 2023 (5)
- September 2023 (1)
- July 2023 (1)
- June 2023 (4)
- May 2023 (1)
- April 2023 (8)
- March 2023 (2)
- February 2023 (6)
- January 2023 (4)
- December 2022 (3)
- November 2022 (4)
- October 2022 (3)
- September 2022 (1)
- August 2022 (2)
- July 2022 (2)
- June 2022 (3)
- May 2022 (1)
- April 2022 (3)
- March 2022 (2)
- February 2022 (2)
- January 2022 (3)
- December 2021 (7)
- November 2021 (4)
- October 2021 (11)
- September 2021 (4)
- August 2021 (5)
- July 2021 (6)
- June 2021 (7)
- May 2021 (9)
- April 2021 (3)
- March 2021 (6)
- January 2021 (1)
- November 2020 (2)
- October 2020 (2)
- September 2020 (3)
- August 2020 (5)
- July 2020 (9)
- June 2020 (8)
- May 2020 (3)
- April 2020 (2)
- March 2020 (1)
- February 2020 (1)
- January 2020 (4)
- November 2019 (3)
- October 2019 (4)
- September 2019 (1)
- August 2019 (2)
- July 2019 (7)
- June 2019 (5)
- May 2019 (7)
- April 2019 (9)
- March 2019 (4)
- February 2019 (8)
- January 2019 (9)
- December 2018 (8)
- November 2018 (12)
- October 2018 (4)
- September 2018 (1)
- July 2018 (3)
- May 2018 (2)
- April 2018 (3)
- July 2017 (1)
Log Management Modernized. Instantly collect, centralize, and analyze logs in real-time from any platform, at any volume.
Why Mezmo?
- Powerful Logging at Scale: Get powerful log aggregation, auto-parsing, log monitoring, blazing fast search, custom alerts, graphs, visualization, and a real-time log analyzer in one suite of tools. We handle hundreds of thousands of log events per second, and 20+ terabytes per customer, per day and boast the fastest live tail in the industry. Whether you run 1 or 100,000 containers, we scale with you.
- Easy, Instant Setup: Mezmo's SaaS log management platform sets up in under two minutes. Instantly collect logs from AWS, Docker, Heroku, Elastic, and more with the flexibility to deploy anywhere - cloud, multi-cloud, or self-hosted. Logging in Kubernetes? Logs start flowing in just 2 kubectl commands. Whether you wish to send logs via Syslog, Code library, or agent, we have hundreds of custom integrations.
- Affordable: Mezmo’s simple, pay-per-GB pricing model eliminates contracts, paywalls, and fixed data buckets. Try our free plan, or only pay for the data you use with no overage charges or data limits. Our user-friendly, frustration-free interface allows your team to get started with no special training required, saving even more time and money.
- Secure & Compliant: Our military grade encryption ensures your logs are fully secure in transit and storage. We offer SOC2, PCI, and HIPAA-compliant logging. To comply with GDPR for our EU/Swiss customers, we are Privacy Shield certified. The privacy and security of your log data is always our top priority, and we are ready to sign Business Associate Agreements.
Blazing fast, centralized log management that's intuitive, affordable, and scalable.