Operations | Monitoring | ITSM | DevOps | Cloud

Stream AWS Metrics to Grafana with Last9 in 10 minutes

It’s 2:47 AM and your Lambda functions are timing out. API response times are spiking. You’re flipping between the CloudWatch console, your APM tool, and your logs, trying to figure out what’s going wrong. CloudWatch has the metrics you need: CPU usage, memory pressure, and request rates — but connecting that data to what your app is doing takes time. The delay in stitching it all together slows down your incident response.

Query and Analyze Logs Visually, Without Writing LogQL

It’s 2 AM. An incident’s in progress. Error rates are climbing. You jump into the logs, filter by service, adjust the time window… and now you need a LogQL query. You write one. It errors out. You fix the syntax, try again, only to realize you need a different filter or a new aggregation. Back to rewriting. By the time you’ve got the query right, you’ve already lost 10–15 minutes. The system is still broken, and you still don’t know why.

Trace Go Apps Using Runtime Tracing and OpenTelemetry

When your Go service hits 500ms latencies but CPU usage is flat, tracing gives you visibility into what the profiler misses. With 1–2% runtime overhead, Go’s built-in tracing tools help you: This makes it easier to debug performance regressions that don’t leave a clear footprint.

Kibana Logs: Advanced Query Patterns and Visualization Techniques

Kibana gives you a structured way to explore log data indexed in Elasticsearch. With the right queries and visualizations, you can identify anomalies, debug issues more quickly, and track trends across services. This blog covers practical ways to query logs using Kibana’s Lucene and KQL syntax, build visualizations that surface meaningful signals, and set up dashboards for ongoing log-based monitoring.

Enable Kong Gateway Tracing in 5 Minutes

Kong Gateway is a popular API gateway that sits at the edge of your infrastructure, routing and shaping traffic across microservices. It’s fast, pluggable, and battle-tested, but for many teams, it remains a black box. You might have OpenTelemetry set up across your application stack. Traces flow from your app servers, databases, and third-party APIs. But the moment a request enters through Kong, observability drops off.

Build Log Automation with Last9's Query API

Manual log investigation is one of those engineering tasks that quietly drains hours without offering much real value. You're debugging an incident. Monitoring shows elevated error rates. Now begins the familiar drill: It’s a tedious cycle, and it doesn’t scale. The whole process breaks down when you’re trying to automate incident response, run continuous security monitoring, or generate compliance reports.

Jaeger Metrics: Internal Operations and Service Performance Monitoring

You're monitoring a microservices-based system. Alerts trigger when response times exceed 2 seconds. But when you open Jaeger, you're faced with thousands of traces. Identifying which service or operation is responsible becomes time-consuming. Jaeger metrics help reduce this friction by exposing aggregated telemetry. Instead of scanning individual traces, you get service-level and operation-level performance metrics, latency, throughput, and error rates that highlight where the issue lies.

How to Get Grafana Iframe Embedding Right

Adding Grafana dashboards directly into your app lets users see monitoring data without switching tabs or tools. Using an iframe to embed Grafana does work, but it brings along some tricky authentication and security issues that aren’t always obvious at first. In this blog, we’ll go over the practical ways to embed Grafana dashboards from easy public snapshots to secure, private dashboards that need authentication.

Optimize LangChain Performance with Trace Analytics

You’ve instrumented your LangChain app, and traces are now flowing into Last9. Now the issues are visible: API costs are crossing $200/day, average response times exceed 3 seconds, and performance degrades under 100 concurrent users. A single tool call adds over 2 seconds. Bloated context windows are pushing up token usage, wasting $50/day. Here’s how to use trace data to identify and fix these inefficiencies, systematically and at scale.

Elasticsearch with Python: A Detailed Guide to Search and Analytics

If you’re using Python for search, log aggregation, or analytics, you’ve probably worked with Elasticsearch. It’s fast, scalable, and fairly complex once you go beyond the basics. The official Python client gives you raw access to Elasticsearch’s REST API. But getting it to work the way you want, especially under load, can be tricky. This blog walks through practical ways to index, query, and monitor Elasticsearch from Python code, without getting lost in the docs.