Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Full-stack Observability, MELT, and Why You Need a Data Fabric

Full-stack observability is a term that you may have heard being tossed around in many conversations on the topic of observability. What does it mean? Full-stack observability constitutes having visibility into all layers of your technology stack. Collecting, correlating, and aggregating all telemetry in the components provides insight into the behavior, performance, and health of the system.

Understanding Anomaly detection with time-series metrics

Observability data has three types: metrics, traces, and logs. In this article, we will look at how anomaly detection techniques can be applied to time-series metrics for observability use cases. There are many different anomaly detection algorithms, but they all share a common goal: to find data points that are significantly different from the rest of the data. This can be useful for identifying outliers, monitoring for unusual behavior, and detecting errors in data collection.

Are you stuck with alert fatigue in your observability stack?

If you’re using an observability stack, chances are you’re familiar with alerting. Alerts help users get notified when something interesting happens that they need to act on. Alerting comes with its challenges when too many alerts become a problem and not the solution. Is your team stuck with alert fatigue? It is a very real problem, and it’s not uncommon for observability solutions to generate false alerts.

10 Reasons Why You Need a Machine Data Fabric for Your Digital Transformation Journey

Businesses everywhere are going through digital transformation. This means that they are adapting their operations and strategies to better compete in the digital age. A key part of this transformation is better managing the large volumes of machine data generated every day. Machine data management can be complex, but with the help of a machine data fabric, organizations can tame these complexities and successfully scale their challenges.

Prometheus, logs and root cause analysis

Prometheus is a wildly deployed open source monitoring system for time series metrics. For observability use cases, it is important to bring together logs and metrics for root cause analysis into a tightly integrated framework to help faster root cause. A Prometheus deployment is configured with scrape targets from which metrics are collected periodically. The data is stored in a multi-dimensional data model with metric data stored along with a set of key-value pairs, commonly referred to as labels.

Demystifying Prometheus for Beginners

Metrics are the primary means of representing your system’s general health and any other valuable information for monitoring, alerting, and observability. Despite the Kubernetes ecosystem growing by the day, the community continues to rely on particular technologies to ease operation workloads. One of them is Prometheus. Prometheus bridges the gap between monitoring and alerting.

Tail logs live and debug production issues faster

For all the productivity gains created by DevOps, there’s a nasty side effect. Short release cycles, lots of infrastructure changes, and developer-driven changes to live environments make for a recipe for frequent production issues. When problems occur, developers are being pulled to troubleshoot, which can be painful. Have you ever been on a production server and needed to troubleshoot an issue? Maybe there is an NGINX error or a 700 error from your Redis store.

Partner Stories: Reducing Log Ingest by 90%

In late 2021 we kick started a project with a customer who had shown an interest in reducing log ingestion costs and reached out to us via Linkedin. Like many others, this customer had a combination of popular logging platforms in place that were built and put together for various reasons over time. No real reason why, just different departments, skills, tooling budgets and business needs influenced their position.

Replaying Historical Log Data On Demand

It’s likely that you already know how helpful logs can be in analyzing the inner workings of your IT environments. Beyond that, you may be familiar with the concept of using log data for debugging issues, identifying threats, or gaining context of performance bottlenecks. Log data from a single data source can fulfill a variety of use cases.

The LOGIQ.AI Changelog: January 2022

In January, the LOGIQ.AI team shipped plenty of new features, enhanced existing ones, and squashed a few bugs to make your full-stack observability and data pipeline control and storage experience using the LOGIQ.AI platform better than ever before. This blog captures the key features and enhancements we pushed last month.