Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

6 Best Practices for Tuning Network Monitoring Alerts

Network monitoring and alerting provide the foundation for efficient IT operations and cyber resilience. By keeping track of the status and performance of network infrastructure and applications, network monitoring tools can automatically generate alerts when defined thresholds are exceeded or specific events occur. These network monitoring alerts allow IT teams to detect outages, performance degradation, and potential security incidents so they can respond swiftly to minimize disruption.

Momentum: Announcing 268 Million Downloads & 320% Growth in 2023

We’re happy to announce a landmark 320% growth in 2023! VictoriaMetrics, our open source time series database and monitoring solution, already hit 268 million downloads this year (still counting), and received close to 13,000 stars on GitHub.

Demystifying Cloud and Cloud-Native Observability

In the ever-evolving and fast-changing landscape of cloud computing and modern software development, achieving 360-degree visibility into your critical business services, applications and infrastructure is essential. This is where observability comes into play. Observability, especially in a cloud-based or cloud-native environment, has become a critical aspect of maintaining and optimizing complex systems and services.

Bringing it all together: Speed, performance, and efficiency in InfluxDB 3.0

For most of the past year, we here at InfluxData focused on shipping the latest version of InfluxDB. To date, we launched three commercial products (InfluxDB Cloud Serverless, InfluxDB Cloud Dedicated, and InfluxDB Clustered), with more open source options on the way. All the while, we claimed that this latest version of InfluxDB surpasses anything we built before.

Crossed 15K+ GitHub Stars, Simplified Logs Parsing with Pipelines & Trending on Hacker News - SigNal 30

Welcome to the 30th edition of our monthly product newsletter - SigNal 30! Last month, our Github repo crossed 15k+ Github stars, which is a great milestone for our open-source project and for our team. We also shipped the much-awaited logs pipeline that will make logs parsing a much better experience for our users. We also shipped other improvements to the product, hosted OpenTelemetry meetups and webinars, and much more.

Monitoring Single Page Applications with Synthetics and Browser-based RUM

Businesses today are increasingly dependent on Single Page Applications (SPAs) for better user experiences. A Single Page App is when a user loads a web document and the application then updates different parts of the page with background requests. This is opposed to the more traditional Multi-page Applications (MPAs) where each click loads a different web document. Like the way you’re (hopefully) reading different pages on this web server.

EventSentry v5.1: Anomaly Detection / Permission Inventory / Training Courses & More!

We’re extremely excited to announce the availability of the EventSentry v5.1, which will detect threats and suspicious behavior more effectively – while also providing users with additional reports and dashboards for CMMC and TISAX compliance. The usability of EventSentry was also improved across the board, making it easier to use, manage and maintain EventSentry on a day-by-day basis. We also released 60+ training videos to help you get started and take EventSentry to the next level.

Sponsored Post

Taking down (and restoring) the Raygun ingestion API

In a world where Software as a Service (SaaS) products are integral to daily life, maintaining uninterrupted service for end-users is paramount. However, stuff happens. When it does, our most valuable response (other than restoring service ASAP) is to review the series of events that led up to the incident and learn from them. On August 25th, 2023, at 7:02 AM NZT, Raygun experienced a significant incident that impacted our API ingestion cluster, leading to an outage lasting approximately 1 hour and 15 minutes. While this wasn't fun for anyone involved, this incident did prove to be a valuable learning experience, shedding light on the importance of infrastructure management and resilience.

The Role of Generative AI and Large Language Models in IT Operations

Artificial intelligence, particularly generative AI and large language models have changed how we approach IT operations, cybersecurity, and observability. And though we can point to measurable benefits and outcomes from applying LLMs to ITOps, there is also a lot of speculation to deal with. Phillip Gervasi, Director of Technical Evangelism at Kentik, and Christoph Pfister, Chief Product Officer at Kentik, discuss what generative AI and LLMs are, how they can be used to improve IT operations, and what the future might hold.