Operations | Monitoring | ITSM | DevOps | Cloud

Monitoring

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Send your logs to multiple destinations with Datadog's managed Log Pipelines and Observability Pipelines

As your infrastructure and applications scale, so does the volume of your observability data. Managing a growing suite of tooling while balancing the need to mitigate costs, avoid vendor lock-in, and maintain data quality across an organization is becoming increasingly complex. With a variety of installed agents, log forwarders, and storage tools, the mechanisms you use to collect, transform, and route data should be able to evolve and adjust to your growth and meet the unique needs of your team.

Automatic log level detection reduces your cognitive load to identify anomalies at 3 am

Let’s face it, when that alert goes off at 2:58am, abruptly shaking you out of a deep slumber because of a high-priority issue hitting the application, you’re not 100% “on”. You need to shake the fog out of your head to focus on the urgent task of fixing the problem. This is where having the best log analytics tool can take on some of that cognitive load. Sumo Logic recently released new features specific to our Log Search queries that automatically detect log levels.

Integration roundup: Monitoring your AI stack

Integrating AI, including large language models (LLMs), into your applications enables you to build powerful tools for data analysis, intelligent search, and text and image generation. There are a number of tools you can use to leverage AI and scale it according to your business needs, with specialized technologies such as vector databases, development platforms, and discrete GPUs being necessary to run many models. As a result, optimizing your system for AI often leads to upgrading your entire stack.

Using UX and Observability to Track Application Health

UX (user experience) is a core factor that determines the success of an application or platform in a distributed system. Specifically, developers need to understand the infrastructure within an entire application stack to improve and refine the user experience to meet customer expectations without guesswork. System downtime remains a significant source of revenue and reputational losses for enterprises, employees, and customers.

Enhance code reliability with Datadog Quality Gates

Maintaining the quality of your code becomes increasingly difficult as your organization grows. Engineering teams need to release code quickly while still finding a way to enforce best practices, catch security vulnerabilities, and prevent flaky tests. To address this challenge, Datadog is pleased to introduce Quality Gates, a feature that automatically halts code merges when they fail to satisfy your configured quality checks.

Store and analyze high-volume logs efficiently with Flex Logs

The volume of logs that organizations collect from all over their systems is growing exponentially. Sources range from distributed infrastructure to data pipelines and APIs, and different types of logs demand different treatment. As a result, logs have become increasingly difficult to manage. Organizations must reconcile conflicting needs for long-term retention, rapid access, and cost-effective storage.

Stopping Agents and modules during a time interval

Learn how to create planned stops. Imagine that we are monitoring our resources, and we have to do a maintenance of an equipment that is being monitored, that equipment, or some of its services, will stop being operative and false alerts and events will be generated. To avoid this, there are the planned stops, a PandoraFMS system that allows us to deactivate the agents and modules that we select during a time interval, this way we will avoid receiving false alerts and events.