Latest News

Diagnosing out-of-memory errors on Linux

Jul 9, 2020 By Paul Gottschling In Datadog

Out-of-memory (OOM) errors take place when the Linux kernel can’t provide enough memory to run all of its user-space processes, causing at least one process to exit without warning. Without a comprehensive monitoring solution, OOM errors can be tricky to diagnose. In this post, you will learn how to use Datadog to diagnose OOM errors on Linux systems.

Read Post

Datadog

Read more about Diagnosing out-of-memory errors on Linux

How to use check aggregates in Sensu Go

Jul 9, 2020 By Aaron Sachs In Sensu

Aggregates, which allow you to monitor groups of checks or entities, were a much-beloved feature in Sensu Core (the predecessor to Sensu Go) — Ben Abrams describes them as “awesome” in his post on alert fatigue, noting that aggregates are like having “a bunch of nodes behind a load balancer where each node is healthchecked, and if a node drops out it may not be worth waking someone up in the middle of the night.”

Read Post

Sensu

Read more about How to use check aggregates in Sensu Go

Timestamps On Downtime Alerts

Jul 9, 2020 By Downtime Monkey In Downtime Monkey

We've made a useful improvement to Downtime Monkey alerts. Each downtime alert now includes a timestamp that shows the time that the website went down and each uptime alert includes a timestamp that shows the time that the website came back up. This turned out to be more work than expected, largely because we thought we'd knock it out in under an hour :) Although it wasn't totally straightforward to develop, the end-result is incredibly simple to use...

Read Post

Downtime Monkey

Read more about Timestamps On Downtime Alerts

5 Serverless AWS Core Services Everyone Should Have in Their Starter Toolkit

Jul 9, 2020 By Mariliis Retter In Dashbird

When first looking into serverless migration and its architecture, it can feel like you’re staring down an endless shopping aisle of critical serverless tools that all need to be put into your basket straight away. Some services seem to offer the same function, while others can feel wildly different - both, as a result, can instill some doubts as to what is really necessary for your business and serverless application.

Read Post

Dashbird

Read more about 5 Serverless AWS Core Services Everyone Should Have in Their Starter Toolkit

Pandora FMS 747 Release

Jul 9, 2020 By Pandora FMS team In Pandora FMS

These release notes describe new features, improvements and fixed issues in Pandora FMS NG 747. They also provide information about upgrades and describe some workarounds for known issues.

Read Post

Pandora FMS

Read more about Pandora FMS 747 Release

SRE Report 2020 - Balancing 'Dev' and 'Ops'

Jul 9, 2020 By Catchpoint Systems In Catchpoint

We recently released Catchpoint’s SRE Report 2020 that analyzed results from the SRE survey we conducted early this year along with a recent addendum survey. The report offers a detailed look at the current state of SRE and how the shift to an all-remote work environment has impacted SRE teams. In this blog, we take a deeper look at one of the report highlights – ‘Heavy Ops Workload Comes at a Cost’.

Read Post

Catchpoint

Read more about SRE Report 2020 - Balancing 'Dev' and 'Ops'

How to Create a Python Stack

Jul 9, 2020 By Mukul Khanna In Scout

All programming languages provide efficient data structures that allow you to logically or mathematically organize and model your data. Most of us are familiar with simpler data structures like lists (or arrays) and dictionaries (or associative arrays), but these basic array-based data structures act more as generic solutions to your programming needs and aren’t really optimized for performance on custom implementations. There’s much more than programming languages bring to the table.

Read Post

Scout

Read more about How to Create a Python Stack

Best practices for alerting on Kubernetes

Jul 9, 2020 By Jorge Salamero Sanz In Sysdig

A step by step cookbook on best practices for alerting on Kubernetes platform and orchestration, including PromQL alerts examples. If you are new to Kubernetes and monitoring, we recommend that you first read Monitoring Kubernetes in production, in which we cover monitoring fundamentals and open-source tools. Interested in Kubernetes monitoring?

Read Post

Sysdig

Read more about Best practices for alerting on Kubernetes

Monitoring Kubernetes in Production

Jul 9, 2020 By Carlos Arilla In Sysdig

Monitoring Kubernetes, both the infrastructure platform and the running workloads, is on everyone’s checklist as we evolve beyond day zero and into production. Traditional monitoring tools and processes aren’t adequate, as they do not provide visibility into dynamic container environments. Given this, what tools can you use to monitor Kubernetes and your applications?

Read Post

Sysdig

Read more about Monitoring Kubernetes in Production

Monitoring Your Dynamic Cloud Infrastructure

Jul 9, 2020 By Brian Gladden In LogicMonitor

Fully taking advantage of cloud infrastructure includes the ability to scale up and down dynamically, taking the need and load off your services. The compute services like Amazon Web Services (AWS) EC2, Azure Virtual Machines (VM), and Google Cloud Platform (GCP) Compute Engine allow Auto Scaling of the instances of the service. This helps manage the responsiveness and costs of your cloud services by ensuring that the instance counts go up and down depending on demand.

Read Post

LogicMonitor

Read more about Monitoring Your Dynamic Cloud Infrastructure

Operations | Monitoring | ITSM | DevOps | Cloud

Diagnosing out-of-memory errors on Linux

How to use check aggregates in Sensu Go

Timestamps On Downtime Alerts

5 Serverless AWS Core Services Everyone Should Have in Their Starter Toolkit

Pandora FMS 747 Release

SRE Report 2020 - Balancing 'Dev' and 'Ops'

How to Create a Python Stack

Best practices for alerting on Kubernetes

Monitoring Kubernetes in Production

Monitoring Your Dynamic Cloud Infrastructure

Monthly Archive

Follow Us