Latest News

Elastic Observability: Driving mean time to resolution to zero

Oct 5, 2021 By Gagan Singh In Elastic

At ElasticON Global 2021, Tanya Bragin, VP Product, Observability, and the Elastic Observability team showed how ongoing innovations continue to deliver actionable insights and faster root cause detection, reducing mean time to resolution (MTTR). The adoption of cloud, microservices, and ephemeral infrastructure is driving increased complexity, requiring an observability solution to provide end-to-end visibility.

Read Post

Elastic

Read more about Elastic Observability: Driving mean time to resolution to zero

Announcing General Availability of the Honeycomb Query Data API

Oct 5, 2021 By Phillip Carter In Honeycomb

The Query Data API is a Honeycomb Enterprise feature. With a Honeycomb Enterprise account, you can use this API today. Head over to our API docs to learn how to get access to your data. If you aren’t yet a Honeycomb Enterprise user, try it out by requesting an Enterprise Trial. Starting today, Honeycomb Enterprise customers can use the Honeycomb Query Data API to programmatically run queries and retrieve their results, and pull query results into any data visualization tool of their choice.

Read Post

Honeycomb

Read more about Announcing General Availability of the Honeycomb Query Data API

Reliable Alerting with Icinga and SIGNL4

Oct 5, 2021 By Angelika Bang In Icinga

You’ve probably been in this situation before – you’re using Icinga to monitor your infrastructure and Icinga detects a critical issue but nobody notices it. It might be an urgent maintenance request, an unexpected breakdown, or a service quality issue. But your technicians or service engineers are neither in the control room nor in front of the dashboard to see the issue and its urgency.

Read Post

Icinga

Read more about Reliable Alerting with Icinga and SIGNL4

Facebook, Instagram, and WhatsApp Down for Over Five Hours

Oct 5, 2021 By Pingdom In SolarWinds

Did you unconsciously open Instagram, Facebook, or WhatsApp several times throughout the day on Monday, only to get a “Couldn’t refresh feed” message? Did you try again every 20 or so minutes? Did you maybe even restart your phone, not once thinking websites as large as Facebook could possibly go down and believing it must be your own technology? Rejoice: it’s not you, it’s them.

Read Post

SolarWinds

Read more about Facebook, Instagram, and WhatsApp Down for Over Five Hours

The Future of AIOps Includes an ITOps Strategy

Oct 5, 2021 By meshIQ In meshIQ

One of the questions I get asked a lot by customers, prospects, and partners is, “Will AIOps make them irrelevant?” To them, AIOps is often equivalent to automated remediation; an AIOps system automatically detects an incident and kicks off a remediation process in response to this incident, knowing exactly what process will solve the problem. IT is out of the loop, data centers and NOCs just keep humming along unattended, end users are none the wiser.

Read Post

meshIQ

Read more about The Future of AIOps Includes an ITOps Strategy

Facebook's historic outage, explained

Oct 5, 2021 By Doug Madory In Kentik

Yesterday the world’s largest social media platform suffered a global outage of all of its services for nearly six hours during which time, Facebook and its subsidiaries, including WhatsApp, Instagram and Oculus, were unavailable.

Read Post

Kentik

Read more about Facebook's historic outage, explained

Facebook, Instagram, and Whatsapp's Outage - Understanding MTTR

Oct 5, 2021 By Deepa Ramachandra In ObservIQ

Yesterday the most used social media platforms in the world were inaccessible for 6 hours straight. Later, in a press release, Facebook revealed that the outage was due to configuration changes in their routers. There is no doubt that Facebook has an intense incident response plan, yet a small blind spot resulted in a significant business interruption. So how do we avoid this? The truth is, outages and performance issues are bound to happen in any network.

Read Post

ObservIQ

Read more about Facebook, Instagram, and Whatsapp's Outage - Understanding MTTR

AIOps and Performance Monitoring: A One-Two Punch for IT Operations

Oct 5, 2021 By ScienceLogic In ScienceLogic

Sugar Ray Robinson and Jake LaMotta. Marvelous Marvin Hagler and Tommy Hearns. Muhammad Ali and Joe Frazier. All were among history’s greatest boxers, but when they met in the ring, each brought out the best in the other. It’s the same in IT management. There are tools and platforms that on their own are essential to IT operations; but when paired as an infrastructure management tandem, each complements the other, ensuring maximal efficacy of both systems.

Read Post

ScienceLogic

Read more about AIOps and Performance Monitoring: A One-Two Punch for IT Operations

Better Kubernetes application monitoring with GKE workload metrics

Oct 5, 2021 By Nathan Beach In Google Operations

The newly released 2021 Accelerate State of DevOps Report found that teams who excel at modern operational practices are 1.4 times more likely to report greater software delivery and operational performance and 1.8 times more likely to report better business outcomes. A foundational element of modern operational practices is having monitoring tooling in place to track, analyze, and alert on important metrics.

Read Post

Google Operations

Read more about Better Kubernetes application monitoring with GKE workload metrics

Monitoring Kubernetes with Prometheus

Oct 5, 2021 By Prince Sinha In Scout

Kubernetes is among the emerging open-source products expanding in the market at a very fast rate. It is a portable, extensible, and open-source platform used for managing containerized workloads and services. Companies are widely adopting it for the development of their major products. Docker is always used for running Kubernetes servers on local systems for testing purposes. It becomes essential for companies to monitor their Kubernetes container.

Read Post

Scout

Read more about Monitoring Kubernetes with Prometheus

Operations | Monitoring | ITSM | DevOps | Cloud

Elastic Observability: Driving mean time to resolution to zero

Announcing General Availability of the Honeycomb Query Data API

Reliable Alerting with Icinga and SIGNL4

Facebook, Instagram, and WhatsApp Down for Over Five Hours

The Future of AIOps Includes an ITOps Strategy

Facebook's historic outage, explained

Facebook, Instagram, and Whatsapp's Outage - Understanding MTTR

AIOps and Performance Monitoring: A One-Two Punch for IT Operations

Better Kubernetes application monitoring with GKE workload metrics

Monitoring Kubernetes with Prometheus

Monthly Archive

Follow Us