%term

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Pagerbeauty: Creating an Open Source On-Call Widget | Sergii Tkachenko (Google)

Aug 12, 2020 By Datadog In Datadog

A brief and entertaining story of PagerBeauty — a widget showing who's PagerDuty on-call on Datadog dashboards. How to get a pet project from "barely working MVP" to top 10 on Show Hacker News in a week (or so).

View Video

Datadog

Read more about Pagerbeauty: Creating an Open Source On-Call Widget | Sergii Tkachenko (Google)

How to Spot Website Errors and Reduce Troubleshooting Time

Aug 12, 2020 By Pingdom In SolarWinds

Errors and bugs are a nightmare for any software engineer or developer. Even though errors can seem like a bad experience for any developer or website owner, errors can help improve the quality of a website. You may be wondering, “But how?” Errors pinpoint the weaker parts of the website, giving you direction of what to work on.

Read Post

SolarWinds

Read more about How to Spot Website Errors and Reduce Troubleshooting Time

community.icinga.com

Aug 12, 2020 By Feu Mourek In Icinga

The community forum is a place where you can meet and chat with other Icinga users. It’s hosted by Icinga and moderated by both the Icinga team and members of the community. It’s mostly being used as a platform to ask and answer technical questions about Icinga, which is a great way to learn more about the tool stack! What does it look like? It’s a discourse platform, so it’s a collection of threads or topics which are open for anyone to leave a comment on!

Read Post

Icinga

Read more about community.icinga.com

Debugging AWS Lambda Timeouts

Aug 12, 2020 By Yan Cui In Lumigo

Some time ago, an ex-colleague of mine at DAZN received an alert through PagerDuty. There was a spike in error rate for one of the Lambda functions his team looks after. He jumped onto the AWS console right away and confirmed that there was indeed a problem. The next logical step was to check the logs to see what the problem was. But he found nothing. And so began an hour-long ghost hunt to find clues as to what was failing and why there were no error messages.

Read Post

Lumigo

Read more about Debugging AWS Lambda Timeouts

Pandora FMS: Check Mk alternative

Aug 12, 2020 By Laura Cano In Pandora FMS

Checkmk was created and developed by German Mr. Mathias Kettner, and since 2007 there is an open source version. In this article, we will get to know Checkmk Open Source, which contains 90% of all the code of said monitoring software, and a Check Mk alternative… But since there are several versions, let’s see them first!

Read Post

Pandora FMS

Read more about Pandora FMS: Check Mk alternative

Interview: Why Applications Fail and What to Do About It

Aug 12, 2020 By Polly Traylor In OpsRamp

Lee Atchison is a recognized industry thought leader in cloud computing and has significant experience architecting and building high scale, cloud-based, service oriented, SaaS applications. Formerly the Senior Director for Cloud Architecture at New Relic, Lee is now the owner of Atchison Technology LLC, a cloud consulting and advising firm. Lee is also the author of “Architecting for Scale,” a book published by O’Reilly Media.

Read Post

OpsRamp

Read more about Interview: Why Applications Fail and What to Do About It

Introducing the Sumo Logic Observability suite with distributed tracing (beta) - a cornerstone of cloud-native APM

Aug 12, 2020 By Pawel Brzoska In Sumo Logic

Last week Sumo Logic announced our new Observability Suite, which included the public introduction of the closed beta for our distributed tracing capabilities as part of our Microservices Observability solution. This new solution will provide end-to-end visibility into user transactions across services, as well as seamless integration into performance metrics and logs to accelerate issue resolution and root-cause analysis. In this blog, we’ll explore the new solution in detail.

Read Post

Sumo Logic

Read more about Introducing the Sumo Logic Observability suite with distributed tracing (beta) - a cornerstone of cloud-native APM

ChaosSearch Announces New Integration With Opsgenie

Aug 12, 2020 By Kevin Davis In ChaosSearch

ChaosSearch is excited to announce its new integration with Opsgenie — Atlassian’s alerting and incident management platform. Using this integration, your teams can leverage the industry’s most powerful and comprehensive data monitoring and analytics capabilities channeled into a unified workflow through Opsgenie’s easy-to-use interface.

Read Post

ChaosSearch

Read more about ChaosSearch Announces New Integration With Opsgenie

Backups Suck (But They Don't Have to)

Aug 12, 2020 By Rich Davis In Galileo

Focus on what matters with instant visibility into the condition of your backup application and detailed analytics to quickly pinpoint where any issues lie. IBM’s backup monster, Spectrum Protect (TSM as we called back in the day), sucks. Not because the software sucks – it’s actually the best there is – but because backups suck in general. It’s the quintessential necessary evil of IT.

Read Post