Operations | Monitoring | ITSM | DevOps | Cloud

What's new in Grafana Enterprise Metrics for scaling Prometheus: enhanced access control and a compactor that supports 650 million active series and beyond

I’m a fresh starter here at Grafana Labs, leading one of our teams working on the Grafana Enterprise Stack. As a longtime user of Grafana, I couldn’t wait to see what’s new in versions 1.1 and 1.2 of Grafana Enterprise Metrics (GEM), our scalable, self-hosted Prometheus service. I tried out the shiny features and wanted to share some of the cool things I found.

Painless Kubernetes monitoring and alerting

Kubernetes is hard, but lets make monitoring and alerting for Kubernetes simple! At iLert we are creating architectures composed of microservices and serverless functions that scale massively and seamlessly to guarantee our customers uninterrupted access to our services. As many others in the industry we are relying on Kubernetes when it comes to the orchestration of our services.

Guidelines for picking where to send monitoring alerts

If you've ever had to be on the receiving end of a monitoring system that uses email for alerts, you know how noisy things can get. Particularly if you're working in an agency or freelance-like environment, with dozens of client sites to maintain. You get so many emails that you start looking into integrations with third-party services like Zapier, and coming up with more and more complex rules to try reduce the noise.

What if You Could Autonomously Monitor Across Your Databases?

When DevOps teams talk about monitoring a database, the primary motivation is to ensure that the database won’t suffer a performance hiccup. Long queries, timeouts and table scans are among the most popular causes behind lousy customer experience. However, in recent years, more data has been shifted to cloud databases.

How to monitor website availability

“100% website availability.” Which webmaster would not want to see this availability report? Every website owner would like their website available for users to be 99.9% all of the time. Without a website that is accessible and running smoothly at any time of day, all web-related investments will go to waste. That is why website availability monitoring is so important.

Splunking AWS ECS And Fargate Part 3: Sending Fargate Logs To Splunk

Welcome to part 3 of the blog series where we go through how to forward container logs from Amazon ECS and Fargate to Splunk. In part 1, Splunking AWS ECS Part 1: Setting Up AWS And Splunk, we focused on understanding what ECS and Fargate are, along with how to get AWS and Splunk ready for log routing to Splunk’s Data-to-Everything Platform.

Using the Icinga Web API

Unfortunately, there is little to no documentation for using the Icinga Web API to perform monitoring actions such as scheduling downtimes. But it’s a simple thing and I’ll give you a quick example of how to do it. Using the Icinga Web API instead of the Icinga API gives you the advantages of the permission and restriction system, various authentication methods and auditing.

5 Things to Know About the Orion Platform Database Today

1. It’s the Back End for 14 Products and Connectors The full list is published here. The key is, a lot of products all leverage this single database, though not every customer runs every product. Over 20 years, the Orion Platform has evolved into, well, a pickup truck. It’s a utility vehicle, doing the best job it can for everyone who needs a ride. As such, this pickup truck works great most of the time.

SLF4J Tutorial: Example of How to Configure It for Logging Java Applications

Logging is a crucial part of the observability of your Java applications. Combined with metrics and traces gives full observability into the application behavior and is invaluable when troubleshooting. Logs, combined with metrics shortens the time needed to find the root cause and allows for quick and efficient resolutions of problems.

The Big SCOM Survey: Results and expert analysis

What’s the future of SCOM? How do others set up their SCOM landscape? What tools are SCOM Managers using to help streamline processes? The answers are all here. The Big SCOM Survey 2021 is the first of what we hope to be an annual survey where we measure the pulse of the SCOM community, get insights and share best practices. This year, we asked 27 questions and had 118 respondents – and we’d like to share our findings with you.