Operations | Monitoring | ITSM | DevOps | Cloud

Building a Secure OpenTelemetry Collector

The OpenTelemetry Collector is a core part of telemetry pipelines, which makes it one of the parts of your infrastructure that must be as secure as possible. The general advice from the OpenTelemetry teams is to build a custom Collector executable instead of using the supplied ones when you’re using it in a production scenario. However, that isn’t an easy task, and that prompted me to build something.

Monitor Amazon EC2: key metrics for instances, regions, and more in one view

Amazon EC2 was one of the first services available on AWS, helping propel the cloud platform into the mainstream of IT. And while EC2 instances come in a wide range of sizes and flavors to address all sorts of use cases, keeping tabs on those instances isn’t always easy. That’s why we’re excited to introduce our new EC2 monitoring solution in Grafana Cloud.

Effective strategies for managing cron jobs: Best practices and tools

Cron jobs are essential for automating repetitive tasks and streamlining website and application management. Properly managing cron jobs is crucial for maintaining system efficiency and minimizing risks. In this article, we will explore the significance of cron jobs in tech environments, delve into common challenges in their management, and introduce advanced monitoring solutions like WebGazer. We will also provide best practices to ensure efficient and secure cron job management.

Building a Custom Read-only Global Role with the Rancher Kubernetes API

In 2.8, Rancher added a new field to the GlobalRoles resource (inheritedClusterRoles), which allows users to grant permissions on all downstream clusters. With the addition of this field, it is now possible to create a custom global role that grants user-configurable permissions on all current and future downstream clusters. This post will outline how to create this role using the new Rancher Kubernetes API, which is currently the best-supported method to use this new feature.

How To Set Up Monitoring for Your Hybrid Environment

The modern IT landscape consists of many distributed systems, which can pose a challenge if you are responsible for the end-to-end performance of these systems. As a platform engineer today, that is exactly what the job requires. You must juggle between dozens of tools to meet SLAs. This is why a modern solution is needed to bridge the gap between disjointed infrastructure and application stacks…and this is why the Splunk Observability platform was born.

Avoiding vendor lock-in with your IDP

Commercial Internal Developer Portals (IDPs) are a valuable investment for teams that want to move quickly toward addressing initiatives surrounding software ownership, production readiness, and improving developer experience. But there's a common misconception that all commercial internal developer portals (IDPs) carry an inherent risk of “vendor lock-in” vs open-source alternatives like Backstage.

What is Network Error Rate & How to Measure It

If you've ever wondered why your network occasionally plays hard to get or experienced those head-scratching moments when everything seems fine, yet something's not quite right – you're in the right place. As the digital landscape evolves, understanding and effectively managing Network Error Rate has become a pivotal aspect of maintaining a robust and efficient network infrastructure.