Operations | Monitoring | ITSM | DevOps | Cloud

Loki 1.3.0 Released!

Welcome to 2020! (We’re a little slow with that on the Loki team.) To kick off the year we are releasing Loki 1.3! Anyone running Loki in microservices mode will be excited by this release as it introduces the Loki Query Frontend. (If you aren’t using microservices, be patient – good things will be coming your way soon.) The query frontend sits in front of the queriers and allows sharding queries based on time.

How to Solve Real World Application Problems With APM - SolarWinds Lab Episode #83

Based on one of the most popular SWUG™ (SolarWinds User Group) sessions of 2019, Jim Hansen, SolarWinds VP of application management products, shows you how to combine user experience monitoring with custom metrics, distributed tracing, log analytics, and log management to provide unparalleled visibility into your custom applications. Jim will demonstrate, step by step, how Pingdom®, AppOptics™, and Loggly® integrate with one another to help you pinpoint performance issues and keep your end users happy.

January 2020 Online Meetup: Securing Your Production Grade Kubernetes Clusters Using Rancher

As DevOps teams deploy Kubernetes in production using Rancher, enterprises must focus on the runtime security and compliance requirements of their cloud-native platforms. Starting with Rancher 2.2, we published self-assessment and hardening guides to outline provisioning a cluster to comply with the CIS Kubernetes benchmark. Identifying gaps and pain points in the process, Rancher engineering added additional features to both Rancher and RKE to simplify the process.

10 Alerts and Visualizations for S3 Server Access Logs to take control of AWS infrastructure

AWS S3 Server Access logs provide detailed records for requests made to S3 buckets. They’re useful for many applications. For example, access log information can be useful in security and access audits. It can also help generate customer insights and better understand your Amazon S3 bill. Coralogix makes it easy to integrate with your S3 server access logs via a Lambda function.

SLOs with Stackdriver Service Monitoring

Service Level Objectives or SLOs are one of the fundamental principles of site reliability engineering. We use them to precisely quantify the reliability target we want to achieve in our service. We also use their inverse, error budgets, to make informed decisions about how much risk we can take on at any given time. This lets us determine, for example, whether we can go ahead with a push to production or infrastructure upgrade.

How To Auto Generate SSL Certificates On The Fly

Customers can generate hosted status pages that display the status such as availability, response time and incidents of their services (websites, API, infrastructure) to their own clients or for internal use. All status pages are hosted and maintained by our care. Users point a custom domain to our DNS, and after seconds their page is ready. Hassle free.

Unified Monitoring and the Paradox of IT Culture Change

I have a coffee mug on my desk that I got from a sales manager many years ago. It’s now filled with pens (I don’t drink coffee), but I take a look at it once in a while. I was starting to think about IT transformation again and noticed a picture of my daughter and an expired AAA card (I might have been thinking AARP) as I looked at that old mug… it reminded me I’m not getting any younger. But it also reminded me of a paradox of culture change.

Using Honeycomb to remember to delete a feature flag

Feature flags are great and serve us in so many ways. However, we do not love long-lived feature flags. They lead to more complicated code, and when we inevitably default them to be true for all our users, they lead to unused sections of code. In other words, tech debt. How do we stay on top of this? Find out how Honeycomb’s trigger alerts proactively tell you to go ahead and clean up that feature flag tech debt!