okmeter

Kubernetes in Production: Services

Dec 4, 2018 By Nikolay Sivko In okmeter

We migrated all of our services to Kubernetes about six months ago. At first glance, the task seemed quite simple: deploy a cluster, write application specifications, and that’s it. But, since we’re obsessed with stability, we nevertheless had to learn how k8s works under pressure, so we tested multiple failure scenarios. Most of the questions that arose were network related. One particular point of concern was how Kubernetes Services function.

Read Post

okmeter

Read more about Kubernetes in Production: Services

PgBouncer monitoring improvements in recent versions

Oct 15, 2018 By Pavel Trukhanov In okmeter

As I wrote in my previous article “USE, RED and real world PgBouncer monitoring” there are some nice commands in PgBouncer’s admin interface that allow to collect stats how things going and spot problems, if you know where to look. This post is about new stats added in these commands in new PgBouncer versions.

Read Post

okmeter

Read more about PgBouncer monitoring improvements in recent versions

okmeter

Oct 5, 2018

Okmeter.io shows you what's going on with your server infrastructure — deep-dive statistics and comprehensible charts provide you with insight about behaviour of server-side processes.

View Organisation

Read more about okmeter

USE, RED and real world PgBouncer monitoring

Sep 25, 2018 By Pavel Trukhanov In okmeter

Brendan Gregg’s USE (Utilization, Saturation, Errors) method for monitoring is quite known. There are even some monitoring dashboard templates shared on the Internet. There’s also Tom Wilkie’s RED (Rate, Errors, Durations) method, which is suggested to be better suited to monitor microservices than USE. We, at okmeter.io, recently updated our PgBouncer monitoring plugin and while doing that we’ve tried to comb everything and we used USE and RED as frameworks to do so.

Read Post

okmeter

Read more about USE, RED and real world PgBouncer monitoring

PostgreSQL: why and how WAL bloats

Sep 3, 2018 By Pavel Trukhanov In okmeter

Any changes to a Postgresql database first of all are saved in Write-Ahead log, so they will never get lost. Only after that actual changes are made to the data in memory pages (in so called buffer cache) and these pages are marked dirty — meaning they need to be synced to disk later.

Read Post

okmeter

Read more about PostgreSQL: why and how WAL bloats

Real world SSD wearout

Aug 27, 2018 By Pavel Trukhanov In okmeter

A year ago we’ve added SMART metrics collection to our monitoring agent that collects disk drive attributes on clients servers. So here a couple of interesting cases from the real world.

Read Post

okmeter

Read more about Real world SSD wearout

Simple/hard metrics that help reduce MTTR when looking for a root cause

Aug 21, 2018 By Pavel Trukhanov In okmeter

Recently there was a mini-incident in a data center where we host our servers. It did not affect our service after all. And thanks to the right operational metrics, we’ve been able to instantly figure our what’s happening. But then an thought came up to me, how we would’ve been racking our heads trying to understand what’s happening without 2 simple metrics.

Read Post

okmeter

Read more about Simple/hard metrics that help reduce MTTR when looking for a root cause

Monitoring (with) Elasticsearch: A few more circles of hell

Mar 27, 2018 By Pavel Trukhanov In okmeter

This is the second part of our two-part article series devoted to Elasticsearch monitoring. The heading of this article refers to Dante Alighieri’s “Inferno”, in which Dante offers a tour through the nine increasingly terrifying levels of hell. Our journey into Elasticsearch monitoring was also filled with hardships, but we have overcome them and found solutions for each case.

Read Post

okmeter

Read more about Monitoring (with) Elasticsearch: A few more circles of hell

PostgreSQL: Exploring how SELECT Queries can produce disk writes

Mar 4, 2018 By Nikolay Sivko In okmeter

We already wrote about monitoring posgresql queries, at the time we thought that we completely understood how PostgreSQL works with various server resources. Working regularly with the statistics of PostgreSQL queries, we noticed some anomalies and decided to dig a bit deeper for better understanding. Through this process, we found that while the behavior of postreSQL is kind of strange at first glance (or at least very peculiar), the clarity of its source code is quite admirable.

Read Post

okmeter

Read more about PostgreSQL: Exploring how SELECT Queries can produce disk writes

Monitoring (with) Elasticsearch: Nine Circles of Hell

Feb 28, 2018 By Pavel Trukhanov In okmeter

We’ve finally made the finishing touches on the elasticsearch monitoring and officially released it. Only after three complete reworks did we manage to achieve really nice results and detect all the issues in any ES cluster setup.

Read Post

okmeter

Read more about Monitoring (with) Elasticsearch: Nine Circles of Hell

Operations | Monitoring | ITSM | DevOps | Cloud

okmeter

Kubernetes in Production: Services

PgBouncer monitoring improvements in recent versions

okmeter

USE, RED and real world PgBouncer monitoring

PostgreSQL: why and how WAL bloats

Real world SSD wearout

Simple/hard metrics that help reduce MTTR when looking for a root cause

Monitoring (with) Elasticsearch: A few more circles of hell

PostgreSQL: Exploring how SELECT Queries can produce disk writes

Monitoring (with) Elasticsearch: Nine Circles of Hell

Monthly Archive

Follow Us