Operations | Monitoring | ITSM | DevOps | Cloud

Experiment: Migrating OpenTracing-based application in Go to use the OpenTelemetry SDK

Jaeger’s HotROD demo has been around for a few years. It was written with OpenTracing-based instrumentation, including a couple of OSS libraries for HTTP and gRPC middleware, and used Jaeger’s native SDK for Go, jaeger-client-go. The latter was deprecated in 2022, so we had a choice to either convert all of the HotROD app’s instrumentation to OpenTelemetry, or try the OpenTracing-bridge, which is a required part of every OpenTelemetry API / SDK.

An Introduction to Talos Linux: The New Kubernetes Operating System

As the cloud native environment becomes increasingly more complex, new systems are needed to combat this issue and create simplified, secure, and stable working environments. Sidero Labs developed Talos Linux as a way to run Kubernetes consistently across all platforms, such as Edge, Cloud, Virtual, and Bare Metal. Talos Linux is a secured Linux distribution designed specifically for managing Kubernetes.

How to Ensure SCCM Client Compliance on All Endpoints with Nexthink

SCCM is one of the most business-critical applications—a must have on all the devices. Administrators use SCCM for endpoint protection, software distribution, and patch management. Any machine where the SCCM client is not functioning will be unable to receive necessary policies or application updates, which can create a significant vulnerability for your organization because this leads to compliance and security issues.

BIG Changes to Windows Feature Updates

It is safe to say that anyone responsible for patch management will have had their fair share of issues with Windows Feature Updates over the years. These updates have amounted to new operating systems versions being released up to twice a year, and needless to say, could be huge in size—ranging anywhere from 3GB to 6GB. This resulted in not only long download times, but also lengthy install times—anywhere from one to two hours—that required multiple reboots to complete.

Inside ObservabilityCon: 'I picked up so much practical information'

I’ve always been wary about vendor events. In my experience, many of them are mostly marketing pitches, with little or no content that is applicable to my use cases. Despite that, last year I decided to convince my manager to let me attend ObservabilityCON 2022 to see what I could learn from it. My hope was that I would be able to get practical knowledge that could be applied as soon as I got back to work. (Spoiler alert: I did!)

How we reduced flaky tests using Grafana, Prometheus, Grafana Loki, and Drone CI

Flaky tests are a problem that are found in almost every codebase. By definition, a flaky test is a test that both succeeds and fails without any changes to the code. For example, a flaky test may pass when someone runs it locally, but then fails on continuous integration (CI). Another example is that a flaky test may pass on CI, but when someone pushes a commit that hasn’t touched anything related to the flaky test, the test then fails.

Quick! Grab all the evidence: Capturing application state for post-incident forensics.

Everyone loves a good mystery thriller. Ok, not everyone – but Hollywood certainly does. Whether it’s Sherlock Holmes or Hercule Poirot, audiences clearly enjoy a page-turning plot of hunting down the culprit for some heinous crime.

Just Stick to the Script?

Have you ever patched your servers using scripts only to realize that you missed another script for pre-update configuration compliance that had to run beforehand? Most IT organizations start with scripting simple, repetitive tasks in either Python, PowerShell, or Perl to help provide the quickest ROI. Common pain points solved for are creating user accounts, installing patches, software, and provisioning resources such as virtual machines (VMs), etc.

What is HAProxy, and what is it used for?

In December 2022, the latest version of HAProxy, 2.7.0, was released. This open-source software is both a proxy and a load balancer, and is immensely popular due to the sheer volume of features it provides to help reduce or even avoid downtime and manage web traffic. Website or application downtime is disastrous for businesses. You want to serve as many users as possible, but if you have nothing in place to manage traffic, then your web applications can quickly become overwhelmed and fail.