Operations | Monitoring | ITSM | DevOps | Cloud

NIST 800-53 compliance for containers and Kubernetes

In this blog, we will cover the various requirements you need to meet to achieve NIST 800-53 compliance, as well as how Sysdig Secure can help you continuously validate NIST 800-53 requirements for containers and Kubernetes. NIST 800-53 rev4 is deprecated since 23 September 2021 Read about the differences between versions down below →

10 reasons you need a network configuration manager

On June 2, 2019, Google Cloud Platform had a major network outage that disrupted the services of Discord, Spotify, and Snapchat, among many others. The root cause was a benign misconfiguration coupled with a software bug that caused the loss of configuration data. The issue was resolved almost four hours later after the lost configuration data was rebuilt and redistributed.

Tutorial: Setting up AWS CloudWatch Alarms

AWS CloudWatch is a service that allows you to monitor and manage deployed applications and resources within your AWS account and region. It contains tools that help you process and use logs from various AWS services to understand, troubleshoot, and optimize deployed services. I’m going to show you how to get an email when your Lambda logs over a certain number of events.

Using Jaeger for your microservices

Jaeger is a popular open-source tool used for distributed tracing in a microservice architecture. In a microservice architecture, a user request or transaction can travel across hundreds of services before serving what a user wants. Distributed tracing helps to track the performance of a transaction across multiple services. Before we deep dive into how Jaeger accomplishes distributed tracing for microservices-based architecture, let's take a short detour to understand distributed tracing.

7 traits to look for when targeting new prospects

I came across an interesting sales stat in a 2020 report from Sales Insight Lab. They interviewed 400 sales professional and found that 71% of respondents said at least 50% of the prospects they were engaged with were NOT a good fit for what they were selling. This tells me there is a lot of wasted time and effort trying to convince incompatible businesses to sign with you. So how can you improve your odds of converting prospects?

Tips for designing distributed systems

With companies expecting software products to handle constantly increasing volumes of requests and network bandwidth use, apps must be primed for scale. If you need resilient, resource-conserving systems with rapid delivery, it is time to design a distributed system. To successfully architect a heterogeneous, secure, fault-tolerant, and efficient distributed system, you need conscientiousness and some level of experience.

Monitoring PostgreSQL With pgmetrics and pgDash

I am currently trialing pgmetrics and pgDash for monitoring PostgreSQL databases. Here are my notes on it. pgmetrics is a command-line tool you point at a PostgreSQL cluster and it spits out statistics and diagnostics in a text or JSON format. It is a standalone binary written in Go, and it is open source. Here is a sample pgmetrics report. Rapidloop, the company that develops pgmetrics, also runs pgDash – a web service that collects reports generated by pgmetrics and displays them in a web UI.

Unexpected Parallels Between Yoga and Observability

Yoga is to ideal human health what observability is to an application’s ideal functioning. It is well established that observability is a critical factor for the successful implementation and maintenance of cloud-native, serverless, cloud-agnostic, and microservices-based applications. Well-established observability helps DevOps and development teams cross the boundaries of complex systems and get complete visibility into their functioning.

Getting started with Memory attacks

Memory (or RAM, short for random-access memory) is a critical computing resource that stores temporary data on a system. Memory is a finite resource, and the amount of memory available determines the number and complexity of processes that can run on the system. Running out of RAM can cause significant problems such as system-wide lockups, terminated processes, and increased disk activity. Understanding how and when these issues can happen is vital to creating stable and resilient systems.