Operations | Monitoring | ITSM | DevOps | Cloud

10 Best Linux Monitoring Tools and Software to Improve Server Performance [2021...

Linux is one of the most popular operating systems today, powering a large portion of the Internet. According to W3Techs, almost half of today’s top-ranked 1 million websites currently run on Linux systems. So, if you want your site—and the application(s) running on it—to be high-performing with lots of uptime, you need to ensure the availability and reliability of your Linux-based servers.

What SREs Can Learn from Facebook's Largest Outage

Facebook’s October 2021 outage was the type of event that gives SREs nightmares: A series of critical business apps crashed in minutes and remained unavailable for hours, disrupting more than 3.5 billion users around the world and costing about 60 million dollars. As incidents go, this was a pretty big one.

What Is Kubernetes Pod Disruption?

Kubernetes pods are the smallest deployable units in the Kubernetes platform. Each pod signals a single running process within the system and functions from a node or worker machine within Kubernetes, which may take on a virtual or physical form. Occasionally, Kubernetes pod disruptions may occur within a system, either from voluntary or involuntary causes.

An intro to Infrastructure as Code

Infrastructure as Code (IaC) is the practice of recording the desired state of your infrastructure using a declarative language. In this article, I’m going to assume that your team is starting from scratch. Maybe some of your build process has been scripted, and maybe there is some manual testing and quality assurance work happening. Many readers will find that they are midway through the IaC adoption journey I’ll describe, or that they have missed some steps.

The Role of Fintech In The Financial Industry

With the rise and advancement of technology and artificial intelligence, there is no question that new rationalists are joining this group to enforce their ideas. However, most Rationalists and highly successful people do not hold back anything, and maybe this is because the technological and financial industry embraces disruptors of the norm. One rationalist concept adopted by the technology and financial sector is Fintech.

Daily SQL Server Performance Checklist for DBAs

A main focus for database administrators (DBAs) is to ensure server environments are optimized and performance is at its peak. Whether you’re a DBA starting in a new role and are evaluating an existing environment for the first time, or you’re a senior database professional with the ongoing task of maintaining optimal performance, following key SQL Server best practices for installing, configuring, and ensuring new instances of SQL Server are consistently deployed is all in a day’s work.

Incident Review - An Account Of The Telia Outage And Its Ripple Effect

Another major outage on the Internet has taken place today. Telia, a major backbone carrier in Europe, suffered from a network routing issue between 16:00 and 17:05 UTC. This had a huge ripple effect, causing issues for multiple key companies providing critical cloud and infrastructure services. Companies affected include: - Google Cloud - Equinix Metal - Cloudflare - Fastly - NS1 It’s always arresting to see the secondary and tertiary effects that a major outage can have.

Mapping FluxCD Applications

Flux is a CNCF based open source stack of tools. Flux focuses on making it possible to keep Kubernetes clusters and cloud-native applications in sync with external resources and definitions hosted in environments such as GitHub. Implementing tools like FluxCD should enable you to achieve results such as: The results above can bring obvious benefits, and many teams are adopting FluxCD as their tool of choice for GitOps.