Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

The Importance of Uptime for Your Website

Business operations have been revolutionized by the advent of web-computing services. Many organizations now look to decrease or eliminate expenditure, increase efficiency, and maximize profits by moving their processes online because of the unmatched flexibility and ability to scale the cloud affords them. With this sea-change to online, cloud-based operations for businesses has come a new challenge: availability.

How the Great Firewall of China Affects Performance of Websites Outside of China

The Great Firewall of China, or as it’s officially called, the Golden Shield Project, is an internet censorship project to block people from accessing specific foreign websites. It is the world’s most advanced and extensive Internet censorship program. This project implements multiple techniques and tactics to censor China’s internet and controls the internet gateways to analyze, filter, and manipulate the internet traffic between inside and outside of China.

WebSocket Application Monitoring

WebSockets have been around for over a decade now, but the real-time web existed long before they came. This preceding ‘real-time’ web was typically slower and hard to achieve. It was attained by hacking available web technologies that were not primarily built for real-time applications. There was no solution with TCP/IP socket-style capabilities in a web environment that could address all concerns associated with operating in a web environment.

Optimizing Web Performance: Understanding Waterfall Charts

Waterfall charts are diagrams which represent how website resources are being downloaded, parsed by the engine, in a timeline that gives us the opportunity to see the sequence and dependencies between resources. It assists in identifying where important events happened during the loading process. They can also let the user easily see how good or bad the performance of their website is, showing you exactly what is slowing down your site.

The 10 Most Common HTTP Status Codes

As a typical Internet user, nothing is more frustrating than waiting for a web page to display, only to receive a “Page Not Found” 404 error status code. Sure, we try reloading the page, and sometimes that gets the gremlins to start working, but most times, the issue is out of our hands. For all of us typical users, we either go onto the next thing or find a different site. There’s a lot going on in the background that most of us are completely unaware of.

Top 13 Site Reliability Engineer (SRE) Tools

The role and responsibilities of a site reliability engineer (SRE) may vary depending on the size of the organization, and as such, so do site reliability engineer tools. For the most part, a site reliability engineer is focused on multiple tasks and projects at one time, so for most SREs, the various tools they use reflect their eve-evolving responsibilities.

SRE Incident Management: Overview, Techniques, and Tools

In the world of a site reliability engineer (SRE), failure is not only an option, but also expected. Systems, web applications, servers, devices, etc., are all prone to performance issues and unexpected outages at some point. It is an unavoidable fact. These unexpected failures can lead to huge revenue losses, customer trust and depending on the industry, maybe fines. Fortunately, SRE incident management is one of the core practices used to limit the disruption caused by unexpected issues.

Monitoring Distributed Systems

There was a time when standing up a website or application was simple and straightforward and not the complex networks they are today. Web developers or administrators did not have to worry or even consider the complexity of distributed systems of today. The recipe was straightforward. Do you have a database? Check. Do you have a web server? Check. Great, your system was ready to be deployed.

SRE Principles: The 7 Fundamental Rules

In one of our previous articles, we discussed what an SRE is, what they do, and some of the common responsibilities that a typical SRE may have, like supporting operations, dealing with trouble tickets and incident response, and general system monitoring and observability. In this article, we will take a deeper dive into the various SRE principles and guidelines that a site reliability engineer practices in their role.

Top 13 Site Reliability Engineer (SRE) Tools

The role and responsibilities of a site reliability engineer (SRE) may vary depending on the size of the organization. For the most part, a site reliability engineer is focused on multiple tasks and projects at one time, so for most SREs, the various tools they use reflect their eve-evolving responsibilities. A typical SRE is busy automating, cleaning up code, upgrading servers, and continually monitoring dashboards for performance, etc., so they are going to see more tools in that toolbelt.