Operations | Monitoring | ITSM | DevOps | Cloud

Blog

The Complete AWS Lambda Handbook for Beginners (Part 1)

Welcome to the Serverless world. One of the first things you’ll hear about is AWS Lambda - and you’ll continue to keep hearing about it! While architecture can be serverless without Lambdas involved, it’s very often the key component within a serverless application. In the first post of this 3-part AWS Lambda Handbook series, we run through what is AWS Lambda, dialling back to basics with the various terminology, how to create a Lambda function and how to run it.

How to Ensure and Optimize Application Performance

Application performance management (APM) involves managing the availability and efficiency of software applications by obtaining and translating IT metrics into business outcomes. Because application performance hiccups are inevitable, it’s crucial to have a robust APM solution in place so IT teams can resolve problems faster and more efficiently, lessening the impact on users and the bottom line. One factor muddling application performance optimization is the speed of change.

Analyze your logs quickly with suggested queries beta in Cloud Logging

Cloud Logging is a popular tool to help developers, operators, and other users identify and find the root cause of issues in their infrastructure. With features like the Logs Explorer, you can quickly and efficiently retrieve, view, and analyze logs. To help you get the most out of your logs, we’re excited to introduce suggested queries in Cloud Logging to help highlight important logs, so you can start analyzing and troubleshoot issues quickly.

The State of Robotics - August 2020

So that’s the summer gone (hopefully, that heat was awful). Or winter if that’s where you are. Seasons change and so does the state of robotics. Fortunately, that’s what we’re here for. Before we get into it, as ever, If you’re working on any robotics projects that you’d like us to talk about, be sure to get in touch.

Network Monitoring: More Dashboards Lead to More Clutter. Here is What the Experts Do.

Cloud and digital transformation have made business operations more efficient than ever. However, with more connected tools, devices, and platforms, monitoring has become a major challenge for IT professionals. Here is a simple analogy – imagine, you are the captain of a ship with thousands of passengers. The entire ship’s control rests in your hands. To better understand what is going on in your ship, you have installed various tools.

How to monitor IIS effectively

Microsoft Internet Information Services (IIS) is a popular web server for hosting web applications and is widely used in many sectors, such as healthcare, banking, e-commerce, logistics, etc. The IIS web server is the backbone of many IT infrastructures. But if the IIS web server encounters problems, it can lead to websites and applications experiencing higher response times, and timeouts resulting in end-users either leaving your website or complaining about the performance.

No More False Alerts at Night

Do you know this situation? You are on-call and in the middle of the night you get a phone call. Loud enough to wake you up. Loud enough to wake your wife up, as well. You stand up and check your emails to see what the problem is. OK, you got it. Then you log on to the console of your monitoring tool and – green. Green? False alert? Why did you get the call then? After double-checking, still a bit sleepy, you recognize that the problem has been recovered automatically.

Use proxy to process complex or aggregated data

Imagine you need a monitor to react to a derivative of several performance values. For example, you could need to only trigger alert if CPU load and free memory have both crossed certain thresholds. If those monitors are related to the same host, you can always use generic monitor type, such as Script or Program, Python script etc. and do whatever math is required. What should you do if the performance values can only be taken from different hosts? There are several solutions.

SRE Leaders Panel: Testing in Production

Blameless recently had the privilege of hosting some fantastic leaders in the SRE and resilience community for a panel discussion. Our panelists discussed testing in production, how feature flagging and testing can help us do that, and how to get managers to be on board with testing in production. The transcript below has been lightly edited, and if you’re interested in watching the full panel, you can do so here.