Many of you may be reading this blog from home, a remote office somewhere, a family member’s house, or—if Zoom backgrounds are to be trusted—the cockpit of the Millennium Falcon. We’re all learning how to get better at “working where we are,” and that includes optimizing the tool stack you use each day.
In today’s world, systems are increasingly becoming more and more complex. Due to this complexity, it’s no longer a matter of “if” our systems will fail but “when”. To manage expectations for when our systems do fail, we can look no further than our Service Level Agreement.
In recent years, digital transformation projects have dominated the tech priorities of most IT departments – and rightfully so given that they are tasked with ensuring their organizations stay relevant in a fast-changing world where customer expectations are soaring, and agility is everything. However, the COVID-19 pandemic has thrown a curveball to businesses around the globe.
Microsoft Azure customers worldwide now gain access to HackerBay’s Fyipe to take advantage of the scalability, reliability, and agility of Azure to drive application development and shape business strategies.
This tip is for those who are using Prometheus federation to monitor multiple clusters. How should alertmanager be configured for multiple clusters? Let us say that if there’s an issue for Cluster A it only needs to send an alert for cluster A? In such cases, every alert should be routed to proper team based on labels (if there is problem with application A on cluster B - team responsible should be notified). In the above case, two alerts are triggered by the same rule.