Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

It's a known issue - How Product Managers should deal with issue or feature related enquiries or feedback

I often hear folks in my network being triggered by interactions with product managers within their companies whenever they follow up on certain product-related issues. The triggering phrase invariably is “It’s a known issue”. And they often wonder, well if it’s a known issue, why on earth isn’t anything done about it?

How to build a customer advisory board

Regardless of where you are in your product journey, it is impreative that you constitute a customer advisory board who can share perspectives into their business challenges so that you can gain insights on how to shape our road map, develop new features, formulate your vision and give you constant feedback on your product. So, how many customers should to include in a customer advisory board? Should you target higher level stakeholder or individual users?

Defining your Sev-1s

One of the primary things you need to figure out whenever your team is formulating your incident management process is describing in words what a Sev0(your highest incident priority) looks like. “Website doesn’t work” is certainly no enough. “Website is up but a key resource (ie CSS file) is missing, rendering the website unusable” is still not enough. “A single page on the website is 404’ing” is not a major but could be a minor incident.

Sending Nagios alerts to Microsoft Teams and rapid incident response with Zenduty

Nagios is one of the most widely used open-source network monitoring software used by thousands of NOC teams globally to monitor the health of a vast array of their hosts and services. Most teams rely on Emails as their primary Nagios alert notification channel, which may take a few minutes to respond to by your NOC team.

Product Metrics for Discovery Activities

Most companies today compile a set of metrics for their product teams to regularly report on to the company management. This includes a variety of product performance metrics(usage frequency, churn rate, NPS, etc.). But a lot of them struggle a bit with product discovery activities. So how do your track discovery?

Two tips to incorporate the voice of the customer in your story grooming/sprint planning

Constantly talking to your users about their business problems and incorporating those solutions is key to the success off your product and company. There are many ways to incorporate the voice of your users into your product planning. Formulate an experience brief that’s less than 2 pages, or a 5-minute clip of user interviews. The best is to have devs in the interviews and discovery activities with you as well.

Learning from Incidents - what to do after you write a postmortem?

For folks who’ve made post mortems more meaningful at your company, it is important that you spread that learning around. A lot of companies have teams that do postmortems really well and a lot of engineering managers(EMs) want to spread it organically, but writing and following postmortems is the kind of practice that a lot of devs really just don’t think about or care about and it can get extremely hard to force this practice, especially without support from upper management.

Disaster recovery in AWS, GCP and Azure - thoughts on capacity planning and risks

One of the most popular cloud disaster recovery models in the industry today is the “pilot light” model where critical applications and data are in already place so that it can be quickly retrieved if needed. A simple question one must ask before adopting this model is what thought has been given to whether the AWS/GCP/Azure APIs will work and if the requisite capacity will be available in the alternate region.