How to reduce your MTTR

By Joe Edwards

Feb 8, 2019

3 minutes

Squared Up

If you're a senior leader within IT, one of the core KPIs held against your name will be the Mean Time to Repair (MTTR) for application outages. And with bonus season on the horizon, here are six quick tips that will help you get on top of your MTTR stats and help IT win over the business along the way.

So, without further ado, let's crack on...

1. Focus on the important stuff

OK. So this might not do what I said it would, but if I hold my hands up now that will hopefully mean you're more likely to trust me further down the line!

Yes, MTTR is super important, but if IT are working on low priority items rather than business-critical applications then that's a sure-fire way for IT to lose friends and alienate people. However, if you focus on the things that matter most then you'll get yourself in the business' good books.

And with four in ten business executives believing that IT can be significantly replaced by third-party services, focusing in on business needs will help prove the doubters wrong.

2. Update your knowledge base

Now this is a simple one, but with Occam's razor front of mind, let's do the simple things well.

If employees were given the choice between fixing something themselves, or helplessly handing on to the other end of the phone as they idly wait for support, you'd hope the majority would choose the former. A comprehensive knowledge base will therefore ensure that those that want to can, and those that don't, well, you still have to deal with them I'm afraid...

Following this mantra will also help the Service Desk steer well clear of the industry average (24.2 hours!!) for a first time response.

3. Think applications, not infrastructure

Whilst infrastructure monitoring tools like SCOM will do a stellar job collecting rich health data across your entire technology stack, they often fail to present this information back to their users in a meaningful form. Monitoring strategies that are built from the top-down, rather than low-level component monitoring, will help organizations focus their attention on what matters most.

Put simply, you don’t care whether a mobile phone has an A12 bionic chip or Kirin 980. What really matters is what you can do with it. Take great photos. Record Videos. Download all the latest apps. The same holds true for enterprise applications.

And to add more fuel to this particular fire, organizations that use Application Performance Management (APM) tools have been shown to reduce their MTTR by 27%. Pretty impressive stuff.

4. Map Every App

APM tools have already proven they can help reduce your MTTR, but with enterprise IT typically responsible for hundreds, or even thousands of applications, APM doesn’t scale at the rate you need. This is where Enterprise Application Monitoring (EAM) steps in.

EAM tools typically utilize your existing monitoring data and help you put all that information in the context of your key applications and services - without the need for any new agents, databases or infrastructure. However, as with most things, there is a compromise. EAM tools won't provide the code-level insights that are available with APM but the upside to this is they will scale application-focused monitoring to the masses. Helping you and your teams access monitoring in the context of business critical applications and avoid costly outages like that of Delta Airlines back in 2016.

5. Contextualize existing data

With 31.9% of IT security professionals ignoring alerts, it goes to show that monitoring, without context, is a waste of everyone's time.

Put off by the user experience offered by centralized monitoring, many users will instead turn their attention to their own niche tools. And whilst they excel within their own narrow focus, they do little to discourage monitoring silos and engender an “us versus them” culture within IT. However, if you were to take all that juicy monitoring data – and put it in the context of your key business services - then you can quickly escalate application performance issues to the right team, at the right time. Now everybody is on the same page.

6. Don't try to be all things to all people

Worse than the stat above, only 9% of IT organizations use business related metrics to measure success. And if you ask me, that's either a pretty arrogant or a naive approach to take with IT.

Monitoring should never try to be all things to all people. Vanilla dashboards do more harm than good. Instead, look to create custom dashboards that will inspire action before the service desk gets bombarded with angry calls. Whilst application owners and senior management want to know if they’re hitting their SLAs, the server team wants visibility of performance issues before all hell breaks loose.

As always, the best fix is the one that prevents an issue occurring in the first place…