A good, proactive Problem Management process is a crucial way to minimize the impact of disruptions whilst increasing satisfaction and trust, of your customers when incidents occur. Plus, it means you won't have to solve the same repeat Incidents over and over again!
But if you want to get the most out of your process, it's good to get of the ground with some best practice ideas. I implement Problem Management processes with organisations across the UK, so I thought I'd share some of my best tips and tricks for each step of the process here.
In this blog, I'll go through some nifty ideas for everything from problem detection to logging your problems. It will be step-by-step based, so for some more over-arching best practices to keep in mind while implementing better Problem Management you can read my other Problem Management blog over here.
But first things first:
What is Problem Management?
So, what actually is a ‘Problem’ in ITSM-terms? ITIL provides the succinct definition that a problem is an underlying cause behind one or more incidents. There are several industry analogies to help to contextualize this. For instance, incidents can be seen as the symptoms of an underlying illness: they reveal that there is a problem. It’s a two-way relationship. The problem then, is the reason that incidents are occurring. Before reading on, for further information on the distinction between incidents and problems see our dedicated blog here.
Detecting the problem at hand is, naturally, the first part of any problem management process. This may happen in many shapes or forms, perhaps the Problem Manager notices a trend in the recent incidents logged, or the system monitoring tool picks up on it. One of the most important things at this stage is to have open conversations throughout the team. Often when the Problem Manager is beginning to see a correlation between incidents, the other team members on the ground will have relevant anecdotes and insights to offer, which can help a lot down the line.
> How does ITIL 4 handle problem management?
Cultivating a culture of open communication, and allowing the team to report trends that they’re seeing on the ground, can be a pivotal method of identifying problems too. Focusing on the softer, more interactive, side of things is a good place to start. I wrote a piece on creating a Knowledge Sharing culture here, which is focused on Knowledge Management, but a lot of the tips are equally applicable to Problems.
After detection, it’s vital to make sure that you have system to log everything and make sure it's categorized and prioritized correctly. And ideally also reflects conversations you've had about the error previously. Spending some time educating colleagues on the reasoning behind this can be especially useful. Everybody likes to know the reason why processes are in place. Especially if you frame it as what's in it for them.
Train your team on not only how to write up detected Problems using the right format in your Problem Wizard, and why this is done. This can ensure that everyone is on board with your Problem Management procedures and actually follow them.
Using a tried and tested formulaic system of writing things up is a great way to maximize efficiency. This can make sure that styles, formats, and acronyms are aligned across the whole of the team. This makes it much - much - easier to understand everyone’s separate contributions, as you’re all working in the same way. In the long term, it also means that you’ll generate a library of Problems that are all recorded in the same way, and a neat and tidy database is everyone’s friend!
Arguably, investigation is the core task of Problem Management. You’ll be doing a comprehensive analysis of trends, occurrence, linked assets, recent changes – anything that might have been the catalyst. The goal of this stage is to identify both the origin of the problem, and to find a work around. You can use sophisticated techniques if you want to, or just a simple 5-whys approach (it's honestly usually enough).
When you identify the actual root cause, it's also often the best time to identify a viable work-around. A good workaround minimizes the impact and allows operators to resume normal service. In this sense you’re hitting two birds with one stone: finding a root cause, but also finding a temporary solution to the problem. Do this with Customer Experience in mind - find the solution that is easiest for those most affected. This will go great lengths to mitigating the impact of the Problem at hand.
Over time you’ll develop a library of Known Errors. This is a virtual warehouse of sorts, which contains every closed Problem. This comprehensive database and should be stored separately from the problem itself. It might seem convoluted, having multiple dedicated spaces for problems and known errors. But resist the temptation of piling everything into one place! If you keep things separated, it means that if you need to re-categorize a problem in the future, then you can.
But why even re-categorize? Well, the category that you chose at the start of the process might not necessarily be the best fit, and after your investigation with the benefit of hindsight, you can revisit these categories and make sure their appropriate. Separate storage means that your database is as flexible as your thought processes.
It also means that your operators have a centralized area to refer to if errors re-occur. And with separate logging and easy categorisation, it's super easy to find out if the problem is new, or simply a new iteration of an already known error. This saves massive amounts of time in the long run!
So, whilst your immediate goal might be to find a work around for incidents once you’ve identified the problem at hand, but your ultimate goal should be to close off your Known Errors and to find a permanent solution to fix the problem. That might mean raising a change to eliminate the cause of the problem, the benefit of this is that resource can then be dedicated elsewhere as it won’t be taken up on repeat incidents.
A fix doens't have to be a revolutionary new idea. It's absolutely fine to use your Knowledge Base to find a way to solve the problem. Or ask around other departments - or on google - if applicable! Sometimes the easiest way to solve a problem is the best.
Of course, there's way more to cover with a process like Problem Management! But I hope this serves as a good primer that answers some basic questions you may have about each step of the process. If you want some more best practices:
In our Best Practice Service Management e-book, we go through the best approaches to key processes, and argue for why you don't need to always follow frameworks. Check it out here: