They have a critical business dependency on your product, and some glitch has degraded it or taken it completely out of service. They turn to you, and the technical support team, to make things right. Fix it now. If you’re in technical support, you’ve seen this movie before. Bring in the engineers, order pizza, pore over stack traces into the wee hours of the morning - work the problem until it’s resolved. For an adrenaline junkie, it’s a fun ride - but your customer, and your business, won’t see it that way - there is an erosion of confidence, a risk to the account, a blemish on your corporate image.
However, in all crisis there is opportunity.
In this case, you’re set to prove that your team not only has the technical competency to solve complex problems, but that you truly understand what the customer needs, and your empathy is genuine. In fact, you’re about to prove how committed you are to this customer, and just how much they can count on you when there is a problem.
Let’s examine some of the best practices, developed over years of technical support incidents with Fortune 500 customers, that can truly impact the outcome of an otherwise bad situation.
In modern IT environments, the root cause of technical problems can be anywhere - in your software, another vendor’s software, the configuration, the integrations - perhaps even caused by external factors like network latency or unusual demand profiles. Regardless of what technology component is to blame, remember that it’s not the technology that has a problem, it’s your customer that has a problem.
Your role is to step up and assume ownership regardless of where the problem lies. We’re in the sinking boat with the customer — it may be academically interesting that we’re not the boat manufacturer and don’t supply leak repair kits, but we’re about to drown too. Act like it.
Customers in a crisis situation are under a lot of stress, and people react differently to stressful events. The technical contacts have executives breathing down their necks, and everybody is edgy. As the vendor, you take a lot of heat - it’s easy to point the frustration at you. In all cases where it is reasonably possible, find a way to use agreeable and affirmative words like “yes”, “true”, and “I agree”.
It’s particularly useful to agree on points that identify a failure on behalf of your team or technology, as it promptly illustrates that you’re not defensive.
On a regular basis, take a minute to step back from the details of the diagnosis and orient everyone as to what we’re trying to do. This “forest for the trees” approach ensures that everyone is in sync on the relevant importance of the immediate task at hand, and how it fits into the big picture.
A resolution strategy includes a working theory and a plan to prove it.
This is particularly important for technical teams, as ‘geeks’ tend to see a technical problem as an end, not a means to an end. Top down framing will help identify those cases where our attention is inappropriately diverted, or how critical the present task is in the resolution strategy. When this is done well, the entire audience need not be subject matter experts to agree we’re spending time on the next most logical thing.
There is a phenomena amongst technical people that I have observed for many years, and I bet you’ve seen it too. Deep into the problem solving, a theory emerges as to the root cause, and there is just a short distance left before we confirm the theory and have our answer.
How long? 10 minutes. 10 minutes later, it’s another 10 minutes. Then 30, oh maybe by 4. Ok, tonight for sure. By morning I promise!
There is a deep conviction inside the problem solver that is absolutely sure they have it. It’s not stalling, it’s very real. Unfortunately, it’s not always correct - and without the discipline to chop the effort and move on, you can end up chasing a theory for far too long.
The only solution is deadlines - written and irrevocable - that identify the next step in the diagnostic plan. Set them, and keep them!
When a major IT system experiences a service disruption, two primary groups respond; the IT folks and the business that’s affected. In providing service to these two diverse groups, it’s best to utilize two discreet channels.
Technical and business teams need different information.
technical channel is detail oriented, fluid, and 24/7. The
business channel is high level roll-up, periodic briefings, and
oriented around impact assessment & timeliness. For obvious reasons,
these channels demand different resource allocation; it’s rare to find
one person who can manage both channels well.
During the diagnosis phase, lots of information flows around in support of the effort. In all cases, never let information pass that you either don’t understand or disagree with; ask for clarification.
Challenge the facts openly, in front of the customer. Everybody needs to understand the facts, and how we got them.
This goes for information supplied by your own colleagues; if it demands clarification, openly ask for such right then and there. Doing this in front of the customer helps them feel included in the process, demonstrates that we’re willing to be vulnerable and not pretend we have all of the answers, and that we’re challenging ourselves in the interest of the truth.
Sometimes, it can be tempting to jump to problem solving before you understand the scope & scale of the problem itself. Impact assessment is a valuable step in the process for many reasons. Primarily, it helps calibrate severity. Secondly, it is vital data for framing and plotting a diagnostic strategy.
Many times, a “critical” problem is affecting only one important user, or an average severity problem is actually a global service degradation that’s just not quite bad enough to provoke a flood of customer complaints.
Crisis events are no time for wishy-washy commitments or positions.
It takes some self confidence to do so, but strong terms and hard commitments/positions are mandatory in this environment. It can be difficult for engineering types to swallow, but this can often require 80% probabilities to be stated categorically, just to avoid the inflammatory nature of sounding unsure. “I think this will work” is very different than “This is our recommendation”.
Even if you don’t know, you can be certain you don’t know. If you’re using hedge words or sound like you’re not sure, you’re providing fuel for a perception that leads to unnecessary escalation & tension.
Generally speaking, most reasonable people reach the same conclusions when coming from the same perspective and provided all of the same facts. On this premise, taking the time to explain “why”, even when you’re not asked, is time well spent.
“Why” is the grease that keeps everything flowing smoothly.
Do this for everything — even something simple, like “the next update will by at 7 pm” begs the question “why not 6:30?” or “can they really get me something material that quick?”. If you follow the diagnosis, statements and expectations with some commentary on the why, people have a chance to understand, and if required, challenge the thinking that lead to your assertion. Done well, this exposes ample opportunity to flush out inconsistencies in premise and get people thinking with the same rationale.
IT problem diagnosis is often a labor intensive proposition.
There is data to collect, scripts to run, test environments to configure, results to parse, debug flags to be set, etc. Many steps may be necessary over the course of the diagnosis, and in many cases it is easier or more convenient for you to ask the customer to perform the action, as you might normally do in a standard problem diagnostic process.
Let the customer focus on the impact of the problem, while you focus on solving the problem.
Well, this isn’t a normal circumstance - the customer needs time to handle the internal pressures of a service impacting event on their business, and they need you to operate independently. If you’re chasing them constantly for help where it’s strictly not required, you’ll be perceived as not “stepping up”, and it can look like a stall tactic.
Own this - don’t ask the customer unless there is no other way. It will
make them more receptive to responding promptly & positively to the
requests for help that you
“Who doesn’t sleep until this is fixed?” is an important question that drives at the heart of accountability. To be sure, your customer fits this description; don’t let them be the only one.
A single person needs to be held accountable for marshaling the resources required to get the customer what they need. This can’t be a “team”; when everyone is responsible, nobody is responsible.
Designate the in-charge person, share that name & phone number with your customer, and make sure they know you’re there 24/7 until this is fixed.
Whew - problem solved. Some well deserved rest, and on to the next one, right? Not exactly! At the heart of good execution is the incorporation of lessons learned into the modus operandi. Take the time to conduct a full retrospective analysis after the dust settles; talk to everyone involved (including the customer), and feed what you learn back into the process.
Make a commitment to the customer to debrief when the incident is over.
Commitment to a debrief is a useful tactical tool during the crisis as well, allowing for complaints or unhelpful observations by frustrated people to be channeled into a transparent process after we’ve solved the problem. It sends the message that you’re willing to listen, just not right now.
Technical Support is not for the faint of heart. For those of you supporting enterprise customers on mission critical systems, I hope that you find this information helpful. For as much as it may not feel that way at the time, these events really do cement a high-trust, enduring relationship between you and your customer, and amongst the support team as well - everybody is stronger after the battle.