Getting Better at Failure #FailBetter
I recently took part in a virtual round table discussion as part of the DRJ conference in October 2020. The topic was ‘Maintaining Operational Resilience while Delivering at Pace’. Emma Robinson from Cutover has posted her own thoughts of the event here but below are some of my key thoughts that I also explore in my Transformer Resilience article.
Your browser doesn't support HTML5 audio
My main thoughts are that all companies need to get better at change & failure. I use Zoom as an example of doing change well (at least externally) but also, and crucially, doing failure well. They had ‘EncryptionGate' and ‘DropInGate’ but both were handled very well with good comms and rapid change. They have also managed to scale from less than 40k daily active users in the UK in January 2020 to a peak of over 770k daily active users in May 2020! That’s nearly a 20x increase in 5 months! Most digital services would have cratered under the pressure but Zoom didn’t.
Secondly, the pressure to be resilient will not go away and if a company is resting on it’s laurels thinking it’s the most resilient it can be, there will be a nasty surprise in the future awaiting them! Nothing is truly ‘fully resilient’. It may be ‘more resilient’ or ‘less resilient’ than a standard but not ‘fully’. In the same way that the Titanic was clearly not ‘unsinkable’, there is a metaphorical (or an actual one depending on your business!) iceberg lurking out there waiting to strike.
The key thing is to learn from failure: Continuously test yourself against threats and learn from outages.
I talk about Chaos Monkey and how this could be applied to an organisation. For companies that have grown organically over years, this is likely to be further away than those that are green field start-ups. But the principles behind Chaos Monkey can be applied anywhere and to everything!
To be able to do that, you need to test the way you run. If your testing programme is not the same as your runbooks, you are not resilient. If you don’t learn from mistakes or plan for failure, you are not resilient by design.
Watch the video and let me know your thoughts!
ps: I bet you can’t stop singing Celine Dion now. 😜😂
#Resilience #OperationalResilience #Webinar #RoundTable #Zoom #ServiceManagement #Education #BusinessContinuity #Vlog #ITDR #Recovery #Disaster #Crisis #Incident #FailBetter #TestHowYouRun