When it comes to service restoration, being able to identify the problem at a time when a warning storm is creating noise is critical
In association with Daysha DevOps
When a company’s core applications fail, the damage is financial at best and reputational at worst. The teams of people who build, deploy, and support software are under tremendous pressure to restore service as quickly as possible. They can work nights and weekends to get the show back on the road. But what do the processes and tools needed to reduce “mean time to restore” (MTTR) look like?
DevOps training and solutions provider Daysha DevOps is an authorized Gold Partner of Atlassian, whose service management product Jira is a platform used to codify support processes.
DevOps is a well understood concept. Most people will correctly assume that it is a workflow that starts with defining requirements and ends with running software in production or, as Gene Kim in his seminal book The Phoenix Project invented the “first way”.
According to Kim, in the First Way, work always flows in one direction – downstream. The Second Way shows us how to create, shorten and amplify feedback loops. The third way emphasizes continuous experimentation, that we should learn from our mistakes and achieve mastery.
Reducing the MTTR emphasizes the second path. It starts when customers discover bugs or identify missing features. It is the speed and quality of the response to these requests that are determining factors in customer or end-user satisfaction.
Daysha DevOps has been working in this area since 2014. He advises clients to focus on reducing cycle time, which is the time it takes to build and deploy code. But once customers progress to the point where change is brought to production at a pace appropriate to business needs, it creates new and different pressures for customer-facing SRE and support teams. .
Therefore, processes such as ITIL (Information Technology Infrastructure Library) and tools to reduce the risk of change failure and MTTR become more focused. One such process is “progressive delivery”, a way to decouple code deployment from feature release. This is made possible by feature flagging to provide operations teams with the ability to deprecate buggy or non-performing features “in-flight”.
Root Cause Analysis (RCA) is another well-understood process – ideally undertaken in a clean way as it will identify for major incidents what needs to be corrected so that the problem does not reoccur. But all too often, SRE and operations teams have too much data to sift through and endless war room meetings waste time.
Major incident management processes have been in place for some time, and it is important to collect as much data as possible to inform the CRA. But when it’s hot to restore service, being able to pinpoint the problem at a time when a warning storm is creating noise is vital. How do you find the signal?
The observability tool is an emerging area of interest for teams living and dying from major incidents and subsequent ARCs.
These challenges and more will be discussed at the Daysha DevOps Agile ITSM event next month. Taking place on November 16 at the Alexander Hotel, Dublin 2, the event aims to educate and inform IT professionals tasked with delivering operational applications or services in the event of an outage.
From 9 a.m. to 2 p.m., attendees will hear from customers and partners about their processes and tools, what works and what needs improvement.
Speakers include Jerry O’Sullivan, Delivery Manager at Utmost, who will discuss setting up a help desk; Fabrizio Fortunato, Head of Frontend Development at Ryanair, will explore the development journey of the business-to-consumer website myryanair.com; Mark Arts, Senior Solutions Engineer at Stackstate, who will investigate the lack of any relationship between the effect of incidents and their causes; and Adrian Skehill from Bank of Ireland who will discuss insourcing technology delivery.
The presentations will end with a panel discussion followed by lunch for participants at 1:15 p.m.
To register for the event, go to: dayhadevops.co.uk/agile-itsm-dublin-nov-16th/