Scenario Testing for Operational Resilience: Key Considerations & Data Points

Published by Ben Saunders - OpRes Founder

Roughly a 5-minute Read

Introduction: 

Last week was quieter than usual for us on the blogging front as we doubled down our efforts on the development of OpRes here at HQ. In doing so, we’ve begun to connect many of the key screens together within OpRes that will allow our users to leverage the proposition for scenario testing important business services over the coming months. 

As part of that solution, we want to provide firms with the capability to conduct lightweight scenario testing. Whereby they can identify the impact of service disruptions across systems built and maintained by the firm themselves. As well as those systems built and maintained by their 3rd & 4th party suppliers. In doing so, our aim is to enable users to correlate multiple data points to determine whether an outage against an important business service could cause intolerable harm to either customers, the firm, or the market.

As we peeled away the onion on this subject, it became clear that there were different capabilities we needed to provide users with in order to conduct worthwhile testing simulations. Whilst we also found that the type of important business service our users could be testing may well be at different stages of the customer lifecycle journey. Therefore, impacting the types of data points that firms would need to calculate the actual harm that could be caused as a result of service disruption to a firm, its customers or the wider market.

Over the course of this blog, we will touch on some of the key data points that firms need to correlate in order to establish a rigorous scenario testing regime for operational resilience. However, let us first provide clarity around the customer lifecycle and provide some examples behind our perspectives in this respect.


The Customer Lifecycle:

The customer lifecycle is pretty heterogeneous across most industries. Namely, more customers often means more revenue. However, it isn’t usually that simple. Especially when you add factors like the cost to serve across areas such as technology, people, and customer acquisition overheads. As such, we are going to unpack the customer lifecycle and reinforce how each stage could have a different impact on its players and actors. 

As we break down the stages of the customer lifecycle we will use retail banking business services to support our examples. The customer lifecycle can generally be broken down into 6 stages. These are:

  1. Awareness - In the awareness stage, the goal of many firms is to make new people aware of their brand and products. This would generally be achieved through targeted marketing campaigns across a number of different media. Typically, the only party who would be impacted if these channels were disrupted is the firm itself. 

  2. Acquisition - Next comes the acquisition stage, when firms convince potential customers to subscribe to their brand and give them access to personal information such as their email, phone number, and/or social media account. In the retail banking domain, this could be through the use of a simple mortgage or loan calculator. Whereby the customer can get access to the various lending products at their disposal in return for providing the firm with a few personal details for a more detailed engagement and follow-up conversation. 

    If these types of business services are disrupted, one could argue that there is a material impact on the firm but also on the customer in being able to conduct market analysis and identify the most suitable products for their requirements. 

  3. Conversion - This is the stage where a product or service is acquired by the customer. In this example, we could be looking at an onboarding journey for opening a new current account, taking out a new mortgage or opening out a credit card with the firm. 

    If an outage is experienced against these types of business services, then intolerable harm could be placed on the customer and the firm. Especially if a customer has completed their onboarding process and the business service requires a number of checks to be executed which are manually administered and recorded on the backend of the business process. Indeed, if the firm can’t process the final steps of the conversion process, they run the risk of missing out on capital conversion opportunities. 

  4. Fulfillment - This is when the customer has completed their transaction and is in receipt of their goods or services. If we were to leverage our onboarding process again, the required Know Your Customer (KYC) checks around identity & verification and credit histories will have been completed. Whilst a series of sub-processes will have been instigated in order to execute some or all of the following; create the customer’s core banking details, setup or transfer any direct debits from pre-existing accounts, order the delivery of new PIN numbers and physical cards, distribute card readers and welcome packs or execute a faster payment. 

    At this stage, there is a likelihood that the customer and the firm could be impacted if business services in this stage of the customer life cycle are impacted.

  5. Retention & Growth - Amidst the rise of Digital Neo Banks, cross-selling products to customers is becoming a more fluid exercise. As an example, up-selling your current account services to a customer by charging them more money for a value-added current account whereby they may get access to travel or mobile phone insurance, exclusive offers, or fee-free withdrawals abroad. 

    Alternatively, firms may provide customers with a facility to create saving nests for moments that matter in their life. By setting a saving target, the firm typically knows that the customer has an end goal in mind and by using this data will be able to identify and suggest suitable products to fulfill their needs. This introduces cross-selling opportunities for firms and enables them to increase their bottom line and drive customer retention. Again, this could be for a new mortgage application. Or alternatively, it could be associated with creating a new stocks and shares ISA. 

    At this stage, there is a likelihood that the customer, the firm, and the wider market could be impacted if business services in this stage of the customer life cycle are impacted. As an example, a firm may be using a white-labeled ISA product provided by a 3rd party. If the capacity to open a new ISA or transfer funds into the account is disrupted then this impacts the customer’s objectives, as well as the firm and their partner’s ability to secure the customer’s capital and make investment decisions for them over a period of time.

  6. Exit & Loss - In some instances, customers get seduced by other providers or become frustrated with the level of service delivered by their existing Bank. In these cases, a customer may wish to close their current account by executing the 7-day Current Account Switching Service (CASS). Resulting in a number of sub-processes to be kicked-off to support the movement of financial data (account balance, direct debits, standing orders) between the existing provider and the customer's new provider.

    It might not seem like it yet. However, depending on the stage of the customer lifecycle, the impact of a service disruption could cause intolerable harm to a single party (e.g., the customer) or multiple parties (e.g., the customer, the firm, and the wider market). We will expand on this more as we provide examples of the key data points firms should calculate and analyse as part of their scenario testing plans for operational resilience. 

    At this stage, there is a likelihood that the customer, the firm, and the wider market could be impacted if business services in this stage of the customer life cycle are impacted. For example, if the CASS cannot be executed, the customer’s ability to make transactions could be hindered, along with the new provider’s ability to transfer the required accounting details and standing monthly transactions.

6 Stages of The Customer Lifecycle

6 Stages of The Customer Lifecycle

Key Data Points for Scenario Testing:  

The FCA and PRA have been clear in setting their expectations that firms must have baselined their impact tolerance as well as conducted sufficient levels of mapping and scenario testing by March 2022. In order to support these efforts, we are aiming to build an operational resilience scenario testing tool. 

In short, this will enable firms to identify the intolerable harm that could be caused to customers, the firm or the market if critical or important technology components experience outages that last for seconds, minutes, hours, days, weeks and (hopefully not), months! Let’s start with the initial baseline data points that firms may consider capturing. 

Service Level Targets: 

In some of our earlier blogs and show & tells we have explained how firms capture a set of baseline metrics for their important business services. These are:

  1. Service Level Agreements (SLA)

  2. Service Level Objectives (SLO)

  3. Service Level Indicators (SLI); traffic, saturation, errors, latency

  4. Recovery Time Objective (RTO)

  5. Recovery Point Objective (RPO)

In addition, firms should have a detailed understanding of the systems and suppliers that underpin the important business service. Whether these be hosted by the firm or external 3rd & 4th party suppliers. 

When using our scenario testing calculator, we expand on these areas by suggesting firms start to capture & correlate data points across the following domains: 

Customer Data Points

  1. A total number of customers using the product.

  2. Customer demographics & most/least used product channel. 

  3. Total customer assets attributed to the product. 

  4. A total number of customers who are deemed to be vulnerable.

Economic Data Points

  1. The firm's overall market share of the product and associated important business services. 

  2. Historical customer acquisitions over a defined period (hourly, daily, weekly, monthly, annual)

  3. Average lifetime customer value. Or the categorisation of customer lifetime value across bandings (e.g., low, medium, high yield) 

  4. Total capital, liquidity & liabilities for the product and associated important business services.

  5. Potential regulatory penalties for the firm in the event of intolerable harm being experienced by customers or the wider market. 

  6. Historical consumer behavior during the midst of a disruption and post-event after the disruption (e.g. moving trades to another firm or platform or closing accounts)

Performance Data Points

  1. Transactions per second (TPS) over a defined period (hourly, daily, weekly, monthly, annual)

  2. Response times in milliseconds (m/s).

  3. Peak transactions per second over a defined period (hourly, daily, weekly, monthly, annual)

  4. Monetary volumes exchanged by the transactions over a defined period (hourly, daily, weekly, monthly, annual)

  5. Batch processing, reconciliation, and market reporting time-frames. Including frequency, duration, and the number of successes and failures to complete processing in allotted time frames. 

  6. Historical incident data & mean time to resolution (MTTR) durations. As well as the severity of the disruption. 

In Closing: 

Whilst not exhaustive, these data points are a sound starting point from which to understand the potential impact of a service interruption on customers, firms themselves, and the wider market. Invariably, this type of data exists but is often stored in many disparate systems across a firm's technology landscape. As such, getting access to this type of information can be troublesome. However, we believe that by securing these data points firms will be able to conduct more meaningful simulations and scenario testing than they would by purely executing a table-top exercise. 


Furthermore, correlating impact projections based on trusted data captured across a firm's systems of record, they can build more objective business cases and secure the required funds to support their remediation efforts between March 2022 and 2025.

Previous
Previous

How To Build A Dynamic Operational Resilience Framework with OpRes & AWS

Next
Next

Bank of England Raises Concerns Around “Cloud Risks”