In the previous chapters, we learnt what ITSCM is and its goals & objectives. In this chapter, we are going to take a look at some of the key concepts of ITSCM.
The Service Continuity Management Lifecycle
Establishing and maintaining ITSCM is a cyclical process that ensures continued alignment with Business Continuity plans and business priorities. It is explained in the picture below:
The first two steps, Initiation and then Requirements and Strategy, mainly relate to BCM. ITSCM begins with producing an ITSCM strategy to underpin the BCM strategy. The ITSCM strategy must ensure that cost-effective plans exist to recover IT services and any required IT infrastructure necessary to maintain VBFs.
The situation is more complex where some or all of the IT services are outsourced to another organization. In this case, the ITSCM Manager must ensure that the outsourcer’s continuity and recovery plans meet the objectives and timescales of the business.
Business Impact Analysis
Business Impact Analysis (BIA) is the activity performed by ITSCM, often together with Availability Management, that works with the business to understand the impact on the organization of suffering degraded service or losing an IT service or component. The analysis will identify business functions that are critical to the success of the organization (VBFs) and it is these functions that ITSCM must protect from the impact of an IT failure. The business will define the recovery requirement for these functions that ITSCM must address through its IT continuity plans. Over time, the importance of business functions can change and new ones appear, so ITSCM must undertake regular BIA exercises and feed the results back into the continuity plans to ensure they remain appropriate and up to date.
Risk Analysis and Management
A Risk is - A possible event that could cause harm or loss, or affect the ability to achieve Objectives. A Risk is measured by the probability of a Threat, the Vulnerability of the Asset to that Threat, and the Impact it would have if it occurred.
The first step in protecting VBFs is to understand their dependency on the IT services and infrastructure. This information can be discovered from the Configuration Management System. Next, ITSCM must consider a number of factors:
• What could cause a service or component to fail? Examples can include fire, flood and security breaches in addition to simple mechanical or electrical failure.
• What is the likelihood of this happening? In other words, what are the chances that each of the events defined above could occur?
• What is the impact of such an occurrence? If one of the events did occur, what effect would this have on the business?
This might be expressed in terms of the impact on its reputation, its customers, its finances or its legal or compliance requirements, for example.
The outcome of these considerations will determine the appropriate actions ITSCM has to take to mitigate the risks adequately and cost-effectively. Typically, the greater the likelihood of failure and the greater the impact, the greater the level of protection needed and the greater the justification for the necessary expense.
The first stage of risk analysis and management is to identify potential threats to an asset or service, estimate the probability that the threat might materialize, assess how vulnerable the asset or service is to these threats and to assess the impact should the threat materialize. For example, as identified above, flood is one example of a threat that might be relevant to an asset such as a data center. We would determine the probability that the center might be flooded; assess the vulnerability of the data center to flooding and the impact on the organization if it did flood. Putting all these together would give us a measure of risk.
The second part of risk management is doing something about the risks identified. Generally, we can do a number of things about risks:
• Some risks can just be accepted and provision made in case the worst happens. If we cannot insure our data center because it sits in a flood plain, we may decide to hold a contingency fund in case it does flood.
• We can avoid or eliminate the risk; for example, we can eliminate the risk to our data center by deciding to go back to manual processing. This is not always a practical solution.
• We can transfer the risk to somebody else, for example by taking out insurance or by outsourcing the data center and disaster recovery.
• We can reduce the risk by reducing the probability of the threat or by reducing the severity if the risk materializes. For our data center we might move it to the top of a hill to reduce the probability of a flood or reduce the impact of a flood by replacing under floor cables with fiber optics.
In many cases, the response to risk will be a combination of all or some of these options, with a balance being established between the business’ tolerance to risks and the cost of countermeasures.
A key issue for IT Service Management and ITSCM in particular, is to have some way of analyzing and managing risk, and the best and safest approach is to use a tried and tested framework that covers all aspects of risk identification and management.
Prev: Goals, Purpose & Objectives
Next: Role of an ITSC Manager