What are recovery strategies for a business continuity plan?

Author: Manu Steens

In this post I write my own opinion, not that of any organization.

Recovery strategies are part of a business continuity plan (BCP). They describe how an organization should act to restore business operations after a disruption or crisis. Some can be taken together.

What are the main recovery strategies?

Some common recovery strategies that you will find in literature are:

  • Data recovery strategies
  • Facilities recovery strategies
  • IT recovery strategies
  • Communication recovery strategies
  • Workforce recovery strategies
  • Vendor and partner recovery strategies
  • Financial recovery strategies
  • Test and practice strategies

What do you need to know about that?

I give a basic explanation here and some contexts in which recovery strategies are relevant.

  • Data recovery strategies: These can be regular backups of the most important data. You then save it offsite. Once a disaster occurs, you need data recovery, as well as procedures to restore this data.

Backups include scheduling the regular retention of all critical data. This is done on reliable media, as recommended in norms and standards. Encryption is recommended.

Then there is the data recovery for after an emergency. To this end, you develop step-by-step procedures. These procedures are complete in the sense that they also designate those responsible for the implementation, and their contact details. Finally, it gives an indication of the number of tests using backups needed to continue to rely on them.

A common case of data problems is a data breach. This can be because of cyberattacks or human error (e.g., because of social engineering). This often involves extortion, and appeals are made to the courts.

An example is when a government agency is hit by a cyberattack that corrupts sensitive data. The recovery strategy can then retrieve this data back to the last protected backups.

A first political example was in 2017 when the United Kingdom was hit by the “WannaCry” ransomware attack. It encrypted patient data from several National Health Services (NHS). This was addressed by using backups. Afterwards, stricter security measures were implemented.

A second political example were the Snowden revelations. This involved disclosing information from intelligence agencies. Then several countries revised their data storage policies and applied encryption to sensitive information.

A third political example concerns the GDPR in the European Union in 2018. To this end, companies and organizations had to develop strategies to comply with the stricter data protection standards.

  • Facilities recovery strategies: This includes alternate locations and temporary workspaces. Before working from home became commonplace on a large scale, alternative work locations were identified where activities can continue if the primary location is not available. Access to temporary spaces, such as co-working spaces, was provided to quickly resume operations. To make this possible, mutual agreements were made with neighboring companies and organizations.

For example, for alternative locations, there was specific office space or external data centers, which are quickly available to continue business operations. For temporary workspaces, agreements could be concluded with specialized companies and serviced offices to ensure that workspaces with the necessary infrastructure and facilities were temporarily but timely ready. Meanwhile, their primary facilities are being restored.

One possible cause is a natural disaster. Europe has natural disasters such as earthquakes or forest fires, but floods are mainly in the collective memory.

A possible example of a problem is when government buildings are damaged during protests or social unrest. The recovery strategy then partly ensures the continuity of essential services. Due to Covid19 in the recent past, a lot of this is currently being taken up with teleworking.

A political example was during the Kosovo war when many government buildings were destroyed in 1999. In response to that crisis, several international organizations and the United Nations developed recovery strategies for their facilities.

A second example was after the terrorist attacks in Madrid in 2004. Train stations were targeted. The Spanish government then used recovery strategies to quickly restore the affected stations so that public transport could resume its functions.

A third example was when, after the floods in Central Europe in 2013, countries such as Germany and the Czech Republic developed strategies to repair damaged roads, bridges, … so that mobility could recover.

  • IT recovery strategies consist of an IT recovery plan and failover and redundancy. Develop a plan with tailored procedures to restore the IT systems and networks. This includes hardware and software recovery procedures. With failover and redundancy, you implement solutions to keep critical systems available in the event of an outage. This includes the continuity aspect in a crisis.

The IT recovery plan includes a list of critical systems, hardware and software requirements, and recovery procedures with timeframes. This is one of the results of the Business Impact Analysis. For failover and redundancy, use technologies such as failover clustering, load balancing, and geographic redundancy.

A possible problem for which this appears to be necessary is a cyber-attack. Such as, for example, ransomware attacks. Therefore, businesses and organizations must have IT recovery plans in place to quickly recover critical systems.

A first example is when a country becomes the target of a large-scale cyberattack that disrupts critical government websites. The IT recovery strategy then involves not only the (rapid) recovery of these services, but also the improvement of cyber security.

One example was when Estonia was hit by a cyberattack targeting government sites and financial institutions in 2007. The solution ultimately consisted of improved cybersecurity and the recovery of the IT infrastructure.

A second example was the Stuxnet attack in 2010. It was originally focused on Iran’s nuclear program. However, it caused a lot of other damage.

As a third example, during the COVID-19 pandemic in 2020-2021, IT strategies were developed to enable the increased demand for teleworking and online education.

  • Communication recovery strategies. These include a communication plan and alternative means of communication. The communication plan serves to inform internal and external stakeholders about the current situation Alternative means of communication are things such as satellite phones or mobile hotspots.

In a communication plan, you include a list of contacts, including internal teams, external partners, suppliers, and customers. You define the channels and methods of communication in advance. In addition, you provide alternative means of communication such as satellite phones, mobile hotspots, and web conferencing services.

A possible use of this plan is a national emergency such as pandemics or large-scale natural disasters. Then organizations need to have effective communication tools and plans to keep employees, customers, and suppliers informed.

An example is during a political crisis or national emergency, such as the terrorist attack at Zaventem airport in 2016. Then public authorities must be able to communicate quickly and effectively with the public. This is important to keep the population informed and to keep calm.

  • Workforce recovery strategies are about staffing and staff training. Personnel must be able to be mobilized quickly, they must know their tasks and responsibilities in advance during a crisis. Therefore, make sure that employees are trained in dealing with emergencies. Sometimes the workforce recovery strategy involves mobilizing additional employees and adjusting their work schedules.

For staffing, draw up a list of crucial positions and appoint responsible employees and backup employees.

You provide training and awareness of the staff about their role during a crisis.

In the event of a health crisis, such as Covid19, organizations must have plans for the protection of staff. This includes implementing working from home and determining responsibilities for operational continuity.

For example, during a geopolitical crisis such as the earthquake in Turkey in 2023, there may be a need for the deployment of additional personnel for population safety and law enforcement.

  • Supplier and partner recovery strategies: For supplier relationships, you maintain good relationships with key suppliers during ‘peacetime’. In addition, you provide alternative suppliers in case of emergency. In partnerships, you work with strategic partners on joint recovery plans. These can be tested by all parties together.

With the critical suppliers you develop good relationships that ensure support in case of emergency. This can be contractually stipulated.

You will regularly work with strategic partners to develop, test, and maintain shared recovery plans. Recovery efforts are coordinated together, including in exercises.

An example concerns international trade conflicts. In addition, or in the event of customs restrictions, organizations must already have alternative suppliers. They should include this in their supply chain planning in advance.

A political example is when the EU is involved in an international trade conflict, such as with Russia because of the war in Ukraine. That creates supply chain disruptions on both sides. The solution appears to be to develop alternative supply routes to maintain economic stability. Diversifying in the customer base and supplier base is then a survival strategy.

  • Financial recovery strategies include financial reserves and insurance. The organization maintains financial reserves for operational costs including recovery and emergency measures. In addition, they provide appropriate insurance. Examples include a captive and insurance for a business interruption.

During economic crises, one must have financial reserves to cover costs. Consider a possible financial restructuring to maintain stability. The government can then take financial measures to stimulate the economy. Financial resources are used to restore economic stability.

A political example of government intervention was during the Fortis crisis. Another example was during Covid19 where the Belgian authorities provided basic financial support for people who could no longer carry out their work.

  • Test and practice strategies: These are the final piece of any strategy. First, there’s the testing regimen. Regularly test the effectiveness and efficiency of the recovery plan. The purpose of testing and practice is the Lessons Learned. Identify the areas for improvement and learn from each disruption and adjust the plan based on lessons identified.

Therefore, plan regular tests and exercises with crisis exercises that rely on the BCP and recovery strategies. Evaluate the results, identify problems in the plans and improve the plans and strategies.

Organizations should also simulate national emergencies to test and evaluate the response and recovery of different recovery strategies. They do this in partnership with governments where useful and necessary.

Examples are when national, provincial, or municipal governments conduct exercises together or separately to evaluate the response to different emergencies. Topics are diverse, such as natural disasters, terrorist attacks or cyber-attacks. This helps to make society resilient.

Conclusion

There are multiple types of recovery strategies. These should be regularly evaluated and reworked to keep them relevant and effective. After all, the business environment is constantly changing. Each of them should be specifically tailored to your organization.

There may be various crises that require recovery strategies. It is important to note that the nature and extent of crises can vary. The actual implementation will therefore depend on the specific circumstances. A well-tested BCP is essential to be able to respond effectively to unexpected events.

Manu Steens

Manu works at the Flemish Government in risk management and Business Continuity Management. On this website, he shares his own opinions regarding these and related fields. Since 2012, he has been working at the Crisis Centre of the Flemish Government (CCVO), where he has progressed in BCM, risk management, and crisis management. Since August 2021, he has been a knowledge worker for the CCVO. As of January 2024, he works at the Department of Chancellery and Foreign Affairs of the Flemish Government. Here, he combines BCM, risk management, and crisis management to create a tailored form of resilience management to meet the needs of the Flemish Government.

Leave a Reply

Your email address will not be published. Required fields are marked *

Recent Posts