400 likes | 546 Views
IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373. Business Continuity Planning 6 December 2007 Charles G. Gray. Business Continuity and Disaster Recovery. Business Continuity - Continuation of the “business” (revenue-generation) in the face of any unusual or unforeseen event
E N D
IT Risk Management, Planning and MitigationTCOM 5253/MSIS 4373 Business Continuity Planning 6 December 2007 Charles G. Gray (c) 2007 Charles G. Gray
Business Continuity and Disaster Recovery • Business Continuity - Continuation of the “business” (revenue-generation) in the face of any unusual or unforeseen event • Overall identification of potential events and the predicted impact on the organization • Disaster – an event that causes significant damage to business operations and requires some actions to recover (c) 2007 Charles G. Gray
DR vs. BCP • Disaster recovery is no longer enough • Business operations must be sustained • Legal requirements • Cash flow • Customer retention • Business continuity is the first priority – then disaster recovery (c) 2007 Charles G. Gray
Business Continuity Planning • An exercise in risk management • Not a “revenue producing” activity • Business overhead (“cost of doing business”) • A form of business insurance justified on losses that might occur • Adequate budgets must be planned • Money • Staff • Time (c) 2007 Charles G. Gray
Key Components of Business Operations • People • Equipment • Workplace • Suppliers • Logistics • Finance (c) 2007 Charles G. Gray
Fire Flood Malicious damage Theft Terrorism Sabotage Explosion Chemical spill Gas leak Disease Earthquake Tropical storm Biological agent Hostage situation Threat of action Criminal damage Accidental damage Fault or failure Disaster Examples (c) 2007 Charles G. Gray
Key Factors Affected by a Disaster • Financial • Reputation • Business (Tylenol, Arthur Anderson) • Personal • NYC Mayor Giuliani • Enron CEO Ken Lay • Customer service • National security • Health and safety • Employees • General public • Regulatory (c) 2007 Charles G. Gray
What is the Cost of Downtime? • Productivity • Number of employees times loaded pay rate • Damaged reputation • Customers • Suppliers and business partners • Banks and financial markets • Revenue • Direct loss, billing losses • Compensatory payments • Loss of future revenue (c) 2007 Charles G. Gray
What is the Cost of Downtime? • Financial performance • Revenue recognition • Cash flow • Lost discounts (Accounts payable) • Credit rating • Stock price • Other expenses • Temporary employees, equipment rental, overtime costs, extra shipping, travel expenses, legal obligations (c) 2007 Charles G. Gray
Examples of Downtime Costs • Energy $2.8 M per hour • Telecommunications $2.1 • Manufacturing $1.6 • Finance/brokerage $1.5 • Info Technology $1.3 • Insurance $1.2 • Retail $1.1 • Pharmaceuticals $1.1 Source – Meta Group2006 (c) 2007 Charles G. Gray
The Ultimate Cost of Downtime • 80% of businesses that suffer a major disruption fail within 18 months (Financial Times 18 April 2007) • Most disruptions are relatively mundane • Drilling through an outside power cable • Failure of air conditioning • “Banana skins” – business slips that result in loss of customers (c) 2007 Charles G. Gray
BCP and IT • IT facilitates the majority of key business processes in a modern company • IT systems control the workflow, production, shipping, billing, customer service (?), etc. • Even the simplest operations can fail when “the computer is down” • IT is a strong management tool • Anything with costs associated with it is tracked for audit and control (c) 2007 Charles G. Gray
Disaster Recovery • Implementation of a response to a specific type of event • A plan with supporting infrastructure, which is implemented in the event of a disaster • Usually treated as an “add on” • Tested occasionally, but rarely emphasized • Financial considerations (CBA) • Cost of downtime vs. cost of system resilience (c) 2007 Charles G. Gray
“Gap” Analysis • Lack of knowledge transfer between business continuity and technical disaster recovery • IT security and physical security operate autonomously • No clear quantitative methodology to rate and benchmark • Health and safety issues not integrated into the business • Continuity planning is isolated • No senior-level champion • Not integrated throughout the business (c) 2007 Charles G. Gray
IT/Business Boundary • IT segmented apart from the “business” • Creators of technology on one side, users on the other • Business analysts, project managers and “relationship” managers are expected to bridge the gap • The business may duplicate some IT support functions to gain some “control” • IT may not even know about it • Highly inefficient (c) 2007 Charles G. Gray
Cultural Issues - Mistrust • Business tells IT that the requirement was misunderstood • Business rejects the technology as not working • Business realizes their error, to “save face” accepts the technology but does not implement • Realize their error and try to negotiate • Find any other way possible to “save face” (c) 2007 Charles G. Gray
Relationship to BCP • BCP is about building a solid and resilient organization that can deal with difficult circumstances or situations • Organization must be designed with business continuity in mind – not “bolted on” later • Ugly to look at • Difficult to manage • Costly! (c) 2007 Charles G. Gray
Health and Safety • Most important business continuity indicators • People are the principal asset of any business – without them, nothing happens • Most companies comply with the “letter of the law” – even if they don’t understand what the law is trying to effect • Companies are responsible for doing all they can to provide a safe workplace (c) 2007 Charles G. Gray
Not Just Fire Anymore • Fire escapes are needed, but that’s not all • Think about emergency slides (airplanes) • Terrorism • Natural disaster (global warming??) • Tropical storms, tornados, tsunami, etc. • Workplace must be designed for protection and evacuation • Flying glass is the biggest cause of injury • ADA compliance (rules on access, but not egress) (c) 2007 Charles G. Gray
Terrorism • Direct loss of life • General economic impact • “Multiplier” effect (trickle-down) • A company with 10,000 employees may influence $1B in indirect community economic impact • Salaries, goods, services, taxes • Mere threat of direct and indirect impact • Psychological effect on employees • Highest impact on business continuity is employee perception and panic (c) 2007 Charles G. Gray
Risk, Motivation and CBA • In failing to protect against a disaster that could be foreseen, is a company being negligent? • When acts of terror can strike any business at any time, is there not a predictable risk to ALL businesses? • What is the cost of lost business, loss of reputation or loss of life? • Are not all businesses bound to protect employees against such events? (c) 2007 Charles G. Gray
Key Issues (1) • Business continuity measures are typically reactive – need to be more proactive • No standard approach to business continuity across organizations/industries • Organizations are not designed with business continuity in the forefront • The threat of terrorism needs to be addressed more specifically when planning for business continuity (c) 2007 Charles G. Gray
Key Issues (2) • Focus needs to be put on people as the core asset of the organization • Organizations need to be motivated toward better continuity preparation, security and health and safety • A means of financially justifying these or even more comprehensive measures must be found • Insurers need to cooperate with industry to ensure that individuals, economies and national security are better protected (c) 2007 Charles G. Gray
The Continuity Assurance Framework Communication Security & Safety Quality Assurance Iterative Process Recovery Rationalization Management Resilience Governance and Strategy Risk Reduction Robustness Rating Rigor (c) 2007 Charles G. Gray
Continuity Assurance Methodology • Strategy sets the direction • Governance is the navigation that keeps us on course • Management controls the day-to-day operation of the continuity assurance machine • QA measures progress in terns of achievement • Interfaces across and around all other functions (c) 2007 Charles G. Gray
The “Machine” Model • Seven levels of quality (continuity) assurance are the spokes in the wheel • The hub and spokes of the wheel are encircled by a ring of security and safety • Encircling all of the elements is communication and knowledge transfer (c) 2007 Charles G. Gray
Core Methodology • Rationalization • Risk Reduction • Rating • Rigor • Robustness • Resilience • Recovery (c) 2007 Charles G. Gray
Rationalization • First step on the path to continuity assurance • If the foundation is wrong the whole method is undermined • Rationalize the organization to harmonize security, continuity and recovery functional areas • Review of processes to avoid overlap • Ensure that business continuity is integrated into the organization rather than “bolted on” (c) 2007 Charles G. Gray
Risk Reduction • Identification of risks to the business • Measures the organization determines to put in place to reduce each risk identified • Eliminate as many risks as possible in order to accurately rate true criticality of processes, people, and systems (c) 2007 Charles G. Gray
Rating • Rating of people, processes and systems to ensure that the organization is aware of its critical components and assets • You may not even know what the components are! • Must understand the business structure before looking at individual components in detail • Expose weaknesses (c) 2007 Charles G. Gray
Rigor of Process • Processes must be in place to manage component configuration • Configuration/change management • Very few organizations have adequate controls • Rating should identify business areas that require reinforced/improved processes • Identify which supporting systems need to be reinforced or made more robust (c) 2007 Charles G. Gray
Robustness of Architecture • Determine the vulnerabilities in the infrastructure and take an integrated architectural approach to correction • Exercise control of the environment to safely manage any fundamental changes to the architecture • Proceed cautiously! • Make sure underlying architecture is sound so as to not replicate something less than ideal (c) 2007 Charles G. Gray
Resilience • Once the underlying systems architecture is strengthened, add new levels of insular resilience to the critical components • Applies to more than just IT systems – includes people • Need the information that systems AND people have for business continuity • Geographic diversity can avoid having to go to “recovery” from a localized event (c) 2007 Charles G. Gray
Resilience–Sun Microsystems • Championed by a senior executive at HQ • Plan “owned” by business units • Ask “what is most critical to the business?” • Why are we doing this? • Will this work in the event of a catastrophe? • Plan must be simple and workable • Simulation/dry run/dress rehearsal is a necessity • You may be amazed at the glitches discovered (c) 2007 Charles G. Gray
Recovery • Recovery is what is left after all else failed because the “event” was to widespread or severe • If you have been successful at all of the previous levels then recovery will be necessary only in the most severe circumstances • Recovery process has its own set of risks (c) 2007 Charles G. Gray
Success of the Framework • Iterative process • Continuous improvement • Revisit each level to tweak their capabilities • Each level builds on the previous levels • Holistic view of the organization • Employ new capabilities in response to the ever-changing business environment • Key performance indicators (KPI) at every level (c) 2007 Charles G. Gray
Continuity Rating • Continuity Assurance Achievement Rating (CAAR) • Overall rating of all KPIs across all levels of the model • Measure of overall business continuity capability (c) 2007 Charles G. Gray
Solving Continuity Problems • Root cause analysis • Pareto charts • The “80/20 rule” • The “trivial many, and vital few” • Fishbone (Ishikawa) process • Cause and effect diagrams • Systematically list all of the different causes that can be attributed to a specific problem • Ask the “why” question five levels down (c) 2007 Charles G. Gray
Communication • Changes needed to truly incorporate business continuity processes are traumatic • The only thing worse may be a merger • Consistent and complete communication across the organization is imperative • Akin to PR and marketing • Must have “buy-in” from top to bottom • Everyone becomes part of the solution • Demanding task requiring full-time resources and materials (c) 2007 Charles G. Gray
Summary • Business continuity capabilities are not simple and may require fundamental change across the entire organization • Disaster comes not only in random events • Can be planned by some and thrust upon others • Not just natural and indiscriminate but can be orchestrated and targeted • Business must orchestrate responses and target defenses to maintain safety, security, and overall continuity (c) 2007 Charles G. Gray