E N D
1. VTN Executive ForumDisaster Recovery & Business Continuity Terry Buchanan, VP Services, CONPUTE
November 2nd, 2006
2. Purpose & Expectations How to get started in DR/BCP and keep it simple
How to evaluate risk for your company
Risk mitigation and planning techniques
Best practice processes to start planning
Plan structure and application
3. Definition Business Continuance (or Continuity) Planning (BCP) involves the processes and procedures organizations put in place to ensure that mission critical functions can continue during and after a disaster.
BCP focuses on
People (Human Safety),
Procedures/Process (How to) and
Infrastructure (Facilities & IT) continuance.
4. Getting Started Considering the “What if scenario”
Keeping your business running
What does that mean for each of you?
What does that mean for your staff + co-workers?
Do you have a plan to:
Stay open?
Recover?
Rebuild?
Do nothing?
5. Best Practice BCP Process
6. Best Practice BCP Process Project Definition
Risk Assessment
Business Impact Analysis
RTO, RPO, Time Critical, Mission Critical
BC Management Design
Emergency Response and Operations
Crisis Communications
Coordination with External Agencies
Detail and Implementation
Awareness and Education
Exercise and Adapt
7. Sponsorship Sponsor has to own initiative.
Sponsor has to fund initiative.
Sponsor is accountable for initiative.
Different level of Sponsors.
Who are sponsors?
President, CEO (Primary)
CIO, CAO, CFO (Secondary)
Directors and Managers (Tertiary)
8. Best Practice BCP Process
9. Project Planning Sponsor
Sets Goals and Charter
Project/Program Manager
Needs Project Management Experience
Manages process/administration/negotiation
Steering Committee
Senior Management
C level, VP, etc.
Approves and makes decisions
Manages budget
Project Team
Business Unit representatives
Managers, Senior staff, Subject Matter Experts
Primary knowledge base
10. Best Practice BCP Process
11. Risk Assessment
12. Risk Assessment
13. Risk Assessment Threats are the cause
Nature – Flood (Peterborough), Hurricane (Katrina/Louisiana), Ice Storm (Ottawa)
Political – Terrorist Attacks (9-11-01), Unions
Engineered – Viruses (Blaster)
Infrastructure – Power Outage (Ontario)
Pandemics – Virus (SARS)
Vulnerability
Probability
Severity
14. Risk Assessment Risks relate to impact
Function disruption to one or many
Elapsed Time
Reach
Controls
Deterrent
Mitigates
Reduces
15. Risk Assessment Analysis Develop worst case scenarios
Develop realistic scenarios
Controllable versus uncontrollable
Identify how risks necessary for conducting business operations and communication are impacted
Identify how threats and risks impact human safety and revenue generation
Preventive controls (security & redundancy)
Reactive controls (failing to or restarting operations at a designated location)
16. Risk Template
17. Application Impact
18. Ten Worst Mistakes IT Staff Make Connecting systems to the Internet before hardening them
Connecting systems to the Internet with default passwords (wireless issues)
Failing to update systems when vulnerabilities are found
Using telnet or other unencrypted protocols for network management
Giving network access over the phone/without authentication
Failing to maintain backups according to legislated archiving
Running misconfigured, unvalidated, security tools (hacker software, I/10)
Failing to implement or update antivirus software
Allowing untrained, uncertified users to take responsibility for securing important systems (accidental use of vendor bias / training)
Failing to train users on what constitutes a security problem
Source: RCMP Technical Security Branch, 2002
19. Best Practice BCP Process
20. Business Impact Analysis Exposure to loss over time
Direct versus Indirect
Cost to recover
Cost to avoid recovery
Business Process interdependencies
Workflow
Legal and Regulatory
21. Recovery Objectives
22. Business Impact Analysis Recovery Point Objective (RPO)
The point in time your company’s data is replicated or stored off-site
Doesn’t mean data is not corrupt or synchronized
The frequency of initiating a RPO determines YOUR risk or data loss potential
Recovery Time Objective (RTO)
How long can you wait for certain functions to be restored?
Tie in RTO to manual process
23. DR SLA 101 for IT
24. Risk MitigationNo BCP
25. Risk MitigationWith BCP
26. Legal Bill C-471, C-387
EI Wait period can be waived in Disaster Region
Bill C-6 “PIPEDA”
Personal Information Protection and Electronic Documents Act
Bill C-145 “Westray”
law to hold corporations, their directors and executives criminally accountable for the health and safety of workers
27. Vital Records Management Legal
Audits, Financials, Intellectual Property, Privacy
Duplication
Electronic, Hard Copy
Off-Site Storage
Access, security, maintenance
28. Best Practice BCP Process
29. ITIL Definition BCP is part of ITIL (IT Infrastructure Library) framework
Continuity management involves the following basic steps:
Prioritizing the businesses to be recovered by conducting a Business Impact Analysis (BIA).
Performing a Risk Assessment (aka Risk Analysis) for each of the IT Services to identify the assets, threats, vulnerabilities and countermeasures for each service.
Evaluating the options for recovery.
Producing the Contingency Plan.
Testing, reviewing, and revising the plan on a regular basis.
30. BCM Design How will you avoid recovery?
How will you recover?
How will you manage recovery?
Defined by RA and BIA
Interdependencies
Relates to RTO & RPO
31. Planning Strategies Perform a Business Impact Analysis (BIA).
Determine risk and cost of downtime by critical application or function.
Create a Business Continuity Plan (BCP).
Create a Disaster Recovery Plan (DRP).
Create a Business Recovery Plan.
Create a Business Resumption Plan.
Create a Contingency Plan.
Execute the Plans.
Test and adapt the plans regularly.
32. Technology BC and DR Planning Software assist tools
iSCSI – low cost network data transport
Vendor Support (Microsoft, Novell, HDS, EMC, SUN, STK, McData, CISCO, etc.)
Drivers + Initiators, NICs, TOE
FAS – data replication/compression
Protocols: FC, FCIP, IFCP, iSCSI, TCP/IP, DWDM
Data Networks: FAS, SAN, NAS
Data Recovery
Tape Backup
Archiving
Database log synchronization
33. Data Replication
34. Emergency Response SC becomes Critical Management Team
Escalation & Notification
Disaster declaration
Emergency Operations Centre (EOC)
Life Safety
Evacuation
Fire, Police, Medical
Security
Property and news media
35. Crisis Communication Respond
Emergency Response Team
Contact lists (up to date)
Recover Operations
Departmental Recovery Teams
Damage Assessment Team
Restoration
Clean Up
Insurance, rebuild, fail back
36. Communicate with EXT Agencies Plans to deal with local emergency services
Plans to deal with media
Plans to deal with volunteers
Contact Lists
Call them before they call you
Lists for internal response teams
Lists for restoration companies
Lists for short/long term staffing
37. Best Practice BCP Process
38. Detail & Implement Plan is now a program
BCP is part of business culture
Implement controls
Vital Records replication and storage
Information Technology
Physical Security
Clean Desk Policies
Test controls with scenarios
39. ControlsInformation Technology Data Availability
Clustering & Load Balancing
Data Replication – Synch vs. Asynch
Data Compression
Off Site Storage – Tapes, CD/DVD, WORM
Data Recovery – Tape Backup
Intrusion Prevention/Detection
Anti-Virus, Anti-Spam, Firewalls
IP Surveillance
40. Strategy Focus Embrace BCP as part of your culture
Be prepared
Educate and promote sponsorship
Human Safety
Security: Physical, records, IT Network
Emergency Response
Maintain business infrastructure
Mission Critical Operations
Support Operations
Self Insurance as a strategy
Reactive effort is not protection, it’s wasteful and costly
Public Perception
Privacy & Legislation
Compliance Legislation
41. Plan Structure Cheat sheet with numbers on a business card sized doc.
Speed kills
Outside help to identify risks and costs saves time
One person owns editing on plan documents
Get a partner to review it and or test it with you
Keep main document short (no more than 4 pages)
Write the document yourself (you’ll understand it better)
Have backups to subject matter experts write content
Appendices for workflow functions owned by departments
Assume you will only have 50% of your staff to manage plan
Assume whomever uses plan knows something about your business and or applications
Assume you are on your own
42. Best Practice BCP Process
43. Application of the Plan Exercise the Controls
Exercise different scenarios
Exercise by cross training
Communicate results
Update documentation
Audit requirements
Develop New Hire training
Develop general training
44. Resources http://www.drii.org Disaster Recovery International
http://www.dri.ca DRI Canada
http://www.drj.com Disaster Recovery Journal
http://www.redcorss.ca Red Cross
http://www.fema.gov Federal Emergency Management Agency
http://www.overt.ca Ontario Volunteer Emergency Response Team
http://www.ocipep.gc.ca Public Safety and Emergency Preparedness Canada
http://www.drie.org Disaster Recovery Information Exchange
http://continuitycentral.com/itdr.htm Continuity Central
45. Are you prepared?
46. Questions?