200 likes | 335 Views
National Cancer Institute Center for Biomedical Informatics and Information Technology (CBIIT). 2009 Exercise for the NCI CBIIT LAN General Support System IT Contingency Plan Orientation Deck. March 31, 2009.
E N D
National Cancer InstituteCenter for Biomedical Informatics and Information Technology (CBIIT) 2009 Exercise for the NCI CBIIT LAN General Support System IT Contingency Plan Orientation Deck March 31, 2009 This document is confidential and is intended solely for the use and information of the client to whom it is addressed.
Orientation, Training, and Exercise –Program Schedule • Orientation (30 mins.) • Goals and Objectives • IT Contingency Planning Defined • Contingency Plan Design • Plan Overview • Roles and Responsibilities • Organizational Structure • Three Main Phases and Process Flow • Maintenance and Testing • Appendices • Training and Exercise Activities (50 minutes) • Tabletop discussions • Scenario 1 – Minor system failure • Scenario 2 – Major System Failure • Wrap Up (5 min.) • Next steps/Comments and Corrections • Collect signed Key Personnel Acceptance forms
Orientation, Training, and Exercise –Goals and Objectives • Today’s orientation and training seeks to provide key personnel with: • A general explanation of IT contingency planning principles and business continuity planning (BCP) in accordance with NIST and HHS guidance • A high-level view of the Contingency Plan structure • A brief description of key roles and responsibilities • A brief description of the contingency plan organization and process flow • A brief description of contingency plan maintenance and training • A series of training and exercise activities to test basic understanding of the plan
IT Contingency Planning – Definition An IT Contingency Plan defines appropriate, cost-effective and approved strategies necessary for recovering and reconstituting IT services following an emergency or major system disruption.* * HHS Information Security Program,Contingency Planning for Information Security Guide. July 21 2004.
IT Contingency Planning – What Does It Do? • An IT Contingency Plan establishes: • A support organization and recovery teams • System boundaries (diagrams, inventory, and architecture) and defines concept of operations • Damage assessment guidelines and procedures • Step-by-step system recovery and resumption activities for minor and major system failures • Key points of contact (internal/external) • The plan is designed to be a viable tool to help coordinate the activities of key personnel and movement of critical assets in the efficient recovery of a major application or general support system.
IT Contingency Plans Are An Important Part Of Any Information Assurance Program • With an ever increasing reliance on information technology to support critical business processes, demands for contingency planning are also increasing • Evolving threat environment inevitably will crack perimeter and boundary protection • Need for a response, recovery, and resumption capability • Mandated by regulatory requirements (i.e. NIST SP 800-34, FISMA, OMB A-130) and a key component of the Certification and Accreditation (C&A)
IT Contingency Plans are a Key Element of Overall Business Continuity Planning (BCP) Our focus today NIST 800-34, IT Contingency Plan Guide
NCI IT Contingency Plans • Purpose: To provide procedures and capabilities for recovering a major application or general support system. • Scope: All major applications’ and general support systems’ management, operations, support personnel: • NCI LAN GSS Administration staff • CBIIT Infrastructure support staff (contractors and FTEs) • Enterprise Database support teams
Roles and Responsibilities –Individuals and Teams Responsible for Performing Contingency Operations
Roles and Responsibilities –Individuals and Teams Responsible for Performing Contingency Operations (cont.)
Roles and Responsibilities –Individuals and Teams Responsible for Performing Contingency Operations (cont.)
Contingency Planning Operations –Three Main Phases • Contingency planning activities occur in three main phases: • Activation/Notification • Recovery • Resumption/Reconstitution • Each phase has a distinct beginning and end, however Recovery and Resumption activities may overlap.
Phase I - Activation/Notification • Purpose: • Identify and assess the system disruption; • Activate the contingency plan if needed; • Notify all necessary personnel to begin recovery activities. • Activities: • The CPC is notified that a system disruption has occurred. • The CPC notifies the all users (internal and external) of the outage through appropriate means (e.g., email, web-posting, pagers, etc.). • The CPC request a damage assessment from appropriate recovery team leads. • The CPC reviews the damage assessment report and classifies the disruption as either a ‘Minor System Failure’ or ‘Major System Failure’. • If the plan is activated, the CPC notifies all application user groups and NCI CIO of the outage and expected recovery time. • The CPC ensures the alternate site location manager is notified that a contingency has been declared and to prepare for the organization’s arrival. • The CPC coordinates recovery operations with external systems/resources if needed (i.e., placing a high priority trouble ticket with NIH for facility, infrastructure, or other GSS issues). NOTE: 6116 Room 120 has been designated as an Emergency Operations Center (EOC) for local recovery operations, and Helgerman Court has been designated as the alternate EOC.
Phase II - Recovery Phase • Purpose: To provide a structured means for recovering temporary services to the NCI LAN in a temporary (alternate) location or on temporary resources. • The CPC’s primary activity during this phase is to monitor all progress by obtaining periodic updates from the Recovery Team Leads. • Activities: • The CPC shall monitor and adjust all recovery plan activities to reflect any delays in infrastructure restoration and communicate them to the service and application recovery teams. • The CPC references the SLAs between NCI and NIH/ORS, and monitors progress on recovery operations, and escalates if necessary. • The CPC monitors recovery teams’ progress, updates the NCI CIO, and functions as the liaison to all other NCI and NIH personnel not directly involved in the recovery effort. • Upon recovery of the primary IT infrastructure and facilities, the contingency operations move into the resumption phase.
Phase III - Resumption • Purpose: To provide a seamless transition from Contingency operations back to normal operations mode. • Activities: • Recovered systems/applications on primary infrastructure, at primary location, should be tested for normal operation. • CPC notifies the NCI CIO that the GSS (primary) infrastructure has been tested and is functioning properly. • The CPC notifies system owners and recovery team leads to end manual processing activities if they were initiated. • Recovery Teams should return all materials, plans, and equipment used during recovery and testing back to storage or to their proper location. • All sensitive materials must be properly returned to their storage locations or destroyed. • The CPC notifies stakeholders regarding the resumption of normal system/business operations. • The CPC should help develop an after-action report to be filed with the NCI ISSO.
Maintaining a Viable Tool –Contingency Plan Training, Testing, and Exercise (TT&E) • To ensure an effective and viable contingency planning capability, periodic training, testing, and exercises (TT&E) should be conducted on the Contingency Plan. • Periodic (e.g., bi-annual) audits will be conducted to verify availability and accuracy of supporting documentation (SOPs, DR Plans, etc.), monitoring and notification procedures, etc. • Annual testing will be done using progressively more detailed and realistic exercises to demonstrate CBIIT’s preparedness for ‘Major’ disruptions. • A CP is a living document and should be updated periodically to reflect any changes to the systems architecture, inventory, lessons learned, and key personnel.
Appendices –Useful Information At Your Finger Tips • A variety of useful information is located in the Contingency Plan appendices such as: • Key Personnel Contact Information • Configuration Diagrams • System Inventories • Contingency Plan Flow Chart • Damage Assessment Report Form • Action Item Checklists • MOUs/SLAs • After Action Report Template • Appendices serve as a useful location for critical information that may change periodically • Appendices can be added or modified over time without making major changes to the plan such as: • Updates to inventory lists • Points of Contact Information for key personnel • New component diagrams
Conclusion • In Summary: • It is important to review the plan in greater depth to better understand your roles and responsibilities. • Knowing how the process works will ensure the most efficient recovery of critical operations. • Making sure alternates have a basic understanding of the plan will provide for a seamless transition of power during a crisis situation. • Ensuring the plan remains current by assisting the CPC in noting any changes to key personnel, system architecture, and primary facilities will keep the plan from becoming obsolete.