140 likes | 155 Views
Raval • Fichadia John Wiley & Sons, Inc. 2007. Systems Availability and Business Continuity . Chapter Four Prepared by: Raval, Fichadia. Chapter Four Objectives. Understand system availability and business continuity, and recognize differences between the two.
E N D
Raval • Fichadia John Wiley & Sons, Inc. 2007 Systems Availability and Business Continuity Chapter Four Prepared by: Raval, Fichadia
Chapter Four Objectives • Understand system availability and business continuity, and recognize differences between the two. • Comprehendincident response systems and their role in achieving the system availability objective. • Explaindisaster recovery planning objectives and its, design, implementation and testing requirements. • Comprehendthe link between business continuity and disaster recovery. • Understandthe role of backup and recovery in disaster recovery plans.
Power outage at Northwest Airlines • Thunderstorm and lightening at the datacenter location caused the problem. • Systems, down initially, operated in a degraded manner the next morning. • Took very long to check people in flights. • NWA triggered manual processes. Lines became longer and so did the delays in departure. • Arrivals were late, but the departures from gates at the destination airport made the flights to wait before they could get to the gate. • NWA announced an embargo, limiting itself to what it can handle under the circumstances.
System Availability and Business Continuity • System availability assures you that business will continue to operate. • Business continuity is necessary for systems to add value on an ongoing basis. • The issues of business continuity and systems availability are related and even overlap to a degree.
Incident Response • Incident: A level of interruption in the system availability that appears to be temporary. • An incident can be triggered by an accidental action by an authorized user, it may result from a threat. • Incidents may be detected by: • End-users who may describe the symptom but not the cause. • Those monitoring systems and processes may detect anomalies which lead to an incident that has occurred. • Attack: A series of steps taken by an attacker to achieve an unauthorized result. • Event: An action directed at a target that is intended to result in a change of state, or status, of the target. • An event consists of an action and a target.
Nature of Response to an Incident • Assess the business significance of the incident’s impact. • Identify critical business processes that might have been compromised. • Determine the root causes of the incident. This might present a challenge, for every incident could be of a different variety. The team may need to consult experts from outside the team. • Training in forensics could help the team collect and evaluate evidence systematically. • Standard procedures must be followed for restoring the affected systems and processes, instead of ad hoc, one-off attempts to restore what is compromised or lost.
Preventive Measures • Prevention is better – and could be more cost effective - than a cure. • Preventive measures require an anticipation or prediction of what might happen in terms of incidents and consequent compromises. • Lessons learned from the organization’s and from others’ experiences can help design and implement effective preventive measures.
Incident Response Team • A multi-skilled group, since the incident may be any variety and may impact almost any information asset. • May include representation from human resources, legal, information systems, networks and communications, physical security, information security, and public relations. • A top management team member may be designated as a direct contact for counseling and support.
CERT • CERT stands for Computer Emergency Readiness Team. • Also called CERT Coordination Center (CERT CC), it is the Internet’s official emergency team. • Provides alerts and offers incident handling and avoidance guidelines. • Is located at Carnegie-Mellon University. • www.cert.org
Disaster Recovery • Disaster: An event that causes a significant and perhaps prolonged disruption in system availability. • Disasters can be man-made or natural. • Man-made disasters can be malicious or unintentional. • Disaster recovery is a systematic effort to recover from the impact of a disaster. • Best way to understand recovery is by focusing on post-disaster phases. • Post-disaster phases • Immediate response • Near-term resumption • Recovery toward normalization • Restoration to pre-disaster state
Timeliness of Action and Value of Recovery • Timeliness of action • The timeline of actions planned should reflect value of the action at the time. • Certain steps can wait while others must be taken without delay, to minimize losses. • Value of recovery • Timeliness of action reflects value of the recovery target. • Considering this, recovery tasks should be systematically assigned to each post-disaster phase.