280 likes | 374 Views
September 11. What Worked, What Didn’t Sean Donelan Donelan.COM Critical Infrastructure Design. Introduction. Impact on the Internet Rumors Causes What worked What didn’t work Duct tape solutions Recommendations. Names Omitted.
E N D
September 11 What Worked, What Didn’t Sean Donelan Donelan.COM Critical Infrastructure Design
Introduction • Impact on the Internet • Rumors • Causes • What worked • What didn’t work • Duct tape solutions • Recommendations
Names Omitted • Individual company names omitted, unless there is only one company • Building addresses used if well-known location • General description of problems or vulnerabilities
Killed and Missing • World Trade Center: 445 people confirmed killed, 4500 to 5000 people missing • Pentagon: 125 killed • American Flight 11: 92 killed • United Flight 175: 65 killed • American Flight 77: 64 killed • United Flight 93: 44 killed • Estimated 2,600 citizens from 80 countries included in above numbers
Impact on the Internet • The Internet wasn’t a target • You aren’t a Tier-1 provider if you weren’t affected by something • Limited network partitioning US/Europe • Local impact ranged from complete destruction to no impact • Most network disruptions happened hours after the initial attack • Most service disruptions due to problems in edge networks
Rumors • 60 Hudson structurally unsound • FBI seizing ISP equipment “supporting” terrorist web sites • Military taking over satellite transponders shutting down ISPs • Carrier/Ryder trucks missing/stolen • Carnivore slowing down the Internet • Terrorists knew the code name for Air Force One
Yogi Berra It ain’t over, till its over.
Causes • “Normal” disruptions like maintenance, fiber cuts, tropical storms, and crackers continue • Loss of third-party infrastructure • Operator errors & omissions • Exceeded environmental design • Direct damage due to the attack • Software bugs/Hardware failures • Lack of coordination/planning/information • Lack of auto-start/auto-boot
What WorkedInternet • Undamaged portions of the Internet continued to function (mostly) • TCP/IP worked (best-effort delivery) • BGP routing worked • Multicast routing worked • Core application protocols (DNS, E-mail) worked • VOIP (excess capacity, NMC bypass) • Packet wireless, Blackberry, Richochet, 802.11b • Carrier Hotels/Colo’s
What WorkedContent • IRC used to feed live news captions • Instant Messenger usage increased by and estimated 20% • Mirroring/Local caches • Corporate web sites distributed updated information. Non-Internet companies seemed to use the web more effectively immediately after attack • Charity fundraising from web sites with help from some e-commerce sites • SPAM, SPAM, SPAM
O’Toole’s Commentary on Murphy’s Law Murphy was an optimist.
What Didn’t WorkComplex Services • Load-balancing products replaced with DNS round-robin • Generated web pages replaced with direct load pages • Software disk mirroring product didn’t automatically recover after power failures • Analog lines repaired first
What Didn’t WorkSecurity & Authentication • Dialup authentication problems • Connect, but couldn’t login • Central authentication servers were located in other regions • Several register/pay news web sites suspended authentication checks (public service, improved performance) • Difficulties verifying authenticity of requests from the “government” (possible social engineering or just FUD)
What Didn’t WorkCongestion • Its so crowded, no one goes there anymore • Well-known news web sites initially overloaded (cached by other sources) • Government web site overloaded (FBI tip site) • NANOG and other mailing lists posting delays, but did deliver • Unicast (distributed and single source) streaming news sources overloaded • Generally a point-source problem • Not a backbone capacity issue (yet)
What Didn’t WorkPOTS/Voice • “Worked” but did calls get through? • Carrier 1-800 call problems • Cell sites depend on landlines • ILEC versus CLEC access • ISPs established new dialup numbers replacing out of service numbers • Call centers were evacuated, who answered the phones
What Didn’t WorkNew York City • Network-wide effects • Physical damage in New York City • Network problems in New York City • Pentagon and Western Pennsylvania are not major public Internet hubs
What Didn’t WorkThe net needs electricity • Electric substations and grid damaged • Outside plant carrier equipment not connected to the best available backup power source • Batteries don’t last a week • Generator failures • Operator turned off generator to save fuel • Fuel delivery problems • Lack of maintenance • Environment exceeded design conditions • Cooling (HVAC) equipment power supply
What Didn’t WorkRedundancy & Spares • If only a single circuit exists and it is destroyed, no IP traffic • Most end-users connected by a single circuit • Multi-homing versus a second circuit • Limited spare parts stored locally, rely on overnight couriers for replacement parts from central parts depots • Non-revenue generating equipment
What Didn’t WorkDiversity & Avoidance • Equipment in the World Trade Center primarily served tenants in complex (shared fate) • SONET ring through WTC tower 1 and alternate path through WTC tower 2 • Damage to 140 West Street central office and surrounding underground infrastructure • Backup circuit routed through same facility • “Advanced” data circuits (ISDN/DSL) concentrated in a few central offices
Duct Tape Solutions • Cables out windows and manholes and along streets • Carriers shared working facilities in telco hotels to restore service, more carriers generally means more facilities • Carrier provided emergency transit to ISPs in Europe to heal breaks in NYC • ConEd organized generators and fuel truck route for many buildings • Lots of offers of assistance
Blaise Pascal People are generally better persuaded by the reasons which they have themselves discovered than by those which have come in to the mind of others.
Recommendations • Rumors will happen, must actively share information to combat it • Update government response plans to include the Internet and post-1982 telecommunication carriers • Automatic/Remote operation of backup systems in case of evacuation • Plan for customer service during evacuation of call centers
More Recommendations • Pre-plan emergency access with authorities, building owner, etc • Pre-plan load shedding procedures to prevent shutting off critical equipment (Note specify “critical equipment”) • “Outside plant” network transport equipment should be connected to building generator(s)
Net Recommendations • Operators are dangerous, do nothing? • Weakest link, know your circuits • Centralized login can create a denial of service vulnerability during a crisis • Using ISDN for out-of-band access may delay recovery • Simple services work best in a crisis • Diversity, Diversity, Diversity
What WorkedWhat Didn’t Work Questions??? Sean Donelan