1 / 89

Loss of the Columbia

This article discusses the loss of the Columbia space shuttle, examining the failures and resistance to risk assessment at NASA. It draws parallels to the 7x24 industry and explores the need for a change in culture and management practices. The article emphasizes the importance of learning from past failures and implementing effective risk assessment techniques.

josefinaj
Download Presentation

Loss of the Columbia

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Loss of the Columbia Lessons Not Learned Steve Fairfax MTechnology, Inc.

  2. Outline • Background • 7x24 Perspective • Review of the Challenger Loss • PRA at NASA after Challenger • Details of the Columbia Loss • NASA Resistance to PRA • Lessons for 7x24 • The Future of Human Space Flight

  3. Background • Challenger Loss January 1986 • Presentation to 7x24 Exchange Spring 1998 • 12 years hindsight • 100+ books published • Many obvious parallels to 7x24 industry • Columbia Loss February 1, 2003 • Asked to present analysis to 7x24 February 4 • 1200+ documents reviewed • 2.5 GB digital files, primarily from internet searches • 4 cubic feet hardcopy • Shuttle design and test information from early 1980s • Books and articles on the Challenger • About 300 hours effort

  4. 7x24 Perspective:Many similarities between NASA and 7x24 firms • Failures can produce catastrophic damages • Billions of dollars • Loss of human life • The role of the organization in technical failure • “Human error is responsible for 67% of all downtime” • Why is this tolerated? What has been done to change this? • Culture of secrecy • Failures and concerns about potential failure are rarely explored • NASA does a much better job than any 7x24 firm, still fails • Normalization of deviance • Near-misses and heroic “saves” are signs of impending failure, not factors of safety, not validation of design or operations • Production culture, time pressures • No time for initial commissioning • No scheduled maintenance windows • Dramatic failure required to get management attention

  5. 7x24 Perspective: PRA • Reliance on redundancy, factors of safety, anecdote, and appeals to experience • Lack of acceptance of PRA techniques, results • “Lack of data” used to excuse sloppy thinking • Results that gore sacred cows are unwelcome • Practices that clearly reduce reliability enshrined as “years of experience” • Inability to quantify risks makes optimal allocation of scarce resources impossible

  6. NASA Government monopoly Resources determined by political process Formerly used to advance political goals Large bureaucracy, very difficult to change 7x24 Industries Varied, competitive Resources determined by market success or failure Mission Critical, not political Range of sizes, change management for survival 7x24 Perspective:Important Differences Look for the for 7x24 relevant points

  7. Summary of the Challenger Loss • Engineers at Solid Rocket Booster (SRB) manufacturer recommended delay in launch • Cold temperatures stiffen O-rings • Seal failures observed 12 times on previous missions • Worst failure, 1/3 O-ring diameter erosion, on coldest launch at 53 °F • Predicted launch temperature 29 °F • 2 conference calls among multiple NASA centers and SRB manufacturer the evening before launch • NASA “appalled” at request for delay • SRB Manufacturer management certified safe for launch • O-ring seals on solid rocket boosters (SRB) failed at ignition • Shuttle destroyed by aerodynamic forces 88 seconds after launch • Hot gas from failed O-ring seal caused SRB – ET attachments to fail • Loss of ET internal pressure caused tank to fail

  8. SRB failed joint seal leaking hot rocket gas

  9. Investigation of the Challenger Loss • Rogers Commission (reports to President) • Testimony under oath • Misleading Testimony by NASA managers • Disputed O-rings as cause of failure • Testified that Thiokol had certified system safe • Failed to mention initial recommendation to delay • Commission attitude shifts from co-operative to confrontational

  10. Causes of the Challenger Loss • Immediate Cause: O-ring failure due to improper design, cold temperatures • Contributing causes • Excessive schedule pressures • Normalization of deviance • Accepting failures as evidence of safety • Gradual shift from requirement to prove that flight is safe to proving it is unsafe • Reliance on experience to approve launch • Demand for “hard” data to halt launch

  11. Recommendations from the Challenger Loss • Change design with independent oversight and review • Change in management practices • Criticality Review and Hazard Analysis • Establish independent safety organization • Improved communications • Avoid reliance on shuttle alone

  12. Risk Assessment at NASA • Emphasis on Redundancy, Safety Factors, Criticality • 4N redundant avionics and computer systems • 1.4 Safety Factor in structure • Single failure of “Criticality 1” component leads to Loss of Crew Vehicle and Crew (LOV/C) • Absent quantitative risk data, impossible to allocate limited resources most effectively • NASA forbade use of formal probabilistic risk analysis (PRA) after Apollo consultant found small probability of successful moon missions • NASA reluctantly started PRA in 1987, after Challenger loss

  13. Risk Assessment at NASA: Tile PRA • “Risk Assessment is a management tool.” • Risk Management for the Tiles of the Space Shuttle, Paté-Cornell & Fishbeck, 1994 • 15% of tiles account for 85% of the risk of LOV/C • Combination of 3 factors • Probability of tile loss/damage • Heat load during re-entry • Criticality of underlying structures and systems • Tile PRA allowed NASA to use same amount of maintenance resources (time, money, attention) to provide more safety • “NASA seems to have grown from a can-do organization to a large bureaucracy.” Ibid.

  14. Tile PRA Applied to NASA Management Practices • New techniques allowed PRA to account for effects of: • Lower pay scale for tile technicians • Lack of training and experience • No sense of priorities, procedures meant to ensure that everything was done “perfectly.” • Fixed daily quota of tile inspections • Schedule pressure effects on technician error rates • High Expectations set for flight frequency, safety • “High visibility makes it difficult for the organization to learn.” Ibid.

  15. Tile inspection and repair is difficult, painstaking work.

  16. PRA of the Entire Shuttle System • “Top-Down” PRA would allow NASA to • Identify systems contributing most to risk • Determine the uncertainty in the findings • Assess the benefits of potential improvements • Select systems with best risk/reward ratio for further work • Track effects and continuously improve • NASA opted for piecemeal PRA of subsystems • Goal was to show that risks were acceptable

  17. Brief History of the Tiles • Early shuttle concepts included Titanium alloy structure, skin, tiles • Excellent high-temperature strength, corrosion resistance • Difficult to form, new alloy development required • Expensive, primary source is Russia (then USSR) • Switch to Aluminum • Much lower cost • Large experience base from previous aerospace work • Loses 95% of room-temperature strength at 800 F • Places much greater demands on tiles • Politically beneficial, domestic suppliers • Lowest capital cost, highest operating cost

  18. Tile History Continued • Tiles developed, manufactured by NASA • Ceramic/glass composite • 25,000 unique shapes, 14+ material mixes • Orbiter Skin must be smooth to control heating • Gap fillers used to fill voids • Early Columbia wing roughness led to “hottest” re-entry • Fragility not understood at first • Glass-like layer added to surface for protection • “Densified” layer added to base for strength

  19. Tile Fragility • Long history of bonding and debris damage • 40% loss of tiles on 1st ferry flight to KSC • Average of 179 tiles damaged on each of 1st 33 flights • High of 707 • Low of 53 • Design Specification for Tile Damage: 0 • “The reusable surface insulation (RSI) material used in the shuttlethermal protection system is susceptible to damage. If any RSI tiles are damaged or lost during ascent, they must be repaired or replaced prior to entry.” - NASA-TM-81822(1980) • NASA cancelled on-orbit tile repair development program

  20. Photograph of tile damage to shuttle Endeavor, in orbit.

  21. New tiles installed on space shuttle Atlantis.

  22. Brief History of the External Tank • Initial RFP did not require insulation • Assumed ice would form, be shed as in Apollo • Extreme fragility of tiles added new requirement • No ice on external tank • “Orbiter tiles were so fragile that an ice cube dropped four inches would crack the tile glass coating.” • Lessons Learned from Space Shuttle External Tank Development by MyronPessin, 2002 • Extensive effort to develop Spray-On Foam Insulation (SOFI) • 1st material withdrawn from market after Univ. Utah professor fed burned residue to rats, showed possible toxicity

  23. Brief History of the External Tank • 2nd SOFI material used Freon blowing agent • Banned by EPA for ozone damage during post-Challenger launch hiatus • NASA 1987 press releases tout “Environmentally Friendly” foam • 1st flight with 3rd SOFI produces 308 hits, 132 larger than 1-inch. Some gouges 15 inches long, depths up to 1.5 inches in 2-inch thick tile. • EPA granted NASA waiver to use Freon in 2001 • NASA continued to use new SOFI • 3rd foam material uses HCFC 141b blowing agent • 15% as harmful as Freon for ozone damage • Banned by EPA effective 2004 • No replacement found for SOFI

  24. Focus on the Failure: Leading-Edge RCC • Front edge of shuttle wings, shuttle nose get the most heating • 2500 °F to 3000 °F in typical re-entry • Reinforced Carbon-Carbon used in these areas • Dense matrix of carbon fiber cloth, carbonized epoxy resin • Approximately ¼” thick • Strong but brittle • SiC coating prevents oxidation (burning) of carbon • Nose RCC shows no signs of corrosion • Protected by cap on launch pad • Wing RCC panels (22 per wing) subject to corrosion • Salt spray creates pinholes in SiC protective coating • NASA inspects and repairs coating after each flight • NASA refuses to place canvas covers over RCC before flight

  25. Wing box structure during assembly RCC Panels Wing bulkhead holds RCC panels 8 9 IR photo shows wing heating

  26. Focus on the Failure: T-seals • RCC panels on wing expand and move with large temperature changes • T-seals between each RCC panel seal gaps • T-seals constructed of RCC • Location, shape of T-seals makes inspection difficult • Columbia has unique attachment hardware, subject to corrosion

  27. RCC Panel T-Seal

  28. Focus on the Failure: Bipod Ramp • 2 struts attach shuttle nose to ET: the Bipod • Hand-applied foam ramp used to smooth airflow near ET attach points • Bipod ramp foam failed on previous flight, at least 4 other occasions • Previous flight bipod ramp SOFI debris hit SRB aft skirt • NASA did not order additional inspection of STS-107 ET bipod ramp for defects • Dissection of next ET bipod ramp showed multiple defects • Several large voids that reduced strength, trapped water or liquid air • Duct tape embedded in foam; increases chance of shear failure

  29. Bipod Attach point

  30. Bipod Ramp Foam on next (after Columbia STS-107) external tank.

  31. Voids and duct tape found in bipod foam on next ET

  32. Previous Bipod Ramp Foam Failure: External Tank photographed after jettison STS-32 Jan. 9, 1990 Missing Foam Left Bipod Ramp, many other areas

  33. External Tank Photograph from orbiter wheel well camera after ET jettison Flight STS-50, June 25, 1992 Foam Intact Right Bipod Ramp Missing Foam Left Bipod Ramp

  34. Details of the Failure: The Launch on January 16, 2003 • Left bipod ramp foam broke off 81 seconds after launch • Estimated size 21” x 16” x 6” • Shuttle velocity Mach 2.4 (1800 MPH) • Foam impacted lower portion left wing at ~750 FPS (500 MPH) • Cloud of debris observed below wing after strike • Strike appeared to occur on or near forward edge of wing

  35. Details of the Failure: NASA-Boeing Analysis • Launch on January 16, 2003 • Reports presented January 21, 23, 24 • All reports assumed no water in foam • 4% water content would double weight of foam • 10.6 inches of rain fell while Columbia sat on launch pad • NASA pokes thousands of holes in foam to reduce debris shedding • CRATER program used to predict tile damage • Based on testing with 3 mm SOFI pellets • RCC damage predicted by comparison with ice impact database • RCC not designed for ice impact • Size of ice debris not revealed

  36. SOFI debris damage to the shuttle tiles • Design requirement: No debris allowed to hit shuttle • History: more than 100 impacts on many flights • More than 25 hits larger than 1 inch on multiple flights • CRATER program developed to predict effects of SOFI impact on tiles • Based on study using 1/8” diameter SOFI pellets • CRATER program used to justify safety of Columbia after damage known • CRATER prediction exactly matches depth of gouge on STS-50 tiles • Inconsistent with NASA characterization of results as “conservative” • Possible that STS-50 single datum used to calibrate CRATER

  37. Critical area Color added, not in original

  38. Notes on NASA/Boeing Analysis • No mention of 1994 Tile PRA • No acknowledgement of different risk levels • Wing RCC loss = vehicle loss! “Criticality One” • Tile near front of wing: Criticality One • Main landing gear doors: Criticality One • Main landing gear door seals: Criticality One • T-seal loss: Criticality One • Uses close call on STS-50 to predict safety on STS-107 • Same flawed logic as Challenger pre-flight debate • Ignores huge extrapolation in energy, damage potential

More Related