260 likes | 465 Views
The Case for Software Infrastructure Maintenance. Jim Horning Chief Scientist Information Systems Security Operation SPARTA, Inc. Sonoma State University, November 13, 2008. Overview. Definitions Some ancient history Some recent history Maintenance of civil infrastructures
E N D
The Case for SoftwareInfrastructure Maintenance Jim Horning Chief Scientist Information Systems Security Operation SPARTA, Inc. Sonoma State University, November 13, 2008
Overview • Definitions • Some ancient history • Some recent history • Maintenance of civil infrastructures • Maintenance of software • Two things that are not software maintenance • SCADA • A final puzzle for you • References Sonoma State University
Infrastructure • An underlying base or foundation especially for an organization or system. • The basic public works of a city or subdivision, including roads, bridges, sewer and water systems, drainage systems, and essential public utilities. • The roads, bridges, rail lines, and similar public works that are required for an industrial economy, or a portion of it, to function. • Throughout history, infrastructure systems and services have continuously evolved in both technology and organization. Indeed, in many instances, social scientists measure the level of civilization or advancements of a society on the basis of the richness and articulation of its infrastructure systems. One can easily distinguish at least fifty systems and subsystems that constitute a city's infrastructure, ranging from large-scale transportation and water projects to neighborhood medical clinics and libraries. • A computer system's infrastructure would include the hardware, the operating system, database management system, communications protocols, compilers and other development tools—more generally, any element implicitly relied on in the provision of a service. Sonoma State University
Maintenance • The work of keeping something in proper condition; upkeep. • Accounting: Periodic expenditures undertaken to preserve or retain an asset's operational status for its originally intended use. • Military: The routine recurring work required to keep a facility in such condition that it may be continuously used, at its original or designed capacity and efficiency for its intended purpose. Includes inspection, testing, classification as to serviceability, adjustment, servicing, recovery, evacuation, repair, overhaul, and modification. • Software: The recurring updating of programs in order to continue to operate as intended in a changing environment. Sonoma State University
Ancient history:Key Roman Infrastructures • Roads • Agriculture and food stores • Aqueducts • Photo from Assante Sonoma State University
Timeline of Roman aqueducts [Assante] Sonoma State University
Lack of maintenance [Assante] Sonoma State University
Recent history:Civil infrastructures • Much has been said about the neglect and consequent deterioration of America’s civil infrastructure—the publicly financed or regulated structures and facilities that support essential functions such as transportation (land, water, and air), water supply and wastewater treatment, power, and waste disposal. • There have been many costly infrastructure failures that could have been prevented by timely maintenance. • American engineers have been warning about under-investment in infrastructure maintenance for at least a quarter-century (e.g., America in Ruins: The Decaying Infrastructure, 1983). • But less has been done than said. Sonoma State University
New Orleans afterHurricane Katrina Sonoma State University
Hurricane Katrina, Aug. 29, 2005 • Cascading problems • Wind • High water • Levees collapsed • Massive flooding • Electricity lost • Pumps failed • Telephones largely failed • Water and sewer systems largely failed • Hospitals, schools, police, transportation, libraries, banks, … • Each collapsed infrastructure made restoring others harder • Over 1.5 K dead • Over $100 G in Federal aid alone • Over 100 K trapped in city during storm; over 250 K refugees • Complete recovery may take 20 years Sonoma State University
Interstate 35W bridge collapse, Aug. 1, 2007 New York Times photo Sonoma State University
Interstate 35W bridge collapse, Aug. 1, 2007 • Multiple causes • Faulty design • Gusset plates were too thin for design load (½” instead of 1”) • Structure was “fracture critical” • Inspection two years prior failed to recognize gusset plate buckling that was visible in photographs • Deferred maintenance (rated in “poor” condition for 17 straight years) • Bridge overloaded with construction equipment and materials • 13 killed, 145 injured • $38 M compensation package for victims • Expedited replacement of bridge cost $400 M • Replacement had been scheduled for 2020-25 • See http://www.transportation.org/sites/bridges/docs/I-35%20Bridge%20Collapse%20and%20Response.pdf for details and many graphic photos Sonoma State University
My argument • Civilization and infrastructure are intimately intertwined. • Rising civilizations build and benefit from their infrastructures in a “virtuous cycle.” • As civilizations decline, their infrastructures decay. Sonoma State University
Dependence on critical infrastructures is increasing globally. • This is true not only of information systems and network services, but also of many others that we rely on for our livelihoods and well-being. • These critical infrastructures are becoming more interrelated, and more heavily dependent on information technology. • People demand ever more and better services, but understand ever less about what it takes to provide those services. Sonoma State University
The failure of a critical infrastructure can cascade into others. • The very synergies among infrastructures that allow progress to accelerate are a source of positive feedback, allowing initial failures to escalate into much larger long-term problems involving many different infrastructures. • Remediating after a collapse often involves many secondary costs that were not foreseen. • The more different infrastructures that fail concurrently, the more difficult it becomes to restore service in any of them. • Restoring a lost “ecosystem” generally costs much more than the sum of the costs of restoring each element separately. Sonoma State University
The maintenance trade-off • Engineers know that physical infrastructures decay without regular maintenance, and they prepare for aging (e.g., corrosion and erosion) that requires inspections and repairs. • Proper maintenance is generally the cheapest form of insurance against failures. • With rare exceptions, such as spacecraft, where it’s not feasible. • However, it has a definite present cost that must be balanced against the unknown future cost of possible failures. Sonoma State University
Software maintenance • Although computer software does not erode or corrode, it is subject to incompatibilities and failures caused by changing environments, changing user practices, and changes in underlying hardware and software. • Therefore, it requires maintenance. • Yet the costs of software maintenance are often ignored in the planning, design, construction, and operation of critical systems. • Incremental upgrades to software are error-prone and complicate maintenance. Sonoma State University
Software maintenance examples • Y2K • In the 60s it seemed perfectly reasonable to use two digits in dates to encode the year. • Who knew the COBOL software would still be used in 00? • Global Positioning System satellite 32 In the November 2008 issue of BoatU.S. magazine, there's a reference to a new GPS satellite being switched on. It uses the identifier “PRN 32,” which causes some Northstar GPS units to “become confused” and “shut down.” Fortunately, there are firmware updates available, though in some cases they cost money. Unfortunately, most boaters wouldn't know a firmware update if they hooked one, so there will undoubtedly be accidents and other problems, and GPS units “acting flakey” (they only crash when that particular satellite is in view). Sonoma State University
Two things that I don’t callSoftware Maintenance • Adding new functionality: This is Software Extension. • Adding a new wing to a building is not maintenance. • Patching bugs: This is just Belated Quality Assurance (BQA). November 11, 2008 (IDG News Service) Some security patches take time—seven and a half years, in fact, if you count the time it's taken Microsoft Corp. to patch a security issue in its SMB (Server Message Block) service, which was fixed Tuesday. This software is used by Windows to share files and print documents over a network. In a blog posting, Microsoft acknowledged that “Public tools, including a Metasploit module, are available to perform this attack.” Metasploit is an open-source tool kit used by hackers and security professionals to build attack code. According to Metasploit, the flaw goes back to March 2001, when a hacker named Josh Buchbinder (a.k.a. Sir Dystic) published code showing how the attack worked. Ben Greenbaum, research manager at Symantec Corp., said the flaw may have first been disclosed at Defcon 2000, by Christien Rioux (a.k.a. Dildog), chief scientist at Veracode Inc. Whoever discovered the flaw, Microsoft seems to have taken an unusually long time to fix it. Sonoma State University
Neglecting maintenance • Creating maintainable systems is difficult and requires significant foresight, appropriate budgets, and skilled individuals. • Neglect is the inertially easy path; maintenance requires recurring effort, talent, and funding. • But appropriate investments in maintenanceand in maintainability could yield enormouslong-term benefits, through reliability, robustness against attack, ease of use, and adaptability to new needs. Sonoma State University
Supervisory Control and Data Acquisition Systems • SCADA refers to a system that collects data from various sensors at a factory, plant, or other remote location and then sends it to a computer system that uses the data to manage and control a device, a facility, or a collection of facilities. • SCADA is used broadly to describe control and management solutions in a wide range of industries, including Water Management Systems, Electric Power, Traffic Signals, Mass Transit Systems, Environmental Control Systems, and Manufacturing Systems. • This is where software and civil infrastructure meet(or collide). • Virtually all modern SCADA systems are controlled by software. • For operational efficiency, more and more SCADA systems are being connected to the Internet. Sonoma State University
(In)security • Insecure networked computers provide vandals easy access to the Internet, where spam, denial-of-service attacks, and botnet acquisition and control constitute an increasing fraction of all traffic. • They directly threaten the viability of one of our most critical modern infrastructures (the Internet), and indirectly threaten all the infrastructures connected to it via SCADA. • “Although many technological advances are emerging in the research community, those that relate to critical systems seem to be of less interest to the commercial development community.” —“Risks in Retrospect” Comm. ACM, July 2000 • Our networked computers, in turn, depend on various other critical infrastructures: electricity, telecommunications, … Sonoma State University
A final puzzle for you • Why do tomorrow’s software engineers receive so little education about • designing for maintainability, • preparing for software aging, • maintaining legacy software, and • knowing when and how to terminate decrepit legacy software systems? Sonoma State University
To Dig Deeper — Civil Infrastructures • Infrastructure Protection in the Ancient World, Michael J. Assante,http://www.inl.gov/nationalsecurity/energysecurity/d/infrastructure_protection_in_the_ancient_world.pdf • America in Ruins: The Decaying Infrastructure, Pat Choate and Susan Walker, Duke University Press, 1983. • Cities and Their Vital Systems: Infrastructure Past, Present, and Future, Jesse H. Ausubel and Robert Herman (eds.), National Academies Press, 1988. • Civil Engineering: Public Works/Infrastructure, Library of Congress, 1991.http://www.loc.gov/rr/scitech/tracer-bullets/civilengtb.html • America's Ailing Cities: Fiscal Health and the Design of Urban Policy, Helen F. Ladd and John Linger, Johns Hopkins University Press, 1991. • “It's Time to Rebuild America,” Felix G. Rohatyn and Warren Rudman, Washington Post, Dec. 13, 2005. • The Decaying Infrastructure of Complex Society, 2007.http://deconstructingthemanifest.blogspot.com/search/label/Complex%20Society • 4 Things the Roman Aqueducts Can Teach Us About Securing the Power Grid, Michael Assante and Mark Weatherford, CSO Security and Risk, 2005. http://www.csoonline.com/article/217014 Sonoma State University
To Dig Deeper — Software • International Conference on Software Maintenance (ICSM) • http://www.icsm2008.org • European Conference on Software Maintenance and Reengineering (CSRM) • http://www.csmr2008.uwaterloo.ca/ • “Risks of Neglecting Infrastructure,” Jim Horning and Peter Neumann • http://www.csl.sri.com/users/neumann/insiderisks08.html • Communications of the ACM, “Inside Risks” • http://www.csl.sri.com/users/neumann/insiderisks.html • Confessions of a Used Program Salesman: Institutionalizing Software Reuse, Will Tracz, Addison Wesley Longman, 1995. • Risks Digest • http://www.risks.org • Computer-Related Risks, Peter G. Neumann, Addison-Wesley/ACM Press, 1995. • Illustrative Risks to the Public in the Use of Computer Systems and Related Technology • http://www.csl.sri.com/users/neumann/illustrative.html Sonoma State University