1 / 39

CSCE 548 Secure Software Development Independence in Multiversion Programming

CSCE 548 Secure Software Development Independence in Multiversion Programming. Reading. This lecture:

Download Presentation

CSCE 548 Secure Software Development Independence in Multiversion Programming

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSCE 548 Secure Software DevelopmentIndependence in Multiversion Programming

  2. Reading • This lecture: • B. Littlewood, P. Popov, L. Strigini, "Modelling software design diversity - a review", ACM Computing Surveys, Vol. 33, No. 2, June 2001, pp. 177-208, http://portal.acm.org/citation.cfm?doid=384192.384195 • Software reliability, John C. Knight, Nancy G. Leveson, An Experimental Evaluation Of The Assumption Of Independence In Multi-Version Programming, http://citeseer.ist.psu.edu/knight86experimental.html • Recommended • The Role of Software in Spacecraft Accidents by Nancy Leveson. AIAA Journal of Spacecraft and Rockets, Vol. 41, No. 4, July 2004. (PDF )

  3. Modeling Software Design Diversity – A Review • BEV LITTLEWOOD, PETER POPOV, and LORENZO STRIGINI, Centre for Software Reliability, City University • All systems need to be sufficiently reliable • Required level of reliability • Catastrophic failures • Need: • Achieving reliability • Evaluating reliability

  4. Single-Version SoftwareReliability • The Software Failure Process • Why does software fail? • What are the mechanisms that underlie the software failure process? • If software failures are “systematic,” why do we still talk of reliability, using probability models? • Systematic failure: if a fault of a particular class has shown itself in certain circumstances, then it can be guaranteed to show itself whenever these circumstances are exactly reproduced

  5. Systematic failure in software systems: If a program failed once on a particular input case it would always fail on that input case until the offending fault had been successfully removed

  6. Failure Process • System in its operational environment • Real-time system – time • Safety systems – process of failed demands • Failure process is not deterministic • Software failures: inherent design faults

  7. Demand space Uncertainty: which demand will be selected and whether this demand will lie in DF Source: Littlewood et al. ACM Computing

  8. Predicting Future Reliability • Steady-state reliability estimation • Testing the version of the software that is to be deployed for operational use • Sample testing • Reliability growth-based prediction • Consider the series of successive versions of the software that are created, tested, and corrected, leading to the final version • Extrapolate the trend of (usually) increasing reliability

  9. Design Diversity • Design diversity has been suggested to • Achieve higher level of reliability • Assess level of reliability • “Two heads are better than one.” • Hardware: redundancy • Mathematical curiosity: Can we make arbitrary reliable system from arbitrary unreliable components? • Software: diversity and redundancy

  10. Software Versions Source: Littlewood et al. ACM Computing Independent development Forced diversity Different types of diversity

  11. N-Version Software • Use scenarios: • Recovery block • N-self checking • Acceptance

  12. Does Design Diversity Work? • Evaluation: • operational experience • controlled experimental studies • mathematical models • Issues: • applications with extreme reliability requirements • cost-effectiveness

  13. Multi-Version Programming • N-version programming • Goal: increase fault tolerance • Separate, independent development of multiple versions of a software • Versions executed parallel • Identical input  Identical output ? • Majority vote

  14. Separate Development • At which point of software development? • Common form of system requirements document • Voting on intermediate data • Rammamoorthy et al. • Independent specifications in a formal specification language • Mathematical techniques to compare specifications • Kelly and Avizienis • Separate specifications written by the same person • 3 different specification languages

  15. Difficulties • How to isolate versions • How to design voting algorithms

  16. Advantages of N-Versioning • Improve reliability • Assumption: N different versions will fail independently • Outcome: probability of two or more versions failing on the same input is small If the assumption is true, the reliability of the system could be higher than the reliability of the individual components

  17. Is the assumption TRUE?

  18. False? • Solving difficult problems  people tend to make the same mistakes • Common design faults • Common Failure Mode Analysis • Mechanical systems • Software system N-version-based analysis may overestimate the reliability of software systems!

  19. 1. How to achieve reliability?2. How to measure reliability?

  20. How to Achieve Reliability? • Need independence • Even small probabilities of coincident errors cause substantial reduction in reliability • Overestimate reliability • Crucial systems • Aircrafts • Nuclear reactors • Railways

  21. Testing of Critical Software Systems • Dual programming: • Producing two versions of the software • Executing them on large number of test cases • Output is assumed to be correct if both versions agree • No manual or independent evaluation of correct output – expensive to do so • Assumption: unlikely that two versions contain identical faults for large number of test cases

  22. Voting • Individual software versions may have low reliability • Run multiple versions and vote on “correct” answer • Additional cost: • Development of multiple versions • Voting process

  23. Common Assumption: low probability of common mode failures (identical, incorrect output generated from the same input)

  24. Independence • Assumed and not tested • Two versions were assumed to be correct if the two outputs for the test cases agree • Test for common errors but not for independence • Kelly and Avizienis: 21 related and 1 common fault – nuclear reactor project • Taylor: common faults in European practical systems • Need evaluation/testing of independence

  25. Experiment • University of Virginia and University of California at Irvine • Graduate and senior level computer science students • 27 programs (9 UVA, 18 UCI) • 1 million randomly-generated test cases

  26. Software System • Simple anti-missile system • Read data that represent radar reflections • Decide: whether the reflections comes from an object that is a threat • Heavily parameterized conditions • Research Triangle Institute has developed a 3 version study on the same problem previously  RTI supplied the requirement specifications

  27. Development Process • No overall software development methodology was imposed on the developers • Must use Pascal and specified compiler and operating system • Students were instructed about the experiment and N-versioning • Students were instructed not to discuss the project with each other • No restriction on reference sources

  28. Requirement Specification • Answering questions by email  remove the potential of transferring unintended information • General flaws in the specification were broadcasted to all the programmers • Debugging: each student received 15 input and expected output data sets • Acceptance test: • 200 randomly generated test cases • Different data sets for each program  avoid filtering of common faults

  29. Acceptance Test • All 27 versions passed it • Success was evaluated against a “gold” program • Written in FORTRAN for the RTI experiment • Has been evaluated on several million test cases • Considered to be highly accurate

  30. Evaluation of the Independence • 1 million tests were run on • 27 version • Gold program • 15 computers were used between May and Sept. 1984 • Programmers had diverse background and expertise in software development

  31. Time Spent on Development • Reading requirement specification: 1-35 hours (avg. 5.4 hours) • Design: 4-50 hours (avg. 15.7 hours) • Debugging: 4-70 hours (avg. 26.6 hours)

  32. Experimental Results • Failure: if there is any difference between the version’s output and the output of the gold program or raising an exception • 15 x 15 Boolean array • Precision • High quality versions (Table 1) • No failure: 6 versions • Successful on 99.9 % of the tests: 23 versions

  33. Multiple Failures • Table 2 • Common failures in versions from different universities

  34. Independence • Two events, A and B, are independent if the conditional probability of A occurring given that B has occurred is the same as the probability of A occurring, and vice versa. That is pr(A|B)=pr(A) and pr(B|A)=pr(B). • Intuitively, A and B are independent if knowledge of the occurrence of A in no way influences the occurrence of B, and vice versa.

  35. Evaluating Independence • Examining faults  correlated faults? • Examine observed behavior of the programs (does not matter why the programs fail, what matters is that they fail) • Use statistical method to evaluate distribution of failures • 45 faults were detected, evaluated, and corrected in the 27 versions (Table 4)

  36. Faults • Non-correlated faults: unique to individual versions • Correlated faults: several occurred in more than one version • More obscure than non-correlated ones • Resulted from lack of understanding of related technology

  37. Discussion • Programmers: diverse background and experience (most skilled programmer’s code had several faults) • Program length: smaller than real system  does not address sufficiently inter-component communication faults • 1 million test case reflects about 20 years of operational time (1 execution/second)

  38. Conclusion on Independent Failures • Independence assumption does NOT hold • Reliability of N-versioning may NOT be as high as predicted • Approximately ½ of the faults involved 2 or more programs • Either programmers make similar faults • Or common faults are more likely to remain after debugging and testing • Need independence in the design level?

  39. Next Class • Penetration Testing

More Related