220 likes | 245 Views
This article discusses software assurance and the evaluation of tools used in the development and testing process. It explores the importance of good requirements, operations, testing, coding, and design in achieving software assurance. The concept of an assurance case and the evidence that supports it is also explained. The article touches on the role of NIST in software assurance and the details of tool evaluations. It concludes with the need to measure the effectiveness of tools and techniques used in software assurance.
E N D
Software Assurance Metrics and Tool Evaluation Paul E. Black National Institute of Standards and Technology http://www.nist.gov/ paul.black@nist.gov
What is NIST? • National Institute of Standards and Technology • A non-regulatory agency in Dept. of Commerce • 3,000 employees + adjuncts • Gaithersburg, Maryland and Boulder, Colorado • Primarily research, not funding • Over a century of experience in standards and measurement: from dental ceramics to microspheres, from quantum computers to building codes Paul E. Black
What is Software Assurance? • … the planned and systematic set of activities that ensures that software processes and products conform to requirements, standards, and procedures. • from NASA Software Assurance Guidebook and Standard to help achieve • Trustworthiness - No vulnerabilities exist, either of malicious or unintentional origin • Predictable Execution - Justifiable confidence that the software functions as intended Paul E. Black
Getting Software Assurance Good Requirements Good Operations Good Testing Good Coding Good Design Paul E. Black
What is an Assurance Case? • A documented body of evidence that provides a convincing and valid argument that a specified set of claims about a system’s properties are justified in a given environment. • after Howell & Ankrum, MITRE, 2004 Paul E. Black
… in other words? Claims, subclaims Arguments Evidence Paul E. Black
Evidence Comes From All Phases • All tools should produce explicit assurance evidence • Design tools • Compliers • Test managers • Source code analyzers • Etc. • What form should the evidence take? • OMG: “a common framework for analysis and exchange” • Grand Verification Challenge “tool bus” • Software certificates, ala, proof carrying code • Software security facts label Paul E. Black
So… What About Maintenance?? • A change should trigger reevaluation of the assurance case • A typical change should entail regular (re)assurance work: unit test, subsystem regression test, etc. • A significant change may imply changes to the assurance case model Paul E. Black
Why Does NIST Care? • NIST and DHS signed an agreement in 2004 which kicked off the Software Assurance Metrics And Tool Evaluation (SAMATE) project http://samate.nist.gov/ • NIST will take the lead to • Examine software development and testing methods and tools to target bugs, backdoors, bad code, etc. • Develop requirements to take to DHS for R&D funding. • Create studies and experiments to measure the effectiveness of tools and techniques. • Develop SwA tool specifications and tests. • NIST already accredits labs (NVLAP NIAP) and produces security standards (FISMA DES AES) Paul E. Black
Details of SwA Tool Evaluations • Develop clear (testable) requirements • Focus group develops tool function specification • Spec posted to web for public comment • Comments incorporated • Testable requirements developed • Develop a measurement methodology: • Write test procedures • Develop reference datasets or implementations • Write scripts and auxiliary programs • Document interpretation criteria • Come up with test cases Paul E. Black
pen test eval set SAMATE Reference Dataset Much more than a group of (vetted) test sets code scanner benchmark SRD … … … … pen test minimum Paul E. Black
SRD Home Page Paul E. Black
But, are the Tools Effective? Do they really find vulnerabilities? In other words, how much assurance does running a tool or using a technique provide? Paul E. Black
Studies and Experiments to Measure Tool Effectiveness • Do tools find real vulnerabilities? • Is a program secure (enough)? • How secure does tool X make a program? • How much more secure does technique X make a program after doing Y and Z ? • Dollar for dollar, can I get more reliability from methodology P or S ? Paul E. Black
Contact for Participation • Paul E. Black SAMATE Project Leader Information Technology Laboratory (ITL) U.S. National Institute of Standards and Technology (NIST) paul.black@nist.gov http://samate.nist.gov/ Paul E. Black
Possible Study: Do Tools Catch Real Vulnerabilities? • Step 1: Choose programs which are widely used, have source available, have long histories. • Step 2: Retrospective • 2a: Run tools on older versions of the programs. • 2b: Compare alarms to reported vulnerabilities and patches. • Step 3: Prospective • 3a: Run tools on current versions of the programs • 3b: Wait (6 months to 1 year or more) • 3d: Compare alarms to reported vulnerabilities and patches. Paul E. Black
Possible Study: Transformational Sensitivity • Choose a program measure • Transform a program into a semantically equivalent (loop unrolling, in-lining, break into procedures, etc.) • Measure the transformed program • Repeat steps 2 and 3 • If the measurement is consistent, it measures the algorithm. If it differs, it measures the program. Paul E. Black
# vulnerabilities time fixing weaknesses reported by tools Dawson Engler’s Qualm # vulnerabilities time fixing weaknesses reported by tools Paul E. Black
Possible Study: Engler’s Qualm • Step 1: Choose programs which are widely used, have long histories, and adopted tools. • Use number of vulnerabilities reported as a surrogate for (in)security. • DHS funded Coverity to check many programs. • Step 2: Count vulnerabilities before and after. • Step 3: Compare for statistic significance. • Confounding factors • Change in size or popularity of a package • When was tool feedback incorporated? • Reported vulnerabilities may come from the tool. Paul E. Black