280 likes | 538 Views
Frederick P. Brooks, Jr. Kenan Professor & Department Founder. Some Things that Computer Science can Learn from Nature. Mike Reiter Lawrence M. Slifkin Distinguished Professor Department of Computer Science University of North Carolina at Chapel Hill. Natural Science and Computer “Science”.
E N D
Frederick P. Brooks, Jr. Kenan Professor &Department Founder
Some Things that Computer Science can Learn from Nature Mike Reiter Lawrence M. Slifkin Distinguished Professor Department of Computer Science University of North Carolina at Chapel Hill
Natural Science and Computer “Science” • Natural science: any of the sciences (e.g., physics, chemistry, or biology) that deal with matter, energy, and their interrelations and transformations • In contrast, computer science is a “formal science” • Some have even argued that computer science is not a science at all • Computer Science derived from math and engineering primarily, not from the natural sciences • Numerous qualitative differences have been suggested, mostly deriving from their objects of study
Tools: Used to Improve: Computing (i.e. writing & running code) Computers Everything Else (e.g., Medicine, Entertainment, Business, Safety, Science, etc.) Mathematical Reasoning Computer Science [Mowry]
Tools: Used to Improve: Computing (i.e. writing & running code) Computers Everything Else (e.g., Medicine, Entertainment, Business, Safety, Science, etc.) Mathematical Reasoning Computer Science [Mowry] Systems Theory Applications
Used to Improve: Computing (i.e. writing & running code) Computers Everything Else (e.g., Medicine, Entertainment, Business, Safety, Science, etc.) Computer Science [Mowry] Tools:
Computing Systems vs. Natural Systems • “Natural systems are much more complex than computers.” • Just because we built computers doesn’t mean we understand them
Computing Systems vs. Natural Systems • “Natural systems adapt.” email propagation of malicious code DDoS attacks “stealth”/advanced scanning techniques increase in worms sophisticated command & control widespread attacks using NNTP to distribute attack widespread attacks on DNS infrastructure anti-forensic techniques executable code attacks (against browsers) Attack Sophistication home users targeted automated widespread attacks GUI intruder tools distributed attack tools hijacking sessions increase in wide-scale Trojan horse distribution Internet social engineering attacks widespread denial-of-service attacks Windows-based remote controllable Trojans (Back Orifice) techniques to analyze code for vulnerabilities without source code automated probes/scans packet spoofing Intruder Knowledge 1990 2004
Computing Systems vs. Natural Systems • This is not a depiction of any biological phenomenon • It’s the geographic spread of Sapphire worm 30 minutes after release Source: http://www.caida.org
Can CS Learn from Nature? • Modularity
Can CS Learn from Nature? • Diversity
Can CS Learn from Nature? • Redundancy
Modularity • Decomposing a system into components separated by narrow interfaces at which access control is applied • Often separation is enforced by physical constraints • Modularity least privilege (in my view) • Can be thought of as a method of damage containment
Modularity: Trusted Computing Base (TCB) … … App 1 App App 1 App S S OS OS Shim DMA Devices DMA Devices CPU, RAM TPM, Chipset CPU, RAM TPM, Chipset (Network, Disk, USB, etc.) (Network, Disk, USB, etc.)
Modularity:TPM Background • The Trusted Platform Module (TPM) is a dedicated security chip • It can provide an attestation to remote parties • Platform Configuration Registers (PCRs) summarize the computer’s software state • PCR_Extend(N, V): PCRN SHA-1(PCRN | V) • TPM provides a signature over PCR values • TPM spec v1.2 includes dynamic PCRs • Values can be reset without a reboot
Modularity: Late Launch Background • Supported by new commodity CPUs • SVM for AMD • TXT (formerly LaGrande) for Intel • Designed to launch a VMM without a reboot • Hardware-based protections ensure launch integrity • New CPU instruction (SKINIT/SENTER) accepts a memory region as input and atomically: • Resets dynamic PCRs • Disables interrupts • Extends a measurement of the region into PCR 17 • Begins executing at the start of the memory region
Modularity:The Flicker System [w/ McCune, Parno, Perrig, and Seshadri] • Core technique • Pause current execution environment • Execute security-sensitive code with hardware-enforced isolation • Resume previous execution • Extensions • Preserve state securely across invocations • Attest only to code execution and protection • Establish secure communication with remote parties
App RAM OS Module S Shim SKINIT Reset Modularity:Flicker Execution Flow App OS Outputs Inputs 0 0 0 Module Module S Shim TPM … PCRs: CPU K-1
S Shim Modularity:Flicker Attestation TPM PCRs: Inputs … Outputs K-1 TPM … PCRs: K-1
TPM PCRs: 0 0 0 Inputs What code did you run? S … Shim Outputs Inputs Outputs K-1 S Shim ( ) Sign , K-1 Modularity:Flicker Attestation
Diversity • Studied first in the reliability community • Goal: Promote failure independence between program versions • Manual variant creation by different teams does not necessarily provide fault independence [Knight & Leveson 1986, Littlewood et al., 1989] • More recently studied in security community • Goal: Increase attacker’s effort to compromise systems • Has been studied at O/S level, operator/user interface, and others [Forrest et al. 1997, Deswarte et al. 1998; Bain et al. 2000 …] • Still an active topic of investigation • Ex: “Diversity as a computer defense mechanism: A panel” at the New Security Paradigms Workshop (NSPW) 2005
“Behavioral distance” is a measurement of the extent to which the system calls indicate similar simultaneous behavior A compromise of one variant causes divergence from other variant Diversity: Behavioral Distance [w/ Gao & Song] System calls Apache Web Server Abyss Web Server Windows Linux
Diversity:Behavioral Distance • Diverse Platform (Windows and Linux) • Same system call number in two sequences are not really the “same” • System calls may not have a one-to-one correspondence • System call sequences may have different length • Diverse Implementation (Apache and Abyss) • Difficult to map individual system calls between two sequences • Experimented with two approaches • Evolutionary distance originally proposed to evaluate if two DNA sequences derive from a common ancestral sequence • Hidden Markov models
Diversity:Hidden Markov Models 65 % 100 % 70 % 25 % 10 % 30 % 50 % q1 q2 q3 10 % 90 % 50 % 30 % Transition Emission
Diversity:Hidden Markov Model for Behavioral Distance “-” represents a dummy symbol - - 12 7 6 - 155 76 8 274
Diversity:Elements of the Hidden Markov Model • Once the HMM is trained, the probability that the HMM would have produced an observed sequence can be used to detect intrusions
How Far Does the Analogy Go? • These examples show how we can learn strategies for survival from natural systems • I believe these examples are just a sample of what we can learn from nature about managing systems that we don’t understand • Even if we built them ourselves! • There’s also plenty of room for doubt • Clearly nature has its failures (extinct species, global warming, …) • The tactics (implementations) are quite different • But I hope I’ve encouraged you to think about computer science in the broader context of all sciences, and to look for new opportunities at their intersections