250 likes | 420 Views
Security & Trusting Trust. Swarun Kumar Based on slides courtesy: Jorge Simosa MIT 6.033 Spring 2013. Reflections on Trusting Trust. By Ken Thompson (UNIX paper, co-invented C and UNIX) Key Ideas: Difficult to know what the software you use actually does.
E N D
Security & Trusting Trust Swarun Kumar Based on slides courtesy: Jorge Simosa MIT 6.033 Spring 2013
Reflections on Trusting Trust • By Ken Thompson (UNIX paper, co-invented C and UNIX) • Key Ideas: • Difficult to know what the software you use actually does. • So write all of software yourself! • ... but that’s overwhelmingly impractical! • No choice but to trust software from certain sources.
What’s a compiler? • Transforms a code written in one language to another … usually from higher-level language (e.g. C) into machine code • New Compilers provide new features, new optimizations Input: Source Code Program: A Compiler (e.g. gcc) Output: New Compiler Binary App.c (written in C.1) C.1 App.exe
But where do compilers come from? • To write a new compiler say for C.2 (C version 2.0) • Write the program source for C.2 in the C.1 language • Feed it into the C.1 compiler • Resulting binary is a new compiler C.2 • Chicken-and-egg: But where did the first compiler C.0 come from? Input: Source Code Program: Old Compiler Output: New Compiler Binary C.2 (written in C.1) C.1 C.2
Example: C.2 has a new feature! • C.2: Accepts vertical tab ‘\v’ as a special character, like ‘\n’ and ‘\t’ • Source of C.2 (written in C.1): if(c[0] == ‘\\’ && c[1] ==‘n’) return ‘\n’; if(c[0] == ‘\\’ && c[1] == ‘v’) return 11; // Note we are using 11, since C.1 doesn’t recognize ‘\v’ • Output: The new compiler (C.2) can compile programs with ‘\v’
Interestingly, C.2 can now compile itself! C.2 Program: Output: Input: Source of C.2 written in C.1 C.1 C.2 if(c[0] == ‘\\’ && c[1] ==‘n’) return ‘\n’; if(c[0] == ‘\\’ && c[1] == ‘v’) return11; C.2 Output: Program: if(c[0] == ‘\\’ && c[1] == ‘n’) return ‘\n’; if(c[0] == ‘\\’ && c[1] == ‘v’) return‘\v’; Input: Source of C.2 written in C.2! C.2 C.2
So, can we discard old source of C.1/C.2? No! Because: C.2 may contain a hidden backdoor!e.g. a “master” password for all UNIX logins So what? … Why not patch that up in C.3? C.2 can infect C.3, C.4, etc… without leaving any evidence in their source code
How can C.2 infect other compilers? • Source of C.2+Trojan: if(matches login code) return (login with master password); if(matches C compiler) return (C compiler with hidden backdoor); • Input: Login Code => Output: Login with master password • Input: C.3 compiler => Output: C.3 compiler backdoor • Input: Regular C code => Output: Regular executable
How can we detect Trojans? Output-1 Output-1 C.3 C.3 C.3 C.3’’ C.2 C.3 Output-2 Output-2 C.3 C.3 C.3’ C.3’’’ C.2’ C.3’ Do we expect C.3 == C.3’ ? Do we expect C.3’’ == C.3’’’ ? • Not necessary • C.2, C.2’ may differ in optimizations • But must be functionallyidentical • Yes, absolutely! • C.3, C.3’ give same output with same input • If not, one of C.2, C.2’ has a Trojan/bug
Quiz 3 (2010) – Q8 Answer True/False based on the Trusting Trust paper: A. Thompson believes that self-reproducing programs shouldn’t be trusted. Answer: False. He doesn’t say anything about making them more or less trustworthy. Talks about programs in general.
Quiz 3 (2010) – Q8 Answer True/False based on the Trusting Trust paper: B. A Trojan horse like the one Thompson describes could not have been hidden in a compiler for a more modern language like Java. Answer: False. Backdoor is not language-specific.
Quiz 3 (2010) – Q8 Answer True/False based on the Trusting Trust paper: C. The Trojan horse Thompson embedded in the login program could have been found by looking at the machine instructions being executed by the CPU. Answer: True. Even though it might take a long time to figure out what the binary is doing.
Quiz 3 (2010) – Q8 Answer True/False based on the Trusting Trust paper: D. A programmer can prevent the type of attack Thompson describes by writing all of his or her programs in assembly code. Answer: False. Assembly code is still considered a “higher-language”, since it must be translated to machine code/instructions through an assembler.
Quiz 3 (2012) – Q13 Ben has Ken’s compiler (B) and its “supposed” source (S). He wants to know if it still has the login Trojan. His friend Alyssa has a clean compiler binary (A). The source code for the UNIX login program is L. Give an example of two compilation chains that can be compared to detect a possible Trojan? Notation: X -> Y is the result of using binary X to compile source Y • B -> S = A -> S • NO, they might make different optimizations, i.e. not the same output
Quiz 3 (2012) – Q13 Ben has Ken’s compiler (B) and its “supposed” source (S). He wants to know if it still has the login Trojan. His friend Alyssa has a clean compiler binary (A). The source code for the UNIX login program is L. Give an example of two compilation chains that can be compared to detect a possible Trojan? Notation: X -> Y is the result of using binary X to compile source Y • B -> S -> S = A -> S -> S • YES, if A and B have no Trojans, the intermediate output (new binary) should produce the same output when using the same input (S)
Quiz 3 (2012) – Q13 Ben has Ken’s compiler (B) and its “supposed” source (S). He wants to know if it still has the login Trojan. His friend Alyssa has a clean compiler binary (A). The source code for the UNIX login program is L. Give an example of two compilation chains that can be compared to detect a possible Trojan? Notation: X -> Y is the result of using binary X to compile source Y • B -> S = A -> S -> S • YES, since B should already be a compiled version of S, we can skip the step of B -> S
Quiz 3 (2012) – Q13 Ben has Ken’s compiler (B) and its “supposed” source (S). He wants to know if it still has the login Trojan. His friend Alyssa has a clean compiler binary (A). The source code for the UNIX login program is L. Give an example of two compilation chains that can be compared to detect a possible Trojan? Notation: X -> Y is the result of using binary X to compile source Y • B -> S -> L = A -> S -> L • YES, similar to second answer, we can instead feed just the login source
Quiz 3 (2012) – Q13 Ben has Ken’s compiler (B) and its “supposed” source (S). He wants to know if it still has the login Trojan. His friend Alyssa has a clean compiler binary (A). The source code for the UNIX login program is L. Give an example of two compilation chains that can be compared to detect a possible Trojan? Notation: X -> Y is the result of using binary X to compile source Y • B -> L = A -> S -> L • YES, similar to fourth answer, but we can skip the step of B -> S
More Past Quizzes (Trusting Trust) Visit http://web.mit.edu/6.033/www/assignments/quiz-3.shtml • 2012 Q3 - #13 (Section 6) • 2010 Q3 - #8 • 2010 Q3 - #13-15 (Section 3) • 2008 Q3 - #5 (Section 3) • 2006 Q3 - #2 *There may be more that I have accidentally overlooked.
Secure Channels • Alice wants to authenticate message m sent to Bob • First cut for security: Let k be a shared key • Then Alice, besides m, sends y = H(“m|k”) where | is a delimiter • Bob verifies if y == H(“m|k”), since he also has k How can Alice and Bob securely exchange the key k?
Diffie-Hellman key exchange Bob Alice random a ga mod p random b gb mod p k = (ga)b = gab mod p k = (gb)a = gba mod p Both Alice and Bob have the same key k, without sending it on the network
Taking it a step further… • Use Public/Secret Keys (… like many of you in DP2) • Can use a PK/SK to authenticate the shared key exchange • Can use PK/SK based signatures • Many more attacks possible (DOS, TCP SYN flooding, Botnets) Security is an arms-race… So, Fewer assumptions in threat model => stronger security