Countering Trusting Trust through Diverse Double-Compiling

Countering Trusting Trust through Diverse Double-Compiling Russ Giordano CSC 8410 Operating Systems 4/23/2007

Product Security Evaluations • Generally, it is assumed that a check of source code is enough • Check every time the source code is recompiled to see if you get the same binary results • But what happens if malicious code has been inserted into the binary files of the compiler itself?

Inadequate Solutions • Compiler binary files could be manually compared with their source code • Automating the comparison of compiler source to compiler binary • Compile source code with a second compiler • Receivers could require that they only receive source code • Programs can be written in interpreted languages

What is Trusting Trust? • When an attacker modifies one or more binaries so that the compilation process inserts different code than would be expected • Recompilation of the compiler still results in the reinsertion of malicious code • Original source code can be examined without finding the attack, and the compiler itself can be recompiled without removing the attack

Analysis of Threat – Attacker Motivation • Potential benefits of a “trusting trust” attack include: • Complete control of all systems compiled by affected binary [and it’s descendants] • Backdoor passwords/logins that allow unlimited privileges on entire classes of systems • For a widely-used compiler, or one used to compile a widely-used program or operating system, an attack could result in global control over banks, financial markets, military systems etc.

Analysis of Threat – Attacker Motivation • Such an attack requires: • Knowledge of compilers • The effort involved with creating an attack • Access to the compiler binary

Triggers, payloads, and non-discovery • A successful attack depends on three things: • Triggers • Payloads • Non-discovery

Triggers, payloads, and non-discovery • For a trusting trust attack to be valuable, there must be at least two triggers: • One that causes a malicious attack that provides some direct value to the attack • One that propagates the ability to attack in future versions of the code

Triggers, payloads, and non-discovery • The “fragility” of an attack is the susceptibility of the attack to failure • Fragility of an attack can be countered by an attacker by incorporating many narrowly defined triggers and payloads • There may be enough vulnerabilities in the resulting system to allow an attack to re-enter a compiler at will to add new/modify existing triggers and payloads

Diverse Double Compiling • To perform Diverse Double Compiling [DDC], you recompile a compiler’s source code twice: once with a second “trusted” compiler, and then again using the result of the first compilation. • Check if the final result exactly matches the original compiler binary • In order to perform DDC on a compiler, it must be able to self-regenerate

Diverse Double Compiling • Start by using a trusted compiler T to compile the source code SA of an untrusted complier A resulting in c(SA,T) • Next, use c(SA,T) to compile SA again, resulting in c(SA,c(SA,T)) • Finally compare c(SA,c(SA,T)), A, and c(SA,T)If all three are identical, we can say that SA accurately reflects A

Justification • To justify the DDC technique, we must make some assumptions: • We must have a trusted compilation process T, comparer, and environments to perform all of the actions involved with DDC • T must have the same semantics for the same constructs as A • Information that affects the output of compilation must be semantically identical when generating c(SA,T), and c(SA,c(SA,T)) • The compiler defined by SA should be deterministic given only its inputs, and not use or write undefined values

Methods to increase diversity • Diversity in compiler implementation • Compiler T’s binary should be for a completely different implementation than of compiler A • Diversity in time • Compiler T developed long before compiler A, and they do not share a common implementation heritage • Diversity in environment • Compiler T could generate code for a different environment, c(SA,T) could run on a different environment • Diversity in source code input • Use mutations of compiler A’s source code as the input to the first stage of DDC

Ramifications • DDC technique has many strengths: can be complete automated, applied to any common language, and does not require the use of complex mathematical proofs • Unintentional defects in either compiler are also detected by the technique

Ramifications • The DDC only shows that the source code corresponds with a given compiler’s binary [nothing is hidden in the code] • The binary may have errors or malevolent code; the DDC technique simply ensures that these errors and malevolent code can be found by examining the source code

Cited Works • Countering Trusting Trust through Diverse Double-Compiling by David A Wheeler:http://www.acsa-admin.org/2005/papers/47.pdf

Questions?

Countering Trusting Trust through Diverse Double-Compiling