Interactive deobfuscation

Interactive deobfuscation A thrift shop for static deobfuscation

whoami • Security researcher • Break stuff, reverse, make them better and break again • Part of nullsec non profit group

How it all started blame this person => • Presumably a simple crackme • Eventually discovered as wb aes • I wanted to solve it statically • Since running things is cheating • Goal was to solve in lt a month • A race I didn’t manage to fulfill when working statically

Name is md5’ed • Serial is transformed / permutated using unknown function

Challenge archeology • Overall the crackme was deployed into 2 main parts • Deobfuscation • Opaque predicates, lookup tables, value tables and “spaghetti” code • Cryptanalysis • The original cipher was whitebox’ed

Deobfuscation

Deobfuscation - Layer0 • Found some jmps, decided to map them all • find_lookuptables(“Mov <register>, dwordptr [addr*4]”) • Add xrefs, define locs • IDA can’t map them all into graph views (due to size, more RAM == bigger graph) • After looking a bit there seem to be some logic and different operations inside them • However they all lead to the same path eventually

Deobfuscation Layer1 • Removal of jmps and basic block identification • All the obfuscation was done in a matter to effect the bb itself, after a jmp to another table occurred everything was restored • Follow_jmps_by_addr(addr) to find bb boundaries • Follow jcc until a jmp / push + ret sequence is found • Compress it, remove jccs and make one BB • In case xrefs, patch them together

Deobfuscation – Layer2 • Opaque predicates • Ops which used to make the bb bigger • Simple rule – operations are per bb and do not exceed it • Wrote a simple emulator to emulate bb and optimize them to simple instructions • 1 exception – do not touch lookup tables values • More on this later

Deobfuscation – Layer 3 • Tables, and lots of them • Apart from the jmptables which lead the way • Tables are used as part of the cipher itself • Key is dismantled inside them (more on this later) • Each table has a different role and some are doubled for obfuscation • FindTables to the rescue

Deobfuscation – Layer3 • FindTables basically taints memory and looks for read of 16b tables • Once it finds one it defines an array of 0xFF to that addr • All value tables are mapped using this way, their usage however varies

Deobfuscation – Layer 4 • Once we have all the code cleaned we get several consecutive lookup tables • Loops are unrolled and become normal repetitive ops (per round and state) • All deobfuscated code was written into a new section called “deobf” to make code reading easier • It is now time to move on to the cryptanalysis stage

Cryptanal

Cryptanal • The idea to automate every process is infeasible and too much time consuming • I decided to split the work into two main stages: • Operation identification • Key extraction • Both are used interactively • Thus the name interactive deobfuscation

Cryptanal archeology • Discovered BGE attacks from the academia • Chow , Xiao • sysk’s phrack article • Eventually said FUCK YOU ALL gonna do it myself w/o cryptic math • Lack of algebra lessons and focus

Cryptanal – Layer0 • Actual wb code to encrypt a text • Loops 9 times which made me quite frustrated • Before discovering it was wb’ed • After counting the loops by hand I thought it might be AES • But where’s the key ? • LOLWTF ? md5(user) == wbaes.dec(serial,user_as_key) • No, key must be *embedded* • LOLWUT? md5(user) == wbaes.d/enc(serial,key) ?? • Output isn’t ascii so it could be both enc/dec

Cryptanal – Rijndael on a toe • Several simple operations • AddRoundKey, SubBytes , ShiftRows,MixColumns • Some operations are linear and could be replaced with their previous op • The key to understand the attack is to sniff the first round and extract the key • In the future I found Eloi made my life harder

rijndael

whitebox(rijndael) => evolves into =>

whitebox(rijndael) • 1st transformation: • ShiftRows is linear, and thus could be replaced in op position with AddRoundKey • SubBytes and ShiftRows could be replaced in op position, as SubBytes does the same op • Let “Linear” aka lin be • lin(x) ^ lin(y) == lin(x ^ y)

2nd transformation • It is possible to tranform and “compress” several ops into one • By using XORtables and T/yboxes • T/yibox • Combine AddRoundKey and SubBytes into one operation (lookup table) to emit 1 byte • SubBytes(x ^ k[i]) • XORtable • Transform MixColumns into a series of lookuptables, particulary these tables are created by XORing one input byte at a time through the MixColumns vector

3rd transformation • Append external encoding into the keys and lookuptables • Replace table values with random ones upon stage • 41 => 32, 21 => 56, 12 => 4 • Let G & F be encoding values • G() o AES() o F() • Such that G & F cancel each other out eventually • The external encoding is what makes the whitebox variant “attack resistant”

Attaq

Attaq 101 • Chow stated that his implementation doesn’t leak any information • In reality the XORtables and T/ytables still leaks one nibble each time • Not very helpful but still something • Since the external encoding cancel each out it might be worth to understand them • Hint hint

Attaq! • If we look at input encoding and output encoding we know that they both cancel each other out • Thus if we manage to find the values of the encoding we’d only have a “naked” implementation of wbaes • And then just sniff the first round key and extract the key

Cryptbox • Let’s try to look at MixColumns in the Ty/itables transformations • In a general idea it transforms32b to 32b values • Let P be input encodingand Q output encoding

Now let’s try to give an approximation about the encoding values • Billet suggests to zero out two bits out of the 4 and build up a new lookup table and perform the transformation • Once we have that we construct a new lookup tableto their reversed operation

whitebox^whitebox • We get 256 possible bijectionswhich can be used to build up output encoding approximations • The same operation is done to the input encoding using the acquired approximation we had for Q • Once have the external encoding values we can just sniff the first round key and extract the keys

FIN • @shiftreduce • shiftreduce@gmail.com • Thanks to Eloi for making this challenge • greetz @ #ecl,#nullsec,inbarr,nirizr,skier_,emdel,over, Mikae, l_inc,

Interactive deobfuscation