220 likes | 235 Views
Discover the necessity of binary analysis for program structure info, debugging, and performance profiling. Decode binary code to understand its modules, functions, and more. Explore stripped binaries for various applications.
E N D
Analysis Of Stripped Binary Code Laune Harris University of Wisconsin – Madison lharris@cs.wisc.edu www.paradyn.org
The need for binary analysis • Foundation of many applications • Binary modification, performance profiling, security, etc… • Provides program structure info • Modules, functions, control flow, data flow, etc…
Code Spectrum • All compiler info available • Managed Runtime Environments • (eg. Microsoft’s Vulcan, Intel’s ORP) • Some debugging info available • Object files (relocation info) • Shared libraries (exported symbols) • Partially stripped code • Minimum info • Fully stripped binaries
Binary code (with assembly) push %ebp mov %esp, %ebp sub 8, %esp call 857d leave ret push %ebp mov %esp, %ebp sub %eax, %ebp call 866c leave ret 856c : 55 856d : 89e5 856f : 83ec08 8572 : e8ddffffff 857b : c9 857c : c3 857d : 55 857e : 89e5 8581 : 83ec18 858b : e8bfffffff 8591 : c9 8592 : c3
Binary code (with symbol info) main foo push %ebp mov %esp, %ebp sub 8, %esp call foo leave ret push %ebp mov %esp, %ebp sub %eax, %ebp call printf leave ret 856c : 55 856d : 89e5 856f : 83ec08 8572 : e8ddffffff 857b : c9 857c : c3 857d : 55 857e : 89e5 8581 : 83ec18 858b : e8bfffffff 8591 : c9 8592 : c3
Why parse stripped binaries? • Lots of stripped code • Commercial applications (usually) • Proprietary libraries (often) • Malicious code • OS libraries and utilities (depends on OS and OS version)
AbiWord_d citesub dvilj4 gcc.bin FvwmCommand citesub-0.04 dvilj4l gcj X cjpeg dvilj6 gcjh XFree86 cksum dvipdfm gcov Xnestclearafscache dvips gdb Xprt clog2alog dvitomp gdbserver Xvfb clog2slog dvitype gdk-pixbuf-csourc a2p clog_print ebb gdk-pixbuf-query- a2ps cmake ebrowse gftodvi ab cmaketest editres gftopk access cmp egrep gftype acyclic co einitex gif2tiff addbib colorize elatex gij addresses comm emacs gimp afm2tfm composite emacs-21.2 gimp-1.2 animate conjure emacsclient gimp-remote appletviewer console.real epsffit gimp-remote-1.2 appres convcal epstool glib-genmarshal aspell convert eqn glxgears atobm counterfile eqn.broke glxinfo awk cplex escputil gnuclient b2m cpp escputil-1.2 gnuplot bash csplit etags gnuserv bc css-cat etex gobject-query bdftopcf ctags eview gp beforelight ctangle evim gp-2.1 bggen ctest evirtex gpg bib cut ex gpgsplit bibcheck cvs expand gpgv bibclean cweave expect gpr bibclean-2.11.4 cxpm expectk gpsfig bibindex dc extcheck gracebat biblex ddd f grap biblook debugsh fax2ps grconvert bibparse deroff fax2tiff grep bibtex detex fgrep grepjar bibunlex dga find grn bison diff finger gs bitmap diff3 fixnt gsc bltwish disdvi flac gsftopk bltwish24 dispatch_maya_ren flex gss-client bmtoa display flex++ gsview bunzip2 dist fmt gsx bzcat djpeg fold gtk-demo bzip2 dlpsh forw gtk-query-immodul bzip2recover dmp fslsfonts gunzip c++.bin dos2unix fstobdf gview calcinode dot ftp gvim cancel dot2gxl funzip gvimdiff cat dpsexec fvwm-root gxl2dot ccexample dpsinfo fvwm2 gzip ccmake dvdinfo g++.bin head ccomps dvi2tty g77.bin hinotes checkgid dvicopy gawk hpfilter chsh dvilj gawk-3.0.3 htdigest ci dvilj2p gc htpasswd httpd klist mimencode outocp i686-pc-linux-gnu kpasswd mkcfm ovf2ovp i686-pc-linux-gnu kpsestat mkfontdir ovp2ovf i686-pc-linux-gnu kpsewhich mkisofs pal2rgb i686-pc-linux-gnu krb524init mmencode paste ical ksu mogrify patgen ical-2.2 ktab money2qif pathof iceauth kvno montage pcitweak ico lacheck movemail pcmx ident lambda mpack pcv identify latex mpeg2audio pdfeinitex idlj lbxproxy mpeg2player pdfelatex imake lefty mpeg2video pdfetex imecho less mpeg_play pdfevirtex import lessecho mpost pdffonts inews lesskey mpto pdfimages info listrefs mred pdfinfo infokey listres mrsh pdfinitex inimf lndir msgs pdflatex inimpost lockfile msh pdftex iniomega logresolve mtv pdftoppm initex lookbib mtvp pdftops initpass lookup munpack pdftosrc install-datebook lp mutt pdftotext install-expenses lpq mzscheme pdfvirtex install-hinote lpquot nasm perl install-info lpquota native2ascii perl5.6.1 install-memo lpr ncftp pfb2pfa install-netsync lprm ncftpbatch pgpewrap install-todo lpstat ncftpbookmarks pgpring install-todos lsof ncftpget phbook install-user luit ncftpls php invert lynx ncftpput pi-address isapty macref ndisasm pi-csd jar mag neato pi-getram jarsigner mailto newer pi-getrom java main newpag pi-getromtoken javac make next pi-nredir javadoc makedepend nl pic javah makedev nntplist pico javap makeindex nop pilot jcf-dump makeinfo nroff pilot-addresses jdb makepsres oclock pilot-archive jikes makestrs octave pilot-clip join mayaClockServer octave-2.1.36 pilot-datebook jpegtran mayaServerTest od pilot-dedupe jpilot md5sum odvicopy pilot-file jpilot-dial memos odvitype pilot-foto jpilot-dump merge ofm2opl pilot-prc jpilot-sync metaflac omega pilot-schlep jv-scan metamail omfonts pilot-xfer k52token mf omshell pine.bin kdestroy mf-nowin opl2ofm pitclsh kermit mft orbd pk2bm keytool mfw otangle pkg-config kinit mhn otp2ocp pktogf pktype resize suidperl vdcomp pltotf revpath sum vftovp policytool rgb2ycbcr sxpm viamail pooltype rgview syncal view ppm2tiff rgvim tac viewres pr richtext tail vim prev richtoatk tangle vimdiff procmail rlm_dbm_cat tar virmf proxymngr rlm_dbm_parser tbl virmpost prune rlm_ippool_tool tcdialog viromega ps2pk rlog tclsh virtex ps4014 rlogin tclsh8.3 vptovf ps630 rman tcsh w3m psbook rmid telnet wc psc rn tex weave pscat rotatelogs texindex wish8.3 pscatmap rsh texteroids wmmon psdit rsync tftopl wmxmms psdraft runauth thumbnail word-list-compres psfax rview tie wrjpgcom psfig rvim tiff2bw wrl2ma pslpr savepag tiff2ps wvConvert psplot sc tiff2rgba wvRTF psresize scanpci tiffcmp wvSummary psselect sccmap tiffcp wvVersion pstops sclient tiffdither wvWare pswrap scp tiffdump x11perf ptx scqref tiffinfo xanim purecov screen tiffmedian xargs purify sdiff tiffset xauth pv serialver tiffsplit xcalc pxspread serv_p4 tnameserv xclipboard python servertool tnef xclock python2.3 sessreg tr xcmap quantify setxkbmap tr2tex xcmsdb radclient sftp tred xconsole radrelay sha1sum trn xcutsel radwho show trn-artchk xditview radzap showfont troff xdm ras2tiff showrgb tsort xdpyinfo rcp sim_client tstdvd xdvi.bin rcs slog_print ttf2afm xedit rcsclean slogin ttftool xev rcsdiff smbencrypt twm xeyes rcsmerge smproxy twopi xf86cfg rdjpgcom sort unexpand xf86config read-expenses sortbib unflatten xfd read-ical sperl5.6.1 uniq xfindproxy read-notepad spim unzip xfontsel read-palmpix split unzipsfx xfs read-todos splitmail uuclient xfsinfo readlink ssh uufilter xftcache refer ssh-add uwcachename xfwp reminders ssh-agent v5passwd xgamma repl ssh-keygen vacation xgc reset ssh-keyscan valgrind-listener xhost xinit xmag xsetpointer xvidtune xkbbell xman xsetroot xvinfo xkbcomp xmessage xsm xvpictoppm xkbcomp.bak xmgrace xspim xwd xkbevd xmh xspread xwininfo xkbprint xmms xstdcmap xwud xkbvleds xmodmap xterm xxd xkbwatch xpdf xtrapchar yap xkill xprop xtrapin ytalk xload xrandr xtrapinfo zcat xloadimage xrdb xtrapout zipcloak xlogo xrefresh xtrapproto zipinfo xlsatoms xset xtrapreset zipnote xlsclients xsetbg xtrapstats zipsplit xlsfonts xsetmode xv
Analysis • Full control flow analysis of binary • Interprocedural CFG (call graph) • Function start addresses • Intraprocedural CFG • Function basic blocks • Function size • Function entry and exit points
Call Graph creation push %ebp main 856c:
Call Graph creation push %ebp mov %esp, %ebp sub 8, %esp call 857d leave ret main 856c: 856d: 856f: 8572: 857b: 857c:
Call Graph creation push %ebp mov %esp, %ebp sub 8, %esp call func857d leave ret push %ebp main func857d 856c: 856d: 856f: 8572: 857b: 857c: 857d:
Call Graph creation push %ebp mov %esp, %ebp sub 8, %esp call func857d leave ret push %ebp mov %esp, %ebp sub %eax, %ebp call 865e call 866d leave ret main func857d 856c: 856d: 856f: 8572: 857b: 857c: 857d: 857e: 8581: 858b: 8591: 8596: 8597:
Intraprocedural CFG creation • Disassemble function’s code by traversing intra-procedural control flow graph • Highest address determines function size
Challenges: Finding all functions • Some functions only called indirectly • Problem: static call graph traversal does not discover these functions • Solution: examine gaps in text space and use heuristics to find functions
Challenge: Find all basic blocks • Indirect Jumps • Problem: need to find targets to complete CFG • Solution: parse jump tables to find possible targets
Challenge: Identify CFG exits • CFG exit points are sometimes hard to identify • Assume branches that are not obvious exits are intra-procedural • Errors result in overestimation of function size • Overlapping functions indicate error
Problems and Solutions cont’d • Exception handling code • Problem: creates code blocks that appear unreachable • Solution: get block addresses from exception table
Status • Implemented on x86, Power • Currently used for instrumentation and analysis
Future Work • Develop more accurate heuristics to identify code in unlit areas of the binary • Incorporate data flow analyses • Port to other platforms • Support unconventional function constructs • More comparisons with other tools • Extend for use in other domains (eg. Security)