270 likes | 373 Views
Analysis of Security Concerns in Very Simple Processor Designs and Potential Exploits of Current Commercial Processor Errata. Patrick Quinn Matt Blythe. EECS 578 Winter 2008 Final Project. Hardware Security Bugs?!. What is a hardware security bug? Famous bug (Intel’s F00F)
E N D
Analysis of Security Concerns in Very Simple Processor Designs and Potential Exploits of Current Commercial Processor Errata Patrick Quinn Matt Blythe EECS 578 Winter 2008 Final Project
Hardware Security Bugs?! • What is a hardware security bug? • Famous bug (Intel’s F00F) • Why do they matter? • Intel’s Core2 line currently has 54 published outstanding errata*, of which 10 seem to be potential security bugs • If a security bug can be hidden by a designer, bad things can happen! *http://download.intel.com/design/processor/specupdt/31873303.pdf
What We Can Do • Create obscure bugs in simple designs • Formal verification might not catch bug if it does not affect specified functionality • If the bug looks unintentional, when it does come out the designer can get away cleanly • Exploit bugs in currently available commercial processors
Part 1: Inserting malicious bugs • Adding the bug must change a minimal amount of RTL code • Any changes should not make the bug obvious • It should be possible to create malicious software that is useful on non-buggy hardware • Inserted bugs should not change functionality
Register 31 Bug in Alpha Core • R31 is the zero register • Bug allows attacker to write an arbitrary value to the register • Used to hide functionality in assembly • Can hide a memory offset to a location outside of main memory (e.x. memory mapped IO) • Implemented via unspecified instruction
R31 Bug in the Decoder … 6'h08, 6'h20, 6'h28: begin opa_select = `ALU_OPA_IS_MEM_DISP; opb_select = `ALU_OPB_IS_REGB; alu_func = `ALU_ADDQ; dest_reg = `DEST_IS_REGA; case (inst[31:26]) `LDA_INST: /* defaults are OK */; `LDAH_INST: dest_reg = `DEST_IS_REGB; `LDQ_INST: begin rd_mem = `TRUE; dest_reg = `DEST_IS_REGA; func_unit = 4'b1000; end // case: `LDQ_INST `STQ_INST: begin wr_mem = `TRUE; dest_reg = `DEST_NONE; func_unit = 4'b1000; end // case: `STQ_INST default: illegal = `TRUE; endcase // case(inst[31:26]) end … default: begin cond_branch = `TRUE; // all others are conditional func_unit = 4'b0001; end endcase // case(inst[31:26]) end endcase // case(inst[31:29] << 3) end // if(~valid_inst_in) case (dest_reg) `DEST_IS_REGC: dest_reg_idx = rc_idx; `DEST_IS_REGA: dest_reg_idx = ra_idx; `DEST_IS_REGB: dest_reg_idx = 6'h3F `DEST_NONE: dest_reg_idx = `ZERO_REG; default: dest_reg_idx = `ZERO_REG; endcase end // always endmodule // decoder … 6'h08, 6'h20, 6'h28: begin opa_select = `ALU_OPA_IS_MEM_DISP; opb_select = `ALU_OPB_IS_REGB; alu_func = `ALU_ADDQ; dest_reg = `DEST_IS_REGA; case (inst[31:26]) `LDA_INST: /* defaults are OK */; `LDQ_INST: begin rd_mem = `TRUE; dest_reg = `DEST_IS_REGA; func_unit = 4'b1000; end // case: `LDQ_INST `STQ_INST: begin wr_mem = `TRUE; dest_reg = `DEST_NONE; func_unit = 4'b1000; end // case: `STQ_INST default: illegal = `TRUE; endcase // case(inst[31:26]) end … default: begin cond_branch = `TRUE; // all others are conditional func_unit = 4'b0001; end endcase // case(inst[31:26]) end endcase // case(inst[31:29] << 3) end // if(~valid_inst_in) case (dest_reg) `DEST_IS_REGC: dest_reg_idx = rc_idx; `DEST_IS_REGA: dest_reg_idx = ra_idx; `DEST_NONE: dest_reg_idx = `ZERO_REG; default: dest_reg_idx = `ZERO_REG; endcase end // always endmodule // decoder
R31 Bug in the Register File module ARF( //inputs reset, clock, wr_en, wr_reg, wr_data, rd1_reg, rd2_reg, // outputs rd1_data, rd2_data ); input reset; input clock; input wr_en; input [5:0] wr_reg; input [63:0] wr_data; input [4:0] rd1_reg; input [4:0] rd2_reg; // outputs output [63:0] rd1_data; output [63:0] rd2_data; reg [63:0] data [31:0]; assign rd1_data=((rd1_reg==wr_reg)&wr_en&~(wr_reg==31))?wr_data:data[rd1_reg]; assign rd2_data=((rd2_reg==wr_reg)&wr_en&~(wr_reg==31))?wr_data:data[rd2_reg]; integer i; always @(posedge clock) begin if (reset) begin for (i = 0; i < 32; i = i+1) data[i] = 0; end else if (wr_en && ~(wr_reg == 31)) data[wr_reg[4:0]] = wr_data; end endmodule module ARF( //inputs reset, clock, wr_en, wr_reg, wr_data, rd1_reg, rd2_reg, // outputs rd1_data, rd2_data ); input reset; input clock; input wr_en; input [4:0] wr_reg; input [63:0] wr_data; input [4:0] rd1_reg; input [4:0] rd2_reg; // outputs output [63:0] rd1_data; output [63:0] rd2_data; reg [63:0] data [31:0]; assign rd1_data=((rd1_reg==wr_reg)&wr_en&~(wr_reg==31))?wr_data:data[rd1_reg]; assign rd2_data=((rd2_reg==wr_reg)&wr_en&~(wr_reg==31))?wr_data:data[rd2_reg]; integer i; always @(posedge clock) begin if (reset) begin for (i = 0; i < 32; i = i+1) data[i] = 0; end else if (wr_en && ~(wr_reg == 31)) data[wr_reg] = wr_data; end endmodule
Sample Code to Exploit R31 The following code generates all even numbers between 0 and 15 and stores them to a specified location in memory. When run on our buggy processor, r31 is written which changes the memory offset. data = 0x1000 data2 = 0x0800 ldah $r1,data2 // store 0x0800 to $r31 in the buggy processor lda $r3,data // load the initial memory offset to $r3 lda $r2,0 lda $r5,0x8 addq $r1,$r1,$r1 // noop to allow ldah to commit addq $r3,$r31,$r3 // introduces buggy $r31 value to memory offset loop1: blbs $r2,loop2 // if $r2 is odd, don’t store stq $r2,0($r3) addq $r3,$r5,$r3 // increment memory offset loop2: addq $r2,0x1,$r2 // increment $r2 cmple $r2,0xf,$r1 // make sure $r2 is less than 15 bne $r1,loop1 call_pal 0x555 // halt
Results of Previous Code Buggy processor: @@@ mem[ 0] = 207f1000243f0800 : 2341607924050102272 @@@ mem[ 8] = 20bf0008205f0000 : 2359604764668264448 @@@ mem[ 16] = 407f040340210401 : 4647437752476959745 @@@ mem[ 24] = b4430000f0400002 : 12989225754297368578 @@@ mem[ 32] = 4040340240650403 : 4629757601211810819 @@@ mem[ 40] = f43ffffa4041fda1 : 17600067319072161185 @@@ mem[ 48] = 0000000000000555 : 1365 @@@ @@@ mem[ 6152] = 0000000000000002 : 2 @@@ mem[ 6160] = 0000000000000004 : 4 @@@ mem[ 6168] = 0000000000000006 : 6 @@@ mem[ 6176] = 0000000000000008 : 8 @@@ mem[ 6184] = 000000000000000a : 10 @@@ mem[ 6192] = 000000000000000c : 12 @@@ mem[ 6200] = 000000000000000e : 14 Non-buggy processor: @@@ mem[ 0] = 207f1000243f0800 : 2341607924050102272 @@@ mem[ 8] = 20bf0008205f0000 : 2359604764668264448 @@@ mem[ 16] = 407f040340210401 : 4647437752476959745 @@@ mem[ 24] = b4430000f0400002 : 12989225754297368578 @@@ mem[ 32] = 4040340240650403 : 4629757601211810819 @@@ mem[ 40] = f43ffffa4041fda1 : 17600067319072161185 @@@ mem[ 48] = 0000000000000555 : 1365 @@@ @@@ mem[ 4104] = 0000000000000002 : 2 @@@ mem[ 4112] = 0000000000000004 : 4 @@@ mem[ 4120] = 0000000000000006 : 6 @@@ mem[ 4128] = 0000000000000008 : 8 @@@ mem[ 4136] = 000000000000000a : 10 @@@ mem[ 4144] = 000000000000000c : 12 @@@ mem[ 4152] = 000000000000000e : 14
Duplicate Store Bug in Alpha Core • When activated, bug duplicates the following store to a specified address • Bug allows attacker to export important data from processor • For instance, can be used to write out cryptographic private keys • Activated via unspecified instruction
DS Bug in the Decoder … 6'h08, 6'h20, 6'h28: begin opa_select = `ALU_OPA_IS_MEM_DISP; opb_select = `ALU_OPB_IS_REGB; alu_func = `ALU_ADDQ; dest_reg = `DEST_IS_REGA; case (inst[31:26]) `LDA_INST: /* defaults are OK */; `LDAH_INST: begin rd_mem = `TRUE; wr_mem = `TRUE; end `LDQ_INST: begin rd_mem = `TRUE; dest_reg = `DEST_IS_REGA; func_unit = 4'b1000; end // case: `LDQ_INST `STQ_INST: begin wr_mem = `TRUE; dest_reg = `DEST_NONE; func_unit = 4'b1000; end // case: `STQ_INST default: illegal = `TRUE; endcase // case(inst[31:26]) end … 6'h08, 6'h20, 6'h28: begin opa_select = `ALU_OPA_IS_MEM_DISP; opb_select = `ALU_OPB_IS_REGB; alu_func = `ALU_ADDQ; dest_reg = `DEST_IS_REGA; case (inst[31:26]) `LDA_INST: /* defaults are OK */; `LDQ_INST: begin rd_mem = `TRUE; dest_reg = `DEST_IS_REGA; func_unit = 4'b1000; end // case: `LDQ_INST `STQ_INST: begin wr_mem = `TRUE; dest_reg = `DEST_NONE; func_unit = 4'b1000; end // case: `STQ_INST default: illegal = `TRUE; endcase // case(inst[31:26]) end
DS Bug in the Load/Store Queue assign MEM_init_out = valid_entry_out[head] & RA_valid_out & RB_valid_out & ~data_in_RA_out & (~write_out | (destROB_out==head_in)); ... always @* begin squash_all=0; next_wrap=wrap; if(rwbar_in & write_in) next_wrap=1; if(issueEN_in&~full&~rwbar_in) next_wrap=0; next_tail=tail; next_head=head; output_en=0; if(valid_entry_out[head]) output_en[head]=1; new_en=0; new_en[tail]=issueEN_in&(~full); if(issueEN_in&~full) next_tail=next_tail+1; if(issueEN_in&~full&wrap&~rwbar_in) begin new_en[next_tail]=1; output_en[next_tail]=1; next_tail=next_tail+1; end ... end if(new_en) begin next_RA_value = RA_value_in; next_RA_valid = RA_valid_in; next_RB_value = RB_value_in; next_RB_valid = RB_valid_in; next_destROB = destROB_in; next_mem_disp = mem_disp_in; next_rwbar = rwbar_in; next_write = ~rwbar_in; next_data_in_RA = 0; next_valid_entry_out = 1; if(~RA_valid_in & (RA_value_in==current_addr_CDB) & valid_addr_last_CDB) begin next_RA_value = data_CDB; next_RA_valid = 1; end if(~RB_valid_in & (RB_value_in==current_addr_CDB) & valid_addr_last_CDB) begin next_RB_value = data_CDB; next_RB_valid = 1; end if(output_en) begin next_RB_value = 64'h12340; next_RB_valid = 1; next_mem_disp = 0; next_destROB = destROB_in+1; next_write = 0; end end
Sample Code to Exploit DS Bug This program generates multiples of 10, stores them to a region of memory, then copies (loads then stores) the values to a different part of memory. When run on our buggy processor, the copied values also get written to a specified location in memory. data = 0x1000 lda $r5,0 lda $r1,data //load the memory location of the data loop: mulq $r5,0x0a,$r2 //generate the multiple of 10 stq $r2,0($r1) // store the value out (the original value) ldq $r3,0($r1) // load that stored value ldah $r6,0 //unimplemented instruction...triggers bug stq $r3,0x100($r1) //store the copied value out (duplicated) addq $r1,0x8,$r1 addq $r5,0x1,$r5 //increment values cmple $r5,0xf,$r4 bne $r4,loop //repeat until $r5 > 15 call_pal 0x555 //halt
Results of Previous Code Buggy processor: ... @@@ mem[ 4360] = 000000000000000a : 10 @@@ mem[ 4368] = 0000000000000014 : 20 @@@ mem[ 4376] = 000000000000001e : 30 @@@ mem[ 4384] = 0000000000000028 : 40 @@@ mem[ 4392] = 0000000000000032 : 50 @@@ mem[ 4400] = 000000000000003c : 60 @@@ mem[ 4408] = 0000000000000046 : 70 @@@ mem[ 4416] = 0000000000000050 : 80 @@@ mem[ 4424] = 000000000000005a : 90 @@@ mem[ 4432] = 0000000000000064 : 100 @@@ mem[ 4440] = 000000000000006e : 110 @@@ mem[ 4448] = 0000000000000078 : 120 @@@ mem[ 4456] = 0000000000000082 : 130 @@@ mem[ 4464] = 000000000000008c : 140 @@@ mem[ 4472] = 0000000000000096 : 150 @@@ @@@ mem[74560] = 0000000000000096 : 150 Non-buggy processor: ... @@@ mem[ 4360] = 000000000000000a : 10 @@@ mem[ 4368] = 0000000000000014 : 20 @@@ mem[ 4376] = 000000000000001e : 30 @@@ mem[ 4384] = 0000000000000028 : 40 @@@ mem[ 4392] = 0000000000000032 : 50 @@@ mem[ 4400] = 000000000000003c : 60 @@@ mem[ 4408] = 0000000000000046 : 70 @@@ mem[ 4416] = 0000000000000050 : 80 @@@ mem[ 4424] = 000000000000005a : 90 @@@ mem[ 4432] = 0000000000000064 : 100 @@@ mem[ 4440] = 000000000000006e : 110 @@@ mem[ 4448] = 0000000000000078 : 120 @@@ mem[ 4456] = 0000000000000082 : 130 @@@ mem[ 4464] = 000000000000008c : 140 @@@ mem[ 4472] = 0000000000000096 : 150
Part 2: Exploiting Commercial Bugs • Exploits must be run from user code • Any result is a good result • System crash • Privilege escalation • Data corruption • Singularity
Exploiting Memory Aliasing Bugs: • Problem occurs when two pages that refer to the same memory have conflicting traits • mmap() in Linux allows user to access the same memory in two processes • The fine print says that mmap() is written to only allow one process access at any time • Therefore, this bug would have to be in OS code
Exploiting Memory Aliasing Bugs: Process 2 Process 1 mmap()-ed mmap()-ed mmap()-ed mmap()-ed mmap()-ed mmap()-ed
Exploiting self-modifying code bugs: • Pentium II • If code modified when pre-fetched, could cause system hang or crash • Tried lots of variations (page alignment, space between instructions) with no result • Opteron • Bug was fixed in our chip’s revision
Exploiting self-modifying code bugs: Process Stack Text & Data fn(*addr)
Exploiting Address Boundary Bug: • In Opteron, if code sequentially executes across the 32-bit boundary (between 0xFFFF FFFF and 0x0001 0000 0000) it should crash • Linux refused to allocate memory in that range • OS workaround for known errata? • Crashed the machine anyway (too much memory)
Exploiting Address Boundary Bug: Process Stack Magic Line! Text & Data
Exploiting Debug Mode Bugs: • Many bugs related to “debug mode” and setting hardware breakpoints with certain characteristics • Debuggers like GDB use hardware breakpoints • Understanding GDB implementation and x86 debug mode could take a long time
Conclusions:Inserting Bugs • It’s easy to create security bugs that do not break specified functionality as long as the spec is not fully defined • If the spec is fully defined, formal verification can be used to find the bug in small designs • In large designs, it would be much harder to find bugs due to the limit of current verification techniques
Conclusions: Exploiting Commercial Bugs • Errata data published seems not to have enough detail to trivially create exploits • Knowledge of the design or the test that revealed the bug would be very useful • Many bugs require cooperation from the OS • Due to the prevalence of software security bugs, hardware bugs are not as appealing
Lessons Learned • We don’t have enough knowledge of Linux internals to effectively exploit bugs • It’s fun to simulate a processor on a simulation of a processor • Cross compilers are helpful (Thanks Matt) • Subversion >> Mercurial
Group Dynamic • Pat created the R31 bug and modified the processor to allow for instruction streams • Matt created the Duplicate Store bug and attempted to exploit the commercial errata • Both researched commercial bugs to exploit, created test programs for the alpha processor, and a bunch of other things that didn’t make it into the final project