1 / 66

assembly language source code

assembly language source code. assembler. "machine code". To execute a program: 1 put the "machine code" into memory 2 jal __start (the OS does this). memory. "machine code". assembler's task. assign addresses generate "machine code"

jblackburn
Download Presentation

assembly language source code

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. assemblylanguagesourcecode assembler "machine code"

  2. To execute a program: 1 put the "machine code" into memory 2 jal __start (the OS does this) memory "machine code"

  3. assembler's task • assign addresses • generate "machine code" • (architecture dependent)further translation of assembly language source code

  4. previous architectures 1 assemblylanguageinstruction 1 machinecodeinstruction MIPS architecture 1 assemblylanguageinstruction 1 or moremachine codeinstructions

  5. This further translation is also called synthesis. MIPS example of synthesis: add $8, $9, -16 becomes addi $8, $9, -16

  6. Two operands in the source code add $8, $9 are expanded back out to become add $8, $8, $9

  7. integer multiplication and division each produce 2 32-bit results integer division produces • quotient • remainder integer multiplication of 2 32-bit operands produces a 64-bit result

  8. MIPS hardware implements 2 extra registers (called HI and LO) to hold these results. Here are 4 more MIPS instructions: mflo R mtlo R mfhi R mthi R m move lo register LO f from hi register HI t to

  9. multiplication mul $8, $9, $10 becomes mult $9, $10 mflo $8 X HI LO

  10. division div $8, $9, $10 becomes div $9, $10 mflo $8 # quotient in LO rem $12, $13, $14 becomes div $13, $14 mfhi $12 # remainder in HI

  11. puts, putc, getc, and done are not TAL ! I/O is accomplished by requesting service from the operating system (OS). All architectures do this with a single instruction. On MIPS, this instruction is syscall (no operands)

  12. (note that this is specific to our simulator) To help the OS distinguish what service is required, $v0 ($2) is set:

  13. synthesis of puts $8

  14. lw $8, X becomes la $8, X lw $8, 0($8) Oops! la must be synthesized.

  15. synthesis of la $8, my_label • requires the address assigned for my_label • every address is assigned by the assembler MS part LS part 16 16 32

  16. la $8, my_label becomes lui $8, 0xMS part ori $8, $8, 0xLS part

  17. after lui $8, 0xMS part $8 MS part 000 . . . 0 this is then logically ORed with LS part 000 . . . 0 due to the instruction ori $8, $8, 0xLS part resulting contents of $8: $8 MS part LS part

  18. Synthesize lw $8, X Assume X is assigned address 0xaaee0018. first try: la $8, X lw $8, 0($8) with synthesis of the la instruction: lui $8, 0xaaee ori $8, $8, 0x0018 lw $8, 0($8)

  19. Synthesize sb $12, X Assume X is assigned address 0x080001a0.

  20. Generate machine code for addi $8, $20, 15 From the TAL table: addi Rt, Rs, I Rtis $8 Rsis $20 I is 0000 0000 0000 1111 0010 00ss ssst tttt ii .. ii sssss is 10100 (for $20) ttttt is 01000 (for $8) 0010 0010 1000 1000 0000 0000 0000 1111 in hex 0x2288000f op code 16 bits

  21. Generate machine code for lw $8, 12($sp) lw Rt, I(Rb) Rtis $8 Rbis $sp (which is $29) I is 12 1000 11bb bbbt tttt ii .. ii bbbbb is 11101 ttttt is 01000 1000 1111 1010 1000 0000 0000 0000 1100 in hex 0x8fa8000c op code 16 value 12

  22. assemblylanguagesourcecode assembler assign addresses produce machine code memory image

  23. Problem: forward references .text beq $8, $11, later_in_code later_in_code: lw $20, X .data X: .word 16

  24. Simple solution: 2-pass assembler • first pass: • (MIPS-only) MAL  TAL synthesis • assign all addresses • second pass: • produce all machine code More complex and more efficient: 1-pass assembler • Keep a list of instructions that cannot be completed due to yet-to-be-assigned addresses. As addresses are assigned, check the list and complete instructions.

  25. assign all addresses(and remember them) implies the use of a table holding the mapping of addresses to labels called a symbol table

  26. As the assembler works on the source code, it scans the characters in the file. • Scanner (a SW module) • breaks a set of characters into significant groups known as tokens • often, tokens are separated by white space or special punctuation .data a1: .word 3 loop: lw $7, 4($6)

  27. .data a1: .word 3 a2: .word 16:4 a3: .word 5 .text __start: la $6, a2 loop: lw $7, 4($6) mult $9, $10 b loop done

  28. 2 segments: code and data • The assembler places items into these 2 segments. So, it needs addresses. • Use starting addresses of data 0x0040 0000 code 0x0080 0000 • The variable internal to the assembler that represents the next address to be assigned is the location counter.

  29. TAL equivalent of code: .text __start: lui $6, 0x0040 # la $6, a2 ori $6, $6, 0x0004 loop: lw $7, 4($6) mult $9, $10 beq $0, $0, loop # b loop ori $2, $0, 10 # done syscall

  30. As a result of processing the entire .data section, the memory image will be addresscontents 0x0040 0000 0x0000 0003 0x0040 0004 0x0000 0010 0x0040 0008 0x0000 0010 0x0040 000c 0x0000 0010 0x0040 0010 0x0000 0010 0x0040 0014 0x0000 0005

  31. .data a1: .word 3 a2: .word 16:4 a3: .word 5 .text __start: la $6, a2 loop: lw $7, 4($6) mult $9, $10 b loop done

  32. (1) Machine code for la $6, a2 Synthesized:lui $6, 0x0040 (address from symbol table) ori $6, $6, 0x0004 lui Rt, I Rtis $6 0011 1100 000t tttt ii .. ii ttttt is 00110 0011 1100 0000 0110 0000 0000 0100 0000 in hex 0x3c060040 (2) (1) op code 16

  33. Add to the memory image addresscontents 0x0080 0000 0x3c06 0040

  34. Machine code for ori $6, $6, 0x0004 ori Rt, Rs, I Rtis $6 Rsis $6 0011 01ss ssst tttt ii .. ii ttttt is 00110 sssss is 00110 0011 0100 1100 0110 0000 0000 0000 0100 in hex 0x34c60004 (2) op code 16

  35. Add it to the memory image as well, updating the location counter addresscontents 0x0080 0000 0x3c06 0040 0x0080 0004 0x34c6 0004

  36. .data a1: .word 3 a2: .word 16:4 a3: .word 5 .text __start: la $6, a2 loop: lw $7, 4($6) mult $9, $10 b loop done

  37. Scanning on, machine code for lw $7, 4($6) lw Rt, I(Rb) Rtis $7 Rbis $6 I is 4 1000 11bb bbbt tttt ii .. ii 1000 1100 1100 0111 0000 0000 0000 0100 in hex 0x8cc70004 op code 16

  38. Add it to the memory image as well, updating the location counter addresscontents 0x0080 0000 0x3c06 0040 (lui) 0x0080 0004 0x34c6 0004 (ori) 0x0080 0008 0x8cc7 0004 (lw)

  39. .data a1: .word 3 a2: .word 16:4 a3: .word 5 .text __start: la $6, a2 loop: lw $7, 4($6) mult $9, $10 b loop done

  40. Rs Rt next comes mult $9, $10 0000 00ss ssst tttt 0000 0000 0001 1000 0000 0001 0010 1010 0000 0000 0001 1000 in hex 0x012a0018 01001 01010 Rd op code

  41. op code 000000 is used for any arithmetic or logical instruction with 3 register operands 0000 00ss ssst tttt dddd d??? ???? ???? whichoperation

  42. Add mult to the memory image as well, updating the location counter addresscontents 0x0080 0000 0x3c06 0040 (lui) 0x0080 0004 0x34c6 0004 (ori) 0x0080 0008 0x8cc7 0004 (lw) 0x0080 000c 0x012a 0018 (mult)

  43. .data a1: .word 3 a2: .word 16:4 a3: .word 5 .text __start: la $6, a2 loop: lw $7, 4($6) mult $9, $10 b loop done

  44. b loop is a pseudoinstruction (must be synthesized) Many translations: beq $0, $0, loop bgez $0, loop blez $0, loop j loop

  45. beq $0, $0, loop 0001 00ss ssst tttt iii ... ii Rs Rt 00000 00000 op code I I is a derivation of an offset.

  46. At run (execution) time, for a taken branch I (from instruction) I || 00 (concatenate) I || 00 (sign extend to 32 bits) + PC  PC

  47. bytedifference I computed by the assembler relies on except, when the PC (the branch address!) is used (at execution time), the PC update step (of the fetch and execute cycle) has already been completed. So, targetaddress branchaddress - = bytedifference targetaddress branchaddress 4 + - =

  48. from the symbol table target is loop 0x0080 0008 beq is at 0x0080 0010 byte offset = 0x00800008 – ( 0x00800010 + 4 ) 0000 0000 1000 0000 0000 0000 0000 1000 - 0000 0000 1000 0000 0000 0000 0001 0100 (can't do this in unsigned, so convert to 2's complement) 1111 1111 0111 1111 1111 1111 1110 1100 additive inverse of

  49. 0000 0000 1000 0000 0000 0000 0000 1000 + 1111 1111 0111 1111 1111 1111 1110 1100 1111 1111 1111 1111 1111 1111 1111 0100 this represents -12 -12 is the byte offset to be added to the PC to form the new (correct) target PC

  50. Why? Recall that At run (execution) time, for a taken branch I (from instruction) I || 00 (concatenate) I || 00 (sign extend to 32 bits) + PC  PC

More Related