1 / 44

March, 2004 Modified from Notes by Saeid Nooshabadi

COMP 3221 Microprocessors and Embedded Systems Lecture 4: Memory Access http://www.cse.unsw.edu.au/~cs3221. March, 2004 Modified from Notes by Saeid Nooshabadi. arr[0] arr[1] arr[2] arr[3]. #8. 25. Data Transfer: Memory to Reg (#4/4). Example: ldr a1, [v1, #8]

eavan
Download Presentation

March, 2004 Modified from Notes by Saeid Nooshabadi

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COMP 3221 Microprocessors and Embedded Systems Lecture 4: Memory Access http://www.cse.unsw.edu.au/~cs3221 March, 2004 Modified from Notes by Saeid Nooshabadi

  2. arr[0] arr[1] arr[2] arr[3] #8 25 Data Transfer: Memory to Reg (#4/4) • Example: ldr a1, [v1, #8] This instruction will take the pointer in v1, add 8 bytes to it, and then load the value from the memory pointed to by this calculated sum into register a1 • Notes: • v1 is called the base register • 8 is called the offset • offset is generally used in accessing elements of array: base reg points to beginning of array • Example: ldr a1, [v1, v2] This instruction will take the pointer in v1, add an index offset in register v2 to it, and then load the value from the memory pointed to by this calculated sum into register a1 • Notes: • v1 is called thebase register • v2 is called theindex register • index is generally used in accessing elements of array using an variable index: base reg points to beginning of array

  3. Data Transfer: Other Mem to Reg Variants (#1/2) • Pre Indexed Load Example: ldr a1, [v1,#12]! This instruction will take the pointer in v1, add 12 bytes to it, and then load the value from the memory pointed to by this calculated sum into register a1. Subsequently, v1 is updated by computed sum of v1 and 12, ( v1  v1 + 12). • Pre Indexed Load Example: ldr a1, [v1, v2]! This instruction will take the pointer in v1, add an index offset in register v2 to it, and then load the value from the memory pointed to by this calculated sum into register a1. Subsequently, v1 is updated by computed sum of v1 and v2, (v1 v1 + v2).

  4. Data Transfer: Other Mem to Reg Variants (#2/2) • Post Indexed Load Example: ldr a1, [v1], #12 This instruction will load the value from the memory pointed to by value in register v1 into register a1. Subsequently, v1 is updated by computed sum of v1 and 12, ( v1  v1 + 12). • Example: ldr a1, [v1], v2 This instruction will load the value from the memory pointed by value in register v1, into register a1. Subsequently, v1 is updated by computed sum of v1 and v2, ( v1  v1 + v2).

  5. Data Transfer: Reg to Memory (1/2) • Also want to store value from a register into memory • Store instruction syntax is identical to Load instruction syntax • Instruction Name: str (meaning Store from Register, so 32 bits or one word are stored from register to memory at a time)

  6. Data Transfer: Reg to Memory (2/2) • Example: str a1,[v1, #12] This instruction will take the pointer in v1, add 12 bytes to it, and then store the value from register a1 into the memory address pointed to by the calculated sum • Example: str a1,[v1, v2] This instruction will take the pointer in v1, adds register v2 to it, and then store the value from register a1 into the memory address pointed to by the calculated sum.

  7. Data Transfer: Other Reg to Mem Variants (#1/2) • Pre Indexed Store Example: str a1, [v1,#12]! This instruction will take the pointer in v1, add 12 bytes to it, and then store the value from register a1 into the memory address pointed to by the calculated sum. Subsequently, v1 is updated by computed sum of v1 and 12, ( v1  v1 + 12). • Pre Indexed Store Example: str a1,[v1, v2]! This instruction will take the pointer in v1, adds register v2 to it, and then store the value from register a1 into the memory address pointed to by the calculated sum. Subsequently, v1 is updated by computed sum of v1 and v2 ( v1  v1 + v2).

  8. Data Transfer: Other Reg to Mem Variants (#2/2) • Post Indexed Store Example: str a1, [v1],#12 This instruction will store the value from register a1 into the memory address pointed to by register v1. Subsequently, v1 is updates by computed sum of v1 and 12, ( v1  v1 + 12). • Post Indexed Store Example: str a1,[v1], v2 This instruction will store the value from register a1 into the memory address pointed to by register v1. Subsequently, v1 is updated by computed sum of v1 and v2, ( v1  v1 + v2).

  9. Pointers v. Values • Key Concept: A register can hold any 32-bit value. That value can be a (signed) int, an unsigned int, a pointer (memory address), etc. • If you write add v3,v2,v1 then v1 and v2 better contain values • If you write ldr a1,[v1] then v1 better contain a pointer • Don’t mix these up!

  10. Addressing: Byte vs. halfword vs. word • Every word in memory has an address, similar to an index in an array • Early computers numbered words like C numbers elements of an array: • Memory[0], Memory[1], Memory[2], … • Computers needed to access 8-bit bytes, half words (2 bytes/halfword) as well as words (4 bytes/word) • Today machines address memory as bytes, hence • Half word addresses differ by 2 Memory[0], Memory[2], Memory[4], … • word addresses differ by 4 Memory[0], Memory[4], Memory[8], … Called the “address” of a word

  11. Compilation with Memory • What offset in ldr to select my_Array[8] in C? • 4x8=32 to select my_Array[8]: byte v. word • Compile by hand using registers:g = h + my_Array[8]; • g: v1, h: v2, v3: base address of my_Array • 1st transfer from memory to register: ldr v1, [v3,#32] ; v1 gets my_Array[8] • Add 32 to v3 to select my_Array[8], put into v1 • Next add it to h and place in gadd v1,v2,v1 ; v1 = h+ my_Array[8]

  12. v3 contains the address of the Base of the my_Array . • ldr v1, [v3,#32] • Adds offset “8 x 4 = 32” to select my_Array[8], and puts into a1 • The value in register v3 is an address • Think of it as a pointer into memory • add v1, v2,v1 • The value in register v1 is the sum of v2 and v1. v1 + v2 Same thing in pictures 0 my_Array my_Array[0] 32 my_Array[8] v1 g v2 h v3 0xFFFFFFFF

  13. Compile with variable index • What if array index not a constant?g = h + my_Array[i]; • g: v1, h: v2, i: v3, v4: base address of my_Array • To load my_Array[i] into a register, first turn i into a byte address; multiply by 4 • How multiply using adds? • i + i = 2i, 2i + 2i = 4i mov a1,v3 ; a1 = i add a1,a1 ; a1 = 2*i add a1,a1 ; a1 = 4*i Better alternative: mov a1, v3, lsl #2

  14. Compile with variable index, con’t • Now load my_Array[i]= my_Array[0] + 4*i into v1 register: ldr v1, [v4, a1] ;v1= my_Array[i] • Finally add to h to it and put sum in g: add v1,v1, v2 ;g = h + my_Array[i]

  15. Base Reg Index Reg Compile with variable index: Summary • C statement: g = h + my_Array[i]; • Compiled ARM assembly instructions: mov a1, v3, lsl #2 ; a1 = 4*i ldr v1, [v4, a1] ;v1= my_Array[i] • Finally add to h to it and put sum in g: add v1,v1, v2 ;g = h + my_Array[i]

  16. Compile with variable index Example • Compile this into ARM code: B_Array[i] = h + A_Array[i]; • h: v1, i:v2, v3:base address of A_Array, v4:base address of B_Array

  17. Base Reg Index Reg Compile with variable index Example (Solution) • Compile this C code into ARM: B_Array[i] = h + A_Array[i]; • h: v1, i:v2, v3:base address of A_Array, v4:base address of B_Array mov a1, v2, lsl #2 ;a1 = 4*i ldr a2, [v3, a1] ; v4 + a1 = ;addrB_Array[i] ;a2= A_array[i] add a2, a2, v1 ;a2 = h + A_Array[i]; str a2, [v4, a1] ; v4 + a1 = ;addrB_Array[i] ;B_Array[i]= a2

  18. Notes about Memory • Pitfall: Forgetting that sequential word addresses in machines with byte addressing do not differ by 1. • Many an assembly language programmer has toiled over errors made by assuming that the address of the next word can be found by incrementing the address in a register by 1 instead of by the word size in bytes. • So remember that for both ldr and str, the sum of the base address and the offset must be a multiple of 4 (to be word aligned)

  19. 3 2 1 0 Aligned Not Aligned More Notes about Memory: Alignment (#1/2) • ARM requires that all words start at addresses that are multiples of 4 bytes • Called Alignment: objects must fall on address that is multiple of their size. • Some machines like Intel allow non-aligned accesses

  20. 0 1 2 3 0x80 09 82 a2 2e 0x83 0x82 0x81 0x80 ldr a1, 0x80 a1 = 0x0982a22e ldr a1, 0x81 a1 = 0x2e0982a2 ldr a1, 0x82 a1 = 0xa22e0982 ldr a1, 0x83 a1 = 0x82a22e09 More Notes about Memory: Alignment (#2/2) • Non-Aligned memory access causes byte rotation in right direction within the word

  21. Role of Registers vs. Memory • What if more variables than registers? • Compiler tries to keep most frequently used variable in registers • Writing less common to memory: spilling • Why not keep all variables in memory? • Smaller is faster:registers are faster than memory • Registers more versatile: • ARM Data Processing instructions can read 2, operate on them, and write 1 per instruction • ARM data transfer only read or write 1 operand per instruction, and no operation

  22. Overview • Word/ Halfword/ Byte Addressing • Byte ordering • Signed Load Instructions • Instruction Support for Characters

  23. 1 word = 4 Bytes v1 +12 a1 0 0 0 Data Transfer: More Mem to Reg Variants (#1/2) • Load Byte Example: ldrb a1, [v1,#12] This instruction will take the pointer in v1, add 12 bytes to it, and then load the byte value from the memory pointed to by this calculated sum into register a1. • Load Byte Example: ldrb a1, [v1, v2] This instruction will take the pointer in v1, add an index offset in register v2 to it, and then load the byte value from the memory pointed to by this calculated sum into register a1.

  24. 1 word = 4 Bytes v1 +12 a1 0 0 0 Data Transfer: More Mem to Reg Variants (#2/2) • Load Half Word Example: ldrh a1, [v1,#12] This instruction will take the pointer in v1, add 12 bytes to it, and then load the half word value from the memory pointed to by this calculated sum into register a1. • Load Byte Example: ldrh a1, [v1, v2] This instruction will take the pointer in v1, add an index offset in register v2 to it, and then load the half word value from the memory pointed to by this calculated sum into register a1.

  25. 1 word = 4 Bytes v1 +12 a1 Data Transfer: More Reg to Mem Variants (#1/2) • Store Byte Example: strb a1, [v1,#12] This instruction will take the pointer in v1, add 12 bytes to it, and then store the value from lsb Byte of register a1 into the memory address pointed to by the calculated sum. • Store Byte Example: strb a1,[v1, v2] This instruction will take the pointer in v1, adds register v2 to it, and then store the value from lsb Byteof register a1 into the memory address pointed to by the calculated sum.

  26. 1 word = 4 Bytes v1 +12 a1 0 Data Transfer: More Reg to Mem Variants (#2/2) • Store Half Word Example: strh a1, [v1,#12] This instruction will take the pointer in v1, add 12 bytes to it, and then store the value from half word of register a1 into the memory address pointed to by the calculated sum. • Store Half Word Example: strh a1,[v1, v2] This instruction will take the pointer in v1, adds register v2 to it, and then store the value from half wordof register a1 into the memory address pointed to by the calculated sum.

  27. Compilation with Memory (Byte Addressing) • What offset in ldr to select my_Array[8] (defined as Char) in C? • 1x8=8 to select my_Array[8]: byte • Compile by hand using registers:g = h + my_Array[8]; • g: v1, h: v2, v3:base address of my_Array • 1st transfer from memory to register: ldrb v1, [v3,#8] ; v1 gets my_Array[8] • Add 8 to r3 to select my_Array[8], put into v1 • Next add it to h and place in gadd v1,v2,v1 ; v1 = h+ my_Array[8]

  28. Compilation with Memory (half word Addressing) • What offset in ldr to select my_Array[8] (defined as halfword) in C? • 2x8=16 to select my_Array[8]: byte • Compile by hand using registers:g = h + my_Array[8]; • g: v1, h: v2, v3:base address of my_Array • 1st transfer from memory to register: ldrh v1, [v3, #16] ; v1 gets my_Array[8] • Add 16 to r3 to select my_Array[8], put into v1 • Next add it to h and place in gadd v1,v2,v1 ; v1 = h+ my_Array[8]

  29. little endian byte 0 “COMP” ‘C’ 100 3 2 1 0 ‘P’ 100 ‘O’ 101 ’M’ 101 msb lsb ’M’ 102 ‘O’ 102 ‘P’ 103 ‘C’ 0 1 2 3 103 “3221” ‘3’ 104 ‘1’ 104 ‘2’ 105 ‘2’ 105 big endian byte 0 ‘2’ 106 ‘2’ 106 ‘1’ 107 ‘3’ 107 More Notes about Memory: Word • How are bytes numbered in a word? • Gulliver’s Travels: Which end of egg to open? • Cohen, D. “On holy wars and a plea for peace (data transmission).” Computer, vol.14, (no.10), Oct. 1981. p.48-54. • Little Endian address of least significant byte: Intel 80x86, DEC Alpha, • Big Endian address of most significant byte HP PA, IBM/Motorola PowerPC, SGI, Sparc • ARM is Little Endian by default, However it can be made Big Endian by configuration.

  30. r0 = 0x11223344 31 24 23 16 15 8 7 0 11 22 33 44 STR r0, [r1] 31 24 23 16 15 8 7 0 31 24 23 16 15 8 7 0 Memory 44 33 22 11 11 22 33 44 r1 = 0x100 r1 = 0x100 Big-endian Little-endian LDRB r2, [r1] 31 24 23 16 15 8 7 0 31 24 23 16 15 8 7 0 00 00 00 11 00 00 00 44 r2 = 0x44 r2 = 0x11 Endianess Example

  31. Elements 0 { x x + 1 n elements a1 x + (n - 1) Code Example • Write a segment of code that add together elements x to x+(n-1) of an array, where the element x = 0 is the first element of the array. • Each element of the array is word sized (ie. 32 bits). • The segment should use post-indexed addressing. • At the start of your segments, you should assume that: • a1 points to the start of the array. • a2 = x • a3 = n

  32. Code Example: Sample Solution add a1, a1, a2, lsl #2 ; Set a1 to address ; of element x add a3, a1, a3, lsl #2 ; Set a3 to address ; of element x +(n-1) mov a2, #0 ; Initialise ;accumulator Loop: ldr a4, [a1], #4 ; Access element and ; move to next add a2, a2, a4 ; Add contents to ; counter cmp a1, a3 ; Have we reached ; element x+n? blt loop ; If not - repeat ; for next element ; on exit sum ; contained in a2

  33. ldrsb a1, [v1,#12] ldrsb a1, [v1,v2] 31 9 8 7 6 5 4 3 2 1 0 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S ldrsh a1, [v1,#12] ldrsh a1, [v1,v2] 15 31 9 8 7 6 5 4 3 2 1 0 S S SS S S Sign Extension and Load Byte & Load Half Word • ARM instruction (ldrsb) automatically extends “sign” of byte for load byte. • ARM instruction (ldrsh) automatically extends “sign” of half word for load half word.

  34. Instruction Support for Characters • ARM (and most other instruction sets) include instructions to operate on bytes: • move byte (ldrb) loads a byte from memory/reg, placing it in rightmost 8 bits of a register, or vice versa • Declares byte variables in C as “char” • Assume x, y are declared char. x in memory at [v1,#4]and y at [v1,#0]. What is ARM code for x= y; ? ldrb a1, [v1,#0] strb a1, [v1,#4] ; transfer y to x

  35. Strings in C: Example • String simply an array of charvoid strcpy (char x[], char y[]){int i = 0; /* declare,initialize i*/while ((x[i] = y[i]) != ’\0’) /* 0 */ i = i + 1; /* copy and test byte */ } • function i, addr. of x[0], addr. of y[0]: v1, a1, a2 , func ret addr. :lr strcpy: mov v1, #-1 ; i = -1L1: add v1, v1, #1 ; i =i + 1 ldrb a3, [a2,v1] ; a1= y[i] strb a3, [a1,v1] ; x[i]=y[i] cmp a3, #0 bne L1 ; y[i]!=0 ;goto L1 mov pc, lr ; return

  36. Strings in C: Example using pointers • String simply an array of charvoid strcpy2 (char *px, char *py){while ((*px++ = *py++) != ’\0’) /* 0 */ ; /* copy and test byte */ } • function addr. of x[0], addr. of y[0]: v2, v3 func ret addr.:lr strcpy:L1: ldrb a1, [v3],#1 ;a1= *py, py = py +1 strb a1, [v2],#1 ;*px = *py, px = px +1 cmp a1, #0 bne L1 ; py!=0 goto L1 mov pc, lr ; return • ideally compiler optimizes code for you

  37. 0x100 0x100 0x104 0x104 0x108 0x108 0x112 0x112 Replace this with v1 v1 v1 v1 v1 v1 v1 v1 v1 v1 stmia v1!, {a1-a4} STMIA : STORE MULTIPLE INCREMENT AFTER Replace this with stmib v1!, {a1-a4} STMIB : STORE MULTIPLE INCREMENT BEFORE Block Copy Transfer (#1/5) • Consider the following code: str a1, [v1],#4 str a2, [v1],#4 str a3, [v1],#4 str a4, [v1],#4 a1 a2 a3 a4 • Consider the following code: str a1, [v1, #4]! str a2, [v1, #4]! str a3, [v1, #4]! str a4, [v1, #4]! a1 a2 a3 a4

  38. 0x100 0x100 0x104 0x104 0x108 0x108 0x112 0x112 Replace this with v1 v1 v1 v1 v1 v1 v1 v1 v1 v1 stmda v1!, {a1-a4} STMDA : STORE MULTIPLE DECREMENT AFTER Replace this with stmdb v1!, {a1-a4} STMDB : STORE MULTIPLE DECREMENT BEFORE Block Copy Transfer (#2/5) • Consider the following code: str a1, [v1],#-4 str a2, [v1],#-4 str a3, [v1],#-4 str a4, [v1],#-4 a4 a3 a2 a1 • Consider the following code: str a1, [v1, #-4]! str a2, [v1, #-4]! str a3, [v1, #-4]! str a4, [v1, #-4]! a4 a3 a2 a1

  39. 0x100 0x100 0x104 0x104 0x108 0x108 0x112 0x112 Replace this with v1 v1 v1 v1 v1 v1 v1 v1 v1 v1 v1 stmia v1, {a1-a4} STMIA : STORE MULTIPLE INCREMENT AFTER Replace this with stmib v1, {a1-a4} STMIB : STORE MULTIPLE INCREMENT BEFORE Block Copy Transfer (#3/5) • Consider the following code: str a1, [v1] str a2, [v1,#4] str a3, [v1,#8] str a4, [v1,#12] a1 a2 a3 a4 • Consider the following code: str a1, [v1, #4] str a2, [v1, #8] str a3, [v1, #12] str a4, [v1, #16] a1 a2 a3 a4

  40. 0x100 0x100 0x104 0x104 0x108 0x108 0x112 0x112 Replace this with v1 v1 v1 v1 v1 v1 v1 v1 v1 v1 v1 stmda v1, {a1-a4} STMDA : STORE MULTIPLE DECREMENT AFTER Replace this with stmdb v1, {a1-a4,} STMDB : STORE MULTIPLE DECREMENT BEFORE Block Copy Transfer (#4/5) • Consider the following code: str a1, [v1] str a2, [v1,#-4] str a3, [v1,#-8] str a4, [v1,#-12] a4 a3 a2 a1 • Consider the following code: str a2, [v1,#-4] str a3, [v1,#-8] str a4, [v1,#-12] str a1, [v1,#16] a4 a3 a2 a1

  41. Block Data Transfer (#5/5) • Similarly we have • LDMIA : Load Multiple Increment After • LDMIB : Load Multiple Increment Before • LDMDA : Load Multiple Decrement After • LDMDB : Load Multiple Decrement Before For details See Chapter 3, page 61 – 62 Steve Furber: ARM System On-Chip; 2nd Ed, Addison-Wesley, 2000, ISBN: 0-201-67519-6.

  42. COMP3221 Reading Materials (Week #4) • Week #4: Steve Furber: ARM System On-Chip; 2nd Ed, Addison-Wesley, 2000, ISBN: 0-201-67519-6. We use chapters 3 and 5 • ARM Architecture Reference Manual –On CD ROM

  43. “And in Conclusion…” (#1/2) • In ARM Assembly Language: • Registers replace C variables • One Instruction (simple operation) per line • Simpler is Better • Smaller is Faster • Memory is byte-addressable, but ldr and str access one word at a time. • Access byte and halfword using ldrb, ldrh,ldrsb and ldrsh • A pointer (used by ldr and str) is just a memory address, so we can add to it or subtract from it (using offset).

  44. “And in Conclusion…”(#2/2) • New Instructions: ldr, str ldrb, strb ldrh, strh ldrsb, ldrsh

More Related