670 likes | 874 Views
Memory Organization and Addressing CSCI 224 / ECE 317: Computer Architecture. Instructor: Prof. Jason Fritts. Slides adapted from Bryant & O’Hallaron’s slides. Data Representation in Memory. Memory organization within a process Virtual vs. Physical memory Fundamental Idea and Purpose
E N D
Memory Organization and AddressingCSCI 224 / ECE 317: Computer Architecture Instructor: Prof. Jason Fritts Slides adapted from Bryant & O’Hallaron’s slides
Data Representation in Memory • Memory organization within a process • Virtual vs. Physical memory • Fundamental Idea and Purpose • Page Mapping • Address Translation • Per-Process Mapping and Protection • Memory addressing and ordering of multi-byte data • Addressing • Byte ordering • Arrays • Data structures • Ordering in arrays/structures vs. single multi-byte data elements
0x00•••0 0xFF•••F • • • Recall: Basic Memory Organization • Byte-Addressable Memory • Conceptually a very large array, with a unique address for each byte • Processor width determines address range: • 32-bit processor has 232 unique addresses • 64-bit processor has 264unique addresses • Where does a given process reside in memory? • depends upon the perspective… • virtual memory: process can use most any virtual address • physical memory: location controlled by OS
not drawn to scale Virtual Address Space for IA32 (x86) Linux 0xFFFFFFFF Stack 8MB • All processes have the same uniform view of memory • Stack • Runtime stack (8MB limit) • E. g., local variables • Heap • Dynamically allocated storage • When call malloc(), calloc(), new() • Data • Statically allocated data • E.g., global variables, arrays, structures, etc. • Text • Executable machine instructions • Read-only data Heap Data Text 0x08000000 0x00000000
not drawn to scale Memory Allocation Example 0xFF…F Stack char big_array[1<<24]; /* 16 MB */ char huge_array[1<<28]; /* 256 MB */ int beyond; char *p1, *p2, *p3, *p4; int useless() { return 0; } int main() { p1 = malloc(1 <<28); /* 256 MB */ p2 = malloc(1 << 8); /* 256 B */ p3 = malloc(1 <<28); /* 256 MB */ p4 = malloc(1 << 8); /* 256 B */ /* Some print statements ... */ } Heap Data Text 0x08…0 Where does everything go? 0x00…0
not drawn to scale Addresses in IA32 (x86) 0xFFFFFFFF Stack address range ~232 $esp 0xffffbcd0 p3 0x65586008 p1 0x55585008 p4 0x1904a110 p2 0x1904a008 &p2 0x18049760 &beyond 0x08049744 big_array 0x18049780 huge_array 0x08049760 main() 0x080483c6 useless() 0x08049744 0x80000000 Heap Data malloc() is dynamically linked address determined at runtime Text 0x08000000 0x00000000
not drawn to scale Addresses in x86-64 0x00007F…F Stack address range ~247 $rsp0x00007ffffff8d1f8 p3 0x00002aaabaadd010 p1 0x00002aaaaaadc010 p4 0x0000000011501120 p2 0x0000000011501010 &p2 0x0000000010500a60 &beyond 0x0000000000500a44 big_array0x0000000010500a80 huge_array0x0000000000500a50 main() 0x0000000000400510 useless() 0x0000000000400500 0x000030…0 Heap Data malloc() is dynamically linked address determined at runtime Text 0x000000…0
Detailed Virtual Address Spacefor a Linux Process Kernel virtual memory Kernel code and data User stack %esp Shared libraries Process virtual memory brk Runtime heap (malloc(), etc.) Uninitialized data (.bss) Initialized data (.data) Read-only data (.rodata) Program code (.init, .text) 0x08048000 (32) 0x00400000 (64) 0
Data Representation in Memory • Memory organization within a process • Virtual vs. Physical memory • Fundamental Idea and Purpose • Page Mapping • Address Translation • Per-Process Mapping and Protection • Memory addressing and ordering of multi-byte data • Addressing • Byte ordering • Arrays • Data structures • Ordering in arrays/structures vs. single multi-byte data elements
Contrast: System Using Physical Addressing • Used in “simple” systems like embedded microcontrollers in devices like cars, elevators, and digital picture frames Main memory 0: 1: 2: Physical address (PA) 3: CPU 4: 4 5: 6: 7: 8: ... M-1: Data word
Contrast: System Using Virtual Addressing • Used in all modern servers, desktops, and laptops • One of the great ideas in computer science Main memory 0: CPU Chip 1: 2: Virtual address (VA) Physical address (PA) 3: MMU CPU 4: 4 4100 5: 6: 7: 8: ... M-1: Data word
Address Spaces • Linear address space:Ordered set of contiguous non-negative integer addresses: {0, 1, 2, 3 … } • Virtual address space:Set of N = 2n virtual addresses {0, 1, 2, 3, …, N-1} • Physical address space:Set of M = 2m physical addresses {0, 1, 2, 3, …, M-1} • Clean distinction between data (bytes) and their attributes (addresses) • Each object can now have multiple addresses • Every byte in main memory: one physical address; one (or more) virtual addresses
Why Virtual Memory (VM)? • Uses main memory efficiently • Use DRAM as a cache for the parts of a virtual address space • Simplifies memory management • Each process gets the same uniform linear address space • Isolates address spaces • One process can’t interfere with another’s memory • User program cannot access privileged kernel information
Data Representation in Memory • Memory organization within a process • Virtual vs. Physical memory • Fundamental Idea and Purpose • Page Mapping • Address Translation • Per-Process Mapping and Protection • Memory addressing and ordering of multi-byte data • Addressing • Byte ordering • Arrays • Data structures • Ordering in arrays/structures vs. single multi-byte data elements
VM as a Tool for Caching • Virtual memory– array of N contiguous bytes stored on disk. • The contents of the array on disk are cached in physical memory (DRAM cache) • These cache blocks are called pages(size is P = 2p bytes) Virtual memory Physical memory 0 VP 0 Unallocated 0 PP 0 VP 1 Cached Empty PP 1 Uncached Unallocated Empty Cached Uncached Empty PP 2m-p-1 Cached M-1 VP 2n-p-1 Uncached N-1 Virtual pages (VPs) stored on disk Physical pages (PPs) cached in DRAM
DRAM as a Cache for Disk • Disk has enormousmiss penalty • DRAM is only about 10x slower than SRAM • Disk is much slower than DRAM: about 10,000xslower • Consequences • Highly sophisticated algorithms used for organizing DRAM effectively • cache memory has relatively simple mechanisms • Large page (block) size: typically 4-8 KB, sometimes 4 MB • Fully associative • any VP can be placed in any PP
Page Tables • A page table is an array of page table entries (PTEs) that maps virtual pages to physical pages. • Per-process kernel data structure in DRAM Physical memory (DRAM) Physical page number or disk address PP 0 VP 1 Valid VP 2 PTE 0 0 null VP 7 PP 3 1 VP 4 1 0 Virtual memory (disk) 1 0 null 0 VP 1 PTE 7 1 VP 2 Memory resident page table (DRAM) VP 3 VP 4 VP 6 VP 7
Page Hit • Page hit: • reference to VM word that is in physical memory (DRAM cache hit) Physical memory (DRAM) Physical page number or disk address Virtual address PP 0 VP 1 Valid VP 2 PTE 0 0 null VP 7 PP 3 1 VP 4 1 0 1 Virtual memory (disk) 0 null 0 PTE 7 1 VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 VP 7
Page Fault • Page fault: • reference to VM word that is not in physical memory (DRAM miss) Physical memory (DRAM) Physical page number or disk address Virtual address PP 0 VP 1 Valid VP 2 PTE 0 0 null VP 7 PP 3 1 VP 4 1 0 1 Virtual memory (disk) 0 null 0 PTE 7 1 VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 VP 7
Handling Page Fault • Page miss causes page fault (an exception) Physical memory (DRAM) Physical page number or disk address Virtual address PP 0 VP 1 Valid VP 2 PTE 0 0 null VP 7 PP 3 1 VP 4 1 0 1 Virtual memory (disk) 0 null 0 PTE 7 1 VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 VP 7
Handling Page Fault • Page miss causes page fault (an exception) • Page fault handler selects a victim to be evicted (here VP 4) Physical memory (DRAM) Physical page number or disk address Virtual address PP 0 VP 1 Valid VP 2 PTE 0 0 null VP 7 PP 3 1 VP 4 1 0 1 Virtual memory (disk) 0 null 0 PTE 7 1 VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 VP 7
Handling Page Fault • Page miss causes page fault (an exception) • Page fault handler selects a victim to be evicted (here VP 4) Physical memory (DRAM) Physical page number or disk address Virtual address PP 0 VP 1 Valid VP 2 PTE 0 0 null VP 7 PP 3 1 VP 3 1 1 0 Virtual memory (disk) 0 null 0 PTE 7 1 VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 VP 7
Handling Page Fault • Page miss causes page fault (an exception) • Page fault handler selects a victim to be evicted (here VP 4) • Offending instruction is restarted: page hit! Physical memory (DRAM) Physical page number or disk address Virtual address PP 0 VP 1 Valid VP 2 PTE 0 0 null VP 7 PP 3 1 VP 3 1 1 0 Virtual memory (disk) 0 null 0 PTE 7 1 VP 1 Memory resident page table (DRAM) VP 2 VP 3 VP 4 VP 6 VP 7
Data Representation in Memory • Memory organization within a process • Virtual vs. Physical memory • Fundamental Idea and Purpose • Page Mapping • Address Translation • Per-Process Mapping and Protection • Memory addressing and ordering of multi-byte data • Addressing • Byte ordering • Arrays • Data structures • Ordering in arrays/structures vs. single multi-byte data elements
VM Address Translation • Virtual Address Space • V = {0, 1, …, N–1} • Physical Address Space • P = {0, 1, …, M–1} • Address Translation • map: V P U {} • For virtual address a: • map(A) = A’ if virtual address A stored in physical address A’of P • map(A) = if data at virtual address A is not in physical memory • Either invalid or stored on disk
VPN VPO PPN PPO Simple Memory System Example • Addressing • 14-bit virtual addresses • 12-bit physical address • Page size = 64 bytes = 26(6-bit page offset addr) 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Virtual Page Offset Virtual Page Number 11 10 9 8 7 6 5 4 3 2 1 0 Physical Page Number Physical Page Offset
Simple Memory System Page Table Only show first 16 entries (out of 256) VPN PPN Valid VPN PPN Valid 00 28 1 08 13 1 00 28 1 01 – 0 09 17 1 09 17 1 02 33 1 0A 09 1 03 02 1 0B – 0 04 – 0 0C – 0 05 16 1 0D 2D 1 06 – 0 0E 11 1 07 – 0 0F 0D 1
VPN VPO 1 0 1 0 0 0 1 0 0 0 0 0 PPN PPO Address Translation Example #1 Virtual Address: 0x0020 VPN _____ Page Table:Valid? ___ PPN: ____ Location:____________ Physical Address 13 12 11 10 9 8 7 6 5 4 3 2 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 Main Memory 0x00 Y 0x28 11 10 9 8 7 6 5 4 3 2 1 0
VPN VPO 0 1 0 1 1 1 0 1 0 1 1 0 PPN PPO Address Translation Example #2 Virtual Address: 0x0256 VPN _____ Page Table:Valid? ___ PPN: ____ Location:____________ Physical Address 13 12 11 10 9 8 7 6 5 4 3 2 1 0 0 1 0 1 1 0 0 0 0 1 0 0 1 0 Main Memory 0x09 Y 0x17 11 10 9 8 7 6 5 4 3 2 1 0
Data Representation in Memory • Memory organization within a process • Virtual vs. Physical memory • Fundamental Idea and Purpose • Page Mapping • Address Translation • Per-Process Mapping and Protection • Memory addressing and ordering of multi-byte data • Addressing • Byte ordering • Arrays • Data structures • Ordering in arrays/structures vs. single multi-byte data elements
VM as a Tool for Memory Management • Key idea: each process has its own virtual address space • viewed as a simple linear array • mapping function scatters addresses through physical memory • Can share code and data among processes • Map virtual pages to the same physical page (here: PP 6) Address translation 0 0 Physical Address Space (DRAM) Virtual Address Space for Process 1: VP 1 VP 2 PP 2 ... N-1 (e.g., read-only library code) PP 6 0 Virtual Address Space for Process 2: PP 8 VP 1 VP 2 ... ... M-1 N-1
VM as a Tool for Memory Protection • Extend PTEs with permission bits • Page fault handler checks these before remapping • If violated, send process SIGSEGV (segmentation fault) Physical Address Space SUP READ WRITE Address Process i: VP 0: No Yes No PP 6 VP 1: No Yes Yes PP 4 PP 2 VP 2: Yes Yes Yes PP 2 • • • • PP 4 PP 6 SUP READ WRITE Address Process j: PP 8 No Yes No PP 9 VP 0: • PP 9 Yes Yes Yes PP 6 VP 1: PP 11 No Yes Yes PP 11 VP 2:
Data Representation in Memory • Memory organization within a process • Virtual vs. Physical memory • Fundamental Idea and Purpose • Page Mapping • Address Translation • Per-Process Mapping and Protection • Memory addressing and ordering of multi-byte data • Addressing • Byte ordering • Arrays • Data structures • Ordering in arrays/structures vs. single multi-byte data elements
0004 0008 0012 0000 0008 0000 Address of Multi-byte Data 8-bit data 16-bit data 32-bit data 64-bit data Addr. • Every byte has a unique address • So, if data spans multiple bytes, what is address? • Data always addressed by itslowest address • address of first byte in memory 0000 Addr Addr = ?? 0000 0001 0002 Addr Addr = ?? 0002 0003 0004 Addr Addr = ?? 0004 0005 0006 Addr 0006 0007 0008 Addr Addr = ?? 0008 0009 000A Addr Addr = ?? 000A 000B 000C Addr Addr = ?? 000C 000D 000E Addr 000E 000F
0004 0008 0012 0000 0008 0000 Address of Multi-byte Data 8-bit data 16-bit data 32-bit data 64-bit data Addr. • Alignment • Data elements are aligned by size • for a primitive (single datum) with Kbits, address must be multiple of K • chars, booleans at any address • shortsat even addresses • ints, floats, pointers every 4thaddr • doublesevery 8thaddress • etc. • Arrays, structures, and classes • alignment determined by size of largest primitive (single datum) 0000 Addr Addr = ?? 0000 0001 0002 Addr Addr = ?? 0002 0003 0004 Addr Addr = ?? 0004 0005 0006 Addr 0006 0007 0008 Addr Addr = ?? 0008 0009 000A Addr Addr = ?? 000A 000B 000C Addr Addr = ?? 000C 000D 000E Addr 000E 000F
Data Representation in Memory • Memory organization within a process • Virtual vs. Physical memory • Fundamental Idea and Purpose • Page Mapping • Address Translation • Per-Process Mapping and Protection • Memory addressing and ordering of multi-byte data • Addressing • Byte ordering • Arrays • Data structures • Ordering in arrays/structures vs. single multi-byte data elements
Byte Ordering • How should bytes within a multi-byte word be ordered in memory? • Affects only primitive data elements with multiple bytes • i.e. a single data element composed of multiple bytes • short, int, long, float, double, boolean, … • does not affect arrays, structs, or classes • Conventions • Big Endian: Sun, PPC Mac, Internet • Least significant byte has highest address • Little Endian: x86 • Least significant byte has lowest address
0x103 0x102 0x101 0x103 0x100 0x101 0x100 0x102 01 23 67 45 45 23 01 67 45 45 67 67 23 01 23 01 Byte Ordering Example • Big Endian • Least significant byte has highest address • Little Endian • Least significant byte has lowest address • Example • Variable mhas 4-byte representation 0x01234567 • Address given by &mis 0x100 Big Endian Little Endian
IA32 x86-64 IA32, x86-64 IA32, x86-64 Sun Sun Sun FF 93 C4 3B FF FF C4 00 00 FF 00 00 6D 00 93 3B 00 6D 3B 00 6D 3B 00 00 6D 6D 00 00 00 00 00 3B Decimal: 15213 Binary: 0011 1011 0110 1101 Hex: 3 B 6 D Representing Integers int A = 15213; long int C = 15213; int B = -15213; Signed integer (two’s complement) representation
Data Representation in Memory • Memory organization within a process • Virtual vs. Physical memory • Fundamental Idea and Purpose • Page Mapping • Address Translation • Per-Process Mapping and Protection • Memory addressing and ordering of multi-byte data • Addressing • Byte ordering • Arrays • Data structures • Ordering in arrays/structures vs. single multi-byte data elements
Basic Data Types • Integral • Stored & operated on in general (integer) registers • Signed vs. unsigned depends on instructions used Intel ASM Bytes C byte b 1 [unsigned] char word w 2 [unsigned] short double word l 4 [unsigned]int quad word q 8 [unsigned] long int(x86-64) • Floating Point • Stored & operated on in floating point registers Intel ASM Bytes C Single s 4 float Double l 8 double Extended t 10/12/16 long double
Array Allocation • Basic Principle TA[L]; • Array of data type T and length L • Contiguously allocated region of L * sizeof(T) bytes char string[12]; x x + 12 int val[5]; x x + 4 x + 8 x + 12 x + 16 x + 20 double a[3]; x x + 8 x + 16 x + 24 char *p[3]; IA32 x x + 4 x + 8 x + 12 x86-64 x x + 8 x + 16 x + 24
1 5 2 1 3 Array Access • Basic Principle TA[L]; • Array of data type T and length L • Identifier A can be used as a pointer to array element 0: Type T* • Reference Type Value val[4] int3 valint * x val+1int * x + 4 &val[2]int * x + 8 val[5]int?? *(val+1)int5 val + iint * x + 4i int val[5]; x x + 4 x + 8 x + 12 x + 16 x + 20
1 9 0 5 4 2 7 2 1 1 3 2 0 3 9 Array Example #define ZLEN 5 typedefintzip_dig[ZLEN]; zip_digcmu = { 1, 5, 2, 1, 3 }; zip_digmit = { 0, 2, 1, 3, 9 }; zip_digucb = { 9, 4, 7, 2, 0 }; • Declaration “zip_dig cmu” equivalent to “int cmu[5]” • Example arrays were allocated in successive 20 byte blocks • Not guaranteed to happen in general zip_dig cmu; zip_dig mit; 56 36 16 40 60 20 64 24 44 68 28 48 52 32 72 56 36 76 zip_digucb;
1 5 2 1 3 Array Accessing Example • Register %edx contains starting address of array • Register %eax contains array index • Desired digit at 4*%eax + %edx • Use memory reference (%edx,%eax,4) zip_dig cmu; int get_digit (zip_dig z, int dig) { return z[dig]; } 16 20 24 28 32 36 IA32 # %edx = z # %eax = dig movl (%edx,%eax,4),%eax # z[dig]
Array Loop Example (IA32) void zincr(zip_dig z) { inti; for (i = 0; i < ZLEN; i++) z[i]++; } # edx = z movl $0, %eax # %eax = i .L4: # loop: addl $1, (%edx,%eax,4) # z[i]++ addl $1, %eax # i++ cmpl $5, %eax # i:5 jne .L4 # if !=, goto loop
Pointer Loop Example (IA32) void zincr_v(zip_dig z) { void *vz = z; inti = 0; do { (*((int *) (vz+i)))++; i += ISIZE; } while (i != ISIZE*ZLEN); } void zincr_p(zip_dig z) { int *zend = z+ZLEN; do { (*z)++; z++; } while (z != zend); } # edx = z = vz movl $0, %eax # i = 0 .L8: # loop: addl $1, (%edx,%eax) # Increment vz+i addl $4, %eax # i += 4 cmpl $20, %eax # Compare i:20 jne .L8 # if !=, goto loop
2 1 1 1 1 5 5 5 5 2 2 2 2 1 0 1 7 3 1 6 Nested Array Example #define PCOUNT 4 zip_digpgh[PCOUNT] = {{1, 5, 2, 0, 6}, {1, 5, 2, 1, 3 }, {1, 5, 2, 1, 7 }, {1, 5, 2, 2, 1 }}; • “zip_dig pgh[4]” equivalent to “int pgh[4][5]” • Variable pgh: array of 4 elements, allocated contiguously • Each element is an array of 5 int’s, allocated contiguously • “Row-Major” ordering of all elements guaranteed zip_dig pgh[4]; 76 96 116 136 156
A[0][0] • • • A[0][C-1] • • • • • • • • • A [0] [0] A [R-1] [0] • • • A [1] [0] • • • • • • A [1] [C-1] A [R-1] [C-1] A [0] [C-1] A[R-1][0] • • • A[R-1][C-1] Multidimensional (Nested) Arrays • Declaration TA[R][C]; • 2D array of data type T • R rows, C columns • Type T element requires K bytes • Array Size • R * C * K bytes • Arrangement • Row-Major Ordering int A[R][C]; 4*R*C Bytes
Data Representation in Memory • Memory organization within a process • Virtual vs. Physical memory • Fundamental Idea and Purpose • Page Mapping • Address Translation • Per-Process Mapping and Protection • Memory addressing and ordering of multi-byte data • Addressing • Byte ordering • Arrays • Data structures • Ordering in arrays/structures vs. single multi-byte data elements