1 / 45

Main Memory and Address Translation

Main Memory and Address Translation. CS 170, Fall 2015 Tao Yang Some of slides from John Kubiatowicz http://cs162.eecs.Berkeley.edu. What to Learn. Program execution and address space How to translate address Contiguous Memory Allocation Segmentation Address Translation with Paging

moncayo
Download Presentation

Main Memory and Address Translation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Main Memory and Address Translation CS 170, Fall 2015 Tao Yang Some of slides from John Kubiatowicz http://cs162.eecs.Berkeley.edu

  2. What to Learn • Program execution and address space • How to translate address • Contiguous Memory Allocation • Segmentation • Address Translation with Paging • TLB support and performance impact

  3. Recall: Single and Multithreaded Processes • Threads encapsulate concurrency • “Active” component of a process • Address spaces encapsulate protection • Keeps buggy program from trashing the system • “Passive” component of a process

  4. Process Control Block • What is needed as thread control block (TCB)? • Stack info • Registers and machine states • What is new in PCB? • Meta information for memory space management

  5. Objective of Memory Management: Run Programs Memory • Get CPU cycle and allocate memory • Load instruction and data segments of executable file into memory • Create stack and heap • Set the starting address and execute • Provide services to it Executable 0x000… Program Source instructions int main() { … ; } Load & Execute instructions compiler editor data data heap a.out foo.c stack OS 0xFFF… PC: registers Processor

  6. Memory Management • All data in memory. All instructions in memory in order to execute • Memory management activities • Keeping track of which parts of memory are currently being used and by whom • Deciding which processes (or parts thereof) and data to move into and out of memory • Allocating and deallocating memory space as needed

  7. Logical vs. Physical Address Space • Logical address– generated by the CPU; also referred to as virtual address • Physical address– address seen by the memory unit • Logical and physical addresses are the same in compile-time and load-time address-binding schemes; • logical (virtual) and physical addresses differ in execution-time address-binding scheme

  8. Binding of Instructions and Data to Memory Assume 4byte words 0x300 = 4 * 0x0C0 0x0C0 = 0000 1100 0000 0x300 = 0011 0000 0000 Physical addresses Process view of memory • 0x0300 00000020 • … … • 0x0900 8C2000C0 • 0x0904 0C000280 • 0x0908 2021FFFF • 0x090C 14200242 • … • 0x0A00 • data1: dw 32 • … • start: lw r1,0(data1) • jal checkit • loop: addi r1, r1, -1 • bnz r1, loop … • checkit: …

  9. Binding of Instructions and Data to Memory Physical Memory 0x0000 0x0300 • 00000020 Process view of memory Physical addresses 0x0900 • 8C2000C0 • 0C000340 • 2021FFFF • 14200242 • data1: dw 32 • … • start: lw r1,0(data1) • jal checkit • loop: addi r1, r1, -1 • bnz r1, loop … • checkit: … • 0x0300 00000020 • … … • 0x0900 8C2000C0 • 0x0904 0C000280 • 0x0908 2021FFFF • 0x090C 14200242 • … • 0x0A00 0xFFFF

  10. Second copy of program from previous example Physical Memory 0x0000 0x0300 App X Process view of memory Physical addresses 0x0900 • data1: dw 32 • … • start: lw r1,0(data1) • jal checkit • loop: addi r1, r1, -1 • bnz r1, r0, loop … • checkit: … • 0x300 00000020 • … … • 0x900 8C2000C0 • 0x904 0C000280 • 0x908 2021FFFF • 0x90C 14200242 • … • 0x0A00 ? 0xFFFF Need address translation!

  11. Second copy of program from previous example PhysicalMemory 0x0000 0x0300 App X Process view of memory Processor view of memory 0x0900 • data1: dw 32 • … • start: lw r1,0(data1) • jal checkit • loop: addi r1, r1, -1 • bnz r1, r0, loop … • checkit: … • 0x1300 00000020 • … … • 0x1900 8C2004C0 • 0x1904 0C000680 • 0x1908 2021FFFF • 0x190C 14200642 • … • 0x1A00 0x1300 • 00000020 • 8C2004C0 • 0C000680 • 2021FFFF • 14200642 0x1900 • One of many possible translations! • Where does translation take place? • Compile time, Link/Load time, or Execution time? 0xFFFF

  12. Binding of Instructions and Data to Memory • Happen at three different stages • Compile time: If memory location known a priori, absolute codecan be generated. • gcc • Load time: Must generate relocatable code if memory location is not known at compile time • Linux ld • Execution time: Binding delayed until run time if the process can be moved during its execution from one memory segment to another. • Dynamic library

  13. Better Solution: Address Space Translation • Address Space: • All the addresses and state a process can touch • Each process and kernel has different address space • Program operates in an address space that is distinct from the physical memory space of the machine 0x000… “physical address” “virtual address” translator Processor Memory 0xFFF…

  14. Untranslated read or write Virtual Addresses Physical Addresses CPU MMU Address translation with MMU • Consequently, two views of memory: • View from the CPU (what program sees, virtual memory) • View from memory (physical memory) • Translation box (MMU) converts between the two views • Translation essential to implementing protection • If task A cannot even gain access to task B’s data, no way for A to adversely affect B • With translation, every program can be linked/loaded into same region of user address space

  15. Contiguous Memory Allocation • Main memory has two partitions: • Resident operating system, usually held in low memory • User processes then held in high memory • Hardware support and protection • Base register contains value of smallest physical address • Limit register contains range of logical addresses – each logical address must be less than the limit register • MMU maps logical address dynamically OS process 5 process 8 process 2

  16. Example of Memory Protection for Multiprogramming • During switch, kernel loads new base/limit from PCB (Process Control Block) • User not allowed to change base/limit registers

  17. Base Virtual Address CPU DRAM + Physical Address <? Limit No: Error! Simple Base and Bounds (CRAY-1) • Could use base/limit for dynamic address translation – translation happens at execution: • Alter address of every load/store by adding “base” • Generate error if address bigger than limit • This gives program the illusion that it is running on its own dedicated machine, with memory starting at 0 • Program gets continuous region of memory • Addresses within program do not have to be relocated when program placed in different region of DRAM

  18. Issues with Simple B&B Method process 6 process 6 process 6 process 6 process 5 process 5 process 5 • External Fragmentation problem (free gaps between used slots) • Not every process is the same size • Over time, memory space becomes fragmented • Missing support for sparse address space • Would like to have multiple chunks/program • E.g.: Code, Data, Stack • Hard to do inter-process sharing • Want to share code segments when possible • Want to share memory between processes • Helped by providing multiple segments per process process 9 process 9 process 11 process 2 process 10 OS OS OS OS

  19. Segmentation • Memory-management scheme that supports user semantic view of memory • A program is a collection of segments A segment is a logical unit such as: • Code, data, stack. • Others: memory-sharing • Each segment is given • region of contiguous memory • Has a base and limit • Can reside anywhere in • physical memory

  20. Example of Segmentation

  21. Segmentation Hardware

  22. > Virtual Address Error Seg # Offset Base1 Base0 Base2 Base5 Base4 Base6 Base7 Base2 Base3 Limit6 Limit7 Limit1 Limit5 Limit0 Limit4 Limit3 Limit2 Limit2 V N V N V V N V V + Physical Address Check Valid Access Error Implementation of Multi-Segment Model offset • Segment map resides in processor • Segment number mapped into base/limit pair • Base added to offset to generate physical address • Error check catches offset out of range • As many chunks of physical memory as entries • Segment addressed by portion of virtual address • However, could be included in instruction instead: • x86 Example: mov [es:bx],ax. • What is “V/N” (valid / not valid)? • Can mark segments as invalid; requires check as well

  23. Seg Offset 15 14 13 0 0x0000 0x0000 0x4000 0x4000 0x4800 0x5C00 0x8000 0xC000 0xF000 Virtual Address Space Physical Address Space Example: Four Segments (16 bit addresses) Virtual Address Format SegID = 0 Might be shared SegID = 1 Space for Other Apps Shared with Other Apps

  24. Problems with Segmentation • Must fit variable-sized chunks into physical memory • May move processes multiple times to fit everything • Limited options for swapping to disk • Fragmentation: wasted space • External: free gaps between allocated chunks • Internal: don’t need all memory within allocated chunks

  25. Paging • Divide physical memory into fixed-sized blocks called frames or pages • (size is power of 2, between 1K bytes and 16K bytes) • Divide logical memory into blocks of same size called pages or logical pages • Translation conducted by page table which is kept in main memory • Page-table base register (PTBR) points to the page table • Page-table length register (PRLR) indicates size of the page table

  26. Paging Model of Logical and Physical Memory A page table per process is needed to translate logical to physical addresses

  27. Management of Free Pages • Can use vector of bits to represent availability of each page00110001110001101 … 110010 1allocated, 0free Before allocation After allocation

  28. Should pages be as big as our previous segments? • Remove/reduce external fragmentation, but lead to Internal Fragmentation – allocated memory is larger than requested memory; • Typically have small pages (1K-16K) 32K 32K process 1 32 K allocated for Process 1 8 M Unused 32K 32K

  29. Address Translation Scheme • Address generated by CPU is divided into: • Page number (p) • Page offset (d) • Logical (virtual) address mapping • Offset from Virtual address copied to Physical Address • Example: 10 bit offset  1024-byte pages • Virtual page # is all remaining bits • Example for 32-bits: 32-10 = 22 bits, i.e. 4 million entries • Translated physical page # copied from page table into physical address page offset page number p d

  30. Address Translation for Paging Logical address = logical page number + page offset Role of page table: logical page number  physical page number Physical address = physical page number + page offset

  31. Memory Protection • Memory protection implemented by associating permission bits with each page • attached to each entry in the page table: • Permissions include: Valid bits, Read, Write, etc • “valid” indicates that the associated page is in the process’ logical address space, and is thus a legal page • “invalid” indicates that the page is not in the process’ logical address space

  32. 0x00 a b c d 0x04 e f g h 4 3 Page Table 0x08 1 i j k l Virtual Memory Simple Page Table Example Example (4 byte pages) 0000 0000 0000 1000 0000 0100 0x00 0000 1100 0001 0000 0x04 0000 0100 0 i j k l 0x05! 1 2 0x08 0x06? 0x0C e f g h 0x09? 0x0E! 0x10 a b c d 0000 0110 0000 0101 0000 1110 0000 1001 Physical Memory

  33. Performance of Address Translation How many memory accesses for a memory load? 2 1 1 memory access for page table 1 memory access to obtain the final data

  34. TLB for Faster Address Translation • Every data/instruction access requires two memory accesses. • One for the page table and one for the data/instruction. • Slowness due to two memory access can be improved by using a special fast-lookup hardware cache called associative memory or translation look-aside buffers (TLBs) • Address translation for logical page p • If p is in associative register, get physical page # out • Otherwise get frame # from page table in memory Logical # Physical# p d

  35. Paging Hardware With TLB • Typical TLB • Size: 8 - 4,096 entries • Access time: 0.5 - 1 clock cycle • Miss penalty: 10 - 100 clock cycles • Miss rate: 0.01 - 10%

  36. Memory access cost with TLB, L1, L2, & L3 Step 1A Lookup TLB TLB Memory Step 1B If missed lookup pagetable Page Table L3 L2 L1 Step 2 Fetch data Data

  37. Cost of accessing memory (2003)

  38. Cost of accessing memory

  39. Example: Performance Characteristics of TLB • Effective (average) address translation cost Cost(TLB lookup) + Cost(Full Translation)* TLB Miss Rate. • Example: TLB hit takes 1 clock cycle, a miss takes 30 clock cycles to access a memory page table, and the miss rate is 1%. • Average cost for translation: 1 + 30X0.01=1.30 • Total memory access cost = address translation+ memory access = 31.30 clock cycles • 1GHz CPU  1 ns/cycle

  40. Example: Performance Characteristics of TLB Each page has 16 byes, hosting 4 integers. Initially TLB is empty. What is TLB miss rate of this code? int sum=0; For(i=3; i<10; i++) sum += a[i] • How about this? int sum=0; For(i=0; i<100000; i++) sum += a[i]

  41. Example: Performance Characteristics of TLB What is TLB miss rate of this code? int b[1024][1024]; For(i=0; i<1024; i++) for(j=0; j<1024;j++) b[i][j] +=1; b[0][0],b[0][1],b[0][[2],b[0][3] b[0][4],b[0][5],b[0][[6],b[0][7] • How about this? For(j=0; j<1024; j++) for(i=0; i<1024;i++) b[i][j] +=1;

  42. Summary: Paging Page Table Virtual memory view 11111 11101 11110 11100 11101 null 11100 null 11011 null 11010 null 11001 null 11000 null 10111 null 10110 null 10101 null 10100 null 10011 null 10010 10000 10001 01111 10000 01110 01111 null 01110 null 01101 null 01100 null 01011 01101 01010 01100 01001 01011 01000 01010 00111 null 00110 null 00101 null 00100 null 00011 00101 00010 00100 00001 00011 00000 00010 Physical memory view 1111 1111 1110 1111 stack 1111 0000 stack 1110 0000 1100 0000 heap 1000 0000 heap 0111 000 data data 0101 000 0100 0000 code code 0001 0000 0000 0000 0000 0000 page # offset

  43. Summary: Paging Page Table Virtual memory view 11111 11101 11110 11100 11101 null 11100 null 11011 null 11010 null 11001 null 11000 null 10111 null 10110 null 10101 null 10100 null 10011 null 10010 10000 10001 01111 10000 01110 01111 null 01110 null 01101 null 01100 null 01011 01101 01010 01100 01001 01011 01000 01010 00111 null 00110 null 00101 null 00100 null 00011 00101 00010 00100 00001 00011 00000 00010 Physical memory view 1111 1111 stack stack 1110 0000 1110 0000 1100 0000 What happens if stack grows to 1110 0000? heap 1000 0000 heap 0111 000 data data 0101 000 0100 0000 code code 0001 0000 0000 0000 0000 0000 page # offset

  44. Summary: Paging Page Table Virtual memory view 11111 11101 11110 11100 11101 10111 11100 10110 11011 null 11010 null 11001 null 11000 null 10111 null 10110 null 10101 null 10100 null 10011 null 10010 10000 10001 01111 10000 01110 01111 null 01110 null 01101 null 01100 null 01011 01101 01010 01100 01001 01011 01000 01010 00111 null 00110 null 00101 null 00100 null 00011 00101 00010 00100 00001 00011 00000 00010 Physical memory view 1111 1111 stack stack 1110 0000 1110 0000 1100 0000 stack Allocate new pages where room! heap 1000 0000 heap 0111 000 data Challenge: Table size equal to # of pages in virtual memory! data 0101 000 0100 0000 code code 0001 0000 0000 0000 0000 0000 page # offset

  45. Summary • Memory is a resource that must be multiplexed • Controlled Overlap: only shared when appropriate • Translation: Change virtual addresses into physical addresses • Protection: Prevent unauthorized sharing of resources • Simple protection through segmentation • Base + Limit registers restrict memory accessible to user • Can be used to translate as well • Page tables • Memory divided into fixed-sized chunks of memory • Offset of virtual address same as physical address • Page sharing among processes is easy • TLB • Essential for speeding up address translation

More Related