450 likes | 465 Views
This article explores the benefits and challenges of using segmentation in operating systems, including advantages like sharing and flexible protection, as well as disadvantages such as dynamic allocation overhead and external fragmentation. It also discusses the implementation of segmentation in hardware and how it can be combined with paging for efficient memory management.
E N D
W4118 Operating Systems Instructor: Junfeng Yang
Logistics • Homework 4 deadline extended 3:09pm 3/31 1
Last Lecture: Paging • Disadvantages of contiguous allocation • External fragmentation • Wasteful allocation of unused memory • No sharing • Paging: divide memory into fixed-sized pages • Each address is split into page number and page offset • Page table maps virtual page number to physical page number 2
Last lecture: Paging Advantages • No external fragmentation • Don’t need to allocate unused memory • Fine-grained sharing • Two page table entries from two process can point to the same physical page • Easy to swap out to disk (later this lecture) • Efficient Allocation and Free • Allocation: since fixed size, no search necessary • Free: insert page to free list 3
Last lecture: Paging Disadvantages • Internal fragmentation • Page tables can be large • Techniques to reduce memory overhead • Multi-level page tables • Hashed page tables • Inverted page tables • Inefficiency: two memory accesses for each CPU memory access • Translation look ahead buffer (TLB): exploit temporal and spatial locality to reduce the number of memory accesses • Page size? • Too small? • Too large? 4
Today Segmentation Virtual memory 5
Segmentation Divide address space into logical segments Each logical segment can be part of physical memory Separate base and limit for each segment (+ protection bits) How to specify a segment? User part of logical address (similar to how to select a page) Top bits specify segment Low bits specify offset within segment Implicitly by type of memory reference Code v.s. Data segment Special registers
Logical View of Segmentation 1 4 2 3 1 2 3 4 user space physical memory space
Segmentation Architecture Logical address consists of a two tuple: <segment-number, offset>, Segment table– maps two-dimensional physical addresses; each table entry has: base– contains the starting physical address where the segments reside in memory limit– specifies the length of the segment
Segmentation Architecture (Cont.) Protection With each entry in segment table associate: validation bit = 0 illegal segment read/write/execute privileges Protection bits associated with segments; code sharing occurs at segment level Since segments vary in length, memory allocation is a dynamic storage-allocation problem A segmentation example is shown in the following diagram
Segmentation Advantages • Advantages • Sharing of segments • Easier to relocate segment than entire program • Avoids allocating unused memory • Flexible protection • Efficient translation • Segment table small fit in MMU • Disadvantages • Segments have variable lengths dynamic allocation overhead (Best fit? First fit?) • External fragmentation: wasted memory • Segments can be large 13
Combine Paging and Segmentation Structure Segments: logical units in program, such as code, data, and stack Size varies; can be large Each segment contains one or more pages Pages have fixed size Two levels of mapping to reduce page table size Page table for each segment Base and limit for each page table Similar to multi-level page table Logical address divided into three portions seg # page # offset
Example: 80x86 Supports both segmentation and segmentation with paging CPU generates logical address Given to segmentation unit Which produces linear addresses Linear address given to paging unit Which generates physical address in main memory Paging units form equivalent of MMU
80x86 Segment Selector Logical address: segment selector + offset Segment selector stored in segment registers (16-bit) cs: code segment selector ss: stack segment selector ds: data segment selector es, fs, gs Segment register can be implicitly or explicitly specified mov $8049780, %eax // implicitly use ds Logical address: ds : $8049780 mov %ss:$8049780, %eax // explicitly use ss Logical address: ss : $8049780
80x86 Segmentation Unit Descriptor table segment seg descriptor memory ds seg selector seg descriptor CPU mov $8049780, %eax Two memory references for one load! How to optimize?
80x86 Paging Unit 4MB page started with Pentium
Today Segmentation Virtual memory 20
Motivation Previous approach to memory management Must completely load user process in memory One process with large address space or many processes with combined address space out of memory Observation: locality of reference Temporal locality: access memory location accessed just now Spatial locality: access memory location adjacent to locations accessed just Programs spend majority of time in small piece of code 90% of time in 10% of code (Knuth’s estimate) Thus, processes only need small amount of address space at any moment
Virtual Memory Idea OS and hardware produce illusion of a disk as fast as main memory Process runs when not all pages are loaded in memory Keep referenced pages in main memory Keep unreferenced pages on slower, cheaper backing store (disk)
Memory Hierarchy Levels of memory in computer system registers cache memory disk size speed < 1 cycle cost a few cycles <100 ns a few ms
Virtual Address Space Virtual address maps to one of three locations Physical memory: small, fast, expensive Disk: large, slow, cheap Nothing
Virtual Memory Operation What happens when reference a page in backing store? Recognize location of page Choose a free page Bring page from disk into memory Above steps need hardware and software cooperation How to detect if a page is in memory? Extend page table entries with present bits Page fault: if bit is cleared then referencing resulting in a trap into OS
Handling a Page Fault OS selects a free page OS brings faulting page from disk into memory Page table is updated, present bit is set Process continues execution
Continuing Process Continuing process is tricky Page fault may have occurred in middle of instruction Want page fault to be transparent to user processes Options Skip faulting instruction? Restart instruction from beginning? What about instruction like: mov ++(sp), R2 Requires hardware support to restart instructions
OS Decisions Page selection When to bring pages from disk to memory? Page replacement When no free pages available, must select victim page in memory and throw it out to disk
Page Selection Algorithms Demand paging: load page on page fault Start up process with no pages loaded Wait until a page absolutely must be in memory Request paging: user specifies which pages are needed Users do not always know best Preparing: load page before it is referenced When one page is referenced, bring in next one Do not work well for all workloads Difficult to predict future
Page Replacement Algorithms Optimal: throw out page that won’t be used for longest time in future Best algorithm if we can predict future Good for comparison, but not practical Random: throw out a random page Easy to implement Works surprisingly well FIFO: throw out page that loaded in first Fair: all pages receive equal residency LRU: throw out page that hasn’t been used in longest time Past predicts future With locality: approximates Optimal
Page Replacement Algorithms Want lowest page-fault rate Evaluate algorithm by running it on a particular string of memory references (reference string) and computing the number of page faults on that string In all our examples, the reference string is 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
Optimal Algorithm Replace page that will not be used for longest period of time 4 frames example 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 How do you know this? Used for measuring how well your algorithm performs 1 4 6 page faults 2 3 4 5
First-In-First-Out (FIFO) Algorithm Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 3 frames (3 pages can be in memory at a time per process): 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 4 frames: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5Belady’s Anomaly: more frames more page faults 1 1 4 5 2 2 1 3 9 page faults 3 3 2 4 1 1 5 4 2 2 1 10 page faults 5 3 3 2 4 4 3
FIFO Illustrating Belady’s Anomaly • Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
Least Recently Used (LRU) Algorithm Reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 Counter implementation: Every page entry has a counter; every time page is referenced through this entry, copy the clock time into the counter When a page needs to be changed, look at the counters to determine which are to change Problem: have to search all pages/counters! 1 1 1 5 1 2 2 2 2 2 5 4 3 4 5 3 3 4 3 4
Implementing LRU: Stack Stack implementation – keep a stack of page numbers in a double link form: Page referenced: move it to the top requires 6 pointers to be changed No search for replacement bottom entry is by definition least used
LRU: Concept vs. Reality LRU is considered to be a reasonably good algorithm Problem is in implementing it Counter implementation: counter per page, copied per memory reference, have to search pages on page replacement to find oldest Stack implementation: no search, but pointer swap on each memory reference Thus the efforts to design efficient implementations that approximate LRU
LRU Approximation Algorithms Reference bit With each page associate a bit, initially = 0 When page is referenced bit set to 1 Replace the one which is 0 (if one exists) We do not know the order, however Second chance Need reference bit Clock replacement If page to be replaced (in clock order) has reference bit = 1 then: set reference bit 0 leave page in memory replace next page (in clock order), subject to same rules