320 likes | 407 Views
제 52 강 : Swapping & Flushing. When pages are updated. text. data. Main memory. Disk. block device. a.out. 3. text. data. data. text. Main memory. Disk. block device. a.out. Loaded (scattered). X. RW. 3. text. data. data. stack. text. Main memory. Disk. block device.
E N D
제52강 : Swapping & Flushing When pages are updated
text data Main memory Disk block device a.out 3
text data data text Main memory Disk block device a.out Loaded (scattered) X RW 3
text data data stack text Main memory Disk block device a.out Loaded (scattered) X RW RW 3
text data data stack text Main memory Disk block device a.out always clean X written RW written RW 3
text data data stack text Main memory Disk block device a.out always clean X write-back written RW written RW 3
text data data stack text Main memory Disk block device a.out always clean write-back Immediately or flush periodically X written RW RW 3
address_space struct text Page Cache data data clean stack text dirty locked Main memory Disk block device a.out always clean write-back Immediately or flush periodically X written RW RW 3
text data data stack text Main memory Disk block device a.out always clean X written RW written (no disk counterpart) RW 3
text data data stack stack text Main memory Disk block device a.out always clean X write-back or flush written RW written RW swap(save) this page for later use 3 swap area
text data data stack stack text Main memory Disk block device a.out always clean X write-back or flush written RW written RW swap(save) this page for later use 3 swap area “Anonymous page” such as stack, heap does not map to a file on disk
Page Frame Reclamation • kernel refills free-block-list • before all free memory are gone • PFRA(Page Frame Reclaiming Algorithm) • select page frame, make this page free • Target pages can belong to • user-mode-processes or • kernel caches (slab layer) • If dirty – write or swap
Writing out Dirty Pages Chapter 15, Love’s book
data stack text Writing out Dirty Pages Chapter 15, Love’s book Main memory Disk block device a.out write-back Immediately or flush periodically X text data written RW RW
Page Cache clean dirty locked pdflush Daemon • Dirty page writeback occurs when • No free memory (below a specified threshold) • Dirty data became too old (older than a specific threshold) • setting pdflush daemon • dirty_background_ratio (free mem < d_b_r) • dirty_expire_centisecs (mod_time > d_e_c) • dirty_ratio (no. of dirty page > d_r) • dirty_writeback_centisecs (cycle time of pdflush)
text data data stack stack text block device Main memory Disk Swapping out Anonymous Pages a.out X write-back or flush written RW written RW swap area swap(save) this page for later use
Linux swapping • Pages like stack/heap cannot be discarded (used later) • They have to be copied to backing store, called swap area • Strictly speaking, Linux does not swap, because • 'swapping‘ means copying entire process address space to disk • 'paging' means copying out individual pages • Linux actually implements paging (traditionally called it swapping) • Linux swapping page frame reclaiming Meomory page page swap area
Swap Area in Disk (p 179 Gorman) • Multiple swap areas • system administrator spreads load among several disks • Faster swap areas (faster disk) may have higher priority • swapping may start from faster swap area • multiple swap area may read/write concurrently • each active swap area is a file or partition (max 32 swap areas) • Each swap area is divided up into page-sized slots on disk. swap area (file or partition) slot ( page)
Swap Area in Disk (p 179 Gorman) • Multiple swap areas • system administrator spreads load among several disks • Faster swap areas (faster disk) may have higher priority • swapping may start from faster swap area • multiple swap area may read/write concurrently • each active swap area is a file or partition (max 32 swap areas) • Each swap area is divided up into page-sized slots on disk. swap_info[] struct swap_info_struct { unsigned int flags; spinlock_t sdev_lock; struct file *swap_file; struct block_device *bdev; }; swap area (file or partition) 0 1 slot ( page) struct swap_info_struct { }; 31
Swapping Subsystem • PTE keeps track of the positions of data in swap area Pk was swapped out PTE swap_info[] swap_info_struct { flags; sdev_lock; *swap_file; *bdev; }; swap area (file or partition) 0 0 1 1 k index index 31 31 swap_info_struct
Swap in – Race Problem • swap in can cause race condition • Example [2 process case] • 1st process accesses page X and page faults • kernel tries to swap in • allocate a new page frame • start I/O operation • 2nd process accesses page X and page faults • kernel tries to swap in • allocate a new page frame • start I/O operation
Swap out – Race Problem • Swap out can cause race condition • Example [N process share a page] PTE CPU1 PA PTE PB CPU2 page X PTE PC CPU3 PTE PD CPU4
Swap out – Race Problem • Swap out can cause race condition • Example [N process share a page] • 1st process swap out page X • Other processes access page X PTE PTE CPU1 PA CPU1 PA PTE PTE PB CPU2 PB CPU2 page X page X page X PTE PTE PC CPU3 PC CPU3 PTE PTE PD CPU4 PD CPU4
Linux Solution – Swap CacheSwapping out pages If a page is shared, a special entry (swap entry) is allocated “swap out” just decrement reference count in swap entry. Only when the count reaches zero will the page be freed Pages like this are considered to be in the swap cache swap cache is implemented by page cache data structure swap cache is purely conceptual because it’s simply specialization of page cache PTE PA Count 2 page X PTE PB Swap Cache
Linux Solution – Swap CacheSwapping out pages • If a page is shared, a special entry (swap entry) is allocated • “swap out” just decrement reference count in swap entry. • Only when the count reaches zero will the page be freed • Pages like this are considered to be in the swap cache • swap cache is implemented by page cache data structure • swap cache is purely conceptual because it’s simply specialization of page cache PTE PTE PA Count 2 PA Count 1 X page X PTE page X PTE PB PB Swap Cache Swap Cache
Linux Solution – Swap CacheSwapping out pages • If a page is shared, a special entry (swap entry) is allocated • “swap out” just decrement reference count in swap entry. • Only when the count reaches zero will the page be freed • Pages like this are considered to be in the swap cache • swap cache is implemented by page cache data structure • swap cache is purely conceptual because it’s simply specialization of page cache PTE PTE PTE Swap Area PA Count 2 PA PA Count 1 Count 0 X X page X PTE page X page X PTE PTE page X X PB PB PB Swap Cache Swap Cache Swap Cache
task_struct mm_struct mm field mmap pgd /* This routine handles page faults. */ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code) { struct task_struct *tsk; struct mm_struct *mm; struct vm_area_struct * vma; unsigned long address; unsigned long page; /* get the address */ __asm__("movl %%cr2,%0":"=r" (address)); tsk = current; mm = tsk->mm; down_read(&mm->mmap_sem); vma = find_vma(mm, address); if (vma->vm_start <= address) goto good_area; ….. good_area: switch (error_code & 3) { default: /* 3: write, present */ /* fall through */ case 2: /* write, not present */ if (!(vma->vm_flags & VM_WRITE)) goto bad_area; write++; break; case 1: /* read, present */ goto bad_area; case 0: /* read, not present */ if (!(vma->vm_flags & (VM_READ | VM_EXEC))) goto bad_area; } mm tty files fs PTE vm_area_struct Directory VMA - text PTE vm_area_struct VMA - data vm_area_struct start_address end_address permission file operations page fault() add_vma remove_vma VMA – stack
address space CPU SP PTE task_struct mm_struct Directory page T thread_info mmap pgd PTE page D mm tty fs files VMA VMA kernel stack start end file nopage() L VMA S filp cache dentry cache Inode cache space Manager slab unit slab inode address_space clean dirty pages locked page method () page page LRU list