1 / 159

Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5. 9.1 Background. Remember the following sequence from the last chapter:

anahid
Download Presentation

Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 9, Virtual MemoryOverheads, Part 1Sections 9.1-9.5

  2. 9.1 Background • Remember the following sequence from the last chapter: • 1. The simplest approach: Define a memory block of fixed size large enough for any process and allocate such a block to each process (see MISC)—this is tremendously rigid and wasteful • 2. Allocate memory in contiguous blocks of varying size—this leads to external fragmentation and a waste of 1/3 of memory

  3. 3. Do paging. This approach breaks the need for allocation of contiguous memory—the costs as discussed so far consist of the overhead incurred from maintaining and using a page table

  4. Virtual memory identifies several limitations in the paging scheme • One limitation is that it’s necessary to load a complete program for it to run • There is also another limitation: If you ignore medium term scheduling, swapping, and compaction, in general the idea was that once a logical page was allocated a physical frame, it didn’t move

  5. If page locations are fixed in memory, that implies a fixed mapping between the logical and physical address space throughout a program run • More flexibility can be attained if the logical and physical address spaces are delinked and address resolution at run time can handle finding a logical page in one physical frame at one time and in another physical frame at another time

  6. These are examples of why it might not be necessary to load a complete program: • 1. Error handling routines may not be called during most program runs • 2. Arrays of predeclared sizes may never be completely filled • 3. Other routines besides error handling may also be rarely used • 4. For a large program, even if all parts are used at some time during a run, by definition, they can’t all be used at the same time—meaning that at any given time the complete program doesn’t have to be loaded

  7. Reasons for wanting to be able to run a program that’s only partially loaded • Under the assumptions of paging that a complete program has to be loaded, you could observe that the size of a program is limited to the physical memory on the machine • Given current memory sizes, this by itself is not a serious limitation, although in some environments it might still be

  8. For a large program, significant parts of it may not be used for significant amounts of time. If so, it’s an absolute waste to have the unused parts loaded into memory • Specifically, if you are doing multi-tasking, you would like the memory unneeded by another process to be available to allocate to another

  9. Another area of saving is in loading or swapping cost from secondary storage • If parts of a program are never needed, reading and writing from secondary storage can be saved • In general this means leaving more I/O cycles available for useful work • It also means that a program will start faster when initially scheduled because there is less I/O for the long term scheduler to do • It also means that the program will be faster and less wasteful during the course of its run in a system that does medium term scheduling or swapping

  10. Definition of virtual memory: • The complete separation of logical memory space from physical memory space from the programmer’s point of view • In other words, at any given time during a program run, any page, p, in the logical address space could be at any frame, f, in the physical memory space

  11. Side note: • Both segmentation and paging were mentioned in the last chapter • In theory, virtual memory can be implemented with segmentation • However, that’s a mess • The most common implementation is with paging, and that is the only approach that will be covered

  12. 9.2 Demand Paging • If it’s necessary to load a complete process in order for it to run, then there is an up-front cost of swapping all of its pages in from secondary storage to main memory • It it’s not necessary to load a complete process in order for it to run, then a page only needs to be swapped into main memory if the process generates an address on that page • This is known as demand paging

  13. In general, when a process is scheduled it may be given an initial allocation of frames in memory • From that point on, additional frames may be allocated through demand paging • If a process is not even given an initial footprint and it acquires all of its memory through paging, this is known as pure demand paging

  14. Demand paging from secondary storage to main memory is roughly analogous to what happens on a miss between the page table and the TLB • Initially, the TLB can be thought of as empty • The first time the process generates an address on a given page, that causes a TLB miss, and the page entry is put into the TLB

  15. An earlier statement characterized virtually memory a completely separating the logical and physical address spaces • Another way to think about this is that from the point of view of the logical address space, there is no difference between main memory and secondary storage

  16. In other words, the logical address space may refer to parts of programs which have been loaded into memory and parts of programs which haven’t • Accessing memory that hasn’t been loaded is slower, but the loading is handled by the system • From the point of view of the address, the running process doesn’t know or care whether it’s in main memory or secondary storage

  17. This is a side issue, but note the following: • The address space is limited by the architecture—how many bits are available for holding an address • However, even if the amount of attached memory is not the maximum, the address space extends into secondary storage • Virtual memory effectively means that secondary storage functions as a transparent extension of the memory space

  18. From a practical point of view, it becomes necessary to have support to tell which pages have been loaded into physical memory and which have not • This is part of the hardware support for the MMU • In the earlier discussions of page tables, the idea of a valid/invalid bit was introduced

  19. Under that scheme, the page table was long enough to accommodate the maximum number of allocated frames • If a process wasn’t allocated the maximum, then page addresses outside of its allocation were marked invalid • The scheme can be extended: valid means valid and in memory. Invalid means either invalid or not loaded

  20. Under the previous scheme, if an invalid page was accessed, a trap was generated, and the running process was halted due to an attempt to access memory outside of its range • Under the new scheme, an attempt to access an invalid page also generates a trap, but this is not necessarily an error • The trap is known as a page fault trap

  21. This is an interrupt which halts the user process and triggers system software which does the following: • 1. It checks a table to see whether the address was really invalid or just not loaded • 2. If invalid, it terminates the process • 3. If valid, it gets a frame from the list of free frames (the frame table), allocates it to the process, and updates the data structures to show that the frame is allocated to page x of the process

  22. 4. It schedules (i.e., requests) a disk operation to read the page from secondary storage into the allocated frame • 5. When the read is complete, it updates the data structures to show that the page is now valid • 6. It allows the user process to restart on exactly the same instruction that triggered the page fault trap in the first place

  23. Note two things about the sequence of events outlined above. First: • Restarting is just an example of context switching • By definition, the user process’s state will have been saved • It will resume at the IP value it was on when it stopped • The difference is that the page will now be in memory and no fault will result

  24. Second: • The statement was made, “get a frame from the list of free frames”. • You may be wondering, what if there are no free frames? • At that point, memory is “over-allocated”. • That means that it’s necessary to take a frame from one process and give it to anothe

  25. Again, demand paging from secondary storage to main memory is analogous to bringing an entry from the page table to the TLB • Remember that the TLB is a specialized form of cache • Its effectiveness relies on locality of reference • If references were all over the map, it would provide no benefit

  26. In practice, memory references tend to cluster in certain areas over certain periods of time, and then move on • This means that entries remain in the TLB and remain useful over a period of time • Likewise, pages that have been allocated to frames will remain useful over time, and can profitably remain in those frames • Pages will tend not to be used only once and then have to be swapped out immediately because another page is to be referenced

  27. Keep in mind that hardware support for demand paging is the same as for regular paging • 1. A page table that records valid/invalid pages • 2. Secondary storage—a disk. • Recall that program images are typically not swapped in from the file system. The O/S maintains a ready queue of program images in the swap space, a.k.a., the backing store

  28. A serious problem can occur when restarting a user process after a page fault • This is not a problem with context switching per se • It is a problem that is reminiscent of the problems of concurrency control • Memory is like a resource, and when a process is halted, any prior action it’s taken on memory has to be “rolled back” before it can be restarted

  29. Instruction execution can be broken down into these steps: • 1. Fetch the instruction • 2. Decode the instruction • 3. Fetch operands, if any • 4. Do the operation (execute) • 5. Write the results, if any

  30. If the page fault occurs on the instruction fetch, there is no problem • If the page fault occurs on the operand fetches, a little work is wasted on a restart, but there are no problems • In some hardware architectures there are instructions which can modify more than one thing (write >1 result). If the page fault occurs in the sequence of modifications, there is a potential problem

  31. The potential problem has to be dealt with—Since you don’t know if it will occur, you have to set up the memory management page fault trap handling mechanism so that the problem won’t occur in any case • The book gives two concrete examples of machine instructions which are prone to this • One example was from a DEC (rest in peace) machine. It will not be pursued

  32. The other example comes from an IBM instruction set • There was a memory move instruction which would cause a block of 256 bytes to be relocated to a new address • Because memory paging should be transparent, the move could be from a location on one page to a location on another page

  33. It’s also important to note that to be flexible, the instruction allowed the new location to overlap with the old location • In other words, the move instruction could function as a shift

  34. The problem scenario goes like this: • You have a 256 byte block of interest, and it is located at the end of a page • This page is in memory, but the following page is not in memory • For the sake of argument, let the move instruction in fact cause a shift to the right of 128 bytes

  35. Instruction execution starts by “picking up” the full 256 bytes • It shifts to the right and lays down the first 128 of the 256 bytes • It then page faults because the second page isn’t in memory yet

  36. Restarting the user process on the same instruction after the page fault without protection will result in an error condition • Memory on the first page has already been modified • When the instruction starts over, it will then shift the modified memory on the first page 128 bytes to the right

  37. You do not get the original 256 bytes shifted 128 bytes to the right • At a position 128 bytes to the right you get 128 blank bytes followed by the first 128 bytes of the original 256 • The problem is that the effects of memory access should be all or nothing. In this case you get half and half

  38. There are two basic approaches to solving the problem • They are reminiscent of solutions to the problem of coordinating locking on resources • The instruction needs a lock on both the source and the destination in order to make sure that it executes correctly

  39. Solution approach 1: • Have the instruction try to access both the source and the destination addresses before trying to shift • This will force a page fault, if one is needed, before any work is done • This is the equivalent of having the process acquire all needed resources in advance

  40. Solution approach 2: • Use temporary registers to hold operand values • In other words, let the system store the contents of the source memory location before any changes are made • If a page fault occurs when trying to complete the instruction, restore the prior state to memory before restarting the instruction • This is the equivalent of rolling back a half finished process • Note that it is also the equivalent of extending the state saving aspect of context switching from registers, etc., to memory.

  41. The problem of inconsistent state in memory due to an instruction interrupted by a page fault is not the only difficulty in implementing demand paging • Other problems will be discussed • It is worth reiterating that demand paging should be transparent • In other words, the solutions to any problems should not require user applications to do anything but merrily roll along generating logical addresses

  42. Demand paging performance • Let the abbreviation ma stand for memory access time—the time to access a known address in main memory • In the previous chapter, a figure of 100 ns. was used to calculate costs • The author gives 10-200 ns. As the range for current computers

  43. The previous cost estimates were based on the assumption that all pages were in memory • The only consideration was whether you had a TLB hit or miss and incurred the cost of one or more additional hits to memory for the page table • Under demand paging an additional, very large cost can be incurred: The cost of a page fault, requiring a page to be read from secondary storage

  44. Given a probability p, 0 <= p <= 1, of a page fault, the average effective access time of a system can be calculated • Average effective access time • = (1 – p) * ma + p * (page fault time)

  45. Page fault time includes twelve components: • 1. The time to send the trap to the O/S • 2. Context switch time (saving process state, registers, etc.) • 3. Determine that the interrupt was a page fault (i.e., interrupt handling mechanism time) • 4. Checking that the page reference was legal and determine the location on disk (this is the interrupt handling code in action)

  46. 5. Issuing a read from the disk to a free frame. This means a call through the disk management system code and includes • A. Wait in a queue for this device until the read request is serviced • B. Wait for the device seek and latency time • C. Begin the transfer of the page to a free frame

  47. 6. While waiting for the disk read to complete, optionally schedule another process. Note what this entails: • A. It has the desirable effect of increasing multi-programming and CPU utilization • B. There is a small absolute cost simply to schedule another process • C. From the point of view of the process that triggered the page fault, there is a long and variable wait before being able to resume

  48. 7. Receive the interrupt (from the disk when the disk I/O is completed) • 8. Context the other process out if step 6 was taken • 9. Handle the disk interrupt • 10. Correct (update) the frame and page tables to show that the desired page is now in a frame in memory

  49. 11. As noted in 6.C, wait for the CPU to be allocated again to the process that generated the page fault • 12. Context switch—restore the user registers, process state, and updated page table; then resume the process at the instruction that generated the page fault

  50. The twelve steps listed above fall into three major components of page fault service time: • 1. Service the page fault interrupt • 2. Read in the page • 3. Restart the process

More Related