1 / 22

Fast File System

Fast File System. 2/17/2006. Introduction. Paper talked about changes to old BSD 4.2 File System (FS) Motivation - Applications require greater throughput Large amounts of paging Want to retain the good of the old FS Abstraction Backward compatibility with existing software.

Download Presentation

Fast File System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fast File System 2/17/2006

  2. Introduction • Paper talked about changes to old BSD 4.2 File System (FS) • Motivation - Applications require greater throughput • Large amounts of paging • Want to retain the good of the old FS • Abstraction • Backward compatibility with existing software

  3. Old File System • Block size change from 512 KB to 1024  performance > x2 – WHY? • Disk access had 2x more data • Direct blocks had twice the data so indirect blocks weren’t needed as much • Performance degradation over time • 175kb/s  30kb/s due to randomization of block placement on disk • Fundamental limits: • Small block size • read-ahead in the system • Large seek numbers limits file system throughput.

  4. New File System (FS) • Drives mapped into partitions • Each partition has a FS described by redundant superblock • File system blocks = 4096 bytes • Cyllinder groups – what’s that?

  5. Cyllinder Groups • a collection of cylinder groups; Each cylinder group has the following components: • a backup copy of the superblock • a cylinder group header, with statistics, free lists, etc, about this cylinder group, similar to those in the superblock • a number of inodes, each containing file attributes • a number of data blocks

  6. Storage Utilization • 4x bigger blocks sizes 4096 bytes  throughput • Problem: unix FS commonly composed of small files  wasted space

  7. Utilization Cot’d • Small files stored in more efficient way • Blocks broken into fragments of 512 bytes • Block map associated with each cylinder group records the space available in a cylinder group at the fragment level • Is a block is available?  look at aligned fragments

  8. Utilization Cot’d • Fragments of adjoining blocks cannot be used as a full block, even if they are large enough. • If no block with enough aligned fragments is available at file creation, a full size block is split yielding the necessary fragments and a single unused fragment.

  9. Utilization • One of three conditions for file growth allocation • Enough space in allocated block  data written to space • Files contain no fragments • Files contain fragments • Problem: file growth one fragment at a time  many data copies • Soln: User programs write one block at a time

  10. Utilization • Capacity • Problem: as unallocated space  0, the throughput falls to 50% • System should keep ~10% unallocated space • Soln: at a threshold, only administrator can write new blocks

  11. Parameterization of HW • Why? • Old FS had no information about physical characteristics of storage device • Blocks allocated optimally • Processor speed • HW support for mass transfers • Characteristics of mass storage devices (# platters, physical data layouts, etc)

  12. Parameterization of HW • Physical characteristics of each disk: • number of blocks per track • rate of disk spin • Cylinder group summary information: • Cost of rotationally optimal blocks is not free • Soln: count of the available blocks in a cylinder group at different rotational positions. • FS can be parameterized to support min. processor disk operation schedule

  13. Layout Policies • Two parts to data layout policies: • top level -- global policies use FS-wide summary information to make decisions regarding the placement of inodes and blocks • Lower level -- local allocation routines use a locally optimal scheme to lay out data blocks. • Global policies try to balance conflict: • localizing data that is concurrently accessed • spreading out unrelated files

  14. Layout Policies Cot’d • Two allocatable resources • Inodes • Blocks • Layout policy tries to place all the inodes of files in a directory in the same cylinder group. • Data blocks usually accessed together  layout policy tries to place all data blocks for a file in the same cylinder group, preferably at rotationally optimal positions in the same cylinder.

  15. Performance • % of bandwidth in Table 2 measures: effectiveness of utilization of the disk by the file system. • upper bound on the transfer rate from the disk: • number of bytes on a track x number of revolutions of the disk per second. • Bandwidth is calculated by comparing the data rates the file system is able to achieve as a percentage of the bound. • Results: • the old FS uses 3−5% of the disk bandwidth • new FS uses up to 47% of the bandwidth.

  16. Performance • Some stats:

  17. Performance Continued • Limits • Processors limit throughput • Memory to memory copying – 40% of I/O time • Block chaining would require driver rewrites • One block allocated at a time – 10% of system writes

  18. Enhancements • File system changes and required downtime allowed for some new ideas • Long File Names • File Locking • Symbolic Links • Rename • Quotas

  19. Summary • File System Changes from Old System • Block size increased • Layout more efficient • Fragments used to reduce space waste • Performance increased • New items implemented into FS that had been requested by users

  20. Summary Cot’d • New FS organization • Utilization • Parameterization • New Layout Policies • Performance Improvements

  21. Key Ideas • Old FS used fraction of the available data throughput • New FS: • same data structures • Same FS semantics • New FS has new functionality

More Related