270 likes | 439 Views
Disks. Secondary Storage. Secondary Storage Typically Storage systems outside of “primary memory” Cannot access data using load/store instructions Characteristics Large: Terabytes Cheap: few cents / GB Persistent: data survives power loss Slow: 100us – 10ms to access RAM: 100ns access.
E N D
Secondary Storage • Secondary Storage Typically • Storage systems outside of “primary memory” • Cannot access data using load/store instructions • Characteristics • Large: Terabytes • Cheap: few cents / GB • Persistent: data survives power loss • Slow: 100us – 10ms to access • RAM: 100ns access
Disks and the OS • Disks are messy, messy devices • Errors, bad blocks, etc… • OS hides details from higher level software • Low level device drivers (block reads, etc) • Higher level abstractions (files, databases, etc) • Different abstractions for different clients • Physical disk blocks (surface, cylinder, sector) • Logical disk blocks (disk block index #) • Logical files (filenames, byte index #)
I/O System User Process User Process User Process OS File System I/O System Device Driver Device Controller Disk
Disk Performance • How long to read or write n sectors? • Positioning Time + Transfer Time • Positioning Time: Seek time + rotational delay • Transfer time: n / (RPM * bytes/track)
Disk Performance • Performance depends on number of steps • Seek: moving disk arm to correct cylinder • Depends on how fast disk arm can move • Seek times aren’t diminished very quickly (why?) • Rotation (latency):wait for sector to rotate under read head • Depends on rotation rate of disk • Rates are increasing slowly • Transfer: Disk Surface -> Disk Controller -> Host Memory • When the OS uses disk, it tries to minimize total cost of all steps • Particularly seeks and rotation
Interacting with Disks • Old days • OS would specify cylinder, sector, surface, and transfer size • OS would need to know all disk parameters • Modern disks are even more complicated • Not all sectors are the same size, sectors are remapped, etc… • Disks now provide higher level interface (SCSI, EIDE) • Exports data as logical array of blocks [0….N] • Disk controller maps logical blocks to cylinder/surface/sector • On board cache • Thus, physical parameters are hidden from OS
Disk Controller • Responsible for Interface between OS and disk drive • Common interfaces; (S)ATA/IDE vs SCSI • (S)ATA/IDE used for personal storage • slower rotation speeds, slower seek times • SCSI (SAS) used for enterprise systems • Faster rotation and seek times • Basic Operations • Read Block <index> / Write Block <index> • OS does not know of internal disk complexity • Disks export array of Logical Block Addresses (LBAs) • Disk maps internal structures to LBA numbers • Implicit Contract: • Large sequential accesses to contiguous LBA regions achieve better performance than small transfers or small accesses
Reliability • Disks fail more often • When continuously powered-on • With heavy workloads • Under high temperatures • How do disks fail? • Whole disk can stop working (e.g. motor dies) • Transient problem (cable disconnected) • Individual sectors can fail (e.g. head crash or scratch) • Data can either be corrupted or (more likely) unreadable • Disks can internally fix some errors • ECC (error correction code): Detect/correct bit errors • Retry sector reads/writes • Move sectors to other locations on disk • Maintain same LBA location
Buffering • Disks contain internal memory (2MB – 16MB) used as cache • Read-ahead • Read entire cylinder into memory during rotational delay • Write caching to volatile memory • Returns from writes immediately • Data stored in disk’s cache • Can be lost on power failure • Command Queuing • Keep multiple outstanding requests for disk • Disk can reorder (schedule) requests to imrpove performance
Disk Scheduling • Goal: Minimize positioning time • Performed for both OS and disk • FCFS: Schedule requests in order received • Advantage: fair • Disadvantage: High seek cost and rotation • Shortest seek time first (SSTF) • Handle nearest cylinder next • Advantage: Reduces arm movement (seek time) • Disadvantage: Unfair, can starve requests • Where have we seen this before?
Disk Scheduling • SCAN (elevator): Move from outer cylinder in, then reverse direction • Advantage: More fair to requests, minimizes seeks
SSDs: Solid State Storage • Forget everything about rotating disks • Completely different • Greater Complexity but higher performance • Less density • A block of NVRAM (non-volatile) arranged in a grid • Single Level Cell (SLC): 1 bit per cell • Faster and more reliable • Multi Level Cell (MLC): 2 bits per cell • Slower and less reliable
Role of OS • Standard Library • Common block interface across all storage devices • Resource Coordination • Fairness and isolation • Protection • Prevent applications from crashing system • Provide file system semantics • Even higher level abstraction than block devices
File Systems • In general file systems are simple • Abstraction for secondary storage • Files • Logical organization of files • Directories • Sharing of data between users/processes • Permissions/ACLs
Files • Collection of data with some properties • Contents, size, owner, permissions • Files may also have types • Understood by file system • Device, directory, symbolic link, … • Understood by other parts of OS/runtime • Executable, DLL, source code, object code, … • Types can be encoded in name or contents • File extension: .exe, .txt, .jpg, .dll • Content: “#!<interpreter>” • Operating system view • Bytes mapped to collection of blocks on persistent physical storage
Directories • Provides: • Method for organizing files • Namespace that is accessible by both users and FS • Map from file name to file data • Actually maps name to meta-data, meta-data maps to data • Directories contain files and other directories • /, /usr, /usr/local, • Most file systems support notion of a current directory • Absolute names: fully qualified starting from root of FS • /usr/local (absolute) • Relative names: specified with respect to current directory • ./local (relative)
File Meta-Data • Meta-Data: Additional information associated with a file • Name • Type • Location of data blocks on disk • Size • Times: Creation, access, modification • Ownership • Permissions (read/write) • Stored on disk • How and where its stored depends on file system
Directory entries • A directory is a file that contains only meta-data • List of files with associated meta-data • Each file’s meta-data = attributes • Size, protection, location on disk, creation time, … • List is usually un-ordered (effectively random) • Running ‘ls’ sorts files in memory
Directory Trees • Directory entries specify files… • But a directory is a file • Special bit in meta-data stored in directory entry • User programs can read directories • But, only system programs can write directories • Why is this true? • Special directories • This directory: ‘.’ • Parent directory: ‘..’ • Root: ‘/’ • Fixed directory entry for its own meta-data
File Operations • Create file with given pathname /a/b/file • Traverse pathname, allocate meta-data and directory entry • Read-from/write-to offset in file • Find (or allocate) blocks on disk, update meta-data • Delete • Remove directory entry, free disk blocks • Rename • Change directory entry • Copy • Allocate new directory entry, allocate new space on disk, copy • Change permissions • Change directory entry, or ACL meta-data
Opening Files • Files must be opened before accessing data • User must specify access mode: read and/or write • OS tracks open files for each process • Table containing open file’s metadata • How do you know which table entry belongs to a file? • File descriptor = index into open file table • Table contains meta-data + current position and access mode
Basic operations UNIX Win32 CreateFile(name, CREATE) CreateFile(name, OPEN) ReadFile(handle, …) WriteFile(handle, …) FlushFileBuffers(handle, …) SetFilePointer(handle, …) CloseHandle(handle, …) DeleteFile(name) CopyFile(name, …) MoveFile(name, …) • create(name) • open(name, mode) • read(fd, buf, len) • write(fd, buf, len) • sync(fd) • seek(fd, pos) • close(fd) • unlink(name) • rename(old, new)
Path name translation • Example: Open “/one/two/three” • intfd = open(“/one/two/three”, O_RDWR); • What happens in FS? • Open directory “/” at known FS location • Search for directory entry of “one” • Get Location of “/one” from directory entry • Open “one”, search for “two”, get location of “two” • Open “two”, search for “three”, get location of “three” • Open file “three” • (Of course check permissions at each step) • FS spends a lot of time walking through directories • This is why open is separated from read/write • OS can cache directory entries for faster lookups
Acyclic-graph Directories • More general than a tree structure • Add connections across the tree (no cycles) • Create links from one file (or directory) to another • Hard link: “ln a b” (“a” must exist) • Idea: Can use name “a” or “b” to access same file data • Implementation: Multiple directory entries point to same meta-data • What happens when you remove “a”? (Does “b” still exist?) • UNIX: No hard links to directories. Why?
Acyclic-graph Directories • Symbolic (soft) link: “ln -s a b” • Can use name “a” or “b” to get same file data • But only if “a” exists • When referencing “b”, lookup pathname of “a” • B: special file (designated via bit in meta-data) • Contents of “b” simply contains the pathname of “a” • Since data is small, many FS implementations just embed pathname of “a” into directory entry for “b”