File Systems

File Systems (Some Theory)

Topics Files Directories File system implementation Example file systems

Long-term Information Storage We have a need to store large amounts of data. Stored information must outlive the process creating it. Multiple processes must be able to access data concurrently

File Structures There are three kinds of file structures (a) byte sequential (O/S doesn’t care what’s inside) (b) record sequential (O/S understands the record structure) (c) tree (database systems)

Sequential Access • o read all bytes/records from the beginning • o cannot jump around, could rewind or back up. • o convenient when medium was mag tape

Random Access • bytes/records read in any order – disk drives • Essential for db systems • Read can be • -- move file marker then read, or • -- read then move file marker

Some Typical File Attributes

Typical File Operations • Create • Delete • Open • Close • Read • Write 7. Append 8. Seek 9. Get Attributes 10. Set Attributes 11. Rename

Memory Mapped Files In some cases, it is convenient to map a file into the address space of a running process. File access is then done by normal reads and writes of memory. The result is much faster and to some, much easier than actual writing to the file. Memory mapping is done by changing the system’s internal tables so that the file becomes backing store (ala paging) for the memory region into which the file is mapped.

Path Names A UNIX directory tree

Directory Operations • Create • Delete • Open • Close 5. Read 6. Rename 7. Link 8. Unlink

disk partitions MBR A File System Implementation one partition is marked as the active partition Partition table files and directories Boot Block Super Block Free space mgmt i-nodes root dir When the system is started, the BIOS reads in and executes the Master Boot Record (MBR). The MBR locates the active partition and finds its boot block. The boot block loads the O/S.

The superblock contains information about the file system itself, for example, how big is the file system super block i-node table data

This is an array of i-node structures. An i-node is identified by it’s position in the array. super block i-node table data

i-nodes I-nodes are fixed in size. When a file is opened, it’s i-node is loaded from disk into memory. Thus only a small amount of memory is required, and only while the file is opened.

All directories and files are stored in the data area of the file system. Everything in the file system is stored in “blocks ”. Blocks are fixed in size and represent the smallest unit of storage in the file system. super block i-node table data

super block i-node table data Creating a new file involves the following operations: 1. The O/S finds an unused i-node 47

super block i-node table data Creating a new file involves the following operations: 1. The kernel finds an unused i-node 2. The kernel stores file attributes in the i-node 47 attributes

super block i-node table data Creating a new file involves the following operations: 1. The kernel finds an unused i-node 2. The kernel stores file attributes in the i-node 3. The kernel find free blocks and stores the file data 47 ||||||||| ||||||||| ||||||||| 821 635 200 attributes

super block i-node table data Creating a new file involves the following operations: 1. The kernel finds an unused i-node 2. The kernel stores file attributes in the i-node 3. The kernel find free blocks and stores the file data 4. The kernel stores the block numbers in the i-node 47 ||||||||| ||||||||| ||||||||| 821 635 200 attributes 200 635 821

super block i-node table data Creating a new file involves the following operations: 1. The kernel finds an unused i-node 2. The kernel stores file attributes in the i-node 3. The kernel find free blocks and stores the file data 4. The kernel stores the block number in the i-node 5. The kernel adds an entry to the directory 47 ||||||||| ||||||||| ||||||||| 821 635 200 attributes 200 . . . 635 47 myFile.txt 821 . . .

Sector Block or cluster Track Internal Fragmentation: Occurs when all of a block is not used by a file. Terminology A block is the minimum unit of storage for data. Block sizes are defined by the O/S and the file system. External Fragmentation: Occurs when blocks used to store a file are not contiguous. Read/write head

4 1 2 3 Sector Block or cluster Track Problem A block is the minimum unit of storage for data. Block sizes are defined by the O/S and the file system. Read/write head Given that you need n blocks on the disk to hold the contents of a file, how do you allocate those blocks to the application?

The simplest file allocation scheme is to take blocks sequentially from the disk, as they are needed for each file. This has two major advantages: Contiguous File Allocation o It is simple to implement. You only need to keep track of the starting block and the number of blocks in the file. o It is very efficient. Only one seek is required to read in the entire file. (a seek is the operation that moves the read/write head over the correct track.)

1 Sector Block or cluster Track Problem ? 2 4 Read/write head

Contiguous File Allocation Create a file of 3 blocks

Contiguous File Allocation Create a file of 3 blocks Create a file of 5 blocks

Contiguous File Allocation Create a file of 3 blocks Create a file of 5 blocks Create a file of 4 blocks

Contiguous File Allocation Create a file of 3 blocks Create a file of 5 blocks Create a file of 4 blocks Create a file of 6 blocks

Contiguous File Allocation What’s the problem with this design? you have to have a file that is 5 blocks or less to fit here! External Disk Fragmentation. Create a file of 3 blocks Create a file of 5 blocks Create a file of 4 blocks Create a file of 6 blocks Now . . . Delete the 2nd file

Contiguous File Allocation What’s the problem with this design? External Disk Fragmentation. you have to have a file that is 5 blocks or less to fit here! If you fill it with a file that takes up less than 5 blocks you end up with a small hole that may be really hard to fill.

Linked List File Allocation + Every block on disk can be used. Disk blocks can be anywhere. + The directory only need store the address of the first block of the file. - Each block sacrifices the space required to store the pointer - Random access of blocks in the file is slow.

4 1 2 3 Random Access what if you want to randomly access this block?

So ... you have to access this block, because it contains the pointer to the next block. 4 1 2 3 The directory only contains the address of the first block.

4 1 2 3 Now you need to access this block to get the location of the 3rd block . . . but this will more than likely involve a disk seek, i.e. move the disk head

A File Allocation Table (fat) a –1 signals the end of the list of blocks The fat table usually resides in a fixed location at the beginning of the disk. There is an entry in the table for every block on the disk. No space is taken up in the file for pointers.

4 1 fat 2 3 Why cache the fat? move the disk head to read the first entry in the fat

4 1 fat 2 3 Why cache the fat? now move the disk head to read in the first block of the file.

4 1 fat 2 3 Why cache the fat? move the disk head back to read the next entry in the fat

4 1 fat 2 3 Why cache the fat? Move the disk head to read in the next block in the file

When the fat is in cache + Random access is easier – the chain is entirely in memory - The biggest disadvantage is that the FAT resides in memory assume a 20GB disk with 1024KB block-size. The FAT needs 20 million entries (60-80MB)

Unix uses i-nodes!

mail disk address attributes attributes disk address games Directory Implementations attributes disk address homework attributes disk address music disk address photos attributes A simple Directory * File attributes stored in the directory * Disk address stored in the directory (first block) * Fixed size entries (so fixed length file names) (MS/DOS & Windows3.x)

mail address of i-node address of i-node games Directory Implementations address of i-node homework address of i-node music address of i-node photos Each directory entry points to an i-node. File attributes are stored in the i-node. (Unix)

Long File Names Fragmentation Issues (take a directory entry out) Page Faults may occur (directory spans multiple pages)

Long File Names on the Heap

File Sharing Unix allows different processes to share files … The Process Table: every process has an entry in the process table includes all open file descriptors owned by the process - file descriptor flags - a pointer into the file table process table entry fd flags ptr fd 0: fd 1: fd 2: ...

The File Table (per process) table of all open files - status flags for the file (read, write, append, etc) - the current file offset - a pointer to the i-node for this file file table entry process table entry file status flags current file offset i-node pointer fd flags ptr fd 0: fd 1: fd 2: ...

The i-node Table one for each open file read from disk when the file is opened includes a pointer to the file’s i-node - file permissions - file owner - file size - device file is physically located on - pointers to the actual file blocks on disk - etc file table entry i-node process table entry file status flags current file offset i-node pointer permissions user & group ids File size Time stamps . . . Pointer to first disk block fd flags ptr fd 0: fd 1: fd 2: ...

A Single Process with Two Open Files i-node permissions user & group ids File size Time stamps . . . Pointer to first disk block process table entry file table file status flags current file offset i-node pointer fd flags ptr fd 0: fd 1: fd 2: ... file status flags current file offset i-node pointer i-node permissions user & group ids File size Time stamps . . . Pointer to first disk block

File Systems

File Systems

Presentation Transcript

File Systems

File Systems

File Systems

File Systems

File-Systems

File Systems

File Systems

File Systems

File Systems

File Systems

File Systems

File Systems

File Systems

File Systems

File Systems

File Systems

File Systems

File Systems

File Systems

File Systems

File Systems