1 / 101

File Management

Learn essential concepts of file management for programmers, including file organization, storage, and access techniques. Understand the importance of file descriptors and logical structures in managing data effectively.

benitez
Download Presentation

File Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. File Management

  2. File Manager Process & Resource Manager Memory Manager Device Manager Operating System Components Operating System Processor(s) Main Memory Devices Computer Hardware

  3. foo.html File Manager File Manager <head> … </head> <body> … </body> • Structured information • Can be read by any applic • Accessibility • Protocol • Persistent storage • Shared device Why Programmers Need Files <head> … </head> <body> … </body> HTML Editor Web Browser

  4. WriteFile() CreateFile() ReadFile() CloseHandle() SetFilePointer() Fig 13-2: The External View of the File Manager Application Program mount() write() open() close() read() lseek() File Mgr Device Mgr Memory Mgr File Mgr Device Mgr Memory Mgr Process Mgr Process Mgr UNIX Windows Hardware

  5. Introduction • What is a file? • Where is a file located physically? • What are the steps to access a file?

  6. File Management • File is a named, ordered collection of information • The file manager administers the collection by: • Storing the information on a device • Mapping the block storage to a logical view • Allocating/deallocating storage • Providing file directories • What abstraction should be presented to programmer?

  7. File system context

  8. Levels in a file system

  9. Levels of data abstraction

  10. Logical structures in a file

  11. Information Structure Applications Records Structured Record Files Record-Stream Translation Byte Stream Files Stream-Block Translation Storage device

  12. Byte Stream File Interface • Implements the block-stream interface • Info on file held in file descriptor • described later • Typical operations on file • fileID = open(fileName) • close(fileID) • read(fileID, buffer, length) • write(fileID, buffer, length) • seek(fileID, filePosition)

  13. Low Level Files fid = open(“fileName”,…); … read(fid, buf, buflen); … close(fid); ... ... b0 b1 b2 bi int open(…) {…} int close(…) {…} int read(…) {…} int write(…) {…} int seek(…) {…} Stream-Block Translation Storage device response to commands

  14. File meta-data • File contain data plus information about the data, that is, meta-data • Meta-data is kept in a file descriptor

  15. File Descriptor Information • External name • Current state • Sharable • Owner • User • Locks • Protection settings • Length • Time of creation • Time of last modification • Time of last access • Reference count • Storage device details

  16. File Descriptor in Unix • File descriptor in UNIX is called an inode (index node), containing the following entries

  17. Structured Files • A file is a stream of bytes • Usually want to access in a structured manner • May have no structure imposed (UNIX) • Must be provided by application • May have a structure imposed (VMS) • Need to maintain additional information • Type of file • Access methods • Other information

  18. Block Record Translation Records Record-Block Translation

  19. Record-Oriented Sequential Files • A structured sequential file is a named sequence of logical records, indexed by nonnegative integers • Records may be of fixed size, or variable size • This is determined by file manager Logical Record fileID = open(fileName) close(fileID) getRecord(fileID, record) putRecord(fileID, record) seek(fileID, position)

  20. Record-Oriented Sequential Files Logical Record H byte header k byte logical record Next Header ... • Header contains record descriptor information • (occupies H bytes) • Logical record takes up k bytes – fixed size

  21. Record-Oriented Sequential Files Logical Record H byte header k byte logical record ... ... Physical Storage Blocks Fragment

  22. Electronic Mail Example struct message { /* The mail message */ address to; address from; line subject; address cc; string body; }; struct message *getRecord(void) { struct message *msg; msg = allocate(sizeof(message)); msg->to = getAddress(...); msg->from = getAddress(...); msg->cc = getAddress(...); msg->subject = getLine(); msg->body = getString(); return(msg); } putRecord(struct message *msg) { putAddress(msg->to); putAddress(msg->from); putAddress(msg->cc); putLine(msg->subject); putString(msg->body); }

  23. Record-Oriented Sequential Files • Fixed size records can be a problem • Applications requiring large record sizes would require that the programmer break the records into smaller pieces • Applications only requiring small record sizes would waste space • A solution is for the file system to be enhanced to include a function to define the record size for a file – encoded in header

  24. Indexed Sequential File • Suppose we want to directly access records • Add an index to the file fileID = open(fileName) close(fileID) getRecord(fileID, index) index = putRecord(fileID, record) deleteRecord(fileID, index)

  25. Indexed Sequential File (cont) Application structure index = i Account # 012345 123456 294376 ... 529366 ... 965987 Index i k j index = k index = j

  26. More Abstract Files • Inverted files • System index for each datum in the file • Records accessed based on appearance in table rather than their logical location • Company accounts may be accessed by customer name, but customer may have several accounts • Set up external index table by name with pointers to the main table • Multimedia storage • Records contain radically different types • Access methods must be general

  27. Database Management Systems • A database is a very highly structured set of information • Stored across different files • Optimized to minimize access time • DBMSs implementation • Some DBMSs use the normal files provided by the OS for generic use • Some use their own storage device block

  28. File systems • File system • A data structure on a disk that holds files • actually a file system is in a disk partition • a technical term different from a “file system” as the part of the OS that implements files • File systems in different OSs have different internal structures

  29. A file system layout

  30. Implementing Low Level Files • Process needs to be able to read from and write to storage devices • Simplest system is byte stream file system • (will consider record-oriented systems later) • Storage device may be accessed 2 ways • Sequentially – like a tape drive • Randomly – like a magnetic disk

  31. Low-level File System Architecture Block 0 … … b0 b1 b2 b3 bn-1 . . . Randomly Accessed Device Sequential Device

  32. Low Level Files Management • Secondary storage device contains: • Volume directory (sometimes a root directory for a file system) • External file descriptor for each file • The file contents • Manages blocks • Assigns blocks to files (descriptor keeps track) • Keeps track of available blocks • Maps to/from byte stream

  33. File Manager Data Structures Keep the state of the process-file session 2 Copy info from external to the open file descriptor 1 Open File Descriptor Process-File Session Return a reference to the data structure 3 External File Descriptor

  34. An open Operation • Locate the on-device (external) file descriptor • Extract info needed to read/write file • Authenticate that process can access the file • Create an internal file descriptor in primary memory • Create an entry in a “per process” open file status table • Allocate resources, e.g., buffers, to support file usage

  35. A close Operation • Completes all pending operations • Release I/O buffers • Release locks process holds on file • Update external file descriptor • Deallocate file status table entry

  36. Opening a UNIX File fid = open(“fileA”, flags); … read(fid, buffer, len); On-Device File Descriptor 0 stdin 1 stdout 2 stderr 3 ... File structure inode Open File Table Internal File Descriptor

  37. Block Management • The job of selecting & assigning storage blocks to the file • For a fixed sized file of k blocks • File of length m requires N = m/k blocks • Byte bi is stored in block i/k • The logical file is divided into logical blocks • Each logical block is mapped to a physical disk block

  38. Locating file data • The file descriptor contains data on how to perform this mapping • there are many methods for performing this mapping • Three basic strategies: • Contiguous allocation • Linked lists • Indexed allocation

  39. Dividing a file into blocks

  40. Disk Organization Boot Sector Volume Directory Blk0 Blk1 … Blkk-1 Track 0, Cylinder 0 … Blkk Blkk+1 Blk2k-1 Track 0, Cylinder 1 … … Blk Blk Blk Track 1, Cylinder 0 … … Blk Blk Blk Track N-1, Cylinder 0 … … Blk Blk Blk Track N-1, Cylinder M-1

  41. Contiguous Allocation • Maps the N blocks into N contiguous blocks on the secondary storage device • Simple to implement • Random access • Does not provide for dynamic file sizes • If you want to extend a file, hope there is an empty block following, or recopy the entire file to a larger group of unallocated contiguous blocks Head position 237 … First block 785 Number of blocks 25 File descriptor

  42. A contiguous file

  43. Keeping a file in pieces • We need a block pointer for each logical block, an array of block pointers • block mapping indexes into this array • Each file is a linked list of disk blocks • But where do we keep this array? • usually it is not kept as contiguous array • the array of disk pointers is like a second related file (that is 1/1024 as big)

  44. Block pointers in the file descriptor

  45. Block pointers in contiguous disk blocks

  46. Linked Lists • Each block contains a header with • Number of bytes in the block • Pointer to next block • Blocks need not be contiguous • Files can expand and contract • Seeks can be slow First block … Head: 417 ... NULL Length Length Length Byte 0 Byte 0 Byte 0 ... ... ... Byte 4095 Byte 4095 Byte 4095 Block 0 Block 1 Block N-1

  47. Linked Lists– cont.

  48. Length Length Length Indexed Allocation • Extract headers and put them in an index • Simplify seeks • May link indices together (for large files) Byte 0 ... Index block … Head: 417 ... Byte 4095 Block 0 Byte 0 ... Byte 4095 Block 1 Byte 0 ... Byte 4095 Block N-1

  49. Block pointers in an index block

  50. Block pointers in an index block – cont.

More Related