1 / 42

Chapter 13: File Management

Chapter 13: File Management. Prof. Steven A. Demurjian, Sr. † Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155 Storrs, CT 06269-3155. steve@engr.uconn.edu http://www.engr.uconn.edu/~steve (860) 486 - 4818.

ferris
Download Presentation

Chapter 13: File Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 13: File Management Prof. Steven A. Demurjian, Sr. † Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155 Storrs, CT 06269-3155 steve@engr.uconn.edu http://www.engr.uconn.edu/~steve (860) 486 - 4818 † These slides have been modified from a set of originals by Dr. Gary Nutt.

  2. Purpose of this Chapter • What is Role of File Manager in OS? • What are Abstractions for Managing Information? • Heavily Used Abstractions (Sequential Files) • Classical (Indexed Sequential Files) • Database • Futuristic Abstractions (Multi-media Including Audio, Video, etc.) • How are Abstractions Implemented as Part of File Management? • What is Role of Directories in File Management? • System Perspective • User Perspective

  3. Recall: OS Functionality and Processing Process Process Process Abstract Computing Environment Synchronization Process Description File Manager Protection Deadlock Process Manager Device Manager Memory Manager Resource Manager Resource Manager Scheduler Resource Manager Devices Memory CPU Other H/W

  4. File Management • What are Files? • Fundamental Abstraction for Organizing Information on Secondary Storage • Named Collection of Information • File Manager Administers the Collection By: • Storing the Information on a Device • Mapping the Block Storage to the Logical View • Allocating/Deallocating Storage • Providing File Directories • Strong Ties to Virtual Memory Management • What Abstractions are Presented to Programmer?

  5. View of Information Varies Based on Needs and Requirements Applications Need Structured Perspective Via PL Primitives or Provided by OS Translation to Byte-Streams Facilitate Storage to Media Bi-directional Still Must Organize Blocks on Storage Devices Where Does Object Serialization Fit in? Information Structure Applications Records Structured Record Files Record-Stream Translation Byte Stream Files Stream-Block Translation Storage device

  6. Low Level Files ... ... b0 b1 b2 bi Stream-Block Translation • Byte-Streams Translated to Block Streams • Byte-Stream is a Named Sequence of Bytes of Non-Negative Integers • File Pointer Maintains Position within Stream • After open Operation, File Pointer • References Next Byte to be Read/Written • Progresses Through Stream • May Move in Both Directions

  7. File Descriptors: Maintaining Detailed Information on Each File • External Name • Character String for the File Name • Symbolic Name Associated with File • Current State • Archived (Stored on Tape or Tertiary Media) • Closed, Open for Read, Write, Execute, etc. • Sharable: Read, Write, Execution Sharable • Owner: w.r.t. Process and/or User - Protection • User: Process(es) Accessing the Open File • Locks • Read Lock: Exclusivity of Read Access • Write Lock: Exclusivity of Write Access

  8. File Descriptors • Protection Settings: We’ll Discuss in Chapter 14 • Length: Number of Bytes • Time of Creation, Last Modification, Last Access • When File Created • When File Last Written To • When File Last Accessed • Reference Count • Number of Directories Referencing File • When Count is Zero, Remove File • Else, Remove Reference to File • Storage Device Details • Access Strategy for Blocks that Comprise File

  9. Unix File DescriptorReferred to as inode (index) • Mode: Access Permissions for Owner and Users • UID: ID of User Creating File • Group ID: ID Associated with Users Group • Length in Bytes and Length in Blocks • Last Modification and Last Access • Last inode Modification • Reference Count: As with File Descriptors • Block References: Pointers and Indirect Pointers to the Blocks that Comprise the File drwxrwxrwx drwx------ drwxr-xr-x -rwxrwxrwx -rw------- -r--r--r--

  10. Byte Stream File Interface • Comprised of Following Operations • fileID = open(fileName) • close(fileID) • read(fileID, buffer, length) • write(fileID, buffer, length) • seek(fileID, filePosition) • File Manager Maps Filenames to Collection of Physical Blocks on Storage Device • Device Drivers to Read/Write Blocks • write Operation Allocates Unused Blocks • delete Operation Deallocate Blocks • Approach Taken Part of File Management Implementation Strategy

  11. Structured Files Records Record-Block Translation • File Logically Composed of Stream of Records • Who Does Conversion from Record and Byte Stream? • Application Program • PL API (Java O.Ser.) • OS Abstraction • Unix Text Files • Printable ASCII chars Organized into Lines • Unix Commands wc, grep, diff, vi Operate on Text Files • wc Won’t Work on Executable Code

  12. Record-Oriented Sequential Files Logical Record • Manage & Store Set of Records on List • For Example, Mail Messages are Records with • Header, Sender, Subject, Receiver, Body • Manipulated by Editor, Mailer (Sender and Receiver), Browser, etc. • Structured Sequential File is a Named Sequence of Logical Records Indexed by Non-Negative Nums. • Operations Include: • fileID=open(fileName) & close(fileID) • getRecord/putrecord(fileID, record) • seek(fileID, position)

  13. Record-Oriented Sequential Files Logical Record • Every Logical Record Must be Mapped to a Physical Byte Stream • H-Byte Header for Record Descriptor • k Bytes for Record Fields (Sum of Individual Fields) • Padding Used to Even Field Boundaries to 2, 4, or 8 Byte Increments • Block Translation to/from Record to Stream • Stream Must then be Mapped to Physical Storage H byte header k byte logical record ...

  14. Record-Oriented Sequential Files Logical Record • Mapping to Physical Storage Requires Individual and Sets of Fields to be Organized into Secondary Storage Blocks • Possibility of Fragmentation H byte header k byte logical record ... ... Physical Storage Blocks Fragment

  15. Indexed Sequential File • Suppose We Want to Directly Access Records • For Example, in ATM, Access Individual Accounts without Searching Through all Accounts • Indexed Sequential File Adds an Index to the File • Data Structure Allows Translation from Index to Specific Record • Application Must maintain Table to Map Key Value (Bank Acct #s) to Record Indices • Operations Include: • fileID =open(fileName)& close(fileID) • getRecord(fileID, index) • index=putRecord(fileID, record) • deleteRecord(fileID, index)

  16. Indexed Sequential File (continued) index = i Account # 012345 123456 294376 ... 529366 ... 965987 Index i k j index = k index = j

  17. Additional Abstract Files • Inverted Files • Index for Each Datum in the File • Access Based on Different Key Fields • Databases: CSE255 • More Elaborate Indexing Mechanism • DDL & DML • Multimedia Storage and Usage • Records Contain Radically Different Types • Access Methods Must Be General • Evolving Media (DVD, Video, Audio) Managed by Application Level Software • What are Appropriate OS File Abstractions for Real-Time Video, Audio, DVD, Web Sites?

  18. Implementing Low Level Files • Secondary Storage Device Contains: • Volume Directory (Sometimes a Root Directory for a File System) • External File Descriptor for Each File • Contents of Files • Manages Blocks • Assigns Blocks to Files (Descriptor Keeps Track) • Keeps Track of Available Blocks • Maps File To/From Byte Stream

  19. An openOperation • Multi-Step Process • Locate the External File Descriptor • Extract Information Needed to Read/Write File • Authenticate that Process Can Access File (to be Discussed in Chapter 14) • Create Internal File Descriptor in Primary Memory • Create Entry in a “Per Process” Open File Status Table • Allocate Resources(Buffers) for File Usage • Close Operation Completes All Pending Operations, Releases I/O Buffers, Frees Locks, Updates External File Descriptor, Deallocate Files Status Table Entry

  20. Opening a UNIX File fid = open(“fileA”, flags); … read(fid, buffer, len); • Changes to In-Memory inode Occur When inode Copied Back to Secondary Storage Either Periodically or When File is Closed 0 stdin 1 stdout 2 stderr 3 ... File structure inode Descriptor Table

  21. Programming Language Abstractions • Programming Languages Offer Many Abstractions for Software Engineers • Structures and Records • Object Oriented Classes • Collections and Lists • C++ Pioneered Abstractions that Blurred Line Between PL and OS Capabilities • C++ API to Store Instance to File • Reverse Also Supported • Forerunner of Object Serialization • What do Languages Like Java Offer us Today?

  22. Object Serialization in Java • Object Serialization is the Process of Reading and Writing Objects • Bi-directional Process of Write (Save in Serialized form) and Read (Reconstruct from Serialized form) • ObjectInputStream and ObjectOutputStream are used for Reading and Writing Objects • Used in: • Remote Method Invocation (RMI) • Lightweight Persistence - Archival for Use in a Later Invocation of a Program • Exchange of Information Across Network • Agent-Based/Aglet Computing

  23. Using Object Serialization • Straightforward Process in Java • Serialize Objects by ... • Writing Objects to an ObjectOutputStream • Deserialize Objects by ... • Reading Objects using ObjectInputStream • Design/Develop Classes that Promote the Serialization/Deserialization of Instances • Serialize/Deserialize at Topmost Level: • Automatically Includes Component Instances, Set and Collection Instances, User-Defined Class Instances, etc. • Whatever is Declared with Class and Active within Instance

  24. Writing to an ObjectOutputStream • The Following Code Segment Serializes the Date Object FileOutputStream out = new FileOutputStream("theTime"); ObjectOutputStream s = new ObjectOutputStream(out); s.writeObject("Today"); s.writeObject(new Date()); s.flush(); • Serializes the Object to a File named “theTime”

  25. Reading From an ObjectInputStream • The Following Code Segment Reconstructs by Deserializing the Date Object: FileInputStream in = new FileInputStream("theTime"); ObjectInputStream s = new ObjectInputStream(in); String today = (String)s.readObject(); Date date = (Date)s.readObject(); • Object and its Components (if any) are Read from the File “theTime” • If Multiple Independent Objects were Written, Objects must be Read in the Same Order • Return Value from readObject has to be Cast to a Specific Type

  26. Utilizing Object Serialization • Serialization/Deserialization Occurs via the Implementation of the Java Serializable Interface • An Object is Serializable only if its Class Implements the Serializable Interface • Serialization Utilizes Exception Handling • For Example, writeObject Method Throws a NotSerializableException if the given Object is not serializable • Serializable is an Empty Interface • Does not Contain Any Method Declarations • Identifies Classes whose Objects are Serializable

  27. Implementing the Serializable Interface • The Serializable Interface • public interface Serializable { // there's nothing in here!}; • To make Instances of a Vlass serializable, add the Implements Serializable to Class Defenition • public class MySerializableClass implements Serializable { … } • User does not have to Write any Methods • Serialization of Objects are Handled by the defaultWriteObject Method of the ObjectOutputStream

  28. defaultWriteObject Method • The defaultWriteObject Method is Defined in ObjectOutputStream Class • defaultWriteObject Writes all Necessary Details to Reconstruct an Instance of the Class • Class of the Object • Class Signature • Values of Non-Transient and Non-Static Members Including References to Other Contained Objects • In Turn, Contained Objects, their Classes, Signatures, Values, etc., are also Written • Process Continues in a Logically Recursive Fashion

  29. Block Management • Assigning Storage Blocks to the File • Fixed Sized, k, Blocks • File of Length m Requires N = m/k Blocks • Byte bi is Stored in Block i/k • File Manager has Three Basic Strategies for Allocating and Managing Blocks: • Contiguous Allocation • Linked Lists • Indexed Allocation

  30. Contiguous Allocation • Maps the N Logical Blocks of File into N Contiguous Blocks on Secondary Storage Device • Difficult to Support Dynamic File Sizes • If File Grows, Rewrite File to Secondary Store • Utilizes Best-Fit, First-Fit, and Worst-Fit • Fragments Physical Disk Space • Resulting Contiguous Blocks Too Small to Hold Files File descriptor Head position 237 … First block 785 Number of blocks 25

  31. Linked Lists • Each Block Contains a Header With • Number of Bytes in the Block - Allows Storage of Variable Length Blocks • Pointer to Next Block • Blocks Need Not Be Contiguous • Files Can Expand and Contract • Consequently, Seeks Can Be Slow … First block … Head: 417 ... Length Length Length Byte 0 Byte 0 Byte 0 ... ... ... Byte 4095 Byte 4095 Byte 4095 Block 0 Block 1 Block N-1

  32. Indexed Allocation Byte 0 ... Index block … Head: 417 ... Byte 4095 Length Length Length Block 0 Byte 0 ... Byte 4095 Block 1 Byte 0 ... Byte 4095 Block N-1 • Extract Headers and Put Them in an Index • Simplify Seeks • May Link Indices Together (for Large Files)

  33. UNIX Files Index Index Index Index Index Index Index Index Index inode Data With 4K Blocks 4K x 12 = 48K mode owner … Direct block 0 Direct block 1 … Direct block 11 Single indirect Double indirect Triple indirect Data Data Data Data For Single Indirect 1000 Indices/Block 4K x 1000 = 4M Data Data For Double Indirect 4K x 1,000,000 Data For Triple Indirect 4K x ???????? Data

  34. Unallocated Blocks • How Should Unallocated Blocks Be Managed? • Proposal of a Free List for Unused Blocks • Need a Data Structure to Keep Track of Them • Linked List • Very Large • Hard to Manage Spatial Locality • Block Status Map (“Disk Map”) • Bit Per Block • Easy to Identify Nearby Free Blocks • Useful for Disk Recovery • 1GB Disk with 4K Blocks 256K Entries in Map • Each Entry 1 bit ==> 32K Required • Maintained in Primary Memory

  35. Managing the Byte Stream • Implementation of Byte Stream on Top of Contiguous Set of Blocks • Packing and Unpacking Blocks • Must Read-Ahead on Input (Convert Secondary Storage Blocks to Byte Stream - Unpack) • Must Write-Behind on Output (Convert Byte Strings to Secondary Storage Blocks - Pack) • Seek (Locate a Block for Particular Operation) • Inserting/Deleting Bytes in the Interior of the Stream • Block I/O • Buffer Several Blocks • Memory Mapped Files

  36. Memory Mapped Files • Key Concept for Virtual Memory Management • Multiple Active Processes Each with Own Page Table • Processes Allowed to Share Files During Execution • File Descriptor Maps Blocks of File that Have Been Loaded to Primary Memory Addresses • Blocks A, B, C, D Map to Page Frames I, J, K, L • Page Table for Process P Can Reference Blocks I and K • Page Table for Process Q Can Reference Blocks I, J, and L • Reference Counts for Blocks Critical

  37. Directories • A Set of Logically Associated Files and Sub Directories • File Manager Provides Set of Controls: • Enumerate: Return File List and Nested Directories • Copy: Duplicate a File • Rename: Change Symbolic Name of File • Delete: Remove File from Directory • Release All Blocks and File Descriptor • Traverse: Maintain at Least Hierarchical Structure and Ability to Navigate • Unix Supports Graphs (Symbolic Links) • Does NT and Win98 Also?

  38. Directory Structures • How Should Files Be Organized Within Directory? • Flat Name Space: All Files Appear in a Single Directory • Hierarchical Name Space • Directory Contains Files and Subdirectories • Each File/directory Appears As an Entry in Exactly One Other Directory -- a Tree • Popular Variant: All Directories Form a Tree, but a File Can Have Multiple Parents • Hence, Possibility of Graph for Directories • Dominance of Directory Structure in Other Tools • Email Folders in Netscape • Organizing Designs in Rational Rose

  39. Directory Implementation • Device Directory • A Device Can Contain a Collection of Files • Easier to Manage If There is a Root for Every File on the Device -- the Device Root Directory • File Directory • Typical Implementations Have Directories Implemented as a File With a Special Format • Entries in a File Directory are Handles for Other Files • Other Files Can be Files or Subdirectories • Symbolic Linkage Possible (Graphs) • Visibility of Directory Structure Controlled by Protection and Security Mechanism

  40. Apple, DOS, and Unix Directories • Apple • Finder/Multifinder is Directory Manager • Tree Structured Directories with DeskTop • Directories as Folders/Drag and Drop/Trash • DOS • Text Oriented Interface • Relative (autoexec.bat) and Absolute Paths • Path Operator (“\”) for Given Disk Drive • Unix • Text Oriented Upgraded to CDE/Linux/etc. • Subdirectories for User Accounts • Mount of Physical Devices within “/” or Other Directory

  41. UNIX mount Command / bin usr etc foo bill nutt bar abc cde xyz blah Mount bar at foo

  42. Concluding Remarks/Looking Ahead • Review of File Management Concepts/Functions • File Abstractions & Implementation Strategies that Impact File Management and Manager • Directory Structures to Facilitate Organization of Files by Users • Interesting Exercises in Section 13.6 • Problem 9 Suppose Disk Block Holds 2K Disk Addresses and 4K Blocks - Calculate Max File • Problem 10: Cause of Limitation of 32MB of Disk Capacity in DOS? • Looking Ahead to … • Protection and Security (Chapter 14) • Impact of Networks on OS (Chapter 15)

More Related