160 likes | 448 Views
CS246 Data & File Structures Lecture 1 Introduction to File Systems. Instructor: Li Ma Office: NBC 126 Phone: (713) 313-7028 Email: malx@tsu.edu Webpage: http://itscience.tsu.edu/ma Department of Computer Science Texas Southern University, Houston. January, 2007. Motivation.
E N D
CS246 Data & File StructuresLecture 1 Introduction to File Systems Instructor: Li Ma Office: NBC 126 Phone: (713) 313-7028 Email: malx@tsu.edu Webpage: http://itscience.tsu.edu/ma Department of Computer Science Texas Southern University, Houston January, 2007
Motivation • Why do we need file structure design? • What are most computers used to for? • Examples: • Editing document • Internet surfing or Composing email • Programming, etc. • Data processing: storage, organization, access, and operation on data by Li Ma, TSU - cs344
Computer Architecture Main Memory Data is manipulated here Data Transfer Secondary Memory Data is stored here by Li Ma, TSU - cs344
Advantages of Memories • Programs are executed in main memory since it is fast • Data is stored in secondary memories since • Data is not lost during power failures –stable • Affordable –large and cheap by Li Ma, TSU - cs344
Disadvantages of Memories • Main memory is small and expensive • Many programs are too large to fit in main memory • Main memory is volatile, Data is lost during power failure • Secondary storage is slow (10,000 times slower than main memory) by Li Ma, TSU - cs344
Solution • Secondary storage provides reliable, long-term storage for large volume of data • Data we are interested in (a small portion of data) is always in Main memory since it can be rapidly manipulated and processed • Data can be transferred automatically between the Main memory and the secondary storage by Li Ma, TSU - cs344
Problem with the Approach • Transferring data between Main memory and secondary memory is slow because Secondary memory is slow • An important aspect of file system management is to minimize the amount of data transfer, or eliminate unnecessary transfers by Li Ma, TSU - cs344
File Systems • Examples • Telephone book (primary index: name?) • Library (primary index: number, secondary indices: author, title, and subject) • A file system provides a convenient method for organizing and storing files by Li Ma, TSU - cs344
File Systems (cont) • File systems are software programs that allow us to efficiently organize data and operate on data • Data is organized into files on hard disks or other physical media • A file is divided into records • Records are composed of fields by Li Ma, TSU - cs344
Example of A File • A student file may be a collection of student records, one record for each student • Each student record may have several fields, such as: • Name • Address • Student Number • Sex • Age • Grade point average (GPA). by Li Ma, TSU - cs344
Structure of A File • Typically, each record in a file has the same fields • Files are large and are stored in secondary storage • Records we are currently interested in are copied to Main memory • Organizing the records of a file, and getting the records we are interested in are the main topics of this course by Li Ma, TSU - cs344
File Properties • Persistence: Data written into a file persists after the program stops, so the data can be used later • Sharability: Data stored in files can be shared by many programs and users simultaneously • Large Size: Data files can be very large that, typically, they cannot fit into Main memory by Li Ma, TSU - cs344
File Structure Design • A file structure is a combination of representation for data in files and operations for accessing the data • The intension between a disk’s relatively slow access time and its enormous, non-volatile capacity is the driving force behind file structure design by Li Ma, TSU - cs344
File Structures • Sequential file structures – files on tape • Indexed file structures – files on disk, with key and pointer • Tree structures: AVL tree, B-tree, B+ tree • Hash file re-organization by Li Ma, TSU - cs344
Course Content & Outline • This course covers data processing from a computer science perspective: • Storage of data – storage devices (disk & tape) • Organization of data – file organization • sequential - Tape • direct (hashing) • indexed sequential (B-trees) • multi-key (secondary indices) • Access to Data – file systems • Processing of Data – database system by Li Ma, TSU - cs344