1 / 42

Connecting with Computer Science, 2e

Connecting with Computer Science, 2e. Chapter 10 File Structures. Objectives. In this chapter you will: Learn what a file system does Understand the FAT file system and its advantages and disadvantages Understand the NTFS file system and its advantages and disadvantages

wilmer
Download Presentation

Connecting with Computer Science, 2e

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Connecting with Computer Science, 2e Chapter 10 File Structures

  2. Objectives • In this chapter you will: • Learn what a file system does • Understand the FAT file system and its advantages and disadvantages • Understand the NTFS file system and its advantages and disadvantages • Compare common file systems • Learn how sequential and random file access work • See how hashing is used • Understand how hashing algorithms are created Connecting with Computer Science, 2e

  3. Why You Need to Know About...File Structures • Knowledge of how an operating system stores and maintains data in a computer • Allows better comprehension of how a computer handles and manipulate files • Allows the computer to run as efficiently as possible Connecting with Computer Science, 2e

  4. What Does a File System Do? • Responsibilities • Creating, manipulating, renaming, copying, and removing files to and from a storage device • Organizing files into common storage units • Called directories • Keeping track of file and directory locations • Assisting users • Relate files and folders to the physical structure of the storage medium Connecting with Computer Science, 2e

  5. What Does a File System Do? (cont’d.) • Files used by operating systems and applications • Word-processing documents • Source code for programs you have written • Music files • Movie files • Spreadsheets • Photos • Operating systems use a file folder icon to represent a directory Connecting with Computer Science, 2e

  6. What Does a File System Do? (cont’d.) Figure 10-1, Files and directories in a file system are similar to documents and folders in a filing cabinet Connecting with Computer Science, 2e

  7. What Does a File System Do? (cont’d.) Figure 10-2, Folders and files in Windows Connecting with Computer Science, 2e

  8. What Does a File System Do? (cont’d.) • Hard disk • Most common storage medium for a file system • Physically organized into tracks and sectors • Read/write heads move over specified areas of the hard disk to store (write) or retrieve (read) data • Random access device • Reads or writes data directly on the disk • Faster than sequential access • Reads and writes from beginning to end • Makes use of the file system to organize files Connecting with Computer Science, 2e

  9. File Systems and Operating Systems • File management system • Dependent on the operating system • FAT (File Allocation Table) • Used from MS-DOS to Windows ME • NTFS (New Technology File System) • Default for Windows • Unix and Linux support several file systems • XFS, JFS, ReiserFS, ext3, others • Mac OS X file system • HFS and HFS+ Connecting with Computer Science, 2e

  10. FAT • Groups hard drive sectors into clusters • Increases performance by organizing blocks of sectors contiguously • Maintains a relationship between files and clusters • Clusters have two entries in the FAT • Current cluster information • Link to next cluster or special code indicating the last cluster • Keeps track of writable clusters and bad clusters Connecting with Computer Science, 2e

  11. FAT (cont’d.) Figure 10-3, Sectors are grouped into clusters on a hard disk Connecting with Computer Science, 2e

  12. FAT (cont’d.) • Hard drive organization • Partition boot sector • Contains information on how to access volumes • Main and backup FAT • If error in reading the main FAT, backup copied to main to ensure stability • Root directory • Contains entries for every file and folder in the directory • Data area • Measured in clusters Connecting with Computer Science, 2e

  13. FAT (cont’d.) Figure 10-4, Typical FAT file system Connecting with Computer Science, 2e

  14. Disk Fragmentation • File clusters scattered in different locations on the storage medium • Windows provides the Disk Defragmenter utility • Reorganizes clusters contiguously • Improves performance • Minimizes movement of the read/write heads • Use regularly to ensure system runs at peak performance Connecting with Computer Science, 2e

  15. Disk Fragmentation (cont’d.) Figure 10-5, Files become fragmented as they’re stored in noncontiguous clusters; a defragmenting utility moves files to contiguous clusters and improves disk performance Connecting with Computer Science, 2e

  16. Advantages of FAT • Efficient use of disk space • Does not have to use contiguous space for large files • File names up to 255 characters (FAT32) • Easy to recover deleted files upon deletion • System places E5h in the first position of filename • File remains on drive • Replace E5h with original first letter of the filename Connecting with Computer Science, 2e

  17. Disadvantages of FAT • Performance slows down as more files are stored on the partition • Hard drive fragments easily • Lack of security • NTFS provides access rights to files and directories • File integrity problems • Lost clusters • Invalid files and directories • Allocation errors Connecting with Computer Science, 2e

  18. NTFS • Overcomes FAT system limitations • “Journaling” file system • Keeps track of transaction performed • “Rolls back” transactions if errors found • Uses a Master File Table (MFT) • Stores data about all files and directories • Similar to database table with records • Uses clusters • Reserves blocks of space to allow the MFT to grow Connecting with Computer Science, 2e

  19. Advantages of NTFS • File access is very fast and reliable • MFT allows system recovery from problems without losing significant amounts of data • Security is greatly increased over FAT • File encryption with EFS (Encrypting File System) • File compression reduces file size • Saves disk space Connecting with Computer Science, 2e

  20. Disadvantages of NTFS • Large overhead • Not recommended for volumes less than 4 GB • Cannot access NTFS volumes from: • MS-DOS • Windows 95 • Windows 98 • Linux Connecting with Computer Science, 2e

  21. Comparing File Systems • Choosing correct file system • Operating system dependent • Rarely depends on hardware • NTFS: Windows XP or Vista • Supports drive sizes up to 16 TB (1600 GB) • FAT: Windows 9x • Older small hard drives, small removable devices • UNIX/Linux • Many file system choices Connecting with Computer Science, 2e

  22. Comparing File Systems (cont’d.) Table 10-1, Fat16, FAT32, and NTFS compared Connecting with Computer Science, 2e

  23. Comparing File Systems (cont’d.) Table 10-2, Some UNIX/Linux file systems Connecting with Computer Science, 2e

  24. File Organization • Topics covered: • File characteristics • How files are stored on disks and other media Connecting with Computer Science, 2e

  25. Binary or Text • Text files • Consist of ASCII or Unicode characters • Typically read with word-processing programs or text editors • Easy to view and modify • Binary files • Computer readable (not human readable) • Coded and numeric information • More compact than text files • Examples: executable programs, applications, sound and image files Connecting with Computer Science, 2e

  26. Sequential or Random Access • Sequential storage • Data accessed one chunk after the other in order • Random storage • Data accessed in any order • Also called direct or relative access Connecting with Computer Science, 2e

  27. Sequential or Random Access (cont’d.) Figure 10-6, Sequential versus random access Connecting with Computer Science, 2e

  28. Sequential Access • Starts at the beginning and processes to the end of the file • Writing process is very fast • New data added to the end of a file • Retrieving, inserting, deleting, modifying data • Very slow • Stores data in rows like a database record • Field delimiters or specific fixed sizes for each field Connecting with Computer Science, 2e

  29. Sequential Access (cont’d.) Figure 10-7, A comma can be used as a field delimiter Connecting with Computer Science, 2e

  30. Sequential Access (cont’d.) Figure 10-8, Data can also be in fixed-length format Connecting with Computer Science, 2e

  31. Random Access • Provides faster access to large amounts of data • Stores fixed-length records (relative records) • Ability to mathematically calculate the record’s position on disk surface and go right to it • Ability to update records in place • May waste disk space • Partial record or no data • Works well when sequential record number can easily identify records Connecting with Computer Science, 2e

  32. Random Access (cont’d.) Figure 10-9, Record organization and file access Connecting with Computer Science, 2e

  33. Hashing • Used for accessing relative record files • Uses unique value called a hash key • Widely used in database management systems • Involves a hashing algorithm to generate hash keys for each record • Combining hash keys establishes an index to rows or records of information Connecting with Computer Science, 2e

  34. Why Hash? • Allows a key field number not suited for relative file access to be converted into a relative record number • Example: phone numbers as keys in a customer information table • Divide highest possible phone number by the expected number of customers to get the hash key • 9999999999 / 2000 (estimated number of customers) = approximately 5,000,000 • Phone number 7025551234 / 5,000,000 gives the record number 1045 Connecting with Computer Science, 2e

  35. Why Hash? (cont’d.) • Hashing may result in collisions • Same relative key is generated for more than one original key value • One solution: • Expand algorithm to add the sum of the digits of the phone number to the relative key • Sum of the digits in phone number 7025551234 is 34 • Original key 1045 + 34 = 1079 • Lessens collisions but does not eliminate them Connecting with Computer Science, 2e

  36. Dealing with Collisions • Best hashing algorithms have collisions • One solution: create overflow area • Records with duplicate record numbers are placed in the overflow area at the end of the file • Record retrieval • Hash key is calculated, and record at calculation position is retrieved • If the record at that location isn’t the correct one, the overflow area is searched sequentially Connecting with Computer Science, 2e

  37. Dealing with Collisions (cont’d.) Figure 10-10, An overflow area helps resolve collisions Connecting with Computer Science, 2e

  38. Hashing and Computing • Efficient hashing algorithm • Important to companies producing database management systems • Many different hashing algorithms are used in computing • Encryption and decryption • Indexing • Many programming languages have specialized libraries of built-in hashing routines Connecting with Computer Science, 2e

  39. One Last Thought • Determining a computer system’s worth • Often measured in terms of data stored on hard drives • Data can be difficult to replace • Data storage dependent on file systems • Strong understanding of file systems allows more data availability and protraction Connecting with Computer Science, 2e

  40. Summary • Hard drive • Random access device • Stores information in tracks and sectors • Accesses data through read/write heads • File system • Responsible for creating, manipulating, renaming, copying, and removing files from a storage device • Windows uses either FAT or NTFS Connecting with Computer Science, 2e

  41. Summary (cont’d.) • FAT keeps track of which files are using specific clusters • Vulnerable to disk fragmentation • NTFS uses MFT to keep track of files and directories • Used with Windows • NTFS advantages over FAT • Better reliability and security, journaling, file encryption, and file compression Connecting with Computer Science, 2e

  42. Summary (cont’d.) • Linux can be used with many file systems • Files contain binary or text (ASCII) data • Data is usually stored and accessed either sequentially or randomly (relative access) • Hashing • Common method for accessing a relative file • Collisions occur when the hash key is duplicated for more than one relative record location Connecting with Computer Science, 2e

More Related