170 likes | 280 Views
Versioning File Systems. Someone has typed: rm -r * However, he has been in the wrong directory. What can be done? Typical UNIXes and Windows versions have some tools for restoring deleted files, if the file's blocks have not been reclaimed.
E N D
Versioning File Systems • Someone has typed: rm -r * • However, he has been in the wrong directory. What can be done? • Typical UNIXes and Windows versions have some tools for restoring deleted files, if the file's blocks have not been reclaimed. • Is this release of storage by UNIX and Windows essential? 7.1 Advanced Operating Systems
The File System's problem • Key problem with current approach is that user actions have immediate and irrevocableeffect on the disk storage. • Users are not protected against their own mistakes. • Goes against the file system objective of protecting data against failure. • We can do better today. 7.2 Advanced Operating Systems
Disk Capacity • On 1995: • For $200 you can get a 0.54GB disk. • Slackware Linux 2.2 (Basic Applications+X window) is 0.15Gbytes which are 28% of the disk. • On 2000: • For $200 you can get a 40GB disk. • RedHat Linux 7 (Basic Applications+X window) is 1Gbytes which are 2.5% of the disk. 7.3 Advanced Operating Systems
Disk Capacity (Cont.) • On 2004: • For $200 you can get a 300GB disk. • RedHat Linux Advanced Workstation 2.1 (Basic Applications+X window) for the Itanium Processor is 4.2GB which are 1.4% of the disk. • On 2011: • For $200 you can get a 2TB disk. • RedHat Enterprise Linux 5 (Basic Applications+X window) is 8.8GB which are 0.4% of the disk. 7.4 Advanced Operating Systems
Old Solutions • UNIX has RCS and CVS for maintaining versions of files. • The manual operation is the main disadvantage of these tools. • On 1985 the Cedar file system has been proposed. • Cedar automatically retains the last few versions of a file in a copy-on-write fashion. • The number of copies is limited; hence when a new write is done, the oldest version will be deleted. • The user can explicitly delete a version, so the oldest version will not be the victim. • VMS uses a version of the Cedar File System. 7.5 Advanced Operating Systems
Snapshots • Many systems are regularly backed up within the disk. • The backup is usually incremental. • Changes made between snapshots cannot be undone. • Many users maintain multiple versions of their critical data. • All files are treated equally. 7.6 Advanced Operating Systems
Not all files are created equal • Read-only files (like application executables) have no versions history. • Derived files (like object files) can be easily reconstituted. • Cached files require no version history. • Temporary files might benefit from a short-term history but not from a long-term history. • User-modified fileswould benefit most from a long-term and a short-term history. 7.7 Advanced Operating Systems
The Elephant File System • Elephant (1999) maintains multiple versions of user files, but not all versions of all files • Need a retention policy. • Elephant involves the user in the retention/reclamation decisions. This means: • Less protection from user mistakes. • A retention policy that might be better suited to the users’ needs. • Elephant keeps a complete history of a file over a short period of time (one hour to one week), but keeps forever landmark versions of each file. 7.8 Advanced Operating Systems
Elephant's Main Concepts • Storage reclamation is separated from file write and delete. • Files have a variety of retention policies. • Policies are specified by the user, but implemented by the system. • Undo requires complete history for a limited period of time, but long-term histories should not retain all versions. • The file system assists the user in deciding what versions to retain in the long-term history. 7.9 Advanced Operating Systems
Landmark Versions • Elephant detects landmark versions by looking at time line of updates to the file. • Can identify groups of updates separated by long periods of stability. • Last versions of each group of updates are assumed to be landmark versions. • User ability to recognize landmark versions of a file degrades with time. • Thus, landmark versions are automatically specified by Elephant. • Even though, user can manually specify any version as a landmark version. 7.10 Advanced Operating Systems
Elephant's Versioning • The user can set the limit between the recent history (save any version) and the old history (save landmark versions). • File versions are named by combining the file pathname with a creation date and time. • Directories can be versioned as well. • Allows recovery of deleted files. • Previous versions of a file or a directory are read-only. 7.11 Advanced Operating Systems
Retention Policies • Keep One: keeps only the latest version of the file. • Identical to UFS and FAT. • Keep All: keeps all versions of the file. • Useful for very important files. • Keep Safe: keeps all versions of the file during a specific period. • Can be used for log files. • Keep Landmarks: keeps all versions of the file during a specific period and only landmark versions after that. • Useful for common user's files. 7.12 Advanced Operating Systems
I-map • I-map is a new structure points to the I-node of the current version and the vector of the old versions (I-node log). In addition, I-map contains the file's policy. • By default the policy is "keep one". • Common blocks of some versions can be pointed to by several I-nodes. • Changes are detected at the block level. • New system calls have been added to handle the new file system's features. 7.13 Advanced Operating Systems
Elephant's Performance • open() of an exiting file and close() without flushing can be executed almost in same run-time of traditionally UFS. • close() with flushing will be slower. • creat() of Elephant is slower. • Should allocate an I-map in addition to the I-node on the disk. • unlink() of Elephant is faster. • No release of old blocks. • Elephant is much more disk space consuming. 7.14 Advanced Operating Systems
The Moraine File Systems • On 2000 Yamamoto suggested to compress the versioning data. • In addition his versioning file system has software engineering tools: • The Moraine has a version viewer tool runs on a separate window. • The Moraine can also tell how many lines and how many functions any version has. 7.15 Advanced Operating Systems
The Version Viewer of Moraine • Rev is the version ID. • +n means n lines were added while –n means n lines were deleted. • The line bar indicates the size of the file. • The user can put a remark in TAG. 7.16 Advanced Operating Systems
CVFS • On 2003 Soules introduced The Comprehensive Versioning File System (CVFS). • CVFS keeps the versions of all files in a journal-based style. • CVFS saves the changes; not the new data. • To create old versions, each change is undone backward through the journal until the desired version is recreated. • Rather than saving the blocks that have been changed, CVFS keeps the bytes that have been changed. • CVFS is very efficient in disk space, but inefficient in recover time. 7.17 Advanced Operating Systems