400 likes | 417 Views
Mobile File Systems. Mobility and Distributed File-systems. Examples of distributed file-systems – NFS (network file system), AFS (andrew file system), etc. Key problems in a wireless/mobile environment? Disconnections Low bandwidths. File-systems ….
E N D
Mobility and Distributed File-systems • Examples of distributed file-systems – NFS (network file system), AFS (andrew file system), etc. • Key problems in a wireless/mobile environment? • Disconnections • Low bandwidths
File-systems … • CODA – supports disconnected operation • PFS – targets partially connected environments • Bayou – support for data-sharing among mobile users • others …
Hoarding Emulating Reintegrating CODA • Three key components • Hoarding • Emulating • Reintegrating
Hoarding • Cache maintained locally at clients • Hoarding – fills in cache when mobile client is connected • Two types of hoarding • Usage based • User specified • Cache management • User driven • LRU
Emulating • When mobile client is disconnected, coda enters the emulation mode • All requests are served from the local cache • change modify log (CML) maintained based on user-writes • CML used later in conflict detection and resolution • CML optimized periodically to reduce the size of the CML – delete unnecessary entries
Reintegrating • Transparent conflict resolution for directories • Except under impossible scenarios – directory permission changes, add/deletion of same directories … • File conflict resolution • ASR – application specific resolvers • Possess information about nature of files (say calendars) – resolve based on application semantics • Manual repair • User provided a view-graph of conflicts and asked to resolve
Optimizations • Rapid cache validations • Original mechanism in CODA for cache coherence based on callbacks • Callbacks cannot be used when clients are disconnected • Client validates cached copies explicitly upon reconnection • Potentially time-consuming • CODA uses cache coherence checks at multiple levels of granularity
Rapid cache validations - Illustration Vol 2 Vol 3 Vol 4 Vol 1 Version x Version y Version z • Check volume version stamp • If version stamps different, check individual object • version stamps
Trickle Re-integration • When network is weakly connected, propagate updates to server asynchronously • Trade-offs • lower bandwidth efficiency as fewer CML optimizations are possible • More up to date copies at server – fewer conflicts possible • CODA allows users to dynamically set the period of the asynchronous updates based on requirements
User assisted cache miss handling • When a cache miss occurs, what should be done? • Option 1 – fetch from server • Option 2 – convey error message to user • Trade-offs? • Latency vs. availability • Patience vs. importance • CODA uses a “patience threshold” to decide whether a file can be retrieved within this threshold
Mimic: Raw Activity Shipping for File Synchronization in Mobile File Systems GNAN Research Group School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA 30332, USA
File Synchronization in Mobile File Systems • File synchronization in distributed file systems is a consistency restoration process between a file server and a client • Over mobile networks such as a WWAN, file synchronization in traditional distributed file systems cannot be used effectively due to limited bandwidth • Bandwidth usage efficiency of file synchronization is important
Current File Synchronization Schemes (1) • Data compression • Compresses each block of data or meta-data • On-line data compression in a log-structured file system [Burrows’92] • Differential update • Exploits similarities between versions of the same file • Based on the diff scheme of the UNIX systems • Rsync [Tridgell’00] • Low-bandwidth network file systems [Muthi’01]
Current File Synchronization Schemes (2) • Operation shipping • Logs and ships user operations that update the files • Session/application level logging and replaying • Need modification for GUI-based interactive applications • Corrects replaying errors by forward error correction (FEC) • Minor re-execution discrepancies caused by non-repeatable operations can be detected by fingerprint algorithms • Operation shipping for mobile file systems [Lee’02]
Insert a line 98 keystrokes 29341 B 1543 B 236 B Insert a paragraph 476 keystrokes 29356 B 2111 B 848 B Copy and paste a paragraph from the same file 6 keystrokes + 12 mouse-clicks 33449 B 1119 B 72 B Change the font type of a paragraph 7 mouse-clicks 40611 B 1660 B 30 B Motivation Update size comparison in Microsoft Word Activity Description Activity Events Full File Size diff Size Activity Size
Goal and Overview • Goal • To design an application-unaware activity shipping scheme for file synchronization in interactive applications • Mimic is a file synchronization strategy that records raw activity at the client side, ships the records to the server during synchronization, and replays the activities at the server
Input Input Message Message Message Event Input Device Input Driver Application Queue File Mimic Record Local File Recorder Client System File / Record Figerprint Mimic / FEC Network File File Verifier System Record File Input Input Mimic Message Message Message Application Replayer Queue Server Mimic System Overview
Mimic Design Elements • Mimic Client Design: recording raw input activity • Mapping filename to process/session • Record optimization • Mimic server design: replaying raw input activity • Environment synchronization • Replaying speed optimization • Integration of Mimic with file systems • Verification/error correction of replayed files • Presented in [Lee’02]
Client Server Record trapSystemMessage( ) getClipboard( ) Window Window getEnvironment( ) Manager Manager processToWindowHandle( ) system message Mimic Mimic Mimic Synchronization OS OS Client Server File File System System Native File Synchronization Mimic Client Design: Mimic Client to Window Manager Interface
Mimic Client Design: Background • Translation of input activity • Device driver translates an input event into a input system message • Handle of the corresponding window is also written on the message • Delivery of input system messages • Interactive application has a set of application windows • System message queue demultiplexes system messages according to their window handle
Mimic Client Design: Window Handle Table (WHT) • WHT maps the process (or session) handle and filename of an application to the corresponding set of window handles • When a file is opened with a filename, the mapping information is acquired, and added on WHT • through processToWindowHandles( ) • When the file is closed, the mapping is removed from WHT • Mimic recorder captures system messages having the window handles listed on the WHT or the system window handles • through trapSystemMessage( )
Mimic Client Design: Descriptors • Each captured message is translated into a descriptor • Activity Descriptor (AD) • Describes an individual user input activity for an application • Meta Descriptor (MD) • Capture the changes of system environment during recording • Corresponding system messages are generated • Includes screen resolution, color depth, keyboard layout, clipboard content, etc. • Environment Descriptor (ED) • Describes the initial system environment when recording begins • Same structure as that of an MD • through getEnvironment() and getClibboard()
0 31 32 63 64 95 96 127 128 159 0 3 4 15 message message 20 lParam wParam time hwnd clipboard size identifier type bytes variable clipboard information time EVENTMSG in Windows Meta Descriptor (Clibboard Content) 0 3 4 11 12 15 0 3 4 11 12 15 message keyboard layout time 2 bytes message type information virtual-key code time 2 bytes type Meta Descriptor (Keyboard Layout Change) Activity Descriptor (Keyboard Activity) 0 3 4 11 12 15 0 3 4 11 12 15 message x-coordinate message type x-coordinate 4 bytes type 4 bytes y-coordinate time y-coordinate time Meta Descriptor (Screen Resolution Change) Activity Descriptor (Mouse Activity) 0 3 4 11 12 15 0 3 4 11 12 15 message message MD link information time 2 bytes color information time 2 bytes type type Activity Descriptor (Meta Descriptor Link) Meta Descriptor (Color Depth Change) Mimic Client Design: Descriptor Structures
Mimic Client Design: Records • Each descriptor is recorded in a record • File Activity Record (FAR) • Maintained per file • Consisting of file information, a set of environment descriptors (EDs), and a sequence of activity descriptors (ADs) • Linked to a meta descriptor (MD) of a meta activity record (MAR) when an environment change happens • Meta Activity Record (MAR) • Shared by file activity records (FARs) • Consisting of a sequence of meta descriptors (MDs)
Client Server Record trapSystemMessage( ) getClipboard( ) Window Window getEnvironment( ) Manager Manager processToWindowHandle( ) system message Mimic Mimic Mimic Synchronization OS OS Client Server open ( ) close ( ) rename ( ) File File delete ( ) System System finish ( ) synch ( ) Native File /Synch_Status Synchronization Mimic Client Design: Mimic Client to File System Interface
Mimic Client Design: Integration with File Systems • open(filename,mode,process_handle) • Called when a shared file is opened • close(filename), rename(filename),delete(filename) • Called when a shared file is closed, renamed, or deleted • synch(filename,diff_size) • Called when a shared file needs to be synchronized • Decides update mode by comparing FAR_size with diff_size • Returns Synch_Status • SYNCH_FAIL if Mimic synchronization is failed or diff is chosen • Updates again on diff mode when Mimic synchronization failed • finish() • Called when the synchronization process is terminated
Client Server Record trapSystemMessage( ) getClipboard( ) Replay Window Window getEnvironment( ) playActivity( ) Manager Manager processToWindowHandle( ) setEnvironment( ) system message setClipboard( ) Mimic executeDefaultApplication( ) Mimic Mimic Synchronization OS OS Client Server waitForProcessIdle( ) open ( ) close ( ) rename ( ) File File delete ( ) System System finish ( ) synch ( ) Native File /Synch_Status Synchronization Mimic Server Design: Mimic Server to Window Manager/OS Interface
Mimic Server Design: Initialization • Environment synchronization • Based on the environment descriptor (ED) of the FAR • Initial system environment synchronization • through setEnvironment() • Clipboard content synchronization • through setClipboard() • Application synchronization • Opens a corresponding application of the file activity record • Based on the file extension of the filename • Sets the same environment such as window size and location • through executeDefaultApplication() • Moves the system focus to the application
Mimic Server Design: Replaying • System message/function generation • Activity descriptor (AD) is mapped into an input system message • Meta or environment descriptor (MD or ED) is mapped into a system function to set the environment • through playActivity() • System message/function re-execution • Deliver the messages to system message queue • Run the system functions
Mimic Server Design: Replaying Speed Control • User activity skipping and misinterpretation • Certain inputs are relevant to the application only for particular states • Too fast replaying may not let the application wait for a particular state, and cause replaying errors • Replaying speed control in Mimic • Monitors the CPU utilization of the process after every message playback • Playbacks the next AD, only when the process is idle • through waitForProcessIdle( )
Experiment Setup • Wireless wide area network (WWAN) • CDMA2000-1X cellular network • Effective data rate: about 17 Kbps • Round-trip time between the client and server: about 300ms • Operating system/application • Microsoft Windows 2000 Professional • Microsoft Office 2000 suite • Metrics • Transfer size • Synchronization latency • Includes shipping, replaying, and verification delays • Compared with the differential update (diff) • xDelta for Windows
Transfer Size Results (3) • Transfer size in Mimic is generally less than that in diff • Except when copying from outside the file • Bandwidth-inefficient clipboard structure (OLE) • Mimic overhead is generally proportional to activity size • Except when copying from outside the file • Transfer size relies on the size of the copied object • Diff overhead is not proportional to activity size • Single line insertion in diff may consume more bandwidth than a single paragraph insertion • Delete or modify operations in Mimic incurs significantly smaller overheads than those in diff
Synchronization Latency Results (3) • Mimic still shows better latency performance for certain activity • For small insertions, deletions, internal copies, and meta data changes • Even though the latency in Mimic includes its playback time, its total update time does not exceed that of diff • Benefit of small transfer size for those operations is larger than playback overhead • However, for the other types of activities, diff performs better in terms of latency • For large insertions, modifications, and external copies
Conclusions • We propose an application-unaware and OS-independent approach called Mimic that relies on transferring raw user activity records to the server, where the file is updated through a playback of the raw user activity on the old copy of the file • We show that Mimic performs much better than diff in most scenarios in terms of the transfer file sizes • We conclude that Mimic can be used in tandem with diff to substantially improve file synchronization performance
Puzzle • Lateral thinking • E.g. • A man approaches the center of a field, frantically trying to open a package. When he reaches the center of the field, he dies. Why? • A man lives on the 11th of a building • The elevator is a small one (can accommodate only 1 person) • When he goes to work, he takes the elevator to the 1st floor • When he comes back from work, he takes the elevator to the 6th floor, and walks up 5 floors • When it rains, he takes the elevator up to the 11th floor • Why?