380 likes | 610 Views
By Shruti Shivashankaraiah Shubha Sundara Murthy. CST-594. Mobile Replicated Data . Introduction History System Models Replication Requirements Data Consistency Replicated Data Models Representing Updates Recording Updates Sending Updates Ordering Updates References. Contents.
E N D
By ShrutiShivashankaraiah ShubhaSundara Murthy CST-594 Mobile Replicated Data
Introduction • History • System Models • Replication Requirements • Data Consistency • Replicated Data Models • Representing Updates • Recording Updates • Sending Updates • Ordering Updates • References Contents
Introduction Technology is trending towards wireless Constant Access to Personal Data Millions of users access online distributed databases Challenges Consistency and Availability of Data Providing Users anywhere anytime Access Coming up with suitable Replication Protocol.
History The era of replicated services started 1980’s. Grapevine developed at Xerox provided a replicated directory and email service. Many innovations came out of the Coda project at CMU. Around 1990, researchers articulated a vision, called “Ubiquitous computing”.
Key Considerations in Mobile Systems Portable devices with limited displays, CPU resources, storage, battery life, and security. Intermittent, low-bandwidth, high-latency network connections. Changing environmental conditions and contexts.
System Models • Remote Data Access • Device Master Replication • Peer to Peer Replication • Publish Subscribe Systems
Web access from a cell phone using WAP. • Data Resides on Central server. • The server provides methods for querying or accessing the stored information over a network. • Data consistency is not an issue since all updates are performed directly at the server. Remote Data Access
Data is inaccessible if a network connection to the server cannot be established or if the server is temporarily unavailable. • Access time to the data is limited by the round-trip communication latency between device and server. • Communication consumes valuable battery life. • Communication may incur network charges. Drawbacks of Remote Data Access
Laptop caching files and Web pages. • Data resides on Master Site. • Full or partial data resides on the portable devices (Caching). • Uses weak Consistency model and cache replacement policy • Frequently accessed data or actively-managed user-visible data can be cached. • Any updates are written to master server to make it available for all devices. Device Master Replication
Attempts to read a data object fails if the data is not resident on the accessing device. • Weaker consistency guarantees. • Concurrent updates on same or different objects lead to conflicts. • Master is responsible for detecting when two devices produce conflicting updates. • If master fails, other devices cannot propagate updates. Drawbacks of Device–Master Replication
Peer-to-peer synchronization between mobile devices. • No Master Replica, Pair wise synchronization • Complicated protocols but offers advantages • Easy to add external device by establishing local synchronization partnerships. • Tolerant of failed devices and network outages. PEER-TO-PEER REPLICATION
Requires complex protocol to ensure effective utilization of bandwidth during updates. • More prevalent conflicts. Drawbacks of Peer to Peer Replication
Publishers broadcast messages to subscribers. • Information reaches directly via publisher or through other subscribers. • Example news, weather or sports score. • Information created are, by publisher and read-only, which are usually discarded once read. Smart watch receiving news, weather reports, and other notices that are broadcast from a central publishing site. Publish Subscribe System
Replication Requirements Replication requirements for basic data-oriented system models
Best Effort Consistency • Eventual Consistency • Casual Consistency • Session Consistency • Bounded Inconsistency • Hybrid Consistency Data Consistency
Guarantees best effort to deliver update to all replicas. • Despite reliable delivery, replicas will not converge if • Updates are performed differently at different replicas. • Updates are applied in different orders at different replicas and are not commutable; • Replicas have different conflict resolution policies. • Metadata, such as deletion tombstones, are discarded too early. • Replicas lose or have corrupt data. • The system is improperly configured, such as when the synchronization topology is not a well-connected graph. Best Effort Consistency
Guarantees that replicas would eventually converge to a mutually consistent state, to identical contents. • A mobile system provides eventual consistency if : (1) Each update operation is eventually received by each device (2) Non-commutative updates are performed in the same order at each replica (3) The outcome of applying a sequence of updates is the same at each replica. Eventual Consistency
Guarantees that if update U2 follows U1, then it would ensure that a replica never performs U2 without performing U1. Session guarantees have been devised to provide a user with a view of a replicated database that is consistent with respect to the set of read and update operations performed by that user while still allowing temporary divergence among replicas. • read your writes • Monotonic reads • Write your reads • Monotonic write • Session Consistency Casual Consistency
Bounds are placed on the timeliness or inaccuracy of items that are read from a device’s local replica . • Applications can be presented with a choice of strong or weak read consistency. • For strong consistency, read operations can be directed to the master (or publisher). • If stale information is acceptable then the data can be read from local replica. Hybrid Consistency Bounded Inconsistency
Issues faced by a designer of a replication protocol : • Consistency • Update format • Change tracking • Metadata • Sync state • Change enumeration • Communication • Ordering • Filtering Replicated Data Models
Schemes: • Operation-Sending Protocols • Item-Sending Protocols Comparisons: • Common item layout • run different applications with custom APIs for accessing different replicas of the shared collection • Common operations • May even have different physical schemas. Representing Updates
Schemes: • Log-Based Systems Recording Updates
State-Based Systems Recording Updates
Comparison Recording Updates
Direct Broadcast • Simplest technique. • Device that performs a local update operation to immediately send that update to all other replicas. • Expensive and not totally reliable. • Avoids the need to log such updates. • Needs to know complete set of replicas for a data collection. • Devices that are currently unavailable miss updates. • Example: Coda Sending updates
Full Replica or Log Exchange • Simple and robust. • Pairs of devices periodically exchange replicas or logs. • Eventual consistency can be achieved. • Waste substantial bandwidth, consume CPU resources, and reduce battery life. • Need for more network efficient protocol, may be? Sending updates
Gossip Protocols • A device sends a randomly selected update (or set of updates) to some other device. • One technique: Rumor Mongering, dispersing hot rumors • Replica or logs. • Many failed attempts, off the list. • Alternatively, adopt a fixed expiration period : epidemic algorithm • Simple to implement, no device connectivity constraints, small amount of metadata • But, update may not reach all replicas. • Send cold rumors. Sending Updates
Message Queue Protocols • A reliable messaging system can propagate updates. • Example: IBM’s MQ Series or Microsoft’s SQL Service Broker • The replica that performs an update operation simply places this update on queues. • Delivery queues managed by the messaging system. • Each replica need not have a direct connection to each other replica. • The simplest scheme is to have a single tree, all updates are sent to the root of this multicast tree. • A well knit replication topology facilitates eventual convergence. • But, broken /unreliable links and must be handled. Sending updates
Modified Bit Protocol • Scenario: Cell phone and home PC share a copy of a person’s address book. • Each device tracks updates with modified bit for each item. • Updating: operation on local replica and change modified bit • During synchronization, exchange all items with nonzero modified bit and reset the bit. • Widely used for cell phones and PDAs. • Provides eventual consistency. • Simple, low-overhead synchronization technique for systems involving lesser number of devices. • But, it works less well for rapidly changing replica sets and does not allow incremental update exchange between devices that do not have an established synchronization partnership Sending updates
Device–Master Timestamp Protocols • Device-Master model allows a simpler replication protocol . • Could use the modified-bit protocol, but lots of book keeping • Master assigns update timestamps to each item on local update or on receiving an update from client device. • Client records the time of its last synchronization with the master. • No client-specific synchronization state to be stored on master, only per-item update timestamps. • Alternatively, master can save the last synchronization of the replica to initiate synchronization. Sending updates
Device–Master Log-Based Protocol • Clients can hold update logs. • Clients stores local updates in log which are exchanged during synchronization. • No extra per-item metadata. • Client can discard its complete log after synchronization, trivial log management. • Variations: Master can hold update logs, but added burden of shared log. Sending updates
Anti-Entropy/metadata exchange Protocols • In a peer-to-peer replication model in which devices can obtain updated items directly. • More metadata for efficient pairwise synchronization. • Each device to record last unique identifier for each item and version metadata(version vectors). • The protocol is robust enough to operate over lossy wireless networks. • But, network congestion while exchanging metadata and infrequent updates leads to waste of networking and processing resources. Sending updates
Anti-entropy With Checksums • Device to compute checksum over the contents of the replica. • The partner device verifies and then optionally go for metadata exchange. • Variations: • Exchange metadata, checksum later. • Peel-back checksuming • Reduces the network traffic but may increase the overall synchronization latency. Sending updates
Knowledge-Driven Log-Based Protocols • Each device maintain knowledge about the set of versions or operations that have been incorporated into its local replica. • A device’s knowledge is the set of unique identifiers for all operations that are stored in its local log. • Updation: operation and ID added to knowledge. • More compact than Anti-Entropy protocol. • A knowledge vector is a data structure containing a set of <replica, accept-stamp> pairs. • Efficient, robust, and flexible. Sending updates
Knowledge-Driven State-Based Protocols • Mobile devices are not equipped to store or manage any sort of operation log. • Each item in a replica is stored with a unique version number that is updated whenever the item is updated. • Each entry in a device’s knowledge vector is a <replica, version number> pair. • Scenario: • Exchange knowledge exceptions along with knowledge vector. • But a knowledge exception indefinitely stay. Sending updates
Schemes: • Ordered Delivery • Sequencers • Update Timestamps • Update Counters • Version Vectors • Operation Transformation Other Ordering Issues Ordering Updates
http://pooh.poly.asu.edu/Mobile/ClassNotes/Papers/ReplicatedData/ReplicatedDataManagementForMobileComputing.pdfhttp://pooh.poly.asu.edu/Mobile/ClassNotes/Papers/ReplicatedData/ReplicatedDataManagementForMobileComputing.pdf References