1 / 20

Efficient Inconsistency Detection in Internet-Scale Systems

Explore a low-cost framework for real-time detection and resolution of data inconsistencies in large-scale systems. The proposed approach eliminates the need for predefined protocols, ensuring timely detection and versatile application support.

asheehan
Download Presentation

Efficient Inconsistency Detection in Internet-Scale Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Efficient, Low-Cost Inconsistency Detection Framework for Data and Service Sharing in an Internet-Scale System Yijun Lu†, Hong Jiang†, and Dan Feng* †University of Nebraska-Lincoln, USA *Huazhong University of Science and Technology, China

  2. Introduction • Consistency control is important • Active replication is essential to data security • Systems need to handle updates • Thus, consistency needs to be maintained • Challenges • Requirement is difficult to predict • Overhead to maintain consistency is high • In Grid-like systems, network is unreliable

  3. Two Flavors: • Inconsistency avoidance • To avoid inconsistency in the first place. Incur high maintenance cost and support a specific application. • Examples: • Strong consistency • NFS consistency • etc. • Optimistic consistency protocol? • Pre-defined • Inconsistency detection • Our new approach • There is no need to define consistency protocols

  4. Inconsistency Detection • Features • No need to pre-define consistency level • Detect inconsistency among nodes in a timely manner • Resolve inconsistencies based on application semantics • Advantages • Efficient: Timely inconsistency detection • Low-cost: No prohibitive cost associated with a given consistency protocol • Versatile: Several applications with different consistency requirement can run simultaneously

  5. Overview of IDF

  6. Efficient Detection Focus of this paper

  7. Outline • Background • Design • Evaluation • Inconsistency resolution • Related work • Current status

  8. Background • RanSub • Locate disjoint content within a system • Two processes: collect/distribute • Used to exchange nodes’ information among one another • Gossip-based data dissemination • A node disseminates non-duplicate packets to random set of neighbors every T seconds. • Each message travels a certain number of hops • Used to distribute updates

  9. Design of Timely Detection • Basic idea • Two layers • Top layer captures most inconsistencies fast • Bottom layer catch all the missed inconsistencies • Terms • Temperature: the frequency that a user updates a certain file in a period of time.

  10. 1. Measure the Updating Patterns • Importance • Use nodes’ updating patterns as an indicator of their interest in a certain file, called temperature. • The higher the temperature, the more likely a node is the “trouble maker”—It causes most inconsistencies. • Strategy • A node tracks its updating history for a certain file during a certain period of time.

  11. 2. Learning the Updating Patterns • Use RanSub • Collect nodes’ updating patterns • Each node learns a random disjoint set with each distribution • Possible improvement • RanSub uses a single multicasting tree • This cannot tolerate a single interior node failure • Deploy a multicasting forest?

  12. 3. Temperature Collection/Dist. • Why does this matter? • Network bandwidth cost could be prohibitive • Think the total number of files in a computer • Interest-group based approach • Nodes only report the temperature of files that they are interested in. • In distribution, an interior node only relays the temperature of files that are interested in by nodes in its sub-tree • Result • It can be supported by any connectivity, including a dial-up connection.

  13. 4. Two-layer detection • Two layers • Solid line: top layer • Dotted line: bottom layer • Version vector is used to detect inconsistencies • Mechanism • Travel the top layer first • If no inconsistency found in top layer • Go to the bottom layer An example:

  14. 5. Caching & Garbage Collection • Caching • Cache temperature information • Cache routing information among top layer, then smart decision can be made to save traversal time • Garbage collection • Keep the temperature fresh • Assign time stamp to each piece of temperature information • Temperature information expires when the an information is older than a threshold.

  15. 6. Discussion • Till now, we treat the term “update” generically • Only one kind of “update” • Several forms of update exist, indeed • Creating • Modifying • Deleting • It does not matter in the detection part, but does matter when we design the APIs for applications

  16. Evaluation 1: Failure rate • Why do we care about it? • Top layer detects inconsistencies much faster than bottom layer • It is desirable that most inconsistencies are captured by the top layer • Analysis result • In worst case scenario, two sub-cases exist • Case 1: failure rate 0.04% • Case 2: failure rate 18.9% • See paper for clarification • Main message • Top layer captures the vast majority of inconsistencies!

  17. Evaluation 2: Maintenance Cost • Metric • # of messages received by each node incurred by the maintenance process • Simulation setup • 1000 nodes in the network. • Simulation runs 800 seconds. • Result • Max bandwidth cost: < 6KB/s

  18. Inconsistency Resolution • Overview • Utilize detection result • Support multiple applications with different requirement for consistency control • Semantic-based resolution (ongoing & future work) • Get semantics • Hint-based • Middleware detection • Resolution schemes • Middleware automatically resolves inconsistency • Ask users’ preference before reacting

  19. Related Work • TACT • Explore trade-off between consistency level and performance • DENO • Peer-to-Peer scheme, yet to maintain strong consistency • Lpbcast • Pure gossip-based protocol • Quorum system • Could fails in the presence of node failure

  20. Current Status • Dealing with inconsistency resolution • Support applications. • Implementing a prototype on Planet-Lab • Investigating the implications of the new framework to large-scale distributed systems in general

More Related