Efficient Inconsistency Detection in Internet-Scale Systems

An Efficient, Low-Cost Inconsistency Detection Framework for Data and Service Sharing in an Internet-Scale System Yijun Lu†, Hong Jiang†, and Dan Feng* †University of Nebraska-Lincoln, USA *Huazhong University of Science and Technology, China

Introduction • Consistency control is important • Active replication is essential to data security • Systems need to handle updates • Thus, consistency needs to be maintained • Challenges • Requirement is difficult to predict • Overhead to maintain consistency is high • In Grid-like systems, network is unreliable

Two Flavors: • Inconsistency avoidance • To avoid inconsistency in the first place. Incur high maintenance cost and support a specific application. • Examples: • Strong consistency • NFS consistency • etc. • Optimistic consistency protocol? • Pre-defined • Inconsistency detection • Our new approach • There is no need to define consistency protocols

Inconsistency Detection • Features • No need to pre-define consistency level • Detect inconsistency among nodes in a timely manner • Resolve inconsistencies based on application semantics • Advantages • Efficient: Timely inconsistency detection • Low-cost: No prohibitive cost associated with a given consistency protocol • Versatile: Several applications with different consistency requirement can run simultaneously

Overview of IDF

Efficient Detection Focus of this paper

Outline • Background • Design • Evaluation • Inconsistency resolution • Related work • Current status

Background • RanSub • Locate disjoint content within a system • Two processes: collect/distribute • Used to exchange nodes’ information among one another • Gossip-based data dissemination • A node disseminates non-duplicate packets to random set of neighbors every T seconds. • Each message travels a certain number of hops • Used to distribute updates

Design of Timely Detection • Basic idea • Two layers • Top layer captures most inconsistencies fast • Bottom layer catch all the missed inconsistencies • Terms • Temperature: the frequency that a user updates a certain file in a period of time.

1. Measure the Updating Patterns • Importance • Use nodes’ updating patterns as an indicator of their interest in a certain file, called temperature. • The higher the temperature, the more likely a node is the “trouble maker”—It causes most inconsistencies. • Strategy • A node tracks its updating history for a certain file during a certain period of time.

2. Learning the Updating Patterns • Use RanSub • Collect nodes’ updating patterns • Each node learns a random disjoint set with each distribution • Possible improvement • RanSub uses a single multicasting tree • This cannot tolerate a single interior node failure • Deploy a multicasting forest?

3. Temperature Collection/Dist. • Why does this matter? • Network bandwidth cost could be prohibitive • Think the total number of files in a computer • Interest-group based approach • Nodes only report the temperature of files that they are interested in. • In distribution, an interior node only relays the temperature of files that are interested in by nodes in its sub-tree • Result • It can be supported by any connectivity, including a dial-up connection.

4. Two-layer detection • Two layers • Solid line: top layer • Dotted line: bottom layer • Version vector is used to detect inconsistencies • Mechanism • Travel the top layer first • If no inconsistency found in top layer • Go to the bottom layer An example:

5. Caching & Garbage Collection • Caching • Cache temperature information • Cache routing information among top layer, then smart decision can be made to save traversal time • Garbage collection • Keep the temperature fresh • Assign time stamp to each piece of temperature information • Temperature information expires when the an information is older than a threshold.

6. Discussion • Till now, we treat the term “update” generically • Only one kind of “update” • Several forms of update exist, indeed • Creating • Modifying • Deleting • It does not matter in the detection part, but does matter when we design the APIs for applications

Evaluation 1: Failure rate • Why do we care about it? • Top layer detects inconsistencies much faster than bottom layer • It is desirable that most inconsistencies are captured by the top layer • Analysis result • In worst case scenario, two sub-cases exist • Case 1: failure rate 0.04% • Case 2: failure rate 18.9% • See paper for clarification • Main message • Top layer captures the vast majority of inconsistencies!

Evaluation 2: Maintenance Cost • Metric • # of messages received by each node incurred by the maintenance process • Simulation setup • 1000 nodes in the network. • Simulation runs 800 seconds. • Result • Max bandwidth cost: < 6KB/s

Inconsistency Resolution • Overview • Utilize detection result • Support multiple applications with different requirement for consistency control • Semantic-based resolution (ongoing & future work) • Get semantics • Hint-based • Middleware detection • Resolution schemes • Middleware automatically resolves inconsistency • Ask users’ preference before reacting

Related Work • TACT • Explore trade-off between consistency level and performance • DENO • Peer-to-Peer scheme, yet to maintain strong consistency • Lpbcast • Pure gossip-based protocol • Quorum system • Could fails in the presence of node failure

Current Status • Dealing with inconsistency resolution • Support applications. • Implementing a prototype on Planet-Lab • Investigating the implications of the new framework to large-scale distributed systems in general

Efficient Inconsistency Detection in Internet-Scale Systems

Efficient Inconsistency Detection in Internet-Scale Systems

Presentation Transcript

Feng Shui

Some slides adapted from University of Nebraska Lincoln (www.lancaster.unl.edu)

The Mathematical Education of K-8 Teachers at t he University of Nebraska-Lincoln, a Mathematics – Mathematics

ModRED : A Modular Self-Reconfigurable Robot for Autonomous Exploration

Nebraska Assessment Cohort

The Effects of Financial Literacy on Credit Card Behaviors

Abstract

The Effects of Financial Literacy on Credit Card Behaviors

IMPACT OF STUDY ABROAD ON STUDENT LEARNING AND DEVELOPMENT: UNIVERSITY OF NEBRASKA-LINCOLN

University of Nebraska For NETA

David Beukelman University of Nebraska, Lincoln With support from: AAC-RERC/NIDRR/USDE

Method

C omputer and E lectronics En gineering CEEN

Li Jiang 1 , Qiang Xu 1 , Krishnendu Chakrabarty 2 , and T. M. Mak 3

International Thespian Festival Lincoln, Nebraska

Yijun Li, Hongyi Wu, Nian-Feng Tzeng, Dmitri Perkins, and Magdy Bayoumi

Dynamics and Radiation in Ultra-intense Laser-Ion Interactions

Zhongying Niu, Hong Jiang, Ke Zhou , Dan Feng,

Xiaojun Feng , Jin Zhang, and Qian Zhang Hong Kong University of Science and Technology

Mathematics Education at the University of Nebraska Lincoln Jim Lewis

Sumanth J.V, David R. Swanson and Hong Jiang University of Nebraska-Lincoln

The University of Nebraska – Lincoln Welcomes You