1 / 55

A Scalable Distributed Information Management System (SDIMS)

A Scalable Distributed Information Management System (SDIMS). P. Yalagandula, M. Dahlin cs.utexas.edu SIGCOMM 2004. Outline. Introduction Goal : Aggregation Innovation Flexibility Scalability Robustness Implementation Evaluation Conclusions. Introduction. Why SDIMS ?

rusti
Download Presentation

A Scalable Distributed Information Management System (SDIMS)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Scalable Distributed Information Management System (SDIMS) P. Yalagandula, M. Dahlin cs.utexas.edu SIGCOMM 2004

  2. Outline • Introduction • Goal : Aggregation • Innovation • Flexibility • Scalability • Robustness • Implementation • Evaluation • Conclusions

  3. Introduction • Why SDIMS ? • Monitor, querying, reacting to changes are core components of applications such as system management, service placement, data sharing and caching, etc. • SDIMS in a networked system would provide a distributed operating system backbone and facilitate the development and deployment of new distributed service.

  4. Introduction (cont.) • Fundamental • Hierarchical aggregation • A node access detailed views of nearby information and summery views of global information. • A hierarchical system aggregate information through reduction trees.

  5. Introduction (cont.) • A SDIMS should have four properties. • Scalable • Flexibility • Administrative isolation • Robustness

  6. Scalable • SDIMS should accommodate large numbers of nodes. • SDIMS should allow applications to install and monitor large numbers of data attributes.

  7. Flexibility • SDIMS should accommodate a range of applications and attributes. • Read-dominated attribute (rarely change) • Num of CPUs • Write-dominated attribute (change often) • Num of processes • SDIMS should leave the policy decision of tuning replication to applications.

  8. Administrative isolation • Nodes can be arranged in an organizational or administrative hierarchy. • Domain-based control. • Monitor • Query

  9. Robustness • SDIMS should adapt to reconfigurations in a timely fashion when node failures or disconnections. • SDIMS should provide mechanisms so that applications can tradeoff the cost of adaptation with consistency level of aggregated results when reconfigurations occur.

  10. Related Works • Astrolabe • A single logical aggregation tree that mirrors a system administrative hierarchy. • A general interface for installing new aggregation functions. • An unstructured gossip protocol for disseminating information and replicating all aggregated attribute values for a sub-tree to all nodes in the sub-tree.

  11. Related Works (cont.) • Any nodes can answer queries by using local information. • Not scalable. (replication) • Not flexibility. (Type of attribute) • Solution : P2P Go to DHT

  12. Tree • For each level in the hierarchy, the agent maintains a record with the list of child zones (and their attributes), and which child zone represents its own zone (self). Back to Astrolabe

  13. Gossip protocol • Periodically, each agent selects some other agent at random and exchanges state information with it. • If the two agents are in the same zone, the state exchanged relates to MIBs in that zone. • If the two agents are in different zone, they exchange state associated with the MIBs of their least common ancestor zone. Back to Astrolabe

  14. Related Works (cont.) • DHT • SkipNet, CAN, Pastry, Chord, Tapestry

  15. Problem • How to scalable map different attributes to different aggregation tree in a DHT mesh ?{physical network vs overlay network} • How to provide flexibility in the aggregation to accommodate different application requirement ?{flexible API for installing and controlling system}

  16. Problem ? • How to adapt a DHT mesh to attain administrative isolation property ? {virtual organization} • How to provide robustness without unstructured gossip and total replication ?{cache; pre-computing or on-demand re-aggregation}

  17. Aggregation Abstraction

  18. Aggregation Abstraction • Each physical node in the system is a leaf in the tree. • An internal non-leaf, which we call virtual node, is simulated by one or more physical nodes at the leaves of the sub-tree for which the virtual node is the root.

  19. Aggregation Abstraction (cont.) • Each physical node has local data stored as a set of (attributeType, attributeName, value) tuples. • The system associates an aggregation function ftype with each attribute type.

  20. Aggregation Abstraction (cont.) • For each level-i sub-tree Ti in the system has an aggregate valueVi, type, name for each (attributeType, attributeName) pair. • The aggregate value for a level-i sub-tree Ti is the aggregate function for the type, ftype computed across the aggregate values of each of Ti‘s k children.Vi, type, name = ftype

  21. Aggregation Abstraction (cont.) • Example of ftype • Avg(V1, …, Vn)=1/n 錯誤 • SUM(V1, …, Vn) = 正確 • Aggregation function satisfy the hierarchical computation property

  22. Aggregation Abstraction (cont.) node Virtual node

  23. Innovation • Flexibility • Scalability • Administrative isolation • Robustness

  24. Flexibility • Operation API • Install • Update • Prob

  25. Install Operation • The Install operation installs an aggregation function in the system.

  26. Prob Operation 使用於強制reconfigure,更新所有cache

  27. Prob Operation (cont.) • When node A issues a continuous probe at level l for an attribute, then updates for the attribute at any node in A’s level-l ancestor’s subtree are aggregated up to level l and is propagated down along the path from the ancestor to A.

  28. Update and Prob Operation

  29. Update and Prob Operation (cont.)

  30. Update Operation API • Update-UpK-downj :Up to kth level and propagates the aggregate values of a node at level l downward for j levels. (l ≤ k)

  31. Operation API K Update-UpK-downj Level-4 Level-3 L Level-2 J Level-1 Level-0

  32. Dynamic Adaptation • A SDIMS implementation can dynamically adjust its up/down strategies for an attribute based on its measured read/write frequency.

  33. Scalability • SDIMS defines the aggregation abstraction to mesh with its underlying scalable DHT system. • SDIMS refines the basic DHT abstraction to form an Autonomous DHT (ADHT) to achieve the administrative isolation properties

  34. Mapping to DHT 1

  35. Mapping to DHT • Aggregating an attribute along the aggregation tree is corresponding to DHTtreek for k =hash(attribute type, attribute name) • Different attributes will be aggregated along different trees.

  36. Administrative isolation • For security • Updates and Probes are not accessible outside the domain • For availability • Queries for values in a domain are not affected by failures of nodes in other domains • For efficiency • Domain-scoped queries can be simple and efficient.

  37. Administrative isolation • Autonomous DHT • Path Locality: Search paths should always be contained in the smallest possible domain. • Path Convergence: Search paths for a key from different nodes in a domain should converge at a node in that domain.

  38. 應合併 Administrative isolation Domain univ. Domain dept. L0: host L2: univ. isolation property is violated

  39. Administrative isolation Domain dept. Domain univ. L0: host L2: dept. Autonomous DHT

  40. Robustness • ADHT • Distributed Computing (?) • Aggregation Management Layer (AML) • Lazy re-aggregation • On-demand Re-aggregation • Replication in Space

  41. 2 Layer arch. : ADHT and AML • The ADHT layer informs the AML layer about reconfigurations in the network. • NewParent • FailedChild • NewChild

  42. Implementation DifferentOverlay(?)

  43. MIB • Child MIBs containing raw aggregate values gathered from children. • Reduction MIB containing locally aggregated values across this raw information • Ancestor MIB containing aggregate values scattered down from ancestors.

  44. Implementation parent child

  45. Implementation (cont.) • attribute key : Use for retrieving data by aggregation function. • (attributetype, attribute name)

  46. Implementation (cont.) • A node acts • as leaf for all attribute keys • as a level-1 subtree root for keys whose hash matches the node’s ID in b prefix bits. • as a level-i subtree root for keys whose hash matches the node’s ID in the initial i * b bits. • as the system’s global root for attribute keys whose hash matches the node’s ID in more prefix bits than any other node

  47. Evaluation 更新自己的MIB 更新全部Node的MIB Up-All, Down 0 Monitor的attribute變化少 Monitor的attribute變化多

  48. Evaluation (cont.) the session size is set to 8 (domain size), the branching factor is set to 16 Message size nodes

  49. Evaluation (cont.) Bf: Branch Factor Average path length to root

  50. Evaluation (cont.) Bf: Branch Factor

More Related