1 / 29

Managing Dynamic Metadata and Context in Distributed Systems

Explore the management of dynamic metadata and context in distributed systems for correlating activities, managing events, and enabling real-time capabilities. Discover the application use domains, motivation, and characteristics of different domains.

waynerogers
Download Presentation

Managing Dynamic Metadata and Context in Distributed Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Managing Dynamic Metadata and ContextMehmet S. Aktas <maktas@cs.indiana.edu>

  2. Outline • Introduction • Problem Statement, Hypothesis, Design Goals • Literature Survey • Research Issues • Milestones • Contributions • Summary

  3. Context • Def: "Context is any information that can be used to characterize the situation of an entity, where an entity can be a person, place, or computational object.“ Dey A. et al, 1999 • Context is metadata associated to both services and their activities • Context can be • independent of any interaction • static context • Examples: type or endpoint of a service, less likely to change • dynamic context • Examples: throughput of a service, likely to change over time • generated as result of interaction • information associated to an activity or session • Examples: session-id, URI of the coordinator of a session

  4. Gaggle of Services • Gaggle of Services • are set of actively collaborating managed services put together for a particular functionality, such as collaboration, visualization or sensor Grid • collaborate for a particular common goal • Example: emergence preparedness and response • are actively generate events as result of interactions • are very small part of the whole Grid

  5. Motivation • Current Grid Information Services provide information describing services independent of their interactions. • We need management of all information associated with services for; • correlating activities of widely distributed services • workflow-style, SOA based applications • management of events especially in multimedia collaboration • distributed session management • for instance; audio, video, audio/video meetings in Chinese Olympics

  6. Motivation II • More reasons for management of Context • enabling uniform query capabilities to both dialog or monolog context information • “Give me list of services satisfying C:{a,b,c..} QoS requirements and participating S:{x,y,z..} sessions” • enabling real-time replay/playback capabilities in collaboration based sessions • enabling session failure recovery

  7. Application Use Domain • Multimedia Collaboration domain: Global MMCS • multiple A/V services talk to various collaboration clients and services • defines a general session collaboration protocol (XGSP) • XSGP enables different collaboration tools to talk to each other e.g. AccessGrid, H.323 • needs a distributed session management systems • Characteristics of the domain • widely distributed services • metadata of events (archival data) • mostly read-only • persistent, but lifetime is bounded to lifetime of events

  8. Application Use Domain - II • Workflow-style distributed application: Geographic Information System Grid • sensor grid data services generates events when a certain magnitude event occurs • firing off various codes, filtering, analyzing raw data, generating images, maps • needs a distributed context management to correlate workflow activities • Characteristics of domain • any number of widely distributed services can be involved • conversation metadata • transient • multiple writers

  9. 1 WMS GUI WFS <context xsd:type="ContextType"timeout=“100"> <context-id>http://../abcdef:012345<context-id/> <context-service>http://.../HPSearch</ context-service> <content>http://danube.ucs.indiana.edu:8080\x.xml</content> </context> <context xsd:type="ContextType"timeout=“100"> <context-service>http://.../HPSearch</ context-service> <parent-context>http://../abcdef:012345<parent-context/> <content> shared data for HPSearch activity </content> <activity-list mustUnderstand="true" mustPropagate="true"> <service>http://.../DataFilter1</service> <service>http://.../PICode</service> <service>http://.../DataFilter2</service> </activity-list> </context> <context xsd:type="ContextType"timeout=“100"> <context-service>http://.../HPSearch</ context-service> <parent-context>http://../abcdef:012345<parent-context/> <content> profile information related WMS </content> </context> <context xsd:type="ContextType"timeout=“100"> <context-service>http://.../WMS</ context-service> <activity-list mustUnderstand="true" mustPropagate="true"> <service>http://.../WMS</service> <service>http://.../HPSearch</service> </activity-list> </context> <context xsd:type="ContextType"timeout=“100"> <context-service>http://.../HPSearch</ context-service> <content> HPSearch associated additional data generated during execution of workflow. </content> </context> 2 http://..../..../..txt <?xml version="1.0" encoding="UTF-8"?> <soap:Envelope xmlns:soap="http://www.w3..."> <soap:Header encodingStyle=“WSCTX URL" mustUnderstand="true"> <context xmlns=“ctxt schema“ timeout="100"> <context-id>http..</context-id> <context-service> http.. </context-service> <context-manager> http.. </context-service> <activity-list mustUnderstand="true" mustPropagate="true"> <p-service>http://../WMS</p-service> <p-service>http://../HPSearch</p-service> </activity-list> </context> </soap:Header> ... 4 5,6,7 Data Filter 3,9 HP Search PI Code 8 Data Filter http://..../..../tmp.xml Context Information Service • session associated dynamic metadata • user profile • activity associated dynamic metadata • service associated dynamically generated metadata What are the examples of dynamically generated metadata in a real-life example? session shared state service associated user profile SOAP header for Context activity 3,4: WMS starts a session, invokes HPSearch to run workflow script for PI Code with a session id 5,6,7: HPSearch runs the workflow script and generates output file in GML format (& PDF Format) as result 8: HPSearch writes the URI of the of the output file into Context 9: WMS polls the information from Context Service 10: WMS retrieves the generated output file by workflow script and generates a map

  10. Problem Statement What is a novel process of building Information Services, maintaining dynamic session-related metadata of widely distributed services, providing uniform interface to both interaction-independent and conversation-based context?

  11. Hypothesis • A fault-tolerant, high performance, scalable information system • maintaining widely distributed dynamically generated metadata for Gaggle of Services • providing uniform interface to context information • utilization of existing Grid Information Services for interaction-independent context to improve search capabilities • enabling coordination of widely distributed services in Gaggles • workflow-style Grid applications • enabling distributed event management and various capabilities for A/V conferencing applications • discovery of entities in a session • enabling playback/replay capabilities, • enabling session failure recovery

  12. Architectural Design Goals • Key Design Goals of our Design • scalability • with respect to # • widely distributed services • performance • high responsiveness, reduced access latency • fault tolerance • high availability of information • robust to replica crashes • flexibility • accommodate broad range of application domains • read-dominated, read/write dominated

  13. Literature Survey • Main Stream Grid Information Services • MDS, R-GMA, UDDI (Grimories) • Specifications for stateful service interactions • WS-CAF, WSRF, WS-Metadata Exchange • Linda TupleSpaces coordination model

  14. Main Stream Grid Information Services

  15. Limitations in Grid Information Services • Lack of support for session related dynamic metadata • MDS4 adopts WSRF approach which does not scale managing activities of multiple services sharing same state • Lack of support for advanced query capabilities • ex: “Give me list of WFS services participating “fault displacement calculations” workflow session where the service connected by a network path over 2MB/sec of bandwidth with max 100 msec of latency.”

  16. WS-CAFWS-Context - Key Concepts • WS Composite Application Framework (WS-CAF) • WS-Context, WS-Coordination, WS-Transaction Mngmt. • WS Context • defines context, context service and mapping on SOAP • shared data to correlate service activities • context information dependent on the type of the activity • transactional activity: the URI of the coordinator in a session • context service maintains associated context • participants of an activity register with context service for lifecycle of that activity

  17. Web Service ResourceFrameworkKey Concepts • defines standard interfaces and behaviors for distributed system integration • standard XML-based information model • standard interfaces for push and pull mode access to service data • enables every service to expose state data for query, update • monitoring shared state • models resource state as private to a service • supports resource oriented approach for stateful interactions • requires the identity of the resource to be passed in the SOAP message

  18. WS-Metadata ExchangeKey Concepts • WS Metadata is key to interactions • WS-Policy: capabilities, requirements, general characteristics of services • WSDL: describes message operations, supported network protocols used by services • WS-Metadata Exchange • provides mechanism for sharing information about the capabilities of individual Web services • allows querying a WS Endpoint to retrieve metadata about what to know to interact with them • defines request/response message pairs to retrieve WS metadata

  19. Limitations in Specifications for Service Communication • WSRF does not actually accomplish state management by just enabling access and update rights • heterogeneous service environment • workflow-style applications • WSRF, WS-Metadata Exchange models service metadata private to a service • does not scale in managing activities of multiple services • WS-Metadata Exchange defines only how to access interaction-independent metadata • WS-Context is promising it has limitations • simple framework for context management • limited query capability • does not address distributed management aspects of context metadata

  20. TupleSpaces Paradigm • a communication paradigm • space-based asynchronous communication • first described in Linda project in 1982 at Yale • pioneered by David Gelernter • Linda is a coordination language using primitive operations on shared data in shared space • data-centric coordination model • communication units are tuples • data-structure consisting of one or more typed fields • a TupleSpace is an intermediary container

  21. JavaSpaces [Sun Microsystems] • JavaSpaces is an object oriented • strongly influenced by Linda model • Java based, platform independent • spaces are transactionally secure • mutual exclusive access to objects • spaces are persistent • temporal, spatial uncoupling • spaces are associative • content based search • limitations • centralized • inefficient reading/writing performance • dependent on stack of different software layers

  22. Research Issues • Recap on key design goals: • scalability, performance, fault tolerance • research issues related replicating dynamic metadata • deployment (dynamic vs. static replication) • Where to place replicas of given context metadata? • What are the properties of new location must meet? • How to know if replica location stable? • How can we provide tailored replication based on R/W properties?

  23. Research Issues II • consistency • What is the appropriate consistency model? • How do replicas exchange replica updates in what direction? • How can we utilize an ordering capability based on NTP (Network Time Protocol) to provide consistency on the replicated context metadata? • performance • efficient metadata access • How to choose a replica server to best serve client request? • How to avoid performance degradation due to repetitive queries?

  24. Research Issues III • scalability • load balancing strategies • How to manage load balancing? • other research issues • replay/playback capabilities • How to enable real-time replay/playback capabilities? • session recovery • How to enable session recovery? • uniform interface to context • How to provide a uniform interface to context?

  25. Milestones • Implementation of TupleSpaces paradigm • Uniform Update and Query (search, discovery) Services • Sequencer Service • ensures that an order is imposed on actions/events that take place in a session

  26. Milestones II • Storage (Replication) Service • decide # and placement of replicas • enable autonomous behavior • support robust behavior for replica crashes • Access (Request Distribution) Service • distribute request among object replicas • Expeditor Service • generalized caching mechanism • reduce storage access due to repetitive queries

  27. Evaluation of Hypothesis • Qualitative evaluation • Does the system delivers what it promises in terms of functionality? • Example test domains: Geographical Information System Grid, Global MMCS • How does the system function incase of replica crashes? • Quantitative evaluation • How well the system delivers what it promises in terms of performance? • What are the performance cost and gains brought together with scalability and fault tolerance? • trade offs between fault-tolerance, scalability and performance • what limitations does the trade offs impose to the practical use of my system? • what is # of replicas needed for certain availability? • what is the cost of fault tolerance? • what is the cost of scalability?

  28. Contribution of this Thesis • Identifies a novel approach for building Information Services managing session related context. • Identifies a novel approach for providing fault tolerance and scalability while providing high performance when managing dynamic metadata • Identifies a dynamic replication mechanism for widely distributed dynamic and transient metadata

  29. Summary • This thesis addresses following problems • Lack of support in Grid Information Services for context (session-related dynamic metadata) management to correlate activities in workflow-style applications: • by providing a novel approach for management of widely distributed, shared session-related dynamic metadata • Lack of support in Grid Information Services to provide distributed session management: • by providing distributed event management system enabling session failure recovery or replay/playback capabilities • Lack of search capabilities in Grid Information Services: • by providing uniform search interface to both interaction independent and conversation-based context enabling service discovery through events

More Related