360 likes | 494 Views
Information Federation in Grid Information Services. Mehmet S. Aktas Advisor: Prof. Geoffrey C. Fox. Ph.D. Defense Exam May 3, 2007. Talk Outline. Use Cases and Challenges Research Issues Architecture Hybrid Grid Information Service Performance Evaluation Conclusions
E N D
Information Federation in Grid Information Services Mehmet S. Aktas Advisor: Prof. Geoffrey C. Fox Ph.D. Defense Exam May 3, 2007
Talk Outline • Use Cases and Challenges • Research Issues • Architecture • Hybrid Grid Information Service • Performance Evaluation • Conclusions • Contributions and Future Research Directions
Introduction • Grid Information Services in Service Oriented Architectures • 1) Large scale relatively static metadata as in catalog of all the world’s services • Interaction-independent, slowly-varying metadata • 2) Small scale highly dynamic metadata as in dynamic workflows for sensor integration and collaboration • Interaction-dependent, dynamic metadata • Dynamic Grid/Web Service Collections* • Dynamically assembled relatively small number of services (sub-grid) • Gathered at any one time to support a specific task • Generate dynamic metadata and have limited life-time [*] [ICCSE-05] Managing Dynamic Metadata as Context http://grids.ucs.indiana.edu/ptliupages/publications/maktas_iccse05.pdf
Motivating Use Cases • Geophysical Data Grids - CGL • Service Oriented Architecture for Geographical Information Systems Supporting Real Time Data Grids • Pattern Informatics (PI) - UC Davis • Earthquake forecasting code developed by Prof. John Rundle (UC Davis) and collaborators, uses seismic archives. • Interdependent Energy Infrastructure Simulation System (IEISS) - LANL • Models infrastructure networks (e.g. electric power systems and natural gas pipelines) and simulates their physical behavior, interdependencies between systems. • eSports System - CGL • Annotative collaboration application. Supports archive, replay, annotation of real-time video-conferencing streams.
A General Geographical Information System Grid Orchestration Scenario* [*] Building and Applying Geographical Information System Grids, Special Issue on Geographical information Systems and Grids based on GGF15 workshop, Concurrency and Computation: Practice and Experience http://grids.ucs.indiana.edu/ptliupages/publications/GISGrids_Concurrency_submitted.pdf
Background • Specifications for interaction-independent metadata • UDDI Specification • Glue Specification • EbXML Specification • Web Registry Service Specification • Specifications for interaction-dependent metadata • Point-to-point approach • Web Service Resource Framework (WSRF) Specification • Third-party approach • WS-Context Specification
Challenges • Standardization and Unification Issues • Customized Grid Information Services • Fat clients • Performance and Centralization Issues • Low performance • Low fault tolerance • UDDI Specification Issues • Lack of up-to-date, metadata-oriented registry • Lack of domain-specific metadata management • WS-Context Specification Issues • Limited data model and communication protocol
Research Issues I • Unification • How to combine different information services? • Federation • How to federate different information services? • Flexibility • How to accommodate broad range of specific application domains? • Interoperability • How to facilitate connection with wide range of information service clients?
Research Issues II • Performance • How to provide efficient information management strategies? • high-performance, scalable in-memory storage • efficient request distribution • adaptation to instantaneous client-demand changes • Fault-tolerance • How to provide efficient replica-content placement strategies? • Consistency • How to provide efficient consistency enforcement strategies?
Hybrid Grid Information Service Hybrid Grid Information Service • Unification • Federation • Unified Schema • Query/Publish API • Flexibility • Interoperability • Extended UDDI • WS-Context • Glue • … • Unification • Federation • Unified Schema • Query/Publish API • Flexibility • Interoperability • Extended UDDI • WS-Context • Glue • …
UDDI instance WS-Context instance Unified schema instance
Support for interaction-independent metadata: Extended UDDI Service • There are other extensions of UDDI • Supports different types of metadata • User-defined metadata • Functional metadata • Enables advanced query capabilities • Geo-spatial, metadata-oriented, domain-independent queries • Provides additional capabilities • Up-to-date service registry information (leasing) • Dynamic aggregation of capabilities of services • e.g. geospatial capabilities • [GGF16-Semantic Grid Workshop] Web Service Information Systems and Applications • http://www.semanticgrid.org/OGF/ggf16/papers/GGF16SemGrid-CGL.pdf • [SKG06 – IEEE Proceedings] XML Metadata Services • http://grids.ucs.indiana.edu/ptliupages/publications/SKG2006_CameraReady_FinalFix.pdf
Support for interaction-dependent metadata: WS-Context Service • OASIS Standard • Context Manager Service • Data model and communication protocol • Supports Dynamic Web Service Collections • Distributed state based systems • e.g. workflow-style grids • Session metadata management • e.g. real-time replay and session-failure recovery capabilities • Provides various capabilities • Notification capability • Up-to-date metadata registry (leasing) • [SKG05 – IEEE Proceedings] Information Services for Dynamically Assembled Semantic Grids • http://grids.ucs.indiana.edu/ptliupages/publications/skg05-56-maktas-ieee-version.pdf • [FGCS - 2007] Fault Tolerant High Performance Information Services for Dynamic Collections of Grid and Web Services • http://www.informatik.uni-trier.de/~ley/db/journals/fgcs/fgcs23.html#AktasFP07
Support for federated service metadata: Information Federation Support for federated service metadata: Information Federation • Federating Grid Information Services • Unified Schema and communication protocol • Extended UDDI, WS-Context and Glue Schemas • Approach taken for Unified Schema [Schema Integration] • Schema Matching • Identify overlapping information in given two Schemas: S1 and S2 • Schema Merging • Use the identified overlapping information to guide merge of S1 and S2 • Communication protocol • Publish: save_ (create, update), delete_ • e.g. save_service, delete_service • Inquiry: find_ , get_ • e.g. find_metadata, get_metadataDetail
Schema Matching: Identifying Matching Concepts Extended UDDI GLUE site: information about a site where services, computing elements and storage elements are aggregated Site service: all information about a Service Service ComputingElement StorageElement ServiceData EXtUDDI GLUE serviceAttributeEntity: Information about metadata associated to services ExtUDDI.businessEntity 1:N GLUE.site ExtUDDI.businessService 1:1 GLUE.service ExtUDDI.serviceAttributeEntity 1:1 GLUE.serviceData serviceData: information associated to a service
Schema Merging: Unifying Schemas businessEntity: information about the party who publishes information about entities has references to business contains one to n site publisherAssertions:Defines relationships between two business entities site:all information about a concept to aggregate services and resources tModel:Description of Specifications for services or taxonomies site contains one to n computing element site contains one to n storage element site contains one to n services business contains one to n services has references to storageElement: all information required to manage storage resources computingElement:all info. required to manage computing resources service:all information about a service service contains one to n technical information service contains one to n metadata bindingTemplate:Technical information about a service point metadata:information about metadata associated to service Extended UDDI Unified Schema GLUE ExtUDDI.businessEntity ExtUDDI&GLUE.businessEntity ExtUDDI&GLUE.site GLUE.site ExtUDDI.businessService ExtUDDI&GLUE.service GLUE.service ExtUDDI .serviceAttribute ExtUDDI.metadata GLUE.serviceData Example Mappings =>
Key Design Features • In-Memory storage • High performance metadata access/storage • Access distribution • Redirecting client request to an appropriate replica server • Replica content placement for performance • Dynamic replication • Moving/replicating metadata to where they are demanded. • Replica content placement for fault-tolerance • Permanent replication • Replicating data on an appropriate replica server • Consistency enforcement • Ensuring all replicas of a data to be the same
In-Memory Storage • Light-weight implementation of JavaSpaces • Data sharing, associative lookup • Integrated in-memory storage capability • Ex: UDDI-type, WS-Context-type • Today’s servers are capable of holding such small size metadata in memory. • Persistency • Newly-inserted/updated metadata is backed-up into appropriate information service back-end. • If the physical memory wiped out, at the bootstrap, database-metadata is inserted into the in-memory storage from the last-backup.
Access Distribution and Dynamic Replication • Broadcast-based request dissemination • Pub-sub system for message broadcast • Requests are broadcast only to those servers that can answer • No need to keep track of metadata locations • Replica-content placement for performance • Popular copies are moved/replicated where they are demanded • Dynamic migration/replication algorithm* • Self-adaptation to changing client demands [*] Rabinovich et al, A dynamic Object Replication and Migration Protocol for an Internet Hosting Service Proceedings of the 19th IEEE International Conference on Distributed Computing Systems , 1999 http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=880582
Access Distribution ExperimentBenchmark Methodology Time = T1 + T2 + T3 T1 T2 T3 One-broker case Two-broker case
Experiment Results • Overhead of access distribution is only few milliseconds. • Continuous access distribution operation does not degrade the performance.
Experiment Results • The overhead of distribution remains the same regardless of the network distances between nodes.
Dynamic Replication Performance ExperimentBenchmark Methodology Time = T1 + T2 + T3 T1 T2 T3
Experiment Results • The decrease in average latency shows that the algorithm manages to move replica copies to where they are demanded.
Replication and Consistency • Permanent replication for fault tolerance • Each node keeps information about other servers • Replica Server(s) Selection • Load and proximity metrics • Selection algorithm by Rabinovich et al • Unicast-based replica-content placement • Primary-copy approach • Updates are unicast to primary-copy • Updates are broadcast by the primary-copy holder to • a) permanent-copy holding servers • b) applications with high consistency requirements
Fault-tolerance Experiment Benchmark Methodology Time = T1 + T2 + T3 T1 T2 T3 One-broker case Two-broker case
Experiment Results • Overhead of replica-content placement is only few milliseconds. • Overhead of replica-content placement increases in the order of milliseconds as the fault-tolerance level increase.
Consistency Enforcement ExperimentBenchmark Methodology Time = T1 + T2 + T3 T1 T2 T3 One-broker case Two-broker case
Experiment Results • Overhead of consistency enforcement is few milliseconds. • The cost of consistency enforcement remains the same regardless of distribution of the network nodes.
Contributions • Systems Research • Hybrid Grid Information Service Architecture • Unification, Federation and Interoperability of grid information services • Strategies for high-performance, scalable in-memory storage • Strategies for efficient distribution, replica-content placement, consistency enforcement by utilizing pub-sub based messaging schemes • Self-adaptation to changing-client demands • Extensions to semantics of UDDI and WS-Context Web Service Specifications • Detailed evaluation of the system components and algorithms • Systems Software • An implementation of Extended UDDI Specification • Geographical Information Systems-specific, metadata-oriented • An implementation of WS-Context Specification • Session metadata management for collaboration grids, distributed state management for workflow-style grids • An implementation of Hybrid Grid Information Service Architecture
Future Research Directions • Use the proposed approach to solve OGF Grid Interoperation Now (GIN) problem for information services • Investigate an information security mechanism for the decentralized Hybrid Service • Example motivating application case: Pattern Informatics application • Applying Hybrid Service to broader range of application use cases • Web 2.0/Folksonomy information services