520 likes | 655 Views
From Interactive Applications and Knowledge Based Workflows to Transparent Semantic Grid (from X# and ~# to # ). Marian Bubak Institute of Computer Science and ACC CYFRONET AGH Cracow , Polan d bubak@agh.edu.pl and CrossGrid, K-WfGrid, and (future) GridSpace Collaborations.
E N D
From Interactive Applications and Knowledge Based Workflows to Transparent Semantic Grid (from X# and ~# to # ) Marian Bubak Institute of Computer Science and ACC CYFRONET AGH Cracow, Poland bubak@agh.edu.pl and CrossGrid, K-WfGrid, and (future) GridSpace Collaborations
Overview • Trends in applications and computing systems • CrossGrid - Environment for application steering • Workflow applications • K-WfGrid - knowledge-based environment • New proposal – GridSpace: transparent semantic grid
Trends in Applications • Large scale numerical simulations • Computationally demanding data analysis • Distributed computing and storage • Remote access to experimental equipments • A need for integration heterogeneous environments into one application • Collaborative problem solving • Virtual organisations
Evolution in Distributed Computing • Distributed systems operate in heterogenous environments • Large scale resource sharing • Interoperability • Communication via protocol stacks • Service oriented architectures • Open standard integration • Virtualisation of resources • Complexity of computing systems close to the limits of human capability
CrossGrid www.eu-crossgrid.org • 21 partners • 2002-2005, EC IST F2 • Coordinated by CYFRONET • Research areas • CrossGrid Applications • Grid Tool Environment • New Grid Services • International Testbed • Architecture
Main CrossGrid Objectives • New category of Grid-enabled applications • compute- and data-intensive • distributed • near-real-time response (a person in a loop) • layered • New programming tools • Grid more user friendly, secure and efficient • Interoperability with other Grids • Implementation of standards
Blood Flow Visualization Blood Flow Simulation MR Image Storage GVK LB Solver ce2 (NIKHEF) Patient in an MRI Scanner MR Image Segmentation Blood Flow Rendering in VR Portal Login and Grid Proxy Creation Virtual Node Navigation and Grid Data Transfer Bypass Placement and LB Mesh Generation Simulation Job Submission Job Monitoring Virtual Medical Support Medical Data ce (Linz) globus-lumc (Leiden) D-VRE mn (Virtual Operating Theatre at the UvA)
Flood Simulation Data sources Meteorological simulation Hydrological simulation Hydraulic simulation Portal
High Energy Physics Applications • Interactive ANN training • MPI • Parallel Sleuth algorithm • ATLAS DAQ events remote processing • Feasibility study of using the Grid to process difficult events in one of the LHC experiments (DAQ) • Distributed data access prototype • Distributed filtering of ntuples (data files) distributed on CrossGrid Storage Elements • The output can be used in the ANN application.
Meteo / Pollution Application • Weather forecasting • Air pollution forecasting • Wave modeling • Data mining • Weather forecasting is the common link and provides input data for all other activities
Flood Simulation Meteo/ Pollution Particle Physics Medical Support Applications Links Plugin Plugin Performance Analysis Migrating API Desktop Performance Prediction Tools Links Protocol Plugin Portal (OMIS) SOAP MPI Verification Benchmarks SOAP SOAP Application Monitoring Roaming Infrastructure Monitoring SOAP OCM-G API Services Access Server API SOAP API API API API (JMX) Post- processing Scheduler MPI Library SOAP API API Visualization Kernel Network Monitoring DataGrid SOAP Data Access Globus Toolkit API CrossGrid Tools and Services T e s t b e d • 17 sites • 9 countries • over 200 CPUs • 4 TB of storage
Migrating Desktop Tools Desktop tools- Job Wizard- Job Monitor- Application Container and Application Plugin- GridFTP Commander- User Profile Manager- Private Storage Management - VNC/SSH console
Migrating Desktop Functionality Main Features: • Single sign-on / authorisation • Platform independent • Batch jobs • MPI jobs • Running interactive applications using java plugins or VNC • Monitoring grid applications • Flexible Application framework • User profile management • Easy application add on • Local and grid file management
Roaming Access Server • Well-defined set of web-services • An interface for accessing HPC systems and services (based on various technologies) in a common and standardised way • Interconnection between various grid middleware and applications • Additional features: • Virtual Directory support • Plug-in for various grid middleware Roaming Access Server JobSubmission Services Interactive Session Services File Management Services Profile Management Services Application Management Services
Interactivity • Interactive workflow - user interactivityon the Grid A user can continuously interact with a Grid client without waiting for termination of the jobssubmitted • Online output control - user one-way interactivity A user can see the output of the application running in the Grid testbed on an MD client synchronously with the application • Runtime steering - user two-way interactivity A use can steer the running application, either providing some input data online as requested by the application (also asynchronously), or suspending the process, changing some input data and resuming it
Interactive Job Submission • The user submits an application job through the CG portal or Migrating Desktop and the Roaming Access Server which supports individual user environments, • The job is handled by the Scheduler, which selects the appropriate computing resources, • DataGrid software components are used for low-level Grid operations (submission for processing and delivery of results), • The system bases on Globus Toolkit v2, • Other CrossGrid tools and services can be used in conjunction with running jobs, as requirements dictate
JDL Job Shadow RAS shadow port, RAS shadow host Java Visualisation plug-in Console Agent stdin – stdout -stderr Condor ByPass System Job Interactivity via MD Roaming Access Server CrossBroker Job Submission Services Migrating Desktop Job Shadow Gatekeeper LRMS Logging & Bookkeeping WorkerNode 010011000 In/Out/Err job data Process Launched Interactive data Control data Submission flow Computing Element
Scheduler (CrossBroker) • Automatic job management for parallel applications: Search and selection of available resources, job conditioning, job launching, job monitoring, job retry (in case of failures) and results retrieval. • MPICH-P4 (intra-cluster) • MPICH-G2 (inter-cluster) • Computational Workflows • Best effort approach to deal with failures/problems
Performance Analysis Tool OCM-G User Interface Measurement Interface & Visualization Interface (OMIS) Component OCM-G Monitoring System Main Service High-Level Manager Analysis Service Component ... Managers ... Local Monitors Application Performance ... Performance Modules Measurement Analysis Tool ... Component Application P1 P2 P3 Pn G-PM Processes
Definition of Measurements Metrics Objects Functions Partner objects Sites Nodes Processes Aggregation in time Aggregation in space
Definition of Visualizers • Visualization type: • Bar graph • Curve diagram • Histogram • Pie chart • Matrix diagram • Parameters: • Scales • Update interval
Application Monitoring TOOL - VISUALIZATION Standard Interface (OMIS) OCM-G – MONITORING OF APPLICATION System-specific interface RUNNING APPLICATION
OCM-G – Features • On-line operation • Support for multi-site grid applications • Low perturbation • Techniques for data rate reduction • Lightweight and fast socket-based communication • Flexible, services-drivendesign • No fixed metrics but a set of flexible services to construct metrics with desired semantics • Enables custom metrics in G-PM • Extendible • Additional services can easily be added • Loaded dynamically at run-time • Secure • GSI-based security • Minimal security requirements • Autonomous and standardized • Standard interface • Minimized effort of porting OMIS-based tools across platforms • Enabled interoperability of multiple tools monitoring a single application.
GridBench: The Suite • Layered approach • Worker-node • Site • and VO level • Micro-benchmarks • Micro-kernel Benchmarks • Application-kernel Benchmarks
User Site A Application Benchmark Application Broker HLA Speaking Federate Services Monitoring Client Service Code Services Benchmark RTIExec Migration Analiser Service Service Services Site B Broker Infrastructure Service Monitoring Migration HLA Services HLA Performance Support Management Bus Decision Services Services Broker Support Service Services Application N - th Grid site supporting HLA Monitoring Main Service Manager Broker Support HLA Speaking Services Site Site C C Service RTIExec HLA Registry Registry Migration Management Service Service Support Services RTIExec Services Service Grid site supporting HLA Grid HLA Management System • HLA management services • HLA-speaking Service for managing federates • RTIExec Service for managing RTIExec (coordination process in RTI) • Broker for setting up a federation and making migration decisions • Broker decision services • Registry for storinglocation of HLA-speaking services • Infrastructure Monitoring/Benchmarks for checking environment of HLA service • Migration support services • Application Monitoring for monitoring performance • Migration Service
CrossGrid Testbed • Testbed sites in 9 countries • 17 testbed sites • Three types of testbeds: production, development, test • Communication: national research networks and GEANT
CrossGrid: Innovation, Interactivity,Interoperability Features • Brings interactive applications to the Grid • Enables easy access tothe GridviaWeb Services • Extends and enhances DataGrid, GridLab, and EuroGrid • Developed according to GGF and software engineering standards Potential Customers • End-users: hospitals, environmental authorities, physicists • Companies developing compute-intensive software • Service and infrastructure providers Status • Stable version available since March 2004 as open source • Licensing: CrossGrid license based on EDG, GPL • CrossGrid Tutorial available for potential users
K-WfGridwww.kwfgrid.net • Fraunhofer FIRST, Berlin, Germany • Institute of Computer Science, University of Innsbruck, Innsbruck, Austria • Institute of Informatics of the Slovak Academy of Sciences, Bratislava, Slovakia • ACC CYFRONET AGH, Kraków, Poland • LogicDIS S.A., Athens, Greece • Softeco Sismat SpA, Genova, Italy Berlin Kraków Innsbruck Bratislava Genova Athens
User Portal Workflow Service Workflow Knowledge Storage Service Hydrology Service Hydraulics Service Meteorology Service Meteorology Visualization Hydrology Visualization Hydraulics Visualization Flood Simulations - Workflow
Execute workflow Construct workflow Monitor environment K-WfGrid Reuse knowledge Analyze information Capture knowledge Workflow Applications and Knowledge • Integrating services into coherent application scenarios • Enabling automatic construction and reuse of workflows with knowledge gathered during operation • Involving monitoring and knowledge acquisition services in order to provide added value for end users Technologies: service-oriented Grid architecture,software agents, ontologies, dynamicinstrumentation
Architecture of K-WfGrid • Capturing and reusing knowledge about Grid environments • Ontology-based optimization of workflows • Framework for collaborative knowledge reuse Users Knowledge Web Portal User Assistant Agent Grid Organization Memory Grid Application Building <templates> <workflows> <users> <components> <resources> Workflow Composition Tools Automatic Application Builder Grid Workflow Execution Service Grid Service Invocation and Control Grid Performance Analysis Service Knowledge Builder Agent Grid Resources Grid Middleware Grid Performance Monitoring Service
Flow of Actions User User interaction through the Portal Guidances for the user Knowledge Web Portal Grid Workflow User Interface User Assistant Agent Grid Organizational Memory Information on available resources and their description Workflow composition and execution visualization User’s decisions in crucial points of execution Ontological store of knowledge Workflow Orchestration and Execution Automatic Application Builder Workflow Composition Tool Analysed and extracted knowledge Information about workflow execution Scheduler PerformanceAnalysis KnowledgeAssimilation Agent Grid Workflow Execution Service Information about performance of particular resources Information about resources and environment Execution of chosen Grid services Low Level Grid Middleware (WS-RF) Grid Performance Monitoringand Instrumentation Service Grid Resources
Stages of Workflow Construction Initial, abstract grid job Abstract Workflow with Service classes Partially concretized Workflow prior to execution Fully concretized Workflow after Successful execution
USER Hints and guidelines User Assistant Agent Initial conditions provided by user Service class functionality supplied by grid service providers Abstract workflow made by WCT Service instance properties Knowledge Assimilation Agent Concrete workflow Defined metrics made by AAB predefined for the system Performance analysis Service instance performance On-line monitoring infrastructure Running workflow Services and resources monitoring data Scheduler Event publishing subsystem Grid wf Exec System Events occurred during wf composition and execution Circulation of Knowledge
Monitoring and Performance Analysis • Monitoring and instrumentation service (MIS) • Performance analysis service (PAS) • Data representations and service interfaces
Monitoring and Performance Analysis • Monitoring and Instrumentation Service • Instrument code regions, and activities • Monitor infrastructure, code regions, activity execution status • Performance Analysis Service • Define performance metrics for workflows • Analyze monitoring data and relate the data to the workflow • Define performance properties and search performance bottlenecks of workflows • Data Presentations and Service Interfaces • XML schemas for describing CPU usage,TCP bandwidth, generic events, profiling data, workflow activity execution status, etc. • WIRL, PDQS, common service operations and specific service operations • WP Interdependencies • Scheduler and Grid Workflow Execution Service (GWES) - WP2 • Grid Organizational Memory - WP4 • Knowledge Assimilation Agent - WP5
Ontologies in GOM Ontology Scope Generic Specific Ontology Type Application Resources Data Service Workflow
Levels of Knowledge • Knowledge is separated into three levels • Generic – definitions of concepts, taxonomy • Domain specific – definition of domain specific topics • Data – individuals from a concrete application • Knowledge is gathered in different registries
Technologies and Standards • Java • Web Service, SOAP • Tomcat, Maven, JIBX • WSRF, Globus Toolkit 4 • XML, OWL, RDF, RDQL • OWL-DL (Description logic) • JENA (Java Semantic Web Toolkit)
Users of K-WfGrid User community • Environment: Flood decision crisis team support system • Business: Enterprise resource planning • Public sector: Coordinated traffic management Developer community • Grid software developers: Workflow and knowledge management tools • Application developers: Complex distributed application construction • Interested Institutions: • Municipality of Genova • Slovak Water Research Institute, Bratislava • Slovak Hydrometeorological Institute, Bratislava • Slovak Watermanagement Enterprise, Banska Stiavnica
Project Proposal for F2 Call 5 GridSpace Transparent Semantic Grid
Motivation • Programming grid application • Various Grid middleware platforms for uniform access to resources – difficult, complex to program • Recent initiatives on Grid programming do not address the dynamic nature of the Grid • Important features of the Grid • Grid is dynamic • Resource users do not instantiate resources on their own • Resources are in different administrative domains • Therefore, a Grid application has to be • More loosely coupled (combined of autonomous elements) • Flexible to overcome (and benefit from) Grid dynamic nature • Adaptable to cross boundaries of various administration policies
Concept of the GridSpace • Features • Abstract, semantically rich layer between a user and the middleware • Set of tools for Grid application developer to make the Grid programming easier • Strong support for developing flexible and adaptable applications • Components • Grid programming language • Interpreter with dynamic ad-hoc binding capabilities • Runtime environment based on tuple space idea • Evolving language library to share and reuse applications • Technology: components, services, objects
GridSpace - Additional Abstract Layer • Separates the developer from ever-changing Grid resource layer • Seamlessly introduces dynamism into newly created applications • Provides unified access to resources by means of semantically described abstractions • Supports evolving and well organized library of applications used up-to-date • Allows easy reuse of already built applications
Global Grid Environment Computing power Network transfer Data storage space Sensors, devices Common Semantic Description Layer Data Software From Resource Abundance to Programmable Grid Global Grid Environment Computing power Similar semantic description of all the resources • Using common notions: dependency, requirement, capability • Based on growing Semantic Web/Grid achievements • With natural ability to be extended by multiple users Advantages • Helps build new applications • Common language for various tools and platforms • Does not enforce unification of underlying technology Network transfer Data storage space Data/events - Sources - Retainers - Tranformers - Consumers Sensors, devices Data Software The environment provides everything for anapplication: vast space, multiplication of resources, multitude of access standards and protocols Issues • Plenty of resources to build sophisticated applications from • Each new application requires huge effort to overcome integration problems