170 likes | 280 Views
Slides for the Caltech GAE Workshop June 2003. GAE (Grid Analysis Environment) Overview of Caltech effort. Overview. GAE crucial for LHC experiments Utility of Grids proven for production Their use for Analysis will be the Acid Test of Grids Large, Diverse, Distributed community of users
E N D
Slides for the Caltech GAE Workshop June 2003 GAE(Grid Analysis Environment)Overview of Caltech effort
Overview • GAE crucial for LHC experiments • Utility of Grids proven for production • Their use for Analysis will be the Acid Test of Grids • Large, Diverse, Distributed community of users • Support for hundreds/thousands of analysis tasks • Widely varying requirements • Need for Priority Schemes, robust authentication and security • Operation in a severely resource-limited and constrained global system • GAE is where the physics gets done • Where physicists learn to collaborate on analysis at a distance
Scope • Diagram shows “snapshot” in time of analysis activities • Groups of individuals, geographically separated, work on specific analysis topics (e.g. Supersymmetry) • Resources in the Grid system are shared between the groups • Boundaries enclosing the groups move and change shape as the composition or requirements of the groups change
Architecture • Several candidate computing system architectures have been proposed to support GAE • At Caltech we have defined the “CAIGEE” Architecture, in collaboration with UCSD, UCR, FNAL and UCD • Our work is focussed on developing critical missing components of the CAIGEE architecture, creating demonstration-grade applications to determine its validity, and working with other groups on integration of existing software into the CAIGEE scheme
CAIGEE (continued) • Based on the use of Web Services or Portals to provide heterogeneous clients access to analysis tools and data • An important feature is support for even semi-infinitely thin clients, such as PDAs with very limited CPU/Memory • Grid Authentication and transport built in – mediates client/service (portal) traffic
Web Services • Data/Processing services offered via the Web • Widely adopted in the commercial world • Good tools, de facto standard protocols, support etc. • We have been confirming their usefulness for scientific data and services • Access to RDBMS-resident Tags and nTuples (Oracle, SQLServer, PostgreSQL) • Access to ROOT files • Access to Objectivity object collections • To do this, we have updated existing tools to “talk” with Web Services: • ROOT • COJAC (3D event viewer) • Others
Web Services - Principles • Publish makes the service description publicly available. • WSDL( Web Services Description Language) is the language used to create the service description. • Find discovers the web service • UDDI (Universal Description Discovery and Integration) is the directory technology used by service registries. The registries contain descriptions of web services, and support lookup. • Bind allows the service to be used by the client. • SOAP (Simple Object Access Protocol) through which the service provider, service registry and service requestor communicate. SERVICE PROVIDER 1 Publish 3 Bind SERVICE REQUESTOR SERVICE REGISTRY 2 Find
ORACLE9i SERVER DATA (META DATA) Provided at authentication (Service Registry) and security layer of Grid. Data Replication through SSL UUDI Registry Node ORACLE9i SERVER DATA (META DATA) Available On Fabric layer of Grid JAVA XML API to connect with Database Server Proxy Server SOAP HTTP Server Server with Master Database SOAP Processor DISTRIBUTED DATABASE WSDL file (Service Provider) Available at Connectivity and Resource layer of Grid SOAP Bind with the provided service UDDI SOAP Request and Response MS-SQL DATA (META DATA) Server with Materialized View Database Client Web Application to connect with database (Service Requestor) Web Services: Experimental Setup
GAE Tools (1) Clarens • Our emphasis is on accomodating existing analysis tools in our CAIGEE architecture • To facilitate this, we use the “Clarens Dataserver” • Clarens is server software that makes datasets and services available to clients in a suitable lingua franca • Clients initially Grid-authenticate with a Clarens server, and then are able to make use of a wide set of data and analysis services on offer
GAE Tools (2) Clarens • Clarens uses an interpreted Python framework running inside Apache • PKI security for CA certificates • Commodity protocols (http/https) used to talk with clients • Authorization of Web Service requests using hierarchical ACLs for Virtual Organisations • Distributed administration of VO/ACLs • Creating new Clarens services is straightforward and easy: this was one of the design goals.
GAE Tools (4) Clarens • Services include: • Access to SOCATS (next slide) • Storage Resource Broker interface • Application execution (submit jobs to cluster schedulers) • Proxy escrow • File access to files in server filesystem or SRB files
GAE Tools (5) SOCATS • “STL Optimized Caching and Transport System” • SOCATS is a general-purpose tool we have developed that is able to deliver large object collections (result sets) in response to an SQL query on an RDBMS • Targetted at C++ clients who wish to send a SQL Query to a remote RDBMS (using the Clarens dataserver) and receive back the database rows/result set as a collection of C++ objects • Data delivered in binary format (avoid heavy overhead of explicit XML encoding) • Large result sets are streamed efficiently to the client, so allowing client processing to begin as soon as the first data are available
GAE Tools (6) GroupMan • Developed in response to need for user-friendly administration of LDAP based “Virtual Organisations” • Import to the LDAP server of certificates from CA • User-friendly GUI allows ad hoc creation of user groups and VOs • VO data stored to allow easy extraction by standard Grid-based tools • E.g. creation of Globus gridmap files • Part of the DPE distribution
GAE Tools (5) PDA Client • A handheld GAE client: fruits of collaboration between NUST and Caltech • Software is Java Analysis Studio (JAS) ported to the Pocket PC 2002 OS • Hardware is any Pocket PC 2002 device • This tool is still under development and currently lacks authentication/security components
GAE Tools (6) Collaboration Desktop • Four-screen desktop analysis setup • Driven by a single server and single graphics card • Four flat panel monitors • Allows simultaneous work on: • Traditional analysis tools (e.g. ROOT) • Software development (e.g. VS.NET) • Even displays (e.g. IGUANA) • MonALISA monitoring displays • Persistent collaboration (e.g. VRVS) • Online event or detector monitoring • Web browsing, email • Chat windows, instant messaging • Shared whiteboards etc.