420 likes | 1.05k Views
GEON Architecture: Systems Components Overview Sandeep Chandra, SDSC The Geosciences Network (GEON) Cyberinfrastructure Workshop University of Auckland, New Zealand. 26-28 November 2007 IT Goals
E N D
GEON Architecture: Systems Components Overview SandeepChandra, SDSC The Geosciences Network (GEON) Cyberinfrastructure Workshop University of Auckland, New Zealand. 26-28 November 2007 www.geongrid.org
IT Goals • Develop cyberinfrastructure to support the “day-to-day” conduct of science (e-science), not just “hero” computations • Based on a Web/Grid services-based distributed environment • Work closely with geoscientists to help create data sharing frameworks, best practices, and useful and usable capabilities and tools for information integration and knowledge discovery • The “two-tier” approach • Use best practices, including commercial tools, • while developing advanced technology in open source, and doing CS research • Leverage from other similar cyberinfrastructure projects www.geongrid.org
System Deployment Standard reference systems for GEON PoP (point-of-presence) and GEON Portal middleware infrastructure Additional resources can be attached to the PoP Software Deployment Centralized software stack definition Locally controlled extensions Application Development and Integration Centralized web-based portal for access to core resources Local portals provide customization into users home environment and access to local expertise Security Centralized user account policies Locally defined “non-grid” user policies Balancing “Empowering” and “Controlling” www.geongrid.org
Software Layers www.geongrid.org
Vendors Dell (40 prod systems + devel systems) Poweredge 2950-based systems Dual Core 2.8 GHz Intel Xeon 750GB SAS, 2-4 GB RAM ProMirco (3 systems) Dual Pentium 4 TB + RAID HP Cluster donation (9 systems) Rx2600-based dual 1.4 GHz PoPs (PI Institutes, Project Partners, International Partners) 23 servers in 23 domains Compute and Data Clusters 4 small clusters (3-4 nodes each) 3 medium cluster (8-9 nodes) 1 large cluster (30,000 su’s on Teragrid) Data Storage 3 data nodes (4 TB) 12 TB online SAN 10 TB tape archive Misc. Equipment Switches, racks, etc. Hardware Deployment www.geongrid.org
Partner Sites www.geongrid.org
Hardware Deployment Each site runs a PoP Optional cluster and data nodes Users access resources through PoP PoP provides point of entry PoP provides access to global services in GEON Developers add services & data hosted on GEON resources Portal Services, Application Services, Web services/Grid Services Deployment Architecture www.geongrid.org
GEON Hardware Facility www.geongrid.org
Unified Software Stack definition Custom GEON Roll GEON Portal Web/Grid Services software Stack Common GEON Applications and Services Focus on scalable systems management Modified Rocks for wide-area cluster management Mechanism to provide local extensions to base software stack definition Collaborations with partner sites Identified appropriate contacts Helping partner sites in systems development Systems Software www.geongrid.org
Base OS Rocks: highly programmatic software configuration management Development Globus 4.0.2 (GSI, GridFTP, etc) Web Services (Jakarta-tomcat-5.0.28, axis-1.2, ant-1.6, jdk1.4.2, etc) GridSphere 2.0.2 Portal Framework Database IBM DB2 Postgres 8.0.3 PostGIS 1.2 (Geos, Proj) Security Tripwire, chkrootkit System Monitoring INCA Testing and Monitoring framework (Teragrid) With GRASP benchmarks Network Weather Service (NWS) Ganglia Job Submission and Monitoring Condor, PBS GRASS (GDAL, NetCDF, Tiff) GMT PBS Condor NWS INCA/GRASP Globus OGSA-DAI Pre-Web Axis Tomcat Postgres PostGIS Geos Proj Ant Samba JDK Tripwire Rocks 4.2.1 based on RedHat Enterprise Linux GEON Software Stack GEONGrid Software Stack GridSphere Portal www.geongrid.org
Federico Sacerdoti, Sandeep Chandra, and Karan Bhatia, “Grid Systems Deployment and Management using Rocks”, IEEE Cluster 2004, Sept. 20-23 2004, San Diego, California Wide-Area Cluster Management www.geongrid.org
GEON Rocks Central • Local extensions to software stack • Partner sites package and maintain locally hosted rolls. • Provides easy installation and automatic configuration of software on nodes. • A highly customized node. central.<X>.geongrid.org <X> Central Server (<Y> Roll) GEON Frontend (GEONGrid+GRASS+GMT) ASU Central Server (GRASS Roll) central.asu.geongrid.org Compute Compute SDSC Central Server (GEONGrid Roll) UTEP Central Server (GMT Roll) central.sdsc.geongrid.org central.utep.geongrid.org www.geongrid.org
Production/Beta/Development servers 8 Production servers used for various activities 3 Beta Servers 1 Common Development Server 1 Central Server for hosting Rocks and GEON stack. Blogs, Forums, Calendar, RSS Bug tracking software (JIRA) CVS/SVN services svn.geongrid.org GEON Certificate Authority gama.geongrid.org Additional Infrastructure www.geongrid.org
Goals Evaluate core software infrastructure Collaborate and engineer solutions as needed Integrate or build as necessary Portal Middleware Infrastructure (GEON Portal) Security Infrastructure (GAMA) Naming and Discovery Infrastructure (Handle.net) Data Management and Replication (SRB, RLS) Generic Mediation Grid/Web Core Middleware Services www.geongrid.org
Authentication GSI, CAS, SAML, MyProxy, CACL-CA, Naregi-CA, GAMA Monitoring NWS, INCA Scheduling Condor, CSF (Community Scheduling Framework) Cataloging RLS (Replica Location Service), Handle.net Data Transfer and Management GridFTP, SRB Replication RLS, SRB Databases Postgres, PostGIS, DB2 Core Services www.geongrid.org
GridSphere Portal Framework Developed by GridLab (Jason Novatny, and others) Albert Einstein Institute, Berlin, Germany Java/JSP Portlet Container JSR 168 support, WSRP and JSF Supports Collaboration (standard portlet API) Personalization (e.g. my.yahoo.com) Grid Services (GSI support) Web Services Other Frameworks Open Grid Computing Environments (OGCE) Apache JetSpeed based on Sakai Portal Infrastructure www.geongrid.org
GEON Portal • GEON Portal provides: • Authenticated access to data and Web services • Registration of data sets, tools, and services with metadata • Search for data, tools, and services, using ontologies • Scientific workflow environment and access to HPC • Data and map integration capability • Scientific data visualization and GIS mapping www.geongrid.org
Distributed portal architecture Allows partner sites to “brand” their portal Facilitates development by partners Allows custom apps for each site Unified user login GSI based, managed by GEON system Networking for local organizations Distributed Portals www.geongrid.org
Local Customization Partner sites can customize the local portal to the specific needs of the users at that site. Support integration of local resources each site may have significant local resources that can be integrated for local and external users. Supports code development End-user access through Distributed Portals www.geongrid.org
Managing distributed catalogs Integration of tools Complete automation of portal middleware deployment process Challenges www.geongrid.org
Data Portal Middleware • Portal Server • Dual Core Xeon • 750 GB SAS • 4-8 GB RAM • Rocks, GEON • Data Server • Dual Core Xeon • 1.5TB RAID 5 • 4-8 GB RAM • Rocks, SRB • CA Server • Dual Core Xeon • 30GB SCSI • 2 GB RAM • Rocks, GAMA www.geongrid.org
Data Portals Deployed www.geongrid.org
Problem Portal users need access to various Grid-enabled resources for job submission, data management, instrument control, etc. Standard security mechanism is GSI (Grid Security Infrastructure). Typically involves: Creation of credentials for a new user Storage of a proxy in MyProxy by user Retrieval of proxy upon user login to portal Configuration of resources to accept credentials Security Infrastructure www.geongrid.org
GSI Based Collaboration with Telescience & BIRN GEON certificate authority: gama.geongrid.org SDSC CACL system Roll-based access control by extending Gridsphere capabilities geonAdmin, geonPI, geonUser, public Portal Integration Account requests, certificate management Security Infrastructure www.geongrid.org
A Solution Install command-line security infrastructure on a dedicated, locked-down machine (GAMA server) Wrap apps in Web Services on GAMA server Construct GridSphere portlets and services for submitting and managing account requests from users on a portal server Configure GridSphere to automatically retrieve a proxy from the GAMA server when a user logs on to the portal GAMA: Grid Account Management Architecture www.geongrid.org
GAMA Services Kurt Mueller, Sandeep Chandra, and Karan Bhatia, “GAMA: Grid Account Management Architecture”, IEEE E-Science 2005, Melbourne, Australia, Dec 2005. www.geongrid.org
GAMA Portal Components PortletServices ActionPortlets AccountRequest objects Grid Sphere DB AccountRequestService All object persistence methods, and many other account manage- ment methods RequestApprovalRule objects hibernate GAMAClientService Encapsulates all communication with GAMA server GAMAAuthModule Provides GridSphere login and automatic credential retrieval Utility classes FormInputValidation SendMail www.geongrid.org
GAMA Supports • GSI-based using best practices • Global account acceptance policies • Supports importing of grid accounts (privileged user) • Supports non-grid local accounts (non-privileged user) • Supports portals, clusters and rich clients • Packaged as Rocks rolls. www.geongrid.org
Extending Gridsphere user db Users can be authorized for access to tools/services at various levels Registration, LiDAR, SYNSEIS Authorization www.geongrid.org
Naming All service instances, datasets and applications Two level naming scheme to support replication and versioning Globally Unique and Resolvable Resolution Handle system (Evaluating) Collaborating with Earthchem project Discovery Discover resources in heterogeneous metadata repositories MCAT, MCS, Geography Network (ESRI), OPeNDAP UDDI Replica Location Service (Globus) Naming and Discovery www.geongrid.org
Data Movement and Storage GridFTP SRB Server Caching and Replication Replica Location Service (RLS) Data Services Metrics GRASP Inca Data Management & Replication www.geongrid.org
GIS Map Integration Mediation Services www.geongrid.org
System Monitoring and Benchmarking • Inca for user-level monitoring of Grid functionality and performance • Measure Bandwidth, Latency and other system metrics • Use Globus, GRASP, INCA and NWS frameworks • Archive results and display data continuously www.geongrid.org
System Monitoring and Benchmarking www.geongrid.org
Physical Layer Deploy hardware Systems Layer Developing management software and collaborations with partner sites Developing and Deploying GEON Middleware Collaborating with partner sites to develop local software stack extensions Grid Layer Services for Portal & Security (Authentication & Authorization) Naming & Discovery, Data Management & Replication, and Mediation Applications Layer Apps ready, used as templates for how to build apps in GEON. Summary www.geongrid.org
University of Hyderabad, India (iGEON India network with 2 more sites) Russian Academy of Sciences, Moscow Chinese Academy of Sciences, China AIST GeoGrid, Japan Auscope, Australia and now University of Auckland, New Zealand iGEON Sites www.geongrid.org
Goals: Robustify existing GEON systems and middleware infrastructure. Build new useful tools on top of existing framework. Encourage software development and resource integration with partner sites More data, More apps Looking Ahead: GEON 2.0 www.geongrid.org
http://geongrid.org http://portal.geongrid.org http://grid-devel.sdsc.edu/gama www.rocksclusters.org www.globus.org www.gridsphere.org Resources www.geongrid.org
GEON Team Grid-Devel Group Rocks Group University of Auckland BeSTGRID Acknowledgements www.geongrid.org
Mail: systems@geongrid.org Questions or Feedback? www.geongrid.org