320 likes | 336 Views
Australian Partnership for Advanced Computing. Partners: Australian Centre for Advanced Computing and Communications ( ac3 ) in NSW The Australian National University ( ANU ) Commonwealth Scientific and Industrial Research Organisation ( CSIRO )
E N D
Australian Partnership forAdvanced Computing Partners: • Australian Centre for Advanced Computing and Communications (ac3) in NSW • The Australian National University (ANU) • Commonwealth Scientific and Industrial Research Organisation (CSIRO) • Interactive Virtual Environments Centre (iVEC) in WA • Queensland Parallel Supercomputing Foundation (QPSF) • South Australian Partnership for Advanced Computing (SAPAC) • The University of Tasmania (TPAC) • Victorian Partnership for Advanced Computing (VPAC) “providing advanced computing and grid infrastructure for eResearch” Rhys Francis Manager, APAC grid program
APAC Programs • National Facility Program • a world-class advanced computing service • currently 232 projects and 659 users (27 universities) • major upgrade in capability (1650 processor Altix 3700 system) • APAC Grid Program • integrate the National Facility and Partner Facilities • allow users easier access to the facilities • provide an infrastructure for Australian eResearch • Education, Outreach and Training Program • increase skills to use advanced computing and grid systems • courseware project • outreach activities • national and international activities
Research Leader Steering Committee $8M pa in people Plus compute/data resources 140 people >50 full time equivs Project Leader Activities Project Leader Activities Engineering Taskforce APAC Grid Development Implementation Taskforce APAC Grid Operation Research Activities Development Activities
Projects Grid Applications • Astronomy • High-Energy Physics • Bioinformatics • Computational Chemistry • Geosciences • Earth Systems Science Grid Infrastructure • Computing Infrastructure • Globus middleware • certificate authority • system monitoring and management (grid operation centre) • Information Infrastructure • resource broker (SRB) • metadata management support (Intellectual Property control) • resource discovery • User Interfaces and Visualisation Infrastructure • portals to application software • workflow engines • visualisation tools
Experimental High Energy Physics • Belle Physics Collaboration • K.E.K. B-factory detector • Tsukuba, Japan • Matter/Anti-matter investigations • 45 Institutions, 400 users worldwide • 10 TB data currently • Australian grid for KEK-B data • testbed demonstrations • data grid centred on APAC National Facility • Atlas Experiment • Large Hadron Collider (LHC) at CERN • 3.5 PB data per year (now 15 PB pa) • operational in 2007 • Installing LCG (GridPP), will follow EGEE
Belle Experiment • Simulated collisions or events • used to predict what we’ll see (features of data) • essential to support design of systems • essential for analysis • 2 million lines of code
Belle simulations • Computationally intensive • simulate beam particle collisions, interactions, decays • all components and materials : 10x10x20 m, 100 µm accuracy • tracking and energy deposition through all components • all electronics effects (signal shapes, thresholds, noise, cross-talk) • data acquisition system (DAQ) • Need 3 times as many simulations as real events to reduce statistical fluctuations
Belle status • Apparatus at KEK in Japan • Simulation work done world wide • Shared using an SRB federation: KEK, ANU, VPAC, Korea, Taiwan, Krakow, Beijing…(led by Australia!) • Previous research work used script based workflow control, project is currently evaluating LCG middleware for workflow management • Testing in progress: LCG job management, APAC grid job execution (2 sites), APAC grid SRB data management (2 sites) with data flow using international SRB federations • Limitation is international networking
Earth Systems Science Workflow • Access to Data Products • Inter-governmental Panel Climate Change scenarios of future climate (3TB) • Ocean Colour Products of Australasian and Antarctic region (10TB) • 1/8 degree ocean simulations (4TB) • Weather research products (4TB) • Earth Systems Simulations • Terrestrial Land Surface Data • Grid Services • Globus based version of OPeNDAP (UCAR/NCAR/URI) • Server side analysis tools for data sets: GRADS, NOMADS • Client side visualisation from on-line servers • THREDDS (catalogues of OPeNDAP repositories)
Workflow Vision Analysis Toolkit Discovery Visualisation Crawler Job/Data Management OPeNDAP APAC NF VPAC AC3 SAPAC IVEC Digital Library
Workflow Components Analysis ToolkitPortlet Gridsphere Portal DiscoveryPortlet VisualisationPortlet Get DataPortlet WebServices Web MapService Web CoverageService Web ProcessingService OAI Library API (Java) Live Access Server (LAS) OPeNDAP Server Processing App. Application Layer DataLayer Metadata Metadata Crawler Config HardwareLayer Digital Repository Compute Engine
APAC NF (Canberra) International IPCC model results (10-50Tb) TPAC 1/8 degree ocean simulations (7Tb) AC3 Facility (Sydney) Land surface datasets Met Bureau Research Centre (Melbourne) Near real-time LAPS analyses products (<1Gb) Sea- and sub-surface temperature products CSIRO HPSC (Melbourne) IPCC CSIRO Mk3 model results (6Tb) CSIRO Marine Research (Hobart) Ocean colour products & climatologies (1Tb) Satellite altimetry data (<1Gb) Sea-surface temperature product TPAC & ACE CRC (Hobart) NCEP2 (150Gb), WOCE3 Global (90Gb) Antarctic AWS (150Gb), Climate modelling (4Gb) Sea-ice simulations, 1980-2000 OPeNDAP Services
query List of matches get( ) MCAT SRB Australian Virtual Observatory User SSA AVD SRB Registry query List of matches Data SSA SIA
Conceptual models Databases Modeling codes Mesh generators Visualization packages People High Performance Computers Mass Storage Facilities APAC Grid Geoscience
Mantle Convection • Observational Databases • access via SEE Grid Information Services standards • Earthbytes 4D Data Portal • Allows users to track observations through geological time and use them as model boundary conditions and/or to validate process simulations. • Mantle Convection • solved via Snark on HPC resources • Modeling Archive • stores the problem description so they can be mined and audited • Trial application provided by: • D. Müller (Univ. of Sydney) • L. Moresi (Monash Univ./MC2/VPAC)
Workflows and services User Edit Problem Description Run Simulation Job Monitor Archive Search Local Repository Login Data Management Service Results Archive Resource Registry Job Management Service AAA Service Registry Rock Prop. W.A EarthBytes Service Geology W.A HPC Repository Snark Service Rock Prop. N.S.W Geology S.A
Key steps • Implementation of our own CA • Adoption of VDT middleware packaging • Agreement to a GT2 base for 2005, GT4 in 2006 • Agreement on portal implementation technology • Adoption of federated SRB as base for shared data • Development of gateways for site grid architecture • Support for inclusion of ‘associated’ systems • Implementation of VOMS/VOMRS • Development of user and provider policies
Apache HTTPD, v2.0.54 • Apache Tomcat, v4.1.31 • Apache Tomcat, v5.0.28 • Clarens, v0.7.2 • ClassAds, v0.9.7 • Condor/Condor-G, v6.7.12 • VDT Condor configuration script • DOE and LCG CA Certificates, vv4 (includes LCG 0.25 CAs) • DRM, v1.2.9 • EDG CRL Update, v1.2.5 • EDG Make Gridmap, v2.1.0 • Fault Tolerant Shell (ftsh), v2.0.12 • Generic Information Provider, v1.2 (2004-05-18) • gLite CE Monitor, v1.0.2 • Globus Toolkit, pre web-services, v4.0.1 + patches • Globus Toolkit, web-services, v4.0.1 • GLUE Schema, v1.2 draft 7 • Grid User Management System (GUMS), v1.1.0 GSI-Enabled OpenSSH, v3.5 Java SDK, v1.4.2_08 jClarens, v0.6.0 jClarens Web Service Registry, v0.6.0 JobMon, v0.2 KX509, v20031111 Monalisa, v1.2.46 MyProxy, v2.2 MySQL, v4.0.25 Nest, v0.9.7-pre1 Netlogger, v3.2.4 PPDG Cert Scripts, v1.6 PRIMA Authorization Module, v0.3 PyGlobus, vgt4.0.1-1.13 RLS, v3.0.041021 SRM Tester, v1.0 UberFTP, v1.15 Virtual Data System, v1.4.1 VOMS, v1.6.7 VOMS Admin (client 1.0.7, interface 1.0.2, server 1.1.2), v1.1.0-r0 VDT components DOE and LCG CA Certificates v4 (includes LCG 0.25 CAs) GriPhyN Virtual Data System (containing Chimera and Pegasus) 1.2.14 Condor/Condor-G 6.6.7 VDT Condor configuration script Fault Tolerant Shell (ftsh) 2.0.5 Globus Toolkit 2.4.3 + patches VDT Globus configuration script GLUE Schema 1.1, extended version 1 GLUE Information Providers CVS version 1.79, 4-April-2004 EDG Make Gridmap 2.1.0 EDG CRL Update 1.2.5 GSI-Enabled OpenSSH 3.4 Java SDK 1.4.2_06 KX509 2031111 Monalisa 1.2.12 MyProxy 1.11 PyGlobus 1.0.6 UberFTP 1.3 RLS 2.1.5 ClassAds 0.9.7 Netlogger 2.2
Our most important design decision Installing Gateway Servers at all grid sites, using VM technology to support multiple grid stacks Cluster Cluster Datastore High bandwidth, dedicated private networking between grid sites Gateway Server V-LAN Gateway Server Gateways will support, GT2, GT4, LCG/EGEE, Data grid (SRB etc), Production Portals, development portals, experimental grid stacks Datastore Cluster Cluster
Gateway Systems • Support the basic operation of the APAC National Grid and translate grid protocols into site specific actions • limit the number of systems that need grid components installed and managed • enhance security as many grid protocols and associated ports only need to be open between the gateways • in many cases only the local gateways need to interact with site systems • support roll-out and control of production grid configuration • support production and development grids and local experimentation using Virtual Machine implementation
Grid pulse – every 30 minutes NG1 – globus toolkit 2 services ANU iVEC VPAC NG2 – globus toolkit 4 services iVEC SAPAC (down) VPAC NGDATA – SRB & GridFTP ANU iVEC VPAC (down) NGLCG – special physics stack VPAC NGPORTAL – apache/tomcat iVEC VPAC Gateway Down grid_pulse@vpac.org Gateway Up root@ng1.apac.edu.au Gateway Up root@ng1.ivec.org Gateway Up root@ng1.vpac.org Gateway Up root@ng2.ivec.org Gateway Down root@ng2.sapac.edu.au Gateway Up root@ng2.vpac.org Gateway Up root@ngdata.apac.edu.au Gateway Up root@ngdata.ivec.org Gateway Down root@ngdata.ngdata Gateway Down root@ngdata.vpac.org Gateway Up root@ngdom0.apac.edu.au Gateway Up root@nggateway.sf.utas.edu.au Gateway Up root@nglcg.vpac.org Gateway Up root@ngportal.ivec.org Gateway Up root@ngportal.vpac.org Gateway Up root@xen-d.vpac.org Gateway Down root@xen-t.vpac.org Gateway Up root@xen.vpac.org http://goc.vpac.org/
+3500 processors +3PB near line storage Townsville QPSF Brisbane Perth IVEC CSIRO Adelaide SAPAC Canberra ANU Sydney ac3 Melbourne VPAC CSIRO Hobart TPAC CSIRO A National Grid GrangeNet Backbone Centie/GrangeNet Link AARNet Links
Significant Resource Base Mass stores (15TB cache, 200+ TB holdings, 3PB capacity) • ANU 5+1300 TB CSIRO 5+1300 TB plus several 70-100 TB stores Compute Systems (aggregate 3500+ processors) • Altix 1,680 1.6 GHz Itanium-II 3.6 TB 120 TB disk • NEC 168 SX-6 vector cpus 1.8 TB 22 TB disk • IBM 160 Power 5 cpus 432 GB • 2 x Altix 160 1.6 GHz Itanium-II 160 GB • 2 x Altix 64 1.5 GHz Itanium-II 120 GB NUMA • Altix 128 1.3 GHz Itanium-II 180 GB 5TB disk, NUMA • 374 x 3.06 GHz Xeon 374 GB Gigabit Ethernet • 258 x 2.4 GHz Xeon 258 GB Myrinet • 188 x 2.8 GHz Xeon 160 GB Myrinet • 168 x 3.2 GHz Xeon 224 GB GigE, 28 with infiniband • 152 x 2.66GHz P4 153 GB 16TB disk, GigE
1 2 3 4 5 6
QPSF AC3 IVEC APAC NATIONAL FACILITY ANU SAPAC CSIRO TPAC VPAC APAC National Grid one virtual system of computational facilities • Basic Services • single ‘sign-on’ to the facilities • portals to the computing and data systems • access to software on the most appropriate system • resource discovery and monitoring