200 likes | 211 Views
This article discusses the importance of e-Science in Asia and the need for collaboration and sharing to bridge the gap between Asia and the world. It also introduces the EGEE project and its objectives in providing a large-scale grid infrastructure for e-Science. The article highlights the various applications of EGEE, including High Energy Physics, Life Sciences, Earth Sciences, Computational Chemistry, Astronomy & Astrophysics, Fusion, and the Grid Observatory. It also showcases the use of grid technology in drug analysis and modeling for complex grid data challenges. Lastly, the article mentions the involvement of EGEE in avian flu data challenges and the development and deployment of a new environment for drug discovery.
E N D
EGEE Grid in Asia Simon C. Lin Academia Sinica Grid Computing Centre Taipei, Taiwan 16 November 2007 Do-Son ACGrid School in Hanoi, Vietnam
e-Science Reminder • Definition • “e-Science is about global collaboration in key areas of science and the next generation of infrastructure that will enable it.” (by John Taylor, http://www.e-science.clrc.ac.uk) • Objectives • Support research by e-Science, on data intensive sciences and cross disciplinary collaboration • Why e-Science is necessary in Asia • The global infrastructure is establishing quickly • Take advantage of sharing and collaboration to bridge the gap between Asia and the world • To address the challenge of regional cooperation
Collaborating e-Infrastructures TWGRID “Production” = Reliable, sustainable, with commitments to quality of service Potential for linking ~80 countries
The EGEE project • Flagship European grid infrastructure project, now in 2nd phase with 91 partners in 32 countries • Objectives • Large-scale, production-quality grid infrastructure for e-Science • Attracting new resources and users from industry as well asscience • Maintain and further improvegLite Grid middleware • Structure EGEE: 1 April 2004 – 31 March 2006 EGEE-II: 1 April 2006 – 31 March 2008 • Leveraging national and regional grid activities worldwide • Funded by the EC at a level of ~37 M Euros for 2 years • Support of related projects for infrastructure extension, application, specific services • EGEE-III:1 April 2008 – 31 March 2009 • Reaching self-sustainable state
240 sites 45 countries 41,000 CPUs 5 PetaBytes >10,000 users >150 VOs >100,000 jobs/day Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences …
EGEE Applications • HEP: scale & performance testing, 4000 users worldwide, ~10PByte/year, strict deadlines • Life Sciences: diverse community, secured access, data encryption, complex workflows • Earth Sciences: large community, integration of geospatial services for diverse data sources and formats • Computational Chemistry: development of license models, advanced MPI usage, liaison to GEMS project • Astronomy & Astrophysics: access to vast databases and catalogs, large sensor networks, support PLANCK, MAGIC & AUGER • Fusion: liaison with major Fusion projects (e.g.ITER), EU initiatives (e.g. EUFORIA) and interoperability between grids and supercomputers • Grid Observatory: engage computer science community and improve grid reliability/usage
High Energy Physics Large Hadron Collider (LHC): • One of the most powerful instruments ever built to investigate matter • 40 Million Particle collisions per second • 4 Experiments: ALICE, ATLAS, CMS, LHCb • ~15 PetaBytes/year from the 4 experiments • First beams in 2007 Mont Blanc (4810 m) Downtown Geneva
translation / step=2.0 Å quaternion / step =20 degree torsion / step= 20 degree number of energy evaluation =1.5 X 106 max. number of generation =2.7 X 104 run number =50 Drug Analysis: Modeling Complex Grid Data Challenge Targets Compound 2D compound library Lipinski’s RO5 “drug-like” Data challenge on EGEE, Auvergrid, TWGrid ~6 weeks on ~2000 computers 8 structures (including 1 original type) structure generation energy minimization Molecular docking (Autodock) ~137 CPU years, 600 GB data 3D structure ionization tautermization 3D structure library selection 308,585 (6 known drugs)
screening focused library hit rate * cost Modeling as a complement to HTS in drug discovery To improve hit rate$ focused library Can Grid help? Modified from DDT vol. 3, 4, 160-178(1998)
History of Grid Drug Discovery on Avian Flu • 1st WISDOM data challenge on Malaria (autumn in 2005) • pre-activity before the 1st EGEE user forum (1 month work during the Christmas holiday in 2005) • DIANE/GANGA technology • Contacting biologists for the user case • 1st EGEE user forum (March 2006) • Where the biologist (application users) and grid engineers (resource providers) met • 1st avian flu data challenge • 2 weeks for preparation • 6 weeks for real execution started from April 2006 • data analysis and post process • Long process in collecting the distributed data • In-vitro test • 2nd avian flu data challenge • Development phase addressing the issues • Deployment and test the new environment • Start the production from end of August 2007
ASGC Asia Pacific Regional Operation Center Worldwide Grid Infrastructure Grid Application Platform Avian Flu Drug Discovery Large Hadron Collider (LHC)
TWGrid Introduction • Consortium Initiated and hosted by ASGC in 2002 • Objectives • Gateway to the Global e-Infrastructure & e-Science Applications • Providing Asia Pacific Regional Operation Services • Fostering e-Science Applications collaboratively in AP • Dissemination & Outreach • Taiwan Grid/e-Science portal • Providing the access point to the services and demonstrate the activities and achievements • Integration of Grid Resources of Taiwan • VO of general Grid applications in Taiwan NTCU
EGEE Asia Federation is • Extending the gLite Infrastructure, currently led by ASGC • Engaging more user communities to join worldwide e-Science collaboration • Building regional e-Infrastructure and e-Science application • Conducting and supporting a production e-Infrastructure • Working together to provide better user support • Conducting more business and industry cooperations for new business model and opportunity
Production Infrastructure • AsiaPacific Regional Operation Center (APROC) Mission • Provide deployment support facilitating Grid expansion • Maximize the availability of Grid services • Supports EGEE sites in Asia Pacific since April 2005 • 21 production sites, 8 countries • 9 sites joined EGEE since last year • Resources • 2,047 CPU cores, and 500 TB disk space currently • Will have 3500 CPU Cores and close to 2 PB disks by end of 2007 • Provide 3.5 Milion KSI2K-hours in last 12 months
Joining EGEE Infrastructure • Contact APROC: http://www.twgrid.org/aproc/join/newrc/ • If domestic CA is not available • Register as a ASGCCA RA • Obtain user and host certificates • Dedicated an administrator with Unix experience • Allocate servers • Study user guide and installation manual • Send configuration file to APROC for review before deployment • Complete registration and certification process
Long Term Operations • Establish domestic CA if none exists • Increase availability and resource levels • Establish domestic operations structure • Operations procedures • Tools: monitoring and notification, ticketing system • User and administrator support • Training for administrators and users • Collaborate with APROC in Regional operations • Support VOs of application development and production service separately
EUAsiaGrid • Identify and engage scientific communities which can benefit from the use of state-of-art Grid technologies; • Disseminate EGEE middleware in Asian countries by means of public events and written and multimedia material; • Provide training resources and organise training events for potential and actual Grid users; • Support the scientific applications and create a human network of scientific communities by building on and leveraging the e-Science Grid infrastructure. 18
Work Packages of EUAsiaGrid • WP1: Project administrative and technical management • WP2: Requirement capture and coordination policy definition • To collect from the scientific communities of the Asian countries their computing and storage requirements, • To develop a model for the promotion of sustainable National Grid initiatives • To define a roadmap towards a common e-Science Asian Grid infrastructure • WP3: Support of scientific applications • To give support to EGEE applications, selected on the basis of already existing collaborations between EU and Asian partners • To identify new user communities which could profit of the Asian e-Infrastructure. • To provide support for adaptation of the regional applications on top of the gLite MW • WP4: Dissemination • To enhance the awareness about the EUAsiaGrid project and the Grid paradigm in Asia • To facilitate the information and experience exchange for the potential new research communities and encourage them to use the e-Infrastructure for their applications • To promote EUAsiaGrid as a Grid service facilitator to the user communities among Asia • WP5: Training • To train the technical personnel to manage the e-Infrastructure and the user applications by using the Grid tools effectively • To foster the use of the Grid e-Infrastructure by the scientific communities in the Asian countries
Summary • e-Science envisages a whole new way of doing collaborative science • For the sustainable Grid e-Infrastructure, we have to focus more on community building rather than just offering technologies. • Asia Pacific Region has great potential to adopt the e-Infrastructure : • More and more Asia countries will deploy Grid system and take part in the e-Science world • However, applications of and for the Asia Pacific scientists are largely in lack which is crucial!! • Extending from EGEE Asia Federation to EUAsiaGrid, we are widening the uptake of e-Science, by the close collaboration regionally and internationally