360 likes | 492 Views
Development of e-Science Applications in Taiwan. Hsin-Yen Chen Hurng-Chun Lee, Wei-Long Ueng and Simon C. Lin ASGC APAN 22 July 1 8 - 22 , Singapore. Exponentail Growth World. Gorden Bell Prize “price performance”. Gordon Bell Prize outpaces Moore’s Law. The Data Deluge.
E N D
Development of e-Science Applications in Taiwan Hsin-Yen Chen Hurng-Chun Lee, Wei-Long Ueng and Simon C. Lin ASGC APAN 22 July 18-22, Singapore
The Data Deluge • A large novel: 1Mbyte; The Bible:5Mbytes • A Mozart symphony (compressed): 10Mbytes • A digital mammogram: 100Mbytes • OED on CD: 500 Mbytes • Digital movie (compressed): 10 Gbytes • Annual production of refereed journal literature ( 20k journals; 2M articles) : 1Tbytes • Library of Congress: 20 Tbytes • The Internet Archive ( 10 B pages; From 1996 to 2002): 100Tbyts • Annual production of information (print, film, optical & magnetic media): 1500 to 3000 Pbytes • All Worldwide Telephone communication in 2002: 19.3 ExaBytes • Moore’s Law enables instruments and detectors to generate unprecedented amount of data in all scientific disciplines
Taiwan’s Thinking • International collaboration/participation with the global Grid Cyber- and e-Infrastructure is the key • Our strategy is to use High Energy Physics to drive the Next Generation Grid Infrastructure. No national Grid organization has been established yet! • We are setting up a new Academia Sinica Grid Computing Centre (ASGC) reorganised from ASCC • ASGC will provide Grid-based infrastructure, service, e-Science application development, and promote Grid activities in Taiwan • ASGC is fully supported by Academia Sinica and NSC in Taiwan
Taiwan signed the first WLCG MoU • Jos Engelen (CERN CSO) and President YT Lee (Academia Sinica) Signed the WLCG MoU on 9 Dec. 2005 as a Tier-1 Centre for ATLAS and CMS
Plan for Taiwan Tier-1 Network Backup Path to T0 Primary Path to T0 (plan to install 10GE in 2007)
LCG site IHEP Beijing other site PAEC NCP Islamabad KEK TsukubaICEPP Tokyo KNU Daegu VECC Kolkata Taipei - ASGC, IPAS NTU, NCU Tata Inst.Mumbai GOG Singapore Univ. Melbourne LCG and EGEE Grid Sites in theAsia-Pacific Region • 4 LCG sites in Taiwan • 12 LCG sites in Asia/Pacific • Academia Sinica Grid Computing Centre • Tier-1 Centre for the LHC Computing Grid (LCG) • Asian Operations Centre for LCG and EGEE • Coordinator of the Asia/Pacific Federation in EGEE AP Federation now shares the e-Infrastructure with WLCG
Contributions of ASGC in WLCG • WLCG Tier1 Center – Collaborating ATLAS & CMS teams (NCU, NTU, IPAS) in Taiwan • Regional Operation Centre and Core Infrastructure Centre • Production CA Services • LCG Technology Development • Data Management • Grid Technology • Certification & Testing • Application Software • ARDA (Distribute Analysis) • 3D (Distributed Deployment of Database) • Operation and Management • Dissemination and Outreach
Education and Training gLite and the development of EGEE were introduced in all events which are run by ASGC
Issues of the Grid applications • Due to the loose coupling nature, distributing application jobs on the Grid is not trivial • extra works are needed concerning the efficient job handling and result gathering • need also efforts to handle transient network or site problems • complexities should be hidden and the interface to end user should be application oriented • The significant Grid system overhead makes the Grid only benefit to the jobs with long computing time • not suitable for the pilot jobs for decision making
e-Science Applications in Taiwan • High Energy Physics: WLCG • Bioinformatics: mpiBLAST-g2 • Biomedicine: Distributing AutoDock tasks on the Grid using DIANE • Digital Archive: Data Grid for Digital Archive Long-term preservation • Atmospheric Science • Geoscience: GeoGrid for data management and hazards mitigation • Seismology Science • Ecology Research and Monitoring: EcoGrid • SARS Grid: Access Grid Tech. • BioPortal • e-Science Application Framework Development
EGEE Biomed DC II – Large Scale Virtual Screening of Drug Design on the Grid • Biomedical goal • accelerating the discovery of novel potent inhibitors thru minimizing non-productive trial-and-error approaches • improving the efficiency of high throughput screening • Grid goal • aspect of massive throughput: reproducing a grid-enabled in silico process (exercised in DC I) with a shorter time of preparation • aspect of interactive feedback: evaluating an alternative light-weight grid application framework (DIANE) • Grid Resources: • AuverGrid, BioinfoGrid, EGEE-II, Embrace, & TWGrid • Problem Size: around 300 K compounds from ZINC database and a chemical combinatorial library, need ~ 137 CPU years in 4 weeks • a world-wide infrastructure providing over than 5,000 CPUs
T06 (wild type) 34.92% 1f8b 13.06% 1f8c compound numbers binding energy docking energy 2.43% 2qwe Kcal/mol
T01 (E119A) 55% 1f8b, 1f8c compound numbers 11.58% binding energy 2qwe docking energy Kcal/mol
E119 A119 O O O T06 (wild type) T01 (E119A) O GNA * Oseltamivir is with a 4-amine, too.
Web Services-based Grid Infrastructure • Model the world as a collection of services • Resource descriptions and aggregation • Discovery • Composition • Adaptation & Evolution • Quality of Services: security, performance, reliability, scalability, … • Workflow (lifecycle management) • Open Source Implementation
GAPortal Framework Architechure • Layered architecture to improve the usability of Grid Applications • Two frameworks built on the top of current Grid middleware to Provide friendly graphic user interface Handle Grid application logic an efficient way Reduce the efforts of application gridification
Use LAS (Live Access Server) to access the dataset from the SRB System, and integrate with Google Earth
TEC DataGrid-based Digital Library Quick Focal Mechanism Determination Outputs Waveform Simulation TEC Community Library Seismogram Retrieval Inversion of Slip Distribution on Fault Plane 1999 Chi-Chi Taiwan Earthquake DataGrid-based Digital Library • 9 Terabytes of on-line disk • More than 100 Terabytes of tape archive
Web-Based Portal User Interface Job repository User/Grid Proxy Manager Virtual Queuing System SRB-based Digital Library IESAS Sensors Grid Agent CWB Grid Computing Element SeisGrid Architecture
Industrial Program • NSC-Quanta Collaboration • To help Quanta Blade System have best performance for HPC and Grid Computing • Quanta is the largest Notebook manufacturer in the world • Participants: AS, NTU, NCTU, NTHU, NCHC • Scientific Research Disciplines: Material Science, Nano-Technology, Computational Chemistry, Bioinformatics, Engineering, etc. • Microsoft Collaboration • Enable MS CCS been the CE (gLite)
Conclusion • Critical mass decides which Grid technology/system to prevail; Collaboration, Data and Complexity Reduction ar the main themes • We are about to witness Data Deluge in all disciplines of e-Sciences • Unprecedented way to collaborate on day-to-day basis will change the sociology of academia life, eco-system of business world and eventually every one in the society • It is about a new paradigm of collaboration , data and computing will outgrow each of us. Together, we will achieve goals not possible individually!