1 / 28

Grids, a new way to do science

Explore the innovative world of grids and their impact on modern science at the Do Son ACGRID school in 2007. The school integrates hands-on training sessions and tutorials, focusing on computational tools such as EGEE, BOINC, ROOT, TAVERNA, and GEANT4. Learn about the diverse applications driving grid development, from natural resources to healthcare. Attendees will develop skills to deploy grid services, use analysis tools, and leverage distributed computing for advanced simulations. Join us in Vietnam to shape the future of scientific research with grid technology.

helenrhall
Download Presentation

Grids, a new way to do science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grids, a new way to do science V. Breton CNRS-IN2P3 V. Breton CUIC 2007

  2. What is Do Son ACGRID school about ? • The school is about grids • Grids of PC clusters: EGEE tutorial from Nov. 5th to 9th • Desktop grids: BOINC tutorial on Nov 15th • The school is about computational tools that use the grid • For data analysis: ROOT on Nov. 12th and TAVERNA on Nov. 13th • For simulation: GEANT4 on Nov. 14th • The school will consist of courses and hands-on • A Grid has been deployed locally at IOIT for the duration of the school V. Breton CUIC 2007

  3. Our goals for the school • Train asian engineers to install and operate grid services • Tutorial on grid installation (October 29th – Nov. 2nd) • Train asian researchers to use the services offered by the EGEE grid • Train users to call the grid services • Train users to deploy analysis and simulation tools which take advantage of the grid • Deploy in Vietnam a grid infrastructure researchers can use • Machines bought for the school will be distributed in 5 sites • IOIT in Hanoi and HCMC • Hanoi University of Technology • Maison des Sciences et Technologies • Institut Français d’Informatique V. Breton CUIC 2007

  4. What is the Grid? • The World Wide Web provides seamless access to information that is stored in many millions of different geographical locations • In contrast, the Grid is a new computing infrastructure which provides seamless access to computing power, data and other resources distributed over the globe • The name Grid is chosen by analogy with the electric power grid: plug-in to computing power without worrying where it comes from, like a toaster V. Breton CUIC 2007

  5. Two kinds of grids Volunteer computing vs grid infrastructures BOINC tutorial on Nov. 15th EGEE grid tutorial Nov 5-9 V. Breton CUIC 2007

  6. What is driving grid development? • Natural Resources and the Environment(weather forecasting, earth observation, modeling and prediction of complex systems: river floods and earthquake simulation) • Physics/Astronomy (data from different kinds of research instruments) • Bioinformatics(study of the human genome and proteome to understand genetic diseases) • Medical/Healthcare(imaging, diagnosis and treatment ) • Nanotechnology(design of new materials from the molecular scale) • Engineering(design optimization, simulation, failure analysis and remote Instrument access and control) Data and compute intensive sciences are next generation applications that have extreme needs but are likely to become mainstream in the next 5 years V. Breton CUIC 2007

  7. Meteorology • Necessity for early warning and detection system for e.g. hurricanes • Technology advances at fast speeds: • Infrared sensors on meteorological satellites now provide more and more detailed observations of the atmosphere • Research efforts continue the development of computer forecasting models capable of utilizing satellite data to improve current weather-predicting skills • Meteorological studies are aided by the use of large computers for atmospheric modeling • With easier and faster access to data and models, prediction becomes continually more efficient V. Breton CUIC 2007

  8. Earth Observation • Long-term global observations of the land surface, biosphere, solid Earth, atmosphere, and oceans produce huge amounts of data: • not in homogeneous data formats • not easy to locate • no obvious user friendly interface • Challenge: understanding the Earth as an integrated system • increased scope and more local details means ever more data • to better understand the interrelations of different components one needs more analysing power • this translates into better forecasting V. Breton CUIC 2007

  9. Climate Simulation • Climate simulation already usesdistributed computing • Example: the scientific experiment “Casino-21” tries to produce a forecast of the climate in the 21st century by a large-scale simulation • “Casino-21” uses a structure like the SETI@home project • Grid infrastructures will provide new and more powerful ways of using distributed computing for the use of Climate Simulation V. Breton CUIC 2007

  10. Pollution • Satellite monitoring: • helps scientists to understand changes in the atmosphere, track them and plan ways to reduce our environmental impact • A wide variety of emissions is changing the chemistry and composition of our planet's atmosphere • The atmosphere is a very complex chemical system • So far data is used selectively • Increased analysing power gives access to a wider spectrum and optimizes turn-around times V. Breton CUIC 2007

  11. The Vision • An international network of scientists will be able to model a new flood of the Mekong river in real time, using meteorological and geological data from several centres across Europe • UNOSAT: • internet based service to provide high quality maps to UN agencies, NGOs and other institutions of the humanitarian community • Grid technology allows raw satellite images to be reduced and processed into readable maps at a greater speed than would otherwise be possible Access to a production quality grid will change the way science and earth observation of all kinds are done V. Breton CUIC 2007

  12. How does the grid work? • The Grid relies on advanced software, called middleware, which ensures seamless communication between different computers and different parts of the world • The Grid search engine not only finds the data the scientist needs, but also the data processing techniques and the computing power to carry them out • It distributes the computing task to wherever in the world there is available capacity, and sends the result back to the scientist V. Breton CUIC 2007

  13. Grid Challenges • Share data between thousands of scientists with multiple interests • Need to support dynamic virtual organisations of geographically dispersed groups • Ensure all data is accessible anywhere, anytime • Peta-byte range of data needs to be available on-demand • Grow rapidly, yet remain reliable for more than a decade • Are we sure the current technologies will scale? • Transfer to industry to achieve economies of scale • Standardisation process still on-going • Merge of web-services (OASIS) and grids (GGF) into WSRF • Must progress to avoid non-compatible proprietary grids • Cope with different management policies of grid sites • Link computer centres, not just single PCs, separately administered and owned • Needresource allocation policiesandbilling systems • Ensure security • Medical applications have legal/ethical restrictions on data access • Avoid becoming a target for hackers V. Breton CUIC 2007

  14. What is EGEE ? • EGEE • 1 April 2004 – 31 March 2006 • 71 partners in 27 countries, federated in regional Grids • EGEE-II • 1 April 2006 – 31 March 2008 • 91 partners in 32 countries • 13 Federations • Objectives • Large-scale, production-quality infrastructure for e-Science • Attracting new resources and users from industry as well asscience • Maintain and further improve“gLite” Grid middleware V. Breton CUIC 2007

  15. Why did we choose to teach you about EGEE? • EGEE is an operational grid infrastructure • More than 100000 jobs / day • EGEE offers real services to its user communities • Job and data management services are operational • EGEE Infrastructure is used to analyze LHC data • Joining EGEE allows participating to LHC data analysis • EGEE technology is well supported in Asia • Academia Sinica in Taiwan offers central services to user communities around Asia V. Breton CUIC 2007

  16. What does EGEE provide? • Simplified access (access to all the operational resources the user needs) • On demand computing (fastaccess to resources by allocating them efficiently) • Pervasive access (accessible from any geographic location) • Large scale resources (of a scale that no single computer centre can provide) • Sharing of software and data (in a transparent way) • Improved support (use the expertise of all partners to offer in-depth support for all key applications) V. Breton CUIC 2007

  17. 98k jobs/day Highlights of EGEE-II • >200 VOs from several scientific domains • Astronomy & Astrophysics • Civil Protection • Computational Chemistry • Comp. Fluid Dynamics • Computer Science/Tools • Condensed Matter Physics • Earth Sciences • Fusion • High Energy Physics • Life Sciences • Further applications under evaluation Applications have moved from testing to routine and daily usage ~80-90% efficiency V. Breton CUIC 2007

  18. LCG-2 gLite 2004 prototyping prototyping product 2005 product 2006 gLite 3.0 EGEE-II middleware • EGEE maintains and improves the gLite middleware distribution • gLite 3 • Publicly released on May 4, 2006 • Convergence with LCG-2 • Currently deploying version 3.1 • On Scientific Linux • Work management system • Data management system • Information system • Resource brokering • Security V. Breton CUIC 2007

  19. EGEE Network Sites Sites NRENs Sites NRENs Sites NRENs NRENs 98k jobs/day ENOC GGUS Support Units GÉANT2 Users Operations • Size of the infrastructure today: • 237 sites in 45 countries • ~36 000 CPU • ~ 5 PB disk, + tape MSS • distributed operations • copes well with increase in size and usage V. Breton CUIC 2007

  20. Applications VO CPU Consumption Total VOs: 204Total Users: 5034Affected People: 10200 V. Breton CUIC 2007

  21. The pilot applications • High Energy Physics with LHC Computing Grid (www.cern.ch/lcg) relies on a Grid infrastructure to store and analyse petabytes of real and simulated data. LCG is a major source of resources, requirements and a hard deadlines with no conventional solution available • In Biomedical Sciences, several communities are facing equally daunting challenges to cope with the flood of bioinformatics and healthcare data. Need to access large and distributed non-homogeneous data and important on-demand computing requirements V. Breton CUIC 2007

  22. LCG • LCG: a collaboration of • The LHC experiments • The Regional Computing Centres • Physics institutes • Mission: • Prepare and deploy the computing environment that will be used by the experiments to analyse the LHC data • Strategy: • Integrate thousands of computers at dozens of participating institutes worldwide into a global computing resource • Rely on software being developed in advanced grid technology projects, both in Europe and in the USA V. Breton CUIC 2007

  23. WISDOM • WISDOM: a collaboration of • Biology, Bioinformatics, Chemoinformatics laboratories • Grid infrastructure projects • Mission: • in silico drug discovery against emerging and neglected diseases • Strategy: • Centuries of CPU cycles used to dock millions of compounds during large scale grid deployments • Secure data management of biochemical information V. Breton CUIC 2007

  24. www.eu-egee.org 8000 7000 6000 5000 Unique visitors 4000 Links from Internet Search Engines 3000 2000 1000 0 July April June May March August October January February November December September Dissemination and Training • Comprehensive training programme in Europe, South America, Asia • 110 events, > 1600 participants ACGRID is one of these events V. Breton CUIC 2007

  25. What is Do Son ACGRID school about ? • Grids are about sharing • Resources (CPU, storage) • Knowledge • Do Son ACGRID school is about sharing knowledge • Sharing expertise in the installation and operation of grid services • Sharing expertise in the development of deployment of grid-enabled applications • Do Son ACGRID school is about building for long term collaboration • We are here to help Vietnamese engineers to run grid services • We are here to help vietnamese scientists to develop and deploy grid-enabled applications • We are here to present performing tools for data analysis and simulation • TAKE ADVANTAGE OF THIS OPPORTUNITY TO ADVANCE YOUR RESEARCH • ask questions • Don’t hesitate to discuss with teachers V. Breton CUIC 2007

  26. What should happen after the school ? • Grid services will be installed in several sites in Vietnam • In Hanoi: Hanoi University of Technology, IOIT, Institut Français d’Informatique • In HCMC: IOIT • You will be able to use your grid certificates to access the EGEE grid through these sites • Possibility to join any other Virtual Organization • You will benefit from the grid services as any other EGEE user V. Breton CUIC 2007

  27. What you get out of the school • Grids offer a unique opportunity to integrate research laboratories into international initiatives • Example: LHC • Grids offer opportunities to start collaboration • Example: Telemedecine • Installation of a grid enabled medical imaging platform at IOIT in HCMC • Joint application deployment between the platforms in HCMC and Clermont-Ferrand It all depends on you ! V. Breton CUIC 2007

  28. Credits • IOIT in Hanoi: Vu Duc Thy, Luong Chi Mai, Ngo Tran Anh and collaborators • IOIT in HCMC: Do Van Long • ASGC: Min Tsai, Jinny Chen and collaborators • Nicolas Maire, Sébastien Incerti, René Brun, Georgina Moulton, our second week speakers • HealthGrid: Nicolas Spalinger, Nathanaël Verhaeghe • CNRS office in Hanoi: Bernard Mely, Le Tuyet Trinh • CNRS-IN2P3: Vincent Bloch, Vincent Breton, Géraldine Fettahi, Matthieu Reichstadt, Denis Perret-Gallix, Jean Salzemann • TEIN2: David West V. Breton CUIC 2007

More Related