500 likes | 508 Views
Overview. Slides 2 – 9 : What is CERN? general overview of CERN Slides 10 – 19: What is LHC? LHC and LHC challenges in terms of data and CPU Slides 20 – 36: The Grid the GRID in general Slides 37 – 50: The Grid @ CERN
E N D
Overview • Slides 2 – 9 : What is CERN? • general overview of CERN • Slides 10 – 19: What is LHC? • LHC and LHC challenges in terms of data and CPU • Slides 20 – 36: The Grid • the GRID in general • Slides 37 – 50: The Grid @ CERN • grid projects at CERN (EDG, DataTAG, LCG, Grace, Mammogrid, Openlab) • Short video clips: • CERN in 2 minutes • Simulation of LHC collider • The Grid
What is CERN? • CERN is: • ~ 2500 staff scientists (physicists, engineers, …) • Some 6500 visiting scientists (half of the world's particle physicists) • They come from • 500 universities • representing • 80 nationalities. • CERN is the world's largest particle physics centre • Particle physics is about: • - elementary particles which all matter in the • Universe is made of • - fundamental forces which hold matter together • Particles physics requires: • - special tools to create and study new particles
CERN Site Mont Blanc, 4810 m Downtown Geneva
What is CERN? • The special tools for particle physics are: • ACCELERATORS, huge machines able to speed up particles to very high energies before colliding them into other particles • DETECTORS, massive instruments which register the particles produced when the accelerated particles collide
What is CERN? • Physicists smash particles into each other to: • - identify their components • - create new particles • - reveal the nature of the interactions between them - create an environment similar to the one present at • the origin of our Universe • What for? To answer fundamental questions like: • how did the Universe begin? What is the origin of mass? • What is the nature of antimatter?
What is CERN? CERN in 2 minutes Movie
What is CERN? The World Wide Web was invented here, to improve and speed-up the information sharing between physicists working all over the world!
What is CERN? • CERN has made many important discoveries, but our current understanding of the Universe is still incomplete! • Higher energy collisions are the key to further discoveries of more massive particles (E=mc2) • One particle predicted by theorists remains elusive: the Higgs boson
What is CERN? • To answer questions still open, CERN is building the Large Hadron Collider (LHC) • The LHC will be the most powerful instrument ever built to investigate elementary particles • If the Higgs boson exists, the LHC will almost certainly find it
What is LHC? LHC is due to switch on in 2007 Four experiments, with detectors as ‘big as cathedrals’: ALICE ATLAS CMS LHCb • LHC will collide beams of protons at an energy of 14 TeV • Using the latest super-conducting technologies, it will operate at about – 3000C, just above absolute zero of temperature. • With its 27 km circumference, the accelerator will be the largest superconducting installation in the world.
The LHC Data Challenge • A particle collision = an event • Physicist's goal is to count, trace and characterize all the particles produced and fully reconstruct the process. • Among all tracks, the presence of “special shapes” is the sign for the occurrence of interesting interactions. • One way to find the Higgs boson: • look for characteristic decay pattern producing 4 muons
The LHC Data Challenge Starting from this event… Selectivity: 1 in 1013 Like looking for 1 person in a thousand world populations! Or for a needle in 20 million haystacks! You are looking for this “signature”
1 Megabyte (1MB) A digital photo 1 Gigabyte (1GB) = 1000MB A DVD movie 1 Terabyte (1TB) = 1000GB World annual book production 1 Petabyte (1PB) = 1000TB Annual production of one LHC experiment 1 Exabyte (1EB) = 1000 PB World annual information production LHC data • 40 million collisions per second • After filtering, 100 collisions of interest per second • A Megabyte of data digitised for each collision = recording rate of 0.1 Gigabytes/sec • 1010 collisions recorded each year • = 10 Petabytes/year of data CMS LHCb ATLAS ALICE
Balloon (30 Km) LHC data CD stack with 1 year LHC data! (~ 20 Km) LHC data correspond to about 20 million CDs each year! Concorde (15 Km) Where will the experiments store all of these data? Mt. Blanc (4.8 Km)
LHC processing • Simulation: start from theory and detector characteristics and compute what detector should have seen • Reconstruction: transform signals from the detector to physical properties (energies, charge of particles, ..) • Analysis: Find collisions with similar features, use of complex algorithms to extract physics…
LHC processing LHC data analysis requires a computing power equivalent to ~ 100,000 of today's fastest PC processors! Where will the experiments find such a computing power?
Computing at CERN • High-throughput computing based on reliable “commodity” technology • More than 1000 dual processor PCs • More than 1 Petabyte of data on disk and tapes Nowhere near enough!
Computing for LHC Europe: 267 institutes 4603 users Elsewhere: 208 institutes 1632 users • Problem: CERN alone can provide only a fraction of the necessary resources • Solution:Computing centers, which were isolated in the past, should now be connected, uniting the computing resources of particle physicists in the world!
Computing for LHC: a problem? The Grid: a possible solution!
What is the Grid? • The World Wide Web provides seamless access to information that is stored in many millions of different geographical locations • In contrast, the Grid is an emerging infrastructure that provides seamless access to computing power and data storage capacity distributed over the globe.
What is the Grid? • The term Grid was coined by Ian Foster and Carl Kesselman (Grid bible “The Grid: blueprint for a new computing infrastructure”). • The name Grid is chosen by analogy with the electric power grid: plug-in to computing power without worrying where it comes from, like a toaster. • The idea has been around under other names for a while (distributed computing, metacomputing, …). • This time, technology is in place to realise the dream on a global scale.
How will it work? • The Grid relies on advanced software, called middleware, which ensures seamless communication between different computers and different parts of the world • The Grid search engine will not only find the data the scientist needs, but also the data processing techniques and the computing power to carry them out • It will distribute the computing task to wherever in the world there is spare capacity, and send the result to the scientist
How will it work? • The GRID middleware: • Finds convenient places for the scientists “job” (computing task) to be run • Optimises use of the widely dispersed resources • Organises efficient access to scientific data • Deals with authentication to the different sites that the scientists will be using • Interfaces to local site authorisation • and resource allocation policies • Runs the jobs • Monitors progress • Recovers from problems • … and …. • Tells you when the work is complete and transfers the result back!
What are the challenges? Must share data between thousands of scientists with multiple interests Must link major computer centres, not just PCs Must ensure all data accessible anywhere, anytime Must grow rapidly, yet remain reliable for more than a decade Must cope with different management policies of different centres Must ensure data security: more is at stake than just money! Must be up and running by 2007
Benefits for Science • More effective and seamless collaboration of dispersed communities, both scientific and commercial • Ability to run large-scale applications comprising thousands of computers, for wide range of applications • Transparent access to distributed resources from your desktop, or even your mobile phone • The term “e-Science” has been coined to express these benefits
Grid projects in the world • UK e-Science Grid • Netherlands – VLAM, PolderGrid • Germany – UNICORE, Grid proposal • France – Grid funding approved • Italy – INFN Grid • Eire – Grid proposals • Switzerland - Network/Grid proposal • Hungary – DemoGrid, Grid proposal • Norway, Sweden - NorduGrid • NASA Information Power Grid • DOE Science Grid • NSF National Virtual Observatory • NSF GriPhyN • DOE Particle Physics Data Grid • NSF TeraGrid • DOE ASCI Grid • DOE Earth Systems Grid • DARPA CoABS Grid • NEESGrid • DOH BIRN • NSF iVDGL • DataGrid (CERN, ...) • EuroGrid (Unicore) • DataTag (CERN,…) • Astrophysical Virtual Observatory • GRIP (Globus/Unicore) • GRIA (Industrial applications) • GridLab (Cactus Toolkit) • CrossGrid (Infrastructure Components) • EGSO (Solar Physics)
Grid Applications for Science • Medical/Healthcare(imaging, diagnosis and treatment ) • Bioinformatics(study of the human genome and proteome to understand genetic diseases) • Nanotechnology (design of new materials from the molecular scale) • Engineering(design optimization, simulation, failure analysis and remote Instrument access and • control) • Natural Resources and the Environment • (weather forecasting, earth observation, modeling • and prediction of complex systems)
Medical/Healthcare Applications “The Grid will enable a standardized, distributed digital mammography resource for improving diagnostic confidence" • Digital image archives • Collaborative virtual environments • On-line clinical conferences “The Grid makes it possible to use large collections of images in new, dynamic ways, including medical diagnosis.” “The ability to visualise 3D medical images is key to the diagnosis of pathologies and pre-surgical planning” Quotes from: http://gridoutreach.org.uk
Bioinformatics • Capturing the complex and evolving patterns of genetic information, determining the development of an embryo • Understanding the genetic interactions that underlie the processes of life-form development, disease and evolution. “Every time a new genome is sequenced the result is compared in a variety of ways with other genomes. Each code is made of 3.5 billion pairs of chemicals…”
Nanotechnology • New and 'better' materials • Benefits in pharmaceuticals, agrochemicals, food production, • electronics manufacture from the faster, cheaper discovery of new • catalysts, metals, polymers, organic and inorganic materials “The Grid has the potential to store and analyze data on a scale that will support faster, cheaper synthesis of a whole range of new materials.” Quotes from: http://gridoutreach.org.uk
Natural Resources/Environment • Modeling and prediction of earthquakes • Climate change studies and weather forecast • Pollution control • Socio-economic growth planning, financial modeling and • performance optimization “Federations of heterogeneous databases can be exploited through the Grid to solve complex questions about global issues such as biodiversity.” Quotes from: http://gridoutreach.org.uk
Precursors of the Grid • SETI@home: sharing of spare PC processing power to analyze radio signals • Napster: sharing of data (music) between computers • Entropia DCGrid: commercial solution for sharing workstations within a company The difference: The Grid CERN is developing will combine resources at major computer centers, and require dedicated equipment and sophisticated middleware to monitor and allocate resources
SETI@home: a grassroots Grid >1 million years of computer processing time >3.5 million have downloaded the screensaver >30 Teraflops rating (ASCI White = 12 Teraflops)
Spinoff from SETI@home Spawned a cottage industry Xpulsar@home, Genome@home, Folding@home, evolutionary@home, FightAIDS@home, SARS@home... Spawned a real industry Entropia, United Devices, Popular Power... Major limitations: Only suitable for “embarrasingly parallel” problems Cycle scavenging relies on goodwill
Who will use Grids? • Computational scientists & engineers: large scale modeling of complex structures • Experimental scientists: storing and analyzing large data sets • Collaborations: large scale multi-institutional projects • Corporations: global enterprises and industrial partnership • Environmentalists: climate monitoring and modeling • Training & education: virtual learning rooms and laboratories
Grid at CERN Grid is a solution for LHC computing requirements CERN involved in many Grid development efforts worldwide
Grid @ CERN • CERN projects: • LHC Computing Grid (LCG) • EC funded projects led by CERN: • European DataGrid (EDG) • + others • Industry funded projects: • CERN openlab for DataGrid applications
LHC Computing Grid (LCG) • Mission: • Grid deployment project aimed at installing a functioning Grid to help the LHC experiments collect and analyse the data coming from the detectors • Strategy: • Integrate thousands of computers at dozens of participating institutes worldwide into a global computing resource • Rely on software being developed in advanced grid technology projects, both in Europe and in the USA
LHC Computing Grid (LCG) • People: • Over 150 physicists, computer scientists and engineers from partner research centres around the world • Timeline: • 2002: start project • 2003: service opened (Sept) • 2002 - 2005: prepare and deploy the environment for LHC computing • 2006 – 2008: acquire, build and operate the LHC computing service
European DataGrid (EDG) • Mission: • Develop the necessary middleware to run a Grid on a “testbed” involving computer centers in Europe • Key features: • Largest software development project ever funded by the EU (9.8 million euros) • Three year phased developments & demos (2001-2003) • Three application fields: High Energy Physics, Earth • Observation and Genomic Exploration
European DataGrid (EDG) • People: • Total of 21 partners, over 150 programmers from research and academic institutes as well as industrial companies • Status: • Testbed including approximately 1000 CPUs at 15 sites • Several improved versions of middleware software (final release end 2003) • Several components of software integrated in LCG • Software used by partner projects: DataTAG, CROSSGRID
EGEE: Enabling Grids for e-Science in Europe • Mission: • Deliver 24/7 Grid service to European science; re-engineer and “harden” Grid middleware for production; “market” Grid solutions to different scientific communities • Be the first international multiscience production Grid facility • Key features: • 100 million euros/4years • >400 software engineers + service support • 70 European partners
The EGEE Vision Access to a production quality GRID will change the way science and much else is done in Europe An international network of scientists will be able to model a new flood of the Danube in real time, using meteorological and geological data from several centers across Europe. A team of engineering students will be able to run the latest 3D rendering programs from their laptops using the Grid. A geneticist at a conference, inspired by a talk she hears, will be able to launch a complex biomolecular simulation from her mobile phone.
DataTAG • Mission: • Develop advanced networking solutions for transatlantic Grid communications. • Status: • Recent land speed data transfer record: 1 TeraByte of data transferred in 1hr between SLAC and CERN (equivalent to 200 DVD movies or one CD every 2.3s).
GRACE • Background: • Today search engines are extremely centralized. In order to index a document they must download it, process it and store its index - all in one central location. • Mission: • develop a decentralized search engine providing dynamical categorisation of information. Uses Grid technology and semantic tools.
MammoGrid • Background: • Early diagnosis through mammography screening improves prognosis BUT quality control in acquisition, diagnosis and efficient data management is vital. • Mission: • To provide a demonstrator for use in epidemiological studies, quality control and validation of computer aided detection algorithms. • Status: • Building Grid-enabled repository of mammography data for research and training that contain sufficiently large statistical samples.
CERN openlab for DataGrid applications • Mission: • Testbed for cutting edge Grid software and hardware • Industry consortium for Grid-related technologies of common interest • Training ground for a new generation of engineers to learn about Grid • Partners: • CERN • ENTERASYS • HP • IBM • INTEL
CERN openlab for DataGrid applications • CERN opencluster: • Build an ultrahigh performance computer cluster • Link it to the DataGrid and test its performance • Evaluate potential of future commodity technology for LHC • Student Program: • student teams get hands-on experience with some of the • latest hardware and software technologies for the Grid • learn about how CERN and its partners are developing • Grid technology for scientific and industrial purposes • external lab visits and special invited talks
Grid @ CERN The Grid Movie