310 likes | 322 Views
GRID Computing. Rabie A. Ramadan, PhD Cairo University http://www.rabieramadan.org rabie@rabieramadan.org. Table of Contents. What is Grid Computing? Cousins of Grid Computing An Illustrative Example CERN Grid. 2. Living in an Exponential World (1) Computing & Sensors.
E N D
GRID Computing Rabie A. Ramadan, PhD Cairo University http://www.rabieramadan.org rabie@rabieramadan.org
Table of Contents • What is Grid Computing? • Cousins of Grid Computing • An Illustrative Example • CERN Grid 2
Living in an Exponential World(1) Computing & Sensors Moore’s Law: transistor count doubles each 18 months
Living in an Exponential World:(2) Storage • Storage density doubles every 12 months • Dramatic growth in online data (1 petabyte = 1000 terabyte = 1,000,000 gigabyte) • 2000 ~0.5 petabyte • 2005 ~10 petabytes • 2010 ~100 petabytes • 2015 ~1000 petabytes?
Evolution of the Scientific Process • Pre-electronic • Theorize &/or experiment, alone or in small teams; publish paper • Post-electronic • Construct and mine very large databases of observational or simulation data • Develop computer simulations & analyses • Exchange information instantaneously within large, distributed, multidisciplinary teams
Evolution of Business • Pre-Internet • Central corporate data processing facility • Business processes not compute-oriented • Post-Internet • Enterprise computing is highly distributed, heterogeneous, inter-enterprise (B2B) • Outsourcing becomes feasible => service providers of various sorts • Business processes increasingly computing- and data-rich
The Grid “Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”
Computational Grids • A network of geographically distributed resources including computers, peripherals, switches, instruments, and data. • Each user should have a single login account to access all resources. • Resources may be owned by diverse organizations. 8
Computational Grids • Grids are typically managed by gridware. • Gridware can be viewed as a special type of middleware that: • Enable sharing and manage grid components based on user requirements and resource attributes (e.g., capacity, performance, availability…) 9
Cousins of Grid Computing • Parallel Computing • Distributed Computing • Peer-to-Peer Computing • Cloud Computing • Many others: • Cluster Computing, • Network Computing, • Client/Server Computing, • Internet Computing, etc... 10
A Comparison • SERIAL • Fetch/Store • Compute • GRID • Fetch/Store • Discovery of Resources • Interaction with remote application • Authentication / Authorization • Security • Compute/Communicate • Etc • PARALLEL • Fetch/Store • Compute/ communicate • Cooperative game
Distributed Computing • People often ask: Is Grid Computing a fancy new name for the concept of distributed computing? • In general, the answer is “no.” Distributed Computing is most often concerned with distributing the load of a program across two or more processes. 12
PEER2PEER Computing • Sharing of computer resources and services by direct exchange between systems. • Computers can act as clients or servers depending on what role is most efficient for the network. 13
Cluster Computing • A collection of workstations of PCs that are interconnected by a high-speed network. • Needs physical proximity and operating homogeneity. • work as an integrated collection of resources • have a single system image spanning all its nodes 14
Cloud Computing • Clouds – new commercially supported data center model replacing compute grids (and your general purpose computer center)
Methods of Grid Computing • Distributed Supercomputing • High-Throughput Computing • On-Demand Computing • Data-Intensive Computing 16
Distributed Supercomputing • Combining multiple high-capacity resources on a computational grid into a single, virtual distributed supercomputer. • Tackle problems that cannot be solved on a single system. 17
High-Throughput Computing • Uses the grid to schedule large numbers of loosely coupled or independent tasks, with the goal of putting unused processor cycles to work. 18
On-Demand Computing • Uses grid capabilities to meet short-term requirements for resources that are not locally accessible. 19
Data-Intensive Computing • The focus is on synthesizing new information from data that is maintained in geographically distributed repositories, digital libraries, and databases. • Particularly useful for distributed data mining. 20
Collaborative Computing • Concerned primarily with enabling and enhancing human-to-human interactions. • Applications are often structured in terms of a virtual shared space. 21
Who Needs Grid Computing? • A chemist may utilize hundreds of processors to screen thousands of compounds per hour. • Teams of engineers worldwide pool resources to analyze terabytes of structural data. • Meteorologists seek to visualize and analyze petabytes of climate data with enormous computational demands. • Physics with high computing requirements 22
LHC Data every year • 40 million collisions per second • After filtering, 1000 collisions of interest per second • > 1 Megabyte of data digitised per collision recording rate > 1 Gigabyte / sec • Collisions recorded each year stored data require > 15 Petabytes / year 1 Megabyte (1MB) A digital photo 1 Gigabyte (1GB) = 1000MB 5GB = A DVD movie 1 Terabyte (1TB) = 1000GB World annual book production 1 Petabyte (1PB) = 1000TB Annual production of one LHC experiment 1 Exabyte (1EB) = 1000 PB 3EB = World annual information production 24
Balloon (30 Km) CD stack with 1 year LHC data! (~ 20 Km) Concorde (15 Km) Mt. Blanc (4.8 Km) LHC data correspond to about 20 million CDs each year Where will the experiments store all of these data? 25
LHC Data Processing LHC data analysis requires a computing power equivalent to~ 100,000 of today's fastest PC processors Where will the experiments find such a computing power? 26
Computing power available at CERN • High-throughput computers • More than 35’000 CPUs in about 6000 boxes (Linux) • 14 Petabytes on 14’000 drives (NAS Disk storage) • 34 Petabytes on 45’000 tape slots with 170 high speed drives Nowhere near enough! 27
Computing for LHC • Problem: even with Computer Centre upgrade, CERN can provide only a fraction of the necessary resources. • Solution:Computing centers, which were isolated in the past, will be connected, uniting the computing resources of particle physicists worldwide. Users of CERN Europe: 267 institutes 4603 users Out of Europe: 208 institutes 1632 users 28
Lab m Uni x regional group CERN Tier 1 Uni a UK USA Lab a France The LHC Computing Centre Tier 1 Tier3 physics department Uni n Tier2 ………. Italy CERN Tier 0 Desktop Lab b Germany ………. Lab c Uni y Uni b physics group The GRID: a possible solution to CERN computing needs • The LHC Computing GRID (LCG) is a project funded by the European Union. Its objective is to build the next generation computing infrastructure providing intensive computation and analysis. 29
Thank you 31