480 likes | 498 Views
Tutorial: Technology of the Grid. 1. Definition 2. Components 3. Infrastructure. Kento Aida Tokyo Institute of Technology. Goal of the Tutorial. What is the grid? definition What technology is needed to create the grid? component technology How is the grid environment is constructed?
E N D
Tutorial: Technology of the Grid 1. Definition 2. Components 3. Infrastructure Kento Aida Tokyo Institute of Technology Kento Aida, Tokyo Institute of Technology
Goal of the Tutorial • What is the grid? • definition • What technology is needed to create the grid? • component technology • How is the grid environment is constructed? • infrastructure Kento Aida, Tokyo Institute of Technology
1. Definition Kento Aida, Tokyo Institute of Technology
Definition of the Grid • Definition [http://www.jpgrid.org/about/index.html] • The grid is an infrastructure to dynamically organize a virtual organization (or a virtual computer) on demand by virtualizing and integrating resources such as computers, data, experimental devices, sensors, people. (The original definition is written by Japanese.) • What is the grid? A three point checklist[http://www.gridtoday.com/02/0722/100136.html] • coordinates resources that are not subject to centralized control • using standard, open, general-purpose protocols and interfaces • to deliver nontrivial qualities of service Kento Aida, Tokyo Institute of Technology
virtual organization What can we do using the grid? We can use information resources (services) on network securely (to guarantee security), stably (to use required resources on demand), and easily (without knowledge of network, computers, …). Internet Kento Aida, Tokyo Institute of Technology
Examples of Virtual Organizations • Members in a collaborative research project • Researchers in a collaborative research project share resources distributed over their sites, e.g. universities, institutes, laboratories, …. • large-scale scientific computing • large-scale distributed database • Project team in a company • Members in a project team share resources distributed over multiple branches in a company. • business • transaction Kento Aida, Tokyo Institute of Technology
Definition of the Grid (again) • Definition • The grid is an infrastructure to dynamically organize a virtual organization (or a virtual computer) on demand by virtualizing and integrating resources … . • What is the grid? A three point checklist • coordinates resources that are not subject to centralized control • dynamic organization of VO • using standard, open, general-purpose protocols and interfaces • access to resources by standardized protocols • to deliver nontrivial qualities of service • Users do not have to have knowledge about network, computers, etc. Kento Aida, Tokyo Institute of Technology
Grid? • Grid = supercomputer + network? • Grid = idle PCs + network? • Grid = large-scale parallel processing on the internet? • If we connect our resources to the grid, anonymous users’ jobs will run on our resources without owners’ knowledge? • If we submit jobs to the grid, our job will run on resources in unknown sites? Kento Aida, Tokyo Institute of Technology
Classification of the Grid Computing Grid (high-performance computing) business Data Grid (high-performance data processing) Sensor Grid (advanced sensing) Access Grid (support for collaboration) Business Grid (advanced web service) science PC Grid (utilization of idle PCs) Kento Aida, Tokyo Institute of Technology
Computing Grid • Grid computing • high-performance computing service to utilize computers on the grid • Merit of users • reducing computation time • expanding problem size • receiving computation service • Component technology • security, resource management, job management, programming, problem solve environment (PSE), … Kento Aida, Tokyo Institute of Technology
Data Grid • Large-scale data processing/computing • large-scale distributed database on the internet • data processing service to access distributed data • Merit of users • high-speed access to distributed data • high-performance and reliable processing using large-scale data • Component technology • security, high-speed data transfer, replica management, scheduling Kento Aida, Tokyo Institute of Technology
Access Grid • Communication support on the grid • Example • remote conference • virtual laboratory • remote medical service • SARS Grid (NCHC) • entertainment • “KARAOKE” Grid (AIST) Kento Aida, Tokyo Institute of Technology
Sensor Grid • Advanced Monitoring • coordination of autonomous sensors connected by network • wired network, wireless network, satellite, … • advanced sensing, analysis, forecasting • Example • meteorology (weather forecast), ecology, agriculture, … Kento Aida, Tokyo Institute of Technology
Technical Issues of the Grid • Component technology • security, information service, resource management • job management, scheduling • data management • programming • problem solve environment (PSE) • Infrastructure • production grid • Application • applying to big science • applying business Kento Aida, Tokyo Institute of Technology
2. Components Kento Aida, Tokyo Institute of Technology
application programming problem solve environment information service job management data management resource management security infrastructure (computer, network, experimental device, …) Component Technology of the Grid Kento Aida, Tokyo Institute of Technology
Security • Issues • authentication, encryption of communication • Single sign on • user authentication on one host • Authentication on other hosts is automatically performed. user internet authentication authentication is automatically performed. authentication authentication Org. A Org. C Org. B Kento Aida, Tokyo Institute of Technology
Resource Management • Common interfaces to the grid • wrapping differences of commands/operations among different machines internet user common command GW GW GW com. a com. c OS A com. b OS C Org. A Org. C OS B Org. B Kento Aida, Tokyo Institute of Technology
Information Service • Information about resources on the Grid info. service network monitoring CPU: … memory: … OS: … internet GW GW GW Org. C Org. B Org. A Kento Aida, Tokyo Institute of Technology
Proxy Cert. CA Query Resource Status process grid-proxy-init Proxy Cert. Proxy Cert. Data Transfer Process invocation Return result Big picture of the GT2 GIIS GRIS Site B gatekeeper User Cert. GRIS GridFTP Server Client Site C gatekeeper GRIS Site B [source: Yoshio Tanaka, AIST] Kento Aida, Tokyo Institute of Technology
Job Management • Resource selection, Scheduling, Job control info. service resource broker (2) (1,3,4) (0) internet (4) user GW GW GW Org. A Org. C Org. B Kento Aida, Tokyo Institute of Technology
Condor owner: aaa CPU: 2GHz以上 Memory: 512MB以上 Disk: 10GB以上 : • High Throughput Computing • matching jobs and resources by ClassAds mechanism • fault tolerance by check pointing • Implementation on the Globus Tool Kit • Condor-G Client Match maker job Schedd Startd [ http://www.cs.wisc.edu/condor/ ] Kento Aida, Tokyo Institute of Technology
Scheduling • Application scheduling • Scheduling of a single application (job) on resources • How do we decompose an application program into tasks? • Where do we allocate tasks? • When do we start execution of tasks? • Job scheduling • Scheduling of multiple jobs on resources • Where do we dispatch jobs on resources? • When do we start execution of jobs? • Goal • minimizing the execution time, meeting the deadline, minimizing the cost, preserving fairness, … Kento Aida, Tokyo Institute of Technology
Nimrod • Job management system for parameter-survey applications • computational economy • deadline scheduling • Implementation on the Globus Tool Kit • Nimrod/G [source: D. Abramson, et.al., “High Performance Parametric Modeling with Nimrod/G: Killer Application for the Global Grid?,” IPDPS2000, 2000 ] [ http://www.csse.monash.edu.au/~davida/nimrod.html/ ] Kento Aida, Tokyo Institute of Technology
Data Management • Distributed file management, High-speed file transfer, Replica management data management file high-speed file transfer internet user GW GW GW GW Org. A Org. C Org. B replication Kento Aida, Tokyo Institute of Technology
Data Grid Applications • High Energy Physics • Earth Science, Astronomical Observation • Bio informatics [source: Osamu Tatebe, AIST] Kento Aida, Tokyo Institute of Technology
Grid Datafarm • Peta-to-Exascale Global Filesystem on unified CPU/storage cluster • Parallel I/O and parallel processing with local I/O scalability [source: Osamu Tatebe, AIST] Kento Aida, Tokyo Institute of Technology
Trans-Pacific Gfarm Datafarm testbed:Network and cluster configuration SuperSINET Trans-Pacific thoretical peak 3.9 Gbps Gfarm disk capacity 70 TBytes disk read/write 13 GB/sec Indiana Univ Titech 147 nodes 16 TBytes 4 GB/sec 10G SuperSINET NII SC2003 Phoenix 2.4G Abilene Univ Tsukuba 2.4G New York 10G 10 nodes 1 TBytes 300 MB/sec 2.4G(1G) 10G [950 Mbps] KEK [500 Mbps] OC-12 ATM 1G 7 nodes 3.7 TBytes 200 MB/sec 622M Chicago 1G 10G APAN Tokyo XP APAN/TransPAC Maffin 1G 1G 32 nodes 23.3 TBytes 2 GB/sec AIST Los Angeles 5G 2.4G [2.34 Gbps] 10G SDSC Tsukuba WAN 16 nodes 11.7 TBytes 1 GB/sec 16 nodes 11.7 TBytes 1 GB/sec Kasetsert Univ, Thailand [source: Osamu Tatebe, AIST] Kento Aida, Tokyo Institute of Technology
Programming • MPI • programming with Message Passing Interface • MPICH-G2,GridMPI,… • GridRPC • programming with Remote Procedure Call (RPC) mechanism • Ninf-G,OmniRPC,NetSolve,… • Master Worker Template • template to develop master-worker programs • MW,AMWAT,… Kento Aida, Tokyo Institute of Technology
GridRPC internet library program input data user program worker output data ------ for (…) { grpc_call_async( ) } ------ library program worker library program master worker Kento Aida, Tokyo Institute of Technology
GridRPC (cont’d) • Ninf-G [ http://ninf.apgrid.org/ ] • reference implementation of GridRPC • implementation on the Globus Toolkit • using security functions on the Globus (authentication, encrypted communication). for (i = start; i <= end; i++) { SDP_search(argv[1], i, &value[i]); } grpc_function_handle_init(&hdl, …, “SDP/search”); for (i = start; i <= end; i++) { grpc_call_async(&hdl, argv[1], i, &value[i]); } Kento Aida, Tokyo Institute of Technology
Problem Solve Environment (PSE) • Portal • frontend to search, run, monitor, and control applications on the grid • Web page • cooperation with a workflow system • Workflow • mechanism to run multiple applications following their dependencies • representing dependencies among applications by a graph • initiation of applications following the workflow by the workflow engine Kento Aida, Tokyo Institute of Technology
Example of PSE (UNICORE) [source: http://www.unicore.org/unicore.htm] Kento Aida, Tokyo Institute of Technology
3. Infrastructure Kento Aida, Tokyo Institute of Technology
Resources in Grid Infrastructure • Computer • PC, PC cluster, supercomputer, … • Storage • HDD, RAID, … [source: http://www.gsic.titech.ac.jp/Japanese/Service/R_System/Overview/index.html] [source: Matsuoka Lab, TITECH] Kento Aida, Tokyo Institute of Technology
Resources in Grid Infrastructure (cont’d) • Experimental device • microscope, accelerator , … • Sensor • thermometer, camera, … Large Hadron Collider, CERN [source: Osamu Tatebe, AIST] Ultra-High Voltage Electron Microscope, Osaka University [source: http://www.biogrid.jp/] EcoGrid, NCHC [source: Fang Pang Lin, NCHC] Kento Aida, Tokyo Institute of Technology
Resources in Grid Infrastructure (cont’d) • Network • LAN, WAN, internet, … [source: http://www.noc.titech.ac.jp/titanet/supertitanet/index.ja.shtml] [ source: http://www.apan.net/] Kento Aida, Tokyo Institute of Technology
Grid Infrastructure • Classification by objectives • test bed the grid environment construct to perform experiment. • temporally available • production grid the grid environment for production use, or to run practical applications • permanently available. • Resources are fully operated for 24hrs. • Classification by geographic sites • department grid, campus grid, national grid, international grid Kento Aida, Tokyo Institute of Technology
ACT-JST Testbed • Grid testbed for running applications to solve large-scale optimization problem • construction of 1000CPU scale testbed • application development • collaboration among Grid researchers and application scientists AIST TDU TITECH Tokushima U. Kento Aida, Tokyo Institute of Technology
Grid Challenge Federation (GCF) • Test bed constructed for the Grid Challenge event, programming contest on the grid • Resources • Grid Technology Research Center, AIST • HPCS Lab., U. Tsukuba • Yuba-Honda Lab., UEC • Matsuoka Lab., TITECH • Aida Lab., TITECH • Ono Lab., Tokushima U. • Hiraki Lab., U. Tokyo • Chikayama-Taura Lab., U. Tokyo Kento Aida, Tokyo Institute of Technology
ApGrid/PRAGMA • Grid Partnership among Asia-Pacific region [ source: http://www.apgrid.org/] Kento Aida, Tokyo Institute of Technology
Titech Grid [source: http://www.gsic.titech.ac.jp/index-j.html] Kento Aida, Tokyo Institute of Technology
NAREGI [source: http://www.naregi.org/ ] Kento Aida, Tokyo Institute of Technology
TeraGrid • The 40Gbps network connects sites. • 20TeraFlops,1PB resources CalTech, ANL, SDSC, NCSA, PSC [source: http://www.teragrid.org/] Kento Aida, Tokyo Institute of Technology
Operation of Infrastructure • Objectives • An organization/staff is required to stably provide a grid infrastructure to users. • The current internet is operated by experts (organizations) for network operation. Network Operation Center (NOC) • Grid Operation Center • organization to operate a grid infrastructure • providing information of grid resources • resources in VO • load on computing resources, traffic on networks, … • user support • accounting, documents archives, help desk, trouble shooting, … Kento Aida, Tokyo Institute of Technology
PRAGMA GOC Kento Aida, Tokyo Institute of Technology
Network Weather Map http://mrtg.koganei.itrc.net/mmap/grid.html Thanks: Dr. Hirabaru and APAN Tokyo NOC team Kento Aida, Tokyo Institute of Technology