470 likes | 672 Views
Grid Computing (2) (Special Topics in Computer Engineering). Veera Muangsin 30 January 200 4. Outline. High-Performance Computing Grid Computing Grid Applications Grid Architecture Parallel Computers Architectures Cluster Architecture Grid Architecture Grid Middleware Grid Services.
E N D
Grid Computing (2)(Special Topics in Computer Engineering) Veera Muangsin 30 January 2004
Outline • High-Performance Computing • Grid Computing • Grid Applications • Grid Architecture • Parallel Computers Architectures • Cluster Architecture • Grid Architecture • Grid Middleware • Grid Services
Parallel Architecture Taxonomy • Single Instruction Single Data (SISD ) • Multiple Instruction Single Data (MISD) • Single Instruction Multiple Data (SIMD) • Multiple Instruction Multiple Data (MIMD) • Shared Memory MIMD • Distributed Memory MIMD
Instructions Processor Data Output Data Input SISD : A Conventional Computer • Speed is limited by the rate at which computer can transfer information internally. Ex: PC, Macintosh, Workstations
Instruction Stream A Instruction Stream B Instruction Stream C Processor A Data Output Stream Data Input Stream Processor B Processor C The MISD Architecture • More of an intellectual exercise than a practical configuration. Few built, but commercially not available
Instruction Stream Data Output stream A Data Input stream A Processor A Data Output stream B Processor B Data Input stream B Data Output stream C Processor C Data Input stream C SIMD Architecture Ex: CRAY machine vector processing Ci<= Ai * Bi
MIMD Architecture Instruction Stream A Instruction Stream B Instruction Stream C Unlike SISD, MISD, MIMD computer works asynchronously. Shared memory (tightly coupled) MIMD Distributed memory (loosely coupled) MIMD Data Output stream A Data Input stream A Processor A Data Output stream B Processor B Data Input stream B Data Output stream C Processor C Data Input stream C
Clusters • Distributed Memory MIMD • The most common architecture in the TOP500
Top 2-5 Clusters • #2 LANL’s ASCI Q • 13.88 TFlops • 8192-node cluster HP AlphaServer 1.25 GHz • #3 Virginia Tech’s System X • 10.28 TFlops • 1,100-node cluster, Apple G5
#4 NCSA’s Tungsten • 9.81 TFlops • 1,450-node cluster, dual-processor Dell PowerEdge 1750 • #5 PNNL’s MPP2 • 8.63 TFlops • 980-node cluster, HP Longs Peak, dual Intel Itanium-2 1.5 GHz
Our Parallel Computers Apollo Zeus and Athena
Our Parallel Computers • Apollo Cluster • 6-node cluster • Athlon XP 2000+ processor, 512 MB memory • Linux + MPI + PBS (batch scheduler system) + Globus (Grid middleware) • Zeus and Athena • Two 4-processor Sun Enterprise 420R multiprocessor computers • 450 MHz UltraSPARC II processors, 1 GB memory • Solaris + Pthread + MPI
Cluster Middleware • Resides Between OS and Applications and offers in infrastructure for supporting: • Single System Image (SSI) • System Availability (SA) • SSI makes collection appear as single machine • SA - Check pointing and process migration
client server client client Single System Image Components • NFS (Network File System) • NIS (Network Information System) • NTP (Network Time Protocol)
Programming Environments • Threads (Cluster of SMPs) • POSIX Threads • Java Threads • Message Passing • MPI • PVM • Virtual Shared Memory • Batch Scheduling • PBS, Condor, etc.
Batch Scheduling • Process distribution • Load balancing • Job scheduling • PBS, Condor, Sun Grid Engine, IBM Load Leveler, LSF, DQS, …
Cluster Applications • Sequential • Parallel / Distributed (Cluster-aware app.) • Grand Challenging applications • Weather Forecasting • Quantum Chemistry • Molecular Biology Modeling • Engineering Analysis (CAD/CAM) • ………………. • Web servers, data-mining
What is Grid ? • An infrastructure that dynamically couples • Computers (PCs, workstations, clusters, traditional supercomputers, and even laptops, notebooks, mobile computers, PDA, and so on) • Software (e.g., renting special purpose applications on demand) • Databases (e.g., transparent access to human genome database) • Special Instruments (e.g., radio) • People • across the local/wide-area networks (enterprise, organisations, or Internet) and presents them as a unified resource or problem solving environment.
Grid Applications • Old and new applications getting Grid-enabled via coupling of computers, databases, instruments, people, etc: • (distributed) Supercomputing • Collaborative engineering • high-throughput computing • large scale simulation & parameter studies • Remote software access / Renting Software • Data-intensive computing • On-demand computing
How can the Grid help me? • Provide access to a global distributed computing environment • via authentication, authorisation, negotiation, security • Identify and allocate appropriate resources • interrogate information services -> resource discovery • enquire current status/loading via monitoring tools • decide strategy - eg move data or move application • (co-)allocate resources -> process flow
How can the Grid help me? (2) • Schedule tasks and analyse results • ensure required application code is available on remote machine • transfer or replicate data and update catalogues • monitor execution and resolve problems as they occur • retrieve and analyse results - eg using local visualization
To make this happen you need … • agreed protocols (Grid protocols) • defined application programming interfaces (APIs) • distributed data management • availability of current status of resources • monitoring tools • accepted authentication procedures and policies • network traffic management
Grid Components Applications and Portals Grid Apps. … Prob. Solving Env. Collaboration Engineering Web enabled Apps Scientific Grid Tools Development Environments and Tools … Web tools Libraries Languages Monitoring Resource Brokers Debuggers Distributed Resources Coupling Services Grid Middleware … QoS Data Access Sign on & Security Information Comm. Process Local Resource Managers … TCP/IP & UDP Queuing Systems Operating Systems Libraries & App Kernels Grid Fabric Networked Resources across Organisations … Storage Systems Data Sources Clusters Scientific Instruments Computers
Network Before the Grid • independent sites • independent hardware and software • independent user ids • security policy requiring local connection to the machine. User Application The User is responsible for resolving the complexities of the environment Site A Site B
First Step to the Grid • Metacenter • Two or more resources connected in a controlled user environment • Constraints • common architecture • single name space • common scheduler User Application A layer of abstraction is added that hides some of the complexities associated with running jobs in a distributed computing environment, however, limitations exist Network Centralized Scheduler and file staging Site A Site B
Request info from the grid 1 Get response 2 1 2 3 Make selection and submit job 3 The Grid Today • Common Middleware • abstracts independent, hardware, software, user ids, into a service layer with defined APIs • comprehensive security, • allows for site autonomy • provides a common infrastructure based on middleware User Application The underlying infrastructure is abstracted into defined APIs thereby simplifying developer and the user access to resources, however, this layer is not intelligent Grid Middleware Infrastructure Network Site A Site B
The Near Future Grid • Customizable Grid Services built on defined Infrastructure APIs • automatic selection of resources • information products tailored to users • accountless processing • flexible interface: web based, command line, APIs User Application Resources are accessed via various intelligent services that access infrastructure APIs The result: The Scientist and Application Developer can focus on science and not on systems management Intelligent, Customized Middleware Grid Middleware - Infrastructure APIs (service oriented) Infrastructure Network Site A Site B
How the User Sees a Grid • A set of grid functions that are available as • Application programmer interfaces (APIs) • Command-line functions • After authentication, functions can be used to • Spawn jobs on different processors with a single command • Access data on remote systems • Move data from one processor to another • Support the communication between programs executing on different processors • Discover the properties of computational resources available on the grid using the grid information service • Use a broker to select the best place for a job to run and then negotiate the reservation and execution (coming soon). Tom Hinke
PUBLIC FORUMS Computing Portals Grid Forum European Grid Forum IEEE TFCC! GRID’2000 and more. Australia Nimrod/G EcoGrid and GRACE DISCWorld Europe UNICORE MOL METODIS Globe Poznan Metacomputing CERN Data Grid MetaMPI DAS JaWS and many more... Public Grid Initiatives Distributed.net SETI@Home Compute Power Grid USA Globus Legion JAVELIN AppLes NASA IPG Condor Harness NetSolve NCSA Workbench WebFlow EveryWhere and many more... Japan Ninf Bricks and many more... Many GRID Projects and Initiatives http://www.gridcomputing.com/
Nimrod - A Job Management System http://www.dgs.monash.edu.au/~davida/nimrod.html
Nimrod/G Architecture Nimrod/G Client Nimrod/G Client Nimrod/G Client Nimrod Engine Schedule Advisor Trading Manager Persistent Store Dispatcher Grid Explorer TM TS Middleware Services GE GIS Grid Information Services RM & TS RM & TS RM & TS GUSTO Test Bed RM: Local Resource Manager, TS: Trade Server
Compute Power Market Grid Information Server Grid Explorer Application Job Control Agent Schedule Advisor Trading Trade Server Charging Alg. Accounting Resource Reservation Trade Manager Other services Deployment Agent Resource Allocation R1 R2 … Rn User Resource Broker A Resource Domain
Globus Toolkit • Grid computing middleware • Software between the hardware and high-level services • Basic libraries, services, command-line programs • Most common middleware used in grids • Integrated with Web Service
get and put files • 3rd party copy • interactive file management • parallel transfers • login • execute commands • copy files • execute remote applications • stage executable, stdin, stdout, stderr information about resources and services Monitoring and Discovery Service (MDS) Globus Resource Allocation Manager (GRAM) Grid SSH Grid FTP LDAP PBS LSF fork/exec Grid Security Infrastructure (GSI) X.509 Certificates SSL/TLS distributed directory service job management systems credentials for users, services, hosts • authentication • secure communication • single sign on • delegation of credentials • authorization Globus Software Architecture
User User application/tool Web portal Globus client system Grid FTP Client GRAM Client Grid SSH Client MDS Client Clients are programs and libraries MDS server system MDS GIIS Grid FTP Server GRAM Server Grid SSH Server Grid SSH Server GRAM Server Grid FTP Server PBS MDS GRIS MDS GRIS LSF Globus server system Globus server system Globus Deployment Architecture
For More Information • Globus Project™ • www.globus.org • Grid Forum • www.gridforum.org • Book (Morgan Kaufman) • www.mkp.com/grids