350 likes | 705 Views
Grid Computing. Fateha Khanam Bappee Dept. of Mathematics, Statistics & Computer Science St. Francis Xavier University Antigonish , NS April 05, 2012. Presentation Outline. What is Grid Computing How a Grid Works A View on Grid Computing Grid concepts and components
E N D
Grid Computing FatehaKhanamBappee Dept. of Mathematics, Statistics & Computer Science St. Francis Xavier University Antigonish, NS April 05, 2012
Presentation Outline • What is Grid Computing • How a Grid Works • A View on Grid Computing • Grid concepts and components • Grid Computing Projects • ACENET • Limitations of Grid Computing
What is Grid Computing? • Grid computing (or the use of a computational grid) is applying the resources of many computers in a network to a single problem at the same time– usually to a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data • It is a form of distributed computing whereby a "super and virtual computer" is composed of a cluster of networked, loosely coupled computers, acting in concert to perform very large tasks.
How a Grid Works • The term "grid computing" suggests a computing paradigm similar to an electric power grid - a variety of resources contribute power into a shared "pool" for many consumers to access on an as-needed basis. • Ideally the user does not know or care where the computing operation is being performed; the process is invisible to the user. • Middleware handles security, authentication, authorization, resource selection and routing of input and output seamlessly.
A View on Grid Computing • Exploiting underutilized resources • Parallel CPU capacity • Virtual resources and virtual organizations for collaboration • Access to additional resources • Resource balancing • Reliability
Exploiting underutilized resources • The easiest use of grid computing is to run an existing application on a different machine. Conditions: • the application must be executable remotely and without undue overhead. • the remote machine must meet any special hardware, software, or resource requirements imposed by the application. Example: A Batch Job. • Utilizing unused disk drive capacity. • Replication of Data at various points in the grid.
Parallel CPU capacity • The common attribute among such uses is that the applications have been written to use algorithms that can be partitioned into independently running parts. • A CPU intensive grid application can be thought of as many smaller subjobs, each executing on a different machine in the grid. • Barriers often exist to perfect scalability. • can only be split into a limited number of independently running parts. • if the parts are not completely independent causing contention.
Virtual resources and virtual organizations for collaboration • The users of the grid can be organized dynamically into a number of virtual organizations, each with different policy requirements. These virtual organizations can share their resources collectively as a larger grid. • Sharing is not limited to files, but also includes many other resources, such as equipment, software, services, licenses, and others. These resources are virtualized to give them a more uniform interoperability among heterogeneous grid participants.
Access to additional resources • In addition to CPU and storage resources, a grid can provide access to increased quantities of other resources. The additional resources can be provided in additional numbers and capacity. For example, Increase in Bandwidth. • Some machines may have expensive licensed software installed that the user requires. His jobs can be sent to such machines more fully exploiting the software licenses. • Accessing devices remotely.
Resource balancing • A grid federates a large number of resources contributed by individual machines into a greater total virtual resource. For applications that are grid enabled, the grid can offer a resource balancing effect by scheduling grid jobs on machines with low utilization • The load balancing can happen in two ways: An unexpected peak can be routed to relatively idle machines in the grid. If the grid is already fully utilized, the lowest priority work being performed on the grid can be temporarily suspended or even cancelled and performed again later to make room for the higher priority work.
Reliability • High-end conventional computing systems use expensive hardware to increase reliability. All of this builds a reliable system, but at a great cost, due to the duplication of high-reliability components. • We need such an alternate approach to reliability that relies more on software technology than expensive hardware. A grid is just the beginning of such technology. The systems in a grid can be relatively inexpensive and geographically dispersed.
Grid components and Resources • Computation The most common resource is computing cycles provided by the processors of the machines on the grid. The processors can vary in speed, architecture, software platform, and other associated factors, such as memory, storage, and connectivity. There are three primary ways to exploit the computation resources of a grid: • To use it to run an existing application on an available machine on the grid rather than locally. • To use an application designed to split its work in such a way that the separate parts can execute in parallel on different processors. • To run an application that needs to be executed many times on many different machines in the grid.
Grid components (Resources) • Storage • Each machine on the grid usually provides some quantity of storage for grid use, even if temporary. (memory or secondary storage) • Memory attached to a processor usually has very fast access but is volatile. It would best be used to cache data to serve as temporary storage for running applications. • Many grid systems use mountable networked file systems, such as Andrew File System (AFS), Network File System (NFS), Distributed File System (DFS), or General Parallel File System (GPFS). These offer varying degrees of performance, security features, and reliability features. • Capacity can be increased by using the storage on multiple machines with a unifying file system.
Grid components (Resources) • Communication Capacity • Communication within grids and external communication. • Communications within the grid are important for sending jobs and their required data to points within the grid. • External communication access to the Internet, for example, can be valuable when building search engines. • Redundant communication paths are sometimes needed to better handle potential network failures and excessive data traffic.
Grid components (Resources) • Job Scheduling, reservation, and scavenging • Advanced grid systems would include a job scheduler of some kind that automatically finds the most appropriate machine on which to run any given job that is waiting to be executed. Schedulers react to current availability of resources on the grid. • In a scavenging grid system, any machine that becomes idle would typically report its idle status to the grid management node. This management node would assign to this idle machine the next job that is satisfied by the machines resources. • Grid resources can be reserved in advance for a designated set of jobs. This is done to meet deadlines and guarantee quality of service. When policies permit, resources reserved in advance could also be scavenged to run lower priority jobs when they are not busy during a reservation period, yielding to jobs for which they are reserved.
Grid Computing Project (SETI@home) What is SETI@home? • SETI@home (" Search for Extraterrestrial Intelligence at home") is a large scale distributed computing project using Internet-connected computers. • SETI@home uses millions of computers in homes and offices around the world to analyze radio signals from space. • SETI@home was developed by the Space Sciences Laboratory, at the University of California, Berkeley, in the United States. • SETI@home share SERENDIPs (Search for Extraterrestrial Radio Emissions from Nearby Developed Intelligent Populations) data source and distribute data via the Internet.
How SETI@home works? • Collect data source: • SERENDIP use telescope to collect data source from outer space at Arecibo. • The SETI@home use data recorder to record data source from SERENDIP on removable tape. • Distribution of data source: • SETI@home divide data into fixed-size work units. • SETI@home distribute these data via Internet from the servers to a client program. • Client program computes result ,then returns it to the server, and gets another work unit.
How SETI@home works?(Contd.) • Collect and analyze computing results: • The client program repeatedly gets a work unit from the data/result server, and analyzes these work unit. • the client program returns the result (a list of candidate signals) to the SETI@home server complex. • The results are recorded and analyzed at server complex.
How SETI@home project work in personal computer • Ref: 1. http://setiathome.berkeley.edu/ 2. http://boinc.berkeley.edu/download.php • Download BOINC and install it on computer.( BOINC|-Berkeley Open Infrastructure for Network Computing), it is a program that lets us donate our idle computer time to science projects like SETI@home). • Then add a project name|SETI@home. • When I run BOINC on my computer, the BOINC will work on my computer.
The BOINC Manager, or GUI, provides a graphical interface that lets me control the core client
SETI@home will use part of my computer's CPU power, disk space, and network bandwidth.
SETI@home display, showing the power spectrum being computed(bottom) and the best signal found so far (left).
Some Other Grid Computing Projects • CERN • Earth System Grid (ESG) • GRID.ORG • Access Grid
CERN • The LHC Computing Grid, is a distribution network designed by CERN to handle the massive amounts of data produced by the Large Hadron Collider (LHC). (The Large Hadron Collider (LHC) is the world's largest and highest-energy particle accelerator, intended to collide opposing particle beams) • Within 2005, detectors at the Large Hadron Collider at CERN, the European Laboratory for Particle Physics produced several petabytes of data per year - a million times the storage capacity of a desktop computer. • Just the basic data analysis requires 20 tflops/s of computing power (the fastest supercomputer produces 3 teraflops per second).
Earth System Grid (ESG) • The primary goal of ESG is to address the formidable challenges associated with enabling analysis of and knowledge development from global Earth System models. • High-resolution, long-duration simulations performed with advanced DOE SciDAC/NCAR climate models will produce tens of petabytes of output. To be useful, this output must be made available to global change impacts researchers nationwide, both at national laboratories and at universities, other research laboratories, and other institutions. • ESG-I was extremely successful in two regards: it developed a rich technology base that is now seeing use in multiple other disciplines, and it demonstrated the feasibility and power of using a Grid environment for climate analysis applications.
Access Grid • Access Grid is a collection of resources and technologies that enables large format audio and video based collaboration between groups of people in different locations. • It is a collection of resources, including multimedia large-format displays, presentation and interactive environments, and interfaces with grid computing middleware and visualization environments. In simple terms, it is advanced videoconferencing using big displays and with multiple simultaneous camera feeds at each node (site).
Grid - ACENET A known grid example for us is ACEnet ("Atlantic Computational Excellence Network"). Its made of geographically dispersed clusters located at different universities in the Atlantic region. There are nine partner institutions: • Memorial University of Newfoundland, NL • Saint Francis Xavier University, NS • Saint Mary's University, NS • University of New Brunswick, NB • Dalhousie University, NS • Mount Allison University, NB • University of Prince Edward Island, PE • Acadia University, NS • Cape Breton University, NS
ACENET- Hardware Resources The ACE-net hardware resources are located at several universities and include the following clusters: • Brasdor (brasdor.ace-net.ca) at StFX • Fundy (fundy.ace-net.ca) at UNB • Mahone (mahone.ace-net.ca) at Saint Mary's • Placentia (placentia2.ace-net.ca) at MUN • Glooscap (glooscap.ace-net.ca) at Dal • Courtenay (courtenay.ace-net.ca) at UNBSJ
ACENET- Software Resources A large number of software of different category are installed in the Ace-Net grid. Some examples are as follows: • Scientific Computing Packages: DiVinE-mc, GAUSSIAN, Maple, MATLAB, Mathematica, Octave, Spin etc. • Graphics Visualization : feh, ferret, Molden, NCAR graphics, VTK • Parallel APIs : BSPonMPI, MPI, OpenMP, pyMPI, BLACS • Compilers Languages : Portland Group compilers (C, C++, Fortran), Sun Studio 12 Compilers (C, C++, Fortran), GNU compilers (C, C++, Fortran, Java), Java, 64-bit VM, Mono (.NET), Perl, Python and Ruby and many other software. • Scientific Libraries : ACML, PGI, BLAS, FFTW, GMP, GSL, HDF4, HDF5, NetCDF, Sun Performance Library, SS12 and szip
Limitations of Grid Computing • Most of the existing applications that access Grid services require the user to type difficult commands, often using a command-line interface. • Every application is not suitable for running on a grid and some kinds of application simply cannot be parallelized. • For other, it can take a large amount of work to modify them to achieve faster throughput. • The configuration of a grid can greatly affect the performance, reliability and security of an organization’s computing infrastructure.
References • http://www.grid.org • http://www.gridcomputingplanet.com/ • http://www.ibm.com/grid • IBM Redbooks Paper Fundamentals of Grid Computing • Sample presentation of previous year • http://en.wikipedia.org/wiki/Grid computing • http://en.wikipedia.org/wiki/ACEnet • http://www.ace-net.ca/