240 likes | 352 Views
COMP3019 Coursework: Introduction to M-grid. Steve Crouch s.crouch@software.ac.uk, stc@ecs School of Electronics and Computer Science. Objectives. To equip students to drive a lightweight grid implementation to solve a problem that can benefit from using grid technology.
E N D
COMP3019 Coursework: Introduction to M-grid Steve Crouch s.crouch@software.ac.uk, stc@ecs School of Electronics and Computer Science
Objectives • To equip students to drive a lightweight grid implementation to solve a problem that can benefit from using grid technology. • To develop an understanding of the basic mechanisms used to solve such problems. • To develop a general architectural and operational understanding of typical production-level grid software. • To develop the programming skills required to drive typical services on a production-level grid.
Overview • Part 1: m-grid • m-grid: lightweight software illustrating grid concepts in use • Develop a program with m-grid’s Java API to solve a simple problem, submit it to m-grid with input data, collect results • Part 2: Google MapReduce & GridSAM • MapReduce: framework for distributed processing of large datasets using many computers • GridSAM: job submission web service interface to a computational resource (e.g. compute cluster, single machine) • Extend code stubs to submit jobs to GridSAM and monitor them to completion • Extend pseudocode that implements a basic MapReduce framework
Where to get stuff/help? • Can obtain coursework materials from website • Ready for Wednesday • Software documentation • Coursework help lecture 19th March • Myself: s.crouch@software.ac.uk • Building 32: Level 4 lab 4067 Bay 23
The Problem • Basically, want to run compute-intensive task • Don’t have enough resources to run job locally • At least, to return results within sensible timeframe • Would like to use another, more capable resource
Distributed Computing in Olden Times Michael L. Umbricht and Carl R. Friend • Small number of ‘fast’ computers • Very expensive • Centralised • Used nearly all the time • Time allocations for users • Not updated often • Punched cards • Wait time huge • MailNet, SneakerNet, TyperNet, etc… • Mainframes • Cray-1 1976 - $8.8 million, 160 megaflops, 8MB memory Univac 1710 brewbooks Cray X-MP(Cray -1 successor)
The Present… • Now… large number of slow computers: • Cheap • Distributed • Computation • Ownership • Not used all the time • Exclusive access to users • Updated often • e.g. desktop computers, PDAs, mobile phones • Low utilisation of computing power • e.g.: institutional/university resources…
It’s About Scaling Up… • Then… the march towards localisation of computation, the Personal Computer • Computational Science develops in laboratories • Is this changing again? • Compute and data – you need more, you go somewhere else to get it Images: nasaimages, Extra Ketchup, Google Maps, Dave Page
The Grid - a Reminder • The grid – many definitions! “Grid computing offers a model for solving massive computational problems by making use of the unused CPU cycles of large numbers of disparate, often desktop, computers treated as a virtual cluster embedded in a distributed telecommunications infrastructure” – Wikipedia “A service for sharing computer power and data storage capacity over the Internet.“ – CERN (European Organisation for Nuclear Research) • Two components of grid computing: • Computational/data resource – e.g. computational cluster, supercomputer, desktop machine • Infrastructure for externalising that resource to others
Some Examples… • Grid (i.e. internet-accessible) examples: • SETI@Home - http://setiathome.ssl.berkeley.edu/ • Process data from Arecibo Radio Telescope, Puerto Rico • 2 million volunteers installed software • Univa.org- http://www.univa.org/ • Projects such as Cancer Research, Smallpox • 2.5 million volunteer systems • Sells processing time to organisations • Computational resource (i.e. intranet-accessible): • Cluster managers, supercomputer, single machine
The Idea - as a Provider… • Goal: I want others to access my resources & applications • I want to provide secure controlled access to: • My applications: • Specify who can access which applications • My computational or data resources: • I can limit external usage of my resources • Provides an interface that allows remote users to access my resources • Enable collaboration with other partners
The Idea - as a User (or Client) • Goal: I want to use other resources & applications • Through a network of service providers I can…: • Gain access to applications that I do not have installed locally • Use remote machines [larger resource] with more CPU, memory or storage • Process larger problem sizes • Transparently switch between different service providers • No exposure to underlying OS, queuing policy, disk layout etc.
Cluster Computing & the Grid Grid is predominantly built on Cluster Computing solutions University B Grid Cluster Computing University A University C
The General Idea… Client Executor Client Coordinator Executor … … • Abstract ‘virtualisation’ of local network resources • Infrastructure manages many machines • Visualisation as a single resource • Submitted jobs get put on queue(s)
Condor – Background • Begun in 1988, based on Remote-Unix (RU) project • Predominantly makes use of idle cycles on machines
Condor Components • Four main machine ‘roles’ (daemons): • Submit Client (condor_schedd): used to submit resource requests, monitor, modify and delete jobs. • Central Manager, Server • condor_collector: collects information about pool resources. • condor_negotiator:negotiates (match-makes) between resources and resource requests. • Job Executor (condor_startd):executes jobs, advertises resources. Enforces local policy. • (Checkpoint Server (condor_ckpt_server): services requests to store and retrieve checkpoint files.)
Condor Architecture Queue Queue Shared Disk Submit client (condor_schedd, condor_shadow…) Negotiator (condor_negotiator) Collector (condor_collector) Server 3 • Client submits job (executable + input data) to local queue • Client schedd advertises job request to server collector • Server negotiator gets next priority request from collector • Negotiator negotiates w/ client schedd to match resource/job • Client removes job from queue and sends it to executor • Job runs on executor • Job output results returned to client Executor (condor_startd, condor_starter…) Client 2 4 5 Executor 6 7 1 … …
M-grid An overview
Computational Grids - in General Client Executor Client Coordinator Executor • Users supply tasks to be performed via client • Execution nodes contribute processing power • Coordinator node sends tasks to execution nodes, ensuring results returned • Existing grid tech. sophisticated -> significant complexity • To what extent can this be reduced? … …
Java Applets? • How about Java applets as a program unit? • Browsers could act as execution nodes • Security concerns? • Web browsers execute foreign code • Java applets executed within a ‘sandbox’ virtual machine • Stringent security restrictions imposed • In-built security configuration in browsers • Applet can only contact originating server • Risk significantly reduced
M-grid: A Lightweight Grid I Client Executor Client Coordinator Executor • M-grid: • Execution node = Java-applet enabled browser • Client = browser • Coordinator = web server • Tasks distributed as Applets in web pages • Execution node browser opens web page on server • Runs task applet • Uploads results to server … …
M-grid: Overview • Implemented on: • Microsoft’s IIS (Internet Information Server) using ASP • Apache Tomcat – we’ll use this one! • Client • Develops applet class as extension to MGridApplet class • Can run applet locally in appletviewer for testing • Compiles and packages applet with input parameters file into a jar file • Submits jar to web server via JobSubmit web page • Eventually collects results via ViewJobs web page • Execution node • Requests a job via JobRequest page • Applet submits results from job using SubmitResults page • Security provided by session authentication