290 likes | 453 Views
Virtual EZ Grid AAA/Switch project Nabil Abdennadher HES-SO, hepia (Geneva). Virtual EZ Grid at a glance. AAA/Switch funding Partners: UniGE, USI, UniNE, HES-SO Start/End dates : February 2009 - July 2010 Objectives :
E N D
Virtual EZ GridAAA/Switch projectNabil AbdennadherHES-SO, hepia (Geneva)
Virtual EZ Grid at a glance • AAA/Switch funding • Partners: UniGE, USI, UniNE, HES-SO • Start/End dates : February 2009 - July 2010 • Objectives : • Build a sustainable desktop Grid platform based on a Volunteer Computing (VC) middleware • Evaluate it in a real world setting with two medical applications
Virtual EZ Grid objectives Infrastructure: build a grid platform with more than 1000 PCs Non dedicated nodes Belong to several institutions. Reliability: manage the volatility of the “worker” nodes Check pointing, migration, restarting, etc. Security: guarantee the security of resources providers Negotiation Model: provide a resource-credit system Applications: two medical packages NeuroWeb: build neural maps extended from brain measurements MedGift: medical image analysis application
PLAN • Virtual EZ Grid ingredients • JOpera • XtremWeb-CH • EZ Grid • Applications
Virtual EZ Grid ingredients • JOpera (developed at USI) • A grid workflow management system: http://www.jopera.org • XtremWeb-CH (developed at HES-SO) • A volunteer computing platform : www.xtremwebch.net • EZ grid (developed at UniNE and UniGE) • Based on virtualization • Job check-pointing • Migration and restarting • ARC
ARC EZ-Grid Virtual EZ Gridarchitecture JOpera XtremWeb-CH (XWCH) Infrastructure
JOpera Make it easy to build Grid and VC applications composed of multiple jobs “Provide the scientist with a platform that takes care of all data handling and record keeping chores so that the user can concentrate on the science and not computer science” Based on slides by Cesare Pautasso
Drag, Drop and Connect Based on slides by Cesare Pautasso
Run, Monitor, Steer and Debug Based on slides by Cesare Pautasso
Job0 Job3 Job1 Job2 XtremWeb-CH at a glance Consumer (Users’ applications) Providers (Workers) Work Request Service Request PC XtremWeb-CH coordinator Work Result PC PC result PC PC PC PC PC PC Application
Work Alive Work Request Work Result Communication protocol Worker XtremWeb-CH coordinator
Scheduler XWCH DB Worker Warehouse XtremWeb-CH Coordinator Python API Work request User application Worker and Warehouse services User’ services C/C++ API Work Alive Java API Work Result Admin. Service
Job3 Job0 Job1 Job2 User application (jobs generator) • int main () • { • int Job0, Job1, Job2, Job3 • … • Job0 = addjob (…); • while (getJobStatus (Job0,…)) ≠ COMPLETE); Job1 = addjob (…); Job2 = addjob (…); while ((getJobStatus (Job1,…)) ≠ COMPLETE) && (getJobStatus (Job2,…)) ≠ COMPLETE)); Job3 = addjob (…); while (getJobStatus (Job3,…)) ≠ COMPLETE); getJobFileOut (Job3,…); }
EZ Grid • Isolates XWCH jobs from “local” jobs Why?: guarantee the privacy of the providers How?: use virtualization technology • Supports check-point, and migration Why?: manage the volatility of nodes How?: remotely monitor the XWCH virtual machine
EZ Grid coordinator XWCH worker Application Application XWCH VM1 XWCH VM2 XWCH XWCH OS (Linux) OS (Linux) EZ Grid module Local application VM Manager OS (Windows) Worker Warehouse Work request XWCH coordinator Work Alive Work Result
PLAN • Virtual EZ Grid ingredients • JOpera • XtremWeb-CH • EZ Grid • Applications
Applications • NeuroWeb : build neuronal maps extracted from brain measurements • MedGift : medical image analysis and retrieval application
Why? • Avoid invasive surgeries NeuroWeb • Objective • Reconstruction of the electromagnetic brain map • Which neuron is responsible of what? • Epileptic crisis (*) • Parkinson • Alzheimer • Etc.
MEG scanner Magneto-EncephaloGraphy (MEG) scanner: provides temporal information (functional data) # Sensors = 256 dt = 1 millisecond Neuronal activity MEGScanner Based on slides by Cédric Bilat
A large scale optimization problem ? # of sensors: 256 # of neurons: 60’000 NeuroWeb Based on slides by Cédric Bilat
MEG IRM How does the system work? Mapping of the MEG signals on voxels (IRM): Based on slides by Cédric Bilat
The Algorithm • Start with a random matrix A0 • In each step, calculate Ai . • Ai = F (Ai-1, functional data, anatomic data) • Stop when Ai = Ai-1 ~16 Mb ~360 Mb
Gridification A1 A0 Ai-1 Ai … . . . . . . . . .
Persistent server XWCH Job XWCH Worker Main Memory HD Worker node Data persistence • Why ? • Avoid loading/storing data from/to the HD • How ? • Data remain in main memory even after the end of the task
Virtual EZ Grid today Infrastructure: Around 500 workers HES-SO (Geneva + Yverdon): ~250 UNiGe: ~ 200 UniNe: ~ 20 Univ. Franche Comté (France): ~30 Reliability: The implementation of EZ Grid is in progress (50%) Security: Fully operational. based on SWITCHaai: Shibboleth-based AAI Negotiation Model: In progress (30%) Applications NeuroWeb: a prototype (proof-of-concept) is already deployed MedGift: gridification will start soon (January 2010)
Virtual EZ Grid : Links with other projects Swiss Grid Portal Virtual EZ-Grid Swiss Multi-ScienceComputing Grid Infrastructure
Model 1 Prices are fixed by a “central” agency. All the “institutions” receives initial credit which: depends on the quantity of resources provided by the institution can be used against the usage. The client (from a given institution) chooses the priority of jobs. The goal is to: Minimize the cost of execution Execute the application ASAP The model determines the “best” prices which optimize the use of the platform
Model 2 (1) Proposed price Worker XWCH coordinator (2) feedback (3) Price updating