540 likes | 725 Views
Journ?e du LRI. 2. 25 Juin 2003. Outline. Introduction Motivating a large scale instrument for Grid emulation/simulationStatus of Grid todayOther tools for large scale simulation/emulation A large scale instrument for exploring Grid issues in reproducible experimental conditions Example of e
E N D
1. Journ�e du LRI 1 25 Juin 2003 ACI Masse de donn�es
Data Grid Explorer
2. Journ�e du LRI 2 25 Juin 2003 Outline Introduction
Motivating a large scale instrument for Grid emulation/simulation
Status of Grid today
Other tools for large scale simulation/emulation
A large scale instrument for exploring Grid issues
in reproducible experimental conditions
Example of experiments
Large Scale Scheduling
Concluding remarks
3. Journ�e du LRI 3 25 Juin 2003 Several types of GRID
4. Journ�e du LRI 4 25 Juin 2003 Open issues in Grid/P2P Security (GSI, CAS)
Data Storage/consultation/movement
Multi users/ Multi applications scheduling
Coordination (virtual, ephemeral infrastructure)
Programming
Fault Tolerance!
Scalability
Performance
Easy/efficient deployment techniques
Application characterization techniques
Etc.
5. Journ�e du LRI 5 25 Juin 2003 What is Grid today? Middleware: Globus, Legion, Netsolve, Unicore, DIET, Condor, XtremWeb, Boinc, NWS
? they are actually working!
Testbed: DataGRID, TeraGRID, e-Toile, Grads, XW, Boinc
? Difficult to build (debug, human factor, etc.)
Dedicated applications (SETI, Kazaa, Jabber, etc.)
? they are working already (at large scale!),
? BUT they address much less issues!
6. Journ�e du LRI 6 25 Juin 2003 Difficulties with Grid/P2P! Generic Grid/P2P are very complex systems (we still have problems with large scale parallel computers!, we have less control on Grid resources! )
Too many issues are addressed simultaneously
?We need a methodology enabling the study of Grid issues, independently but realistically
7. Journ�e du LRI 7 25 Juin 2003 What are the current approaches? Simulators: SimGRID, MicroGRID, etc.
? they have strong limitations (scalability,
!= than execution of real codes, validation)
Experimental testbed (is there any?)
? Most testbed are for production, each testeb is specific, representativeness?
? We have no way to test:
ideas independently, at a significant scale,
with realistic parameters and behaviors!
8. Journ�e du LRI 8 25 Juin 2003 Case study: MPICH-V
9. Journ�e du LRI 9 25 Juin 2003 Case study: MPICH-V
10. Journ�e du LRI 10 25 Juin 2003 Case study: MPICH-V
11. Journ�e du LRI 11 25 Juin 2003 What is missing? A full fledge scientific environment
(reproducible realistic experimental conditions)
Probes measuring the performance of real resources and networks (Ganglia, NWS, la grenouille)
Fully experimental testbed, (GRID 5000 would remove this lack)
? Not enough, we need instruments with parametrisable reproducible experimental conditions
12. Journ�e du LRI 12 25 Juin 2003 Existing Network Simulators/Emulators NS, NS2
Network simulation (congestion, packet loss, etc.)
Modelnet
Application + network emulation (some nodes are playing the role of routers)
13. Journ�e du LRI 13 25 Juin 2003 Existing Grid Simulators SimGRid and SimGrid2
Discrete event simulation with trace injection
Originaly dedicated to scheduling studies
14. Journ�e du LRI 14 25 Juin 2003
15. Journ�e du LRI 15 25 Juin 2003 Grid eXplorer
16. Journ�e du LRI 16 25 Juin 2003 Grid eXplorer : an instrument for understanding GRID and P2P systems
17. Journ�e du LRI 17 25 Juin 2003 Grid eXplorerAnalogy with physic instruments
18. Journ�e du LRI 18 25 Juin 2003 Outline Introduction
Motivating a large scale instrument for Grid emulation/simulation
Status of Grid today
Other tools for large scale simulation/emulation
A large scale instrument for exploring Grid issues
in reproducible experimental conditions
Examples of experiments
MPICH-V
Large Scale Scheduling
Concluding remarks
19. Journ�e du LRI 19 25 Juin 2003 Grid eXplorer (GdX) platform for eXperiments �Mettre en �uvre, pour la communaut� des chercheurs en informatique une plate-forme d��mulation des syst�mes � grande �chelle de type GRID, P2P ou des syst�mes r�partis en g�n�ral�: Data Grid Explorer.
B) R�aliser des exp�riences utilisant la plate-forme sur les syst�mes � grande �chelle en �tudiant notamment la probl�matique des donn�es massives (s�curit�, fiabilit�, performance). �
1K CPU clusters
configurable network (Ethernet, Myrinet, others?)
configurable OS (kernel, distribution, etc.)
Multi-users
Located/managed by IDRIS
20. Journ�e du LRI 20 25 Juin 2003 Grid eXplorer inside Grid 5000
21. Journ�e du LRI 21 25 Juin 2003
22. Journ�e du LRI 22 25 Juin 2003
23. Journ�e du LRI 23 25 Juin 2003
24. Journ�e du LRI 24 25 Juin 2003
25. Journ�e du LRI 25 25 Juin 2003
26. Journ�e du LRI 26 25 Juin 2003
27. Journ�e du LRI 27 25 Juin 2003
28. Journ�e du LRI 28 25 Juin 2003
29. Journ�e du LRI 29 25 Juin 2003 A Software architecture forstudying the impact of Large Scalein Distributed Syst. and Networks. So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
30. Journ�e du LRI 30 25 Juin 2003 Grid eXplorer (GdX) eXperimental conditions database A set of sensors (Nodes, Networks):
? Academic Networks (x K nodes, GRID 5000)
? ADSL (la grenouille ? 60 K nodes)
? XW-probes (net. perf. eval. on XW platforms)
A common format for traces
A tool set for accessing, managing traces
Tools for trace analysis
31. Journ�e du LRI 31 25 Juin 2003
32. Journ�e du LRI 32 25 Juin 2003
33. Journ�e du LRI 33 25 Juin 2003 Grid eXplorer (GdX) Tool set: Experimental condition injector,
Emulators (running real app., sys., Middl. software)
Parallel simulators (difficult!)
Virtual GRID environment (1k virtual nodes on 1k nodes)
Measurement tools,
Visualization tools,
Result analysis tools.
34. Journ�e du LRI 34 25 Juin 2003
35. Journ�e du LRI 35 25 Juin 2003
36. Journ�e du LRI 36 25 Juin 2003
37. Journ�e du LRI 37 25 Juin 2003
38. Journ�e du LRI 38 25 Juin 2003
39. Journ�e du LRI 39 25 Juin 2003 Outline Introduction
Motivating a large scale instrument for Grid emulation/simulation
Status of Grid today
Other tools for large scale simulation/emulation
A large scale instrument for exploring Grid issues
in reproducible experimental conditions
Examples of experiments
MPICH-V
Large Scale Scheduling
Concluding remarks
40. Journ�e du LRI 40 25 Juin 2003 Large Scale Scheduling So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
41. Journ�e du LRI 41 25 Juin 2003 Methodology So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
42. Journ�e du LRI 42 25 Juin 2003 Large Scale Scheduling Simulator So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
43. Journ�e du LRI 43 25 Juin 2003
44. Journ�e du LRI 44 25 Juin 2003 Three scheduling algorithms (task distribution not execution) So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
45. Journ�e du LRI 45 25 Juin 2003 Comparing the three algorithms So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
46. Journ�e du LRI 46 25 Juin 2003 Random at 50% of task distribution So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
47. Journ�e du LRI 47 25 Juin 2003 Pressure at 50% of task distribution So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
48. Journ�e du LRI 48 25 Juin 2003 Distribution speed So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
49. Journ�e du LRI 49 25 Juin 2003 Outline Introduction
Motivating a large scale instrument for Grid emulation/simulation
Status of Grid today
Other tools for large scale simulation/emulation
A large scale instrument for exploring Grid issues
in reproducible experimental conditions
Examples of experiments
MPICH-V
Large Scale Scheduling
Concluding remarks
50. Journ�e du LRI 50 25 Juin 2003
51. Journ�e du LRI 51 25 Juin 2003
52. Journ�e du LRI 52 25 Juin 2003
53. Journ�e du LRI 53 25 Juin 2003 Grid eXplorer (GdX) A long term effort
? A medium term milestone: 2 years
for a fully functional prototype
Many scientific issues (large scale emulation, experimental conditions injection, distance to reality, etc.)
A tool for Grid users or potential users
A tool for Grid/P2P developers
A tool for Grid/P2P researchers
54. Journ�e du LRI 54 25 Juin 2003