540 likes | 725 Views
Journ?e du LRI. 2. 25 Juin 2003. Outline. Introduction Motivating a large scale instrument for Grid emulation/simulationStatus of Grid todayOther tools for large scale simulation/emulation A large scale instrument for exploring Grid issues in reproducible experimental conditions Example of e
E N D
1. Journée du LRI 1 25 Juin 2003 ACI Masse de données
Data Grid Explorer
2. Journée du LRI 2 25 Juin 2003 Outline Introduction
Motivating a large scale instrument for Grid emulation/simulation
Status of Grid today
Other tools for large scale simulation/emulation
A large scale instrument for exploring Grid issues
in reproducible experimental conditions
Example of experiments
Large Scale Scheduling
Concluding remarks
3. Journée du LRI 3 25 Juin 2003 Several types of GRID
4. Journée du LRI 4 25 Juin 2003 Open issues in Grid/P2P Security (GSI, CAS)
Data Storage/consultation/movement
Multi users/ Multi applications scheduling
Coordination (virtual, ephemeral infrastructure)
Programming
Fault Tolerance!
Scalability
Performance
Easy/efficient deployment techniques
Application characterization techniques
Etc.
5. Journée du LRI 5 25 Juin 2003 What is Grid today? Middleware: Globus, Legion, Netsolve, Unicore, DIET, Condor, XtremWeb, Boinc, NWS
? they are actually working!
Testbed: DataGRID, TeraGRID, e-Toile, Grads, XW, Boinc
? Difficult to build (debug, human factor, etc.)
Dedicated applications (SETI, Kazaa, Jabber, etc.)
? they are working already (at large scale!),
? BUT they address much less issues!
6. Journée du LRI 6 25 Juin 2003 Difficulties with Grid/P2P! Generic Grid/P2P are very complex systems (we still have problems with large scale parallel computers!, we have less control on Grid resources! )
Too many issues are addressed simultaneously
?We need a methodology enabling the study of Grid issues, independently but realistically
7. Journée du LRI 7 25 Juin 2003 What are the current approaches? Simulators: SimGRID, MicroGRID, etc.
? they have strong limitations (scalability,
!= than execution of real codes, validation)
Experimental testbed (is there any?)
? Most testbed are for production, each testeb is specific, representativeness?
? We have no way to test:
ideas independently, at a significant scale,
with realistic parameters and behaviors!
8. Journée du LRI 8 25 Juin 2003 Case study: MPICH-V
9. Journée du LRI 9 25 Juin 2003 Case study: MPICH-V
10. Journée du LRI 10 25 Juin 2003 Case study: MPICH-V
11. Journée du LRI 11 25 Juin 2003 What is missing? A full fledge scientific environment
(reproducible realistic experimental conditions)
Probes measuring the performance of real resources and networks (Ganglia, NWS, la grenouille)
Fully experimental testbed, (GRID 5000 would remove this lack)
? Not enough, we need instruments with parametrisable reproducible experimental conditions
12. Journée du LRI 12 25 Juin 2003 Existing Network Simulators/Emulators NS, NS2
Network simulation (congestion, packet loss, etc.)
Modelnet
Application + network emulation (some nodes are playing the role of routers)
13. Journée du LRI 13 25 Juin 2003 Existing Grid Simulators SimGRid and SimGrid2
Discrete event simulation with trace injection
Originaly dedicated to scheduling studies
14. Journée du LRI 14 25 Juin 2003
15. Journée du LRI 15 25 Juin 2003 Grid eXplorer
16. Journée du LRI 16 25 Juin 2003 Grid eXplorer : an instrument for understanding GRID and P2P systems
17. Journée du LRI 17 25 Juin 2003 Grid eXplorerAnalogy with physic instruments
18. Journée du LRI 18 25 Juin 2003 Outline Introduction
Motivating a large scale instrument for Grid emulation/simulation
Status of Grid today
Other tools for large scale simulation/emulation
A large scale instrument for exploring Grid issues
in reproducible experimental conditions
Examples of experiments
MPICH-V
Large Scale Scheduling
Concluding remarks
19. Journée du LRI 19 25 Juin 2003 Grid eXplorer (GdX) platform for eXperiments “Mettre en œuvre, pour la communauté des chercheurs en informatique une plate-forme d’émulation des systèmes à grande échelle de type GRID, P2P ou des systèmes répartis en général : Data Grid Explorer.
B) Réaliser des expériences utilisant la plate-forme sur les systèmes à grande échelle en étudiant notamment la problématique des données massives (sécurité, fiabilité, performance). “
1K CPU clusters
configurable network (Ethernet, Myrinet, others?)
configurable OS (kernel, distribution, etc.)
Multi-users
Located/managed by IDRIS
20. Journée du LRI 20 25 Juin 2003 Grid eXplorer inside Grid 5000
21. Journée du LRI 21 25 Juin 2003
22. Journée du LRI 22 25 Juin 2003
23. Journée du LRI 23 25 Juin 2003
24. Journée du LRI 24 25 Juin 2003
25. Journée du LRI 25 25 Juin 2003
26. Journée du LRI 26 25 Juin 2003
27. Journée du LRI 27 25 Juin 2003
28. Journée du LRI 28 25 Juin 2003
29. Journée du LRI 29 25 Juin 2003 A Software architecture forstudying the impact of Large Scalein Distributed Syst. and Networks. So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
30. Journée du LRI 30 25 Juin 2003 Grid eXplorer (GdX) eXperimental conditions database A set of sensors (Nodes, Networks):
? Academic Networks (x K nodes, GRID 5000)
? ADSL (la grenouille ? 60 K nodes)
? XW-probes (net. perf. eval. on XW platforms)
A common format for traces
A tool set for accessing, managing traces
Tools for trace analysis
31. Journée du LRI 31 25 Juin 2003
32. Journée du LRI 32 25 Juin 2003
33. Journée du LRI 33 25 Juin 2003 Grid eXplorer (GdX) Tool set: Experimental condition injector,
Emulators (running real app., sys., Middl. software)
Parallel simulators (difficult!)
Virtual GRID environment (1k virtual nodes on 1k nodes)
Measurement tools,
Visualization tools,
Result analysis tools.
34. Journée du LRI 34 25 Juin 2003
35. Journée du LRI 35 25 Juin 2003
36. Journée du LRI 36 25 Juin 2003
37. Journée du LRI 37 25 Juin 2003
38. Journée du LRI 38 25 Juin 2003
39. Journée du LRI 39 25 Juin 2003 Outline Introduction
Motivating a large scale instrument for Grid emulation/simulation
Status of Grid today
Other tools for large scale simulation/emulation
A large scale instrument for exploring Grid issues
in reproducible experimental conditions
Examples of experiments
MPICH-V
Large Scale Scheduling
Concluding remarks
40. Journée du LRI 40 25 Juin 2003 Large Scale Scheduling So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
41. Journée du LRI 41 25 Juin 2003 Methodology So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
42. Journée du LRI 42 25 Juin 2003 Large Scale Scheduling Simulator So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
43. Journée du LRI 43 25 Juin 2003
44. Journée du LRI 44 25 Juin 2003 Three scheduling algorithms (task distribution not execution) So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
45. Journée du LRI 45 25 Juin 2003 Comparing the three algorithms So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
46. Journée du LRI 46 25 Juin 2003 Random at 50% of task distribution So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
47. Journée du LRI 47 25 Juin 2003 Pressure at 50% of task distribution So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
48. Journée du LRI 48 25 Juin 2003 Distribution speed So our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectivesSo our objective is to be able to execute existing or new MPI applications.
Since we look for a transparent fault tolerance, the programmer view must be kept unchanged compared to MPI specification.
The two main issues to face are:
the volatility issue,
firewall protection.
In Global computing and P2P systems, participant machines may be protected by firewalls.
If two participants are protected by firewall, they may not be able to communicate.
We must deal with this possibility.
So here are the objectives
49. Journée du LRI 49 25 Juin 2003 Outline Introduction
Motivating a large scale instrument for Grid emulation/simulation
Status of Grid today
Other tools for large scale simulation/emulation
A large scale instrument for exploring Grid issues
in reproducible experimental conditions
Examples of experiments
MPICH-V
Large Scale Scheduling
Concluding remarks
50. Journée du LRI 50 25 Juin 2003
51. Journée du LRI 51 25 Juin 2003
52. Journée du LRI 52 25 Juin 2003
53. Journée du LRI 53 25 Juin 2003 Grid eXplorer (GdX) A long term effort
? A medium term milestone: 2 years
for a fully functional prototype
Many scientific issues (large scale emulation, experimental conditions injection, distance to reality, etc.)
A tool for Grid users or potential users
A tool for Grid/P2P developers
A tool for Grid/P2P researchers
54. Journée du LRI 54 25 Juin 2003