340 likes | 493 Views
Dr. Isabel Campos Plasencia (on behalf of the inteugrid team) Instituto de Física de Cantabria, IFCA (Santander) Consejo Superior de Investigaciones Científicas (CSIC). Interactive European Grid: An interoperable infrastructure targeting interactivity, visualization and parallelism.
E N D
Dr. Isabel Campos Plasencia (on behalf of the inteugrid team) Instituto de Física de Cantabria, IFCA (Santander) Consejo Superior de Investigaciones Científicas (CSIC) Interactive European Grid:An interoperable infrastructure targeting interactivity, visualization and parallelism EGEE User Forum Manchester, 9th – 11th May 2007
EGEE User Forum, Manchester, 9th – 11th May 2007 Projectacronym int.eu.grid Contract number 031857 InstrumentI3 Duration 2 years may ´06-april ´08 Coordinator: Jesús Marco de Lucas, CSIC “providing transparently the researcher’s desktop with the power of a supercomputer, using distributed resources” http://www.interactive-grid.eu The Interactive European Grid
EGEE User Forum, Manchester, 9th – 11th May 2007 Outline of the presentation • Objectives & challenges of int.eu.grid • Applications requirements • Middleware versus Apps. • MPI Support • Interactive steering • Visualization • Example Open-MPI Grid Visualization
EGEE User Forum, Manchester, 9th – 11th May 2007 The challenge of a stable infrastructure: int.eu.grid • From the middleware point of view • Parallel Computing (MPI) • Support intracluster Jobs with OpenMPI • Support intercluster Jobs with PACX-MPI • Advanced visualization tools allowing simulation steering • GVid, glogin • A Job scheduler that supports it all • User friendly interface to the grid supporting all this features • Integrating in the Migrating Desktop all the features • From the Infrastructure point of view • Operate a production level infrastructure 24/7 • Support Virtual Organizations at all levels • Running the VO (user support) • From the Applications point of view • Analyze requirements of reference applications • Ensure that middleware copes the reference applications demands • Application Porting Support • Promote collaborative environments like AccessGrid
EGEE User Forum, Manchester, 9th – 11th May 2007 This is our infrastructure
EGEE User Forum, Manchester, 9th – 11th May 2007 Applications Requirements • Understanding the Application User input to NA team • Description in terms of • Area of knowledge and status of the art • Results expected and impact on the scientific community • Understanding the computational approach at the algorithmic level • Resources needed • Software & Hardware • GRID services • GRID added value • Why on the GRID ? • Interactive environment • Graphics & Visualization • Quality of Service and network reliability
Project Pilot Applications Fusion Astrophysics Environment Medical Imaging EGEE User Forum, Manchester, 9th – 11th May 2007
Height above Surfacein m IMS Model Suite Applications in Environmental Research Evolution of pollution clouds in the atmosphere EGEE User Forum, Manchester, 9th – 11th May 2007
EGEE User Forum, Manchester, 9th – 11th May 2007 Analysis of CMB maps (Astrophysics)
EGEE User Forum, Manchester, 9th – 11th May 2007 Pattern: Requirements for Middleware • Distributing the task among N processors • MPI support • The Job should be started inmediately on the user desktop • MPI Interactive job scheduling • The graphical interface should be forwarded to the user desktop • Graphical interface to the grid Migrating Desktop • Supporting Visualization GVid • The user should be able to steer the simulation • Real Time steering glogin
EGEE User Forum, Manchester, 9th – 11th May 2007 GRID MPI Support • Why MPI Support ? • The standard API for distributed memory parallelisation • Write once, run everywhere • This is what applications are What is MPI • Is an API • Description of the semantics, but NOT the implementation • Almost platform indenpendent (modulo problems with MPI-IO) What is NOT MPI • There is no implementation • No specification of how to start the processes • How to the get the binary on the remote sites • How to start the binaries on the remote sites (ssh, PBS,…)
EGEE User Forum, Manchester, 9th – 11th May 2007 MPI Support • Why MPI Support ? • The standard API for distributed memory parallelisation • Write once, run everywhere • This is what applications are There are many issues about handling MPI jobs types already worked out for Linux Clusters, SuperComputers, etc… which have to be addressed when running MPI on the Grid in a particular way. What is MPI • Is an API • Description of the semantics, but NOT the implementation • Almost platform indenpendent (modulo problems with MPI-IO) What is NOT MPI • There is no implementation • No specification of how to start the processes • How to the get the binary on the remote sites • How to start the binaries on the remote sites (ssh, PBS,…)
EGEE User Forum, Manchester, 9th – 11th May 2007 There is no standar way how to start a MPI program No common Syntax for mpirun MPI-2 defines mpiexec as starting mechanism, but support for mpiexec is only optional Resource Brokers should handle different MPI implementations Different Schedulers and different MPI implementations at each site have different ways to specify the machinefile Problems of MPI Support on the Grid • Non-shared filesystems (Oh!) • Many Grid sites dont have support for a shared home directory • Many MPI implementations expect that the executable is available in the nodes where the process is started • Mixed setup in general: some sites have shared filesystems, some not
EGEE User Forum, Manchester, 9th – 11th May 2007 MPI Support in Grid Environments • In Grid Environments there are two possible cases • Intra Cluster Jobs • All processes run on the same cluster • Inter Cluster Jobs • Processes are distributed across several clusters/sites 1 2 SIZE . . . P2P MPI_COMM_WORLD Collective Communication
EGEE User Forum, Manchester, 9th – 11th May 2007 Grid Scheduler Language needs “translation” to local scheduler syntax Local Site B UI RB WN CE General Grid Scheduler Local Scheduler Translate? NO, Of course! Translate? NO Translate? YES, but how? RB cannot be updated often without compromising the whole job submission procedure
EGEE User Forum, Manchester, 9th – 11th May 2007 Problems of MPI Support on the Grid • Our Solution an intermediate layer: mpi-start RESOURCE BROKER MPI-START MPI Implement. Scheduler
EGEE User Forum, Manchester, 9th – 11th May 2007 mpi-start • Goals • Hide differences between MPI implementations • Hide differences between local schedulers implementations • Supports simple file distribution Hides from the user the filesystem details (shared or non-shared) • Providing a simple but powerful enough unique interface for the Resource Broker to specify MPI Jobs The Resource Broker does not have to contain hardcoded the MPI support
EGEE User Forum, Manchester, 9th – 11th May 2007 mpi-start • mpi-start: design mpi-start portable ($bash scripting) schedulers MPI hooks PBS SGE filesystem Openmpi PACX-MPI MPICH
EGEE User Forum, Manchester, 9th – 11th May 2007 MPI Job Example Executable = "IMB-MPI1"; Arguments = "pingpong"; JobType = "Parallel"; JobSubType = "openmpi"; NodeNumber = 16; StdOutput = "std.out"; StdError = "std.err"; OutputSandbox = {"std.out","std.err"}; InputSandbox = {"IMB-MPI1"}; mpirun –machinefile $TMP/machines –np 16 pingpong
EGEE User Forum, Manchester, 9th – 11th May 2007 MPI Job Example Include in JDL the following: InputSandbox = {"MyHooks.sh", ....}; Environment = {"I2G_MPI_PRE_RUN_HOOK=./MyHooks.sh", "I2G_MPI_POST_RUN_HOOK=./MyHooks.sh“ # cat MyHooks.sh pre_run_hook () { echo "pre run hook called " wget www.myhome.xx/mysources.tgz tar xzvf mysources.tgz make … return 0; }
EGEE User Forum, Manchester, 9th – 11th May 2007 Dissemination Effort • School organized in Dublin at TCD, Course including Grids and MPI Support Hosted by TCD (Brian Coghlan) Date: end of June 2007.
EGEE User Forum, Manchester, 9th – 11th May 2007 MPI Support in Grid Environments • For InterCluster Jobs we support PACX-MPI • A middleware for seamlessly run a MPI-application on a network of parallel computers • PACX-MPI is an optimized standard-conforming MPI- implementation, application just needs to be recompiled(!) • PACX-MPI uses locally installed, optimized vendor implementations for cluster inter communication Application PACX-MPI (job) Open MPI (job) Open MPI (job) Cluster 1 Cluster 2
EGEE User Forum, Manchester, 9th – 11th May 2007 3 1 1 5 3 0 2 0 4 2 PACX-MPI Design • A grid site has in general the following topology • CE = Computing Element (head node) public IP • WN = Worker Nodes, private IP • Requirements • Connectivity of CE to the clusters and start-up daemons • Files: Application & Input files • Start on daemons on the CE. Connectivity of ssh to CE A MPI Job requesting N processes per cluster spawns N+2 processes, Two of them in the CE running as Daemons, making the bridge between clusters CE WN
EGEE User Forum, Manchester, 9th – 11th May 2007 3 1 4 6 1 0 5 3 2 4 0 1 2 0 5 7 4 2 3 5 Cluster 1 Cluster 2 PACX MPI – Design • External Communication • Handled via the Computing Element, the only one with public IP • TCP/IP daemons do the job
Example: Visualization of plasma in fusion devices • The application visualizes the behaviour of plasma inside a Fusion device • Runs are foreseen as a part of a so called Fusion Virtual Session • The plasma is analyzed as a many body system consisting of N particles • Inputs • Geometry of the vacuum chamber • Magnetic field in the environment • Initial number, position, direction, velocity of particles • Possibility of collisions between particles • Density of particles inside the device • Solves a set of Stochastic Differential Equations with Runge-Kutta method • Outputs • Trajectories of the particles • Average of relevant magnitudes: densities, temperatures... TJ-II Stellerator at CIEMAT (Spain) Graphical Display in using OpenGL with interactive capabilities EGEE User Forum, Manchester, 9th – 11th May 2007
EGEE User Forum, Manchester, 9th – 11th May 2007 Porting the application to int.eu.grid • Spread the calculation over hundreds of Worker Nodes on the Grid to increase the number of particles in the plasma. • Design of a Grid collaborative environment for fusion device designing and analysis. Uses most of the capabilities of the int.eu.grid Middleware N particles distributed among P processes: MPI Particle trajectories are displayed graphically Interactive simulation steering
EGEE User Forum, Manchester, 9th – 11th May 2007 Global Schema MPI + interactive + visualization
EGEE User Forum, Manchester, 9th – 11th May 2007 Glogin Lightweight tool for support of interactivity on the grid Grid authenticated shell access “glogin host” No dedicated daemon needed such as sshd TCP Port Forwarding enables access to grid worker nodes with private IPs. X11 Forwarding Middleware for Visualization & Steering Our middleware is based on the combination of a Grid Video Streamer together with an interactive grid enabled login tool • GVid • Grid Video Service • Visualization can be executed remotely on a grid resource • Transmits the visualization output to the user desktop • Communication of the interaction events back to the remote rendering machine • Uses Glogin as bi-directional communication channel
EGEE User Forum, Manchester, 9th – 11th May 2007 Fusion Application MPI Schema MPI job distribution P1 P0 MPI synchronization Every Process does own i/o P2 P3 Independent Processes Master P0 does renderization
EGEE User Forum, Manchester, 9th – 11th May 2007 The User interacts with the Master process for Visualization and Steering USER SIDE P0 Master Event Reception Event Interception Keyboard Mouse Java Gvid Decoder Gvid Encoder P1 P2 User Screen P3
EGEE User Forum, Manchester, 9th – 11th May 2007 Using the Migrating Desktop Graphical Interface to Job submission
EGEE User Forum, Manchester, 9th – 11th May 2007 Running on the Migrating Desktop Job monitoring Job logs & details
EGEE User Forum, Manchester, 9th – 11th May 2007 See our DEMO:Simulation steering on the Migrating Desktop
EGEE User Forum, Manchester, 9th – 11th May 2007 Some related events GRIDS & E-SCIENCE 24 - 29th September Santander, Spain http://grid.ifca.unican.es