160 likes | 391 Views
Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes GRAAL Research Team Join work with DIET TEAM. D istributed I nteractive E ngineering T oolbox. DIET Batch and Simbatch: a quick glance. RPC and Grid Computing: Grid RPC. Request. S2 !. A, B, C.
E N D
Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rhône-Alpes GRAAL Research Team Join work with DIET TEAM Distributed Interactive Engineering Toolbox DIET Batch and Simbatch:a quick glance
RPC and Grid Computing: Grid RPC Request S2 ! A, B, C Answer (C) AGENT(s) Client Op(C, A, B) S4 S3 S1 S2
Outline • Introduction • Diet-Batch • Simbatch • Conclusion and perspectives
DIET Architecture MA MA JXTA MA MA LA FAST library Application Modeling System availabilities LDAP NWS Client Master Agent MA Server front end LA LA LA Local Agent
Parallel and batch submissions - 1/2 Parallel & sequential jobs → transparent for the user Submit a parallel job → system dependent NFS: copy the code? MPI: LAM, MPICH? batch system dependent Numerous batch systems(homogenization?) Batch schedulers behaviour(queues, scripts, etc.) Information about theinternal scheduling process Monitoring& Performance prediction MA SeD_parallel LA SeD_batch SeD_seq GLUE OAR SGE LSF PBS Loadleveler Frontal NFS
Parallel and batch submissions - 2/2 2 API Client side Request for seq, // resolution or let DIET choose the best Server side Script with generic mnemonics DIET_NAME_FRONTALE, DIET_NB_NODES, DIET_BATCH_NODESFILE A program that must end with a call to diet_submit_call() Experiments
Performance prediction with batch system During the submission stage Need to know when the task will begin/end Need to decide how many processors will be used Need performance prediction! Three means Use a probabilistic tool Ask the batch system (only available for MAUI and OAR 2.0) Use a simulator
Batch scheduler overview Portable Batch System (PBS) First Come First Served (FCFS) OAR (v. 1.6) Conservative BackFilling (CBF) Torque + Maui Only torque: FCFS Maui 3 scheduling policies: BESTFIT, FIRSTFIT (CBF), GREEDY Sun Grid Engine (SGE) FCFS Loadleveler 3 scheduling policies: FCFS, CBF, GANG Possibility to plug external schedulers EASY Maui (should soon become the standard scheduler)
Grid simulator overview Data replication: ChicSim : I. Foster PARallel Simulation Environment for Complex Systems OptorSim: W. H. Bell, D. G. Cameron, R. Carvajal-Schiaffino JAVA Grid-economy GridSim: R.Buyya(Nimrod/G) JAVA Quite similar to Simgrid Non-specialized toolkit Simgrid H. Casanova, A. Legrand and M. Quinson C
… and their drawbacks Minimal support for batch schedulers Sometimes lack of functionalities to create them Often difficult to reuse Example: OptorSim No parallel tasks available Backfilling impossible Lack of realism
Simbatch in a nutshell Goals Cluster simulation for enhancing realism Prediction tool for DIET API for clients Description of the platform in XML files Use of the API in the deployment.xml file Example 1: Creating a batch process on the host « Frontal » <process host=“Frontale” function=“SB_batch” /> Example 2: Creating a resource <process host=“Node1” function=“SB_node” /> Each batch must be described in simbatch.xml A specific load can be simulated for each batch API for developers Algorithms are plug-ins Reusable functions Find the first matching slot in a Gantt chart slot_t * find_first_slot(cluster_t c, int nb_nodes, double start_time, double duration); Empty queues and reschedule void generic_reschedule(cluster_t cluster, void (*schedule)(cluster_t cluster, m_task_t task));
Experiment description 2 types of experiments Validation by simulation: parameter variation Topology, scheduling algorithm… Comparison between simulated platform Task generation Inter-arrival time: Poisson law, µ = 300s Resources number: U(1,5) Run time: U(600,1800) Wall time: run time x U(1.1;1.3) Experiment platform 5 node cluster Star topology OAR v. 1.6
Simulation precision Number of tasks: 100 Makespan: 23h Error rate on the flow metrics around 1%
Conclusion and perspectives • DIET-Batch • Diet is now able to handle batch schedulers • 3 Sed types: sequential, batch, parallel • Good performance improvements • Simbatch • Standalone simulations show good results • Configuration file available to simulate Lyon’s site • Excellent tool to replay load • Next steps • Integrate Simbatch in DIET-Batch
http://graal.ens-lyon.fr/DIET/ http://graal.ens-lyon.fr/simbatch/ Questions ?