160 likes | 266 Views
Grid Scheduler: Plan & Schedule. Adam Arbree Jang Uk In. Current System. User Request. VDC. RC. TC. Chimera. Abstract Planner. Concrete Planner. DAGMan. Condor-G. Globus (gahp-server). Remote Site. User Request. VDC. TC. RC. Data Rep. Service. Chimera Abstract Planner.
E N D
Grid Scheduler:Plan & Schedule Adam Arbree Jang Uk In
Current System User Request VDC RC TC Chimera Abstract Planner Concrete Planner DAGMan Condor-G Globus (gahp-server) Remote Site
User Request VDC TC RC Data Rep. Service Chimera Abstract Planner Scheduling Server Grid Monitor Interface Scheduling Client PRDB JDB GDB Condor-G Globus (gahp-server) Remote Site Proposed System
Scheduling Server .RC JDB Grid Mon. Data Replication Server DAG Reducer Tracking System Message Interface Planner Prediction Engine Scheduling Client RC & TC Grid Mon. PJDB
Input User virtual data request Output Abstract production plan Queries VDC for full dependency graph Chimera Abstract Planner User Request VDC TC RC Data Rep. Service Chimera A-Planner Scheduling Server Grid Monitor Interface Scheduling Client PRDB JDB GDB Condor-G Globus (gahp-server) Remote Site
Input Parse abstract DAG Read run messages from server Output Send DAG to server Build and send jobs for Condor-G Maintain local image of DAG progress Refresh the scheduler data by request Choose scheduling server Scheduling Client User Request VDC TC RC Data Rep. Service Chimera A-Planner Scheduling Server Grid Monitor Interface Scheduling Client PRDB JDB GDB Condor-G Globus (gahp-server) Remote Site
TC: Trans. Catalog (LFN, site) (PFN, env) RC: Replica Catalog (LFN, site) (PFN) (LFN, site, copy) (PFN) PRDB: Prediction DB (job, params, site) Execution Time CPU use Disk use Bandwith JDB: Job DB (job) Job state Site VO User Params Prediction use Current use Scheduling Databases User Request VDC TC RC Data Rep. Service Chimera A-Planner Scheduling Server Grid Monitor Interface Scheduling Client PRDB JDB GDB Condor-G Globus (gahp-server) Remote Site
Input Monitor data Output Data to Data Rep. Service Data to Server Data to grid cache Monitors Cost Function VO limits table CPU load (by job) Disk Usage (by job) Job List Bandwidth (by job) Grid Monitor User Request VDC TC RC Data Rep. Service Chimera A-Planner Scheduling Server Grid Monitor Interface Scheduling Client PRDB JDB GDB Condor-G Globus (gahp-server) Remote Site
Input A-DAG (from client) User status requests (from client) Job run requests (from planner) Job state request (from rep. server) Job state (from tracking) Output Job run requests (to client) Status updates (to client) Pruned DAG (to client) Job state (to rep. server) Job state request (to tracking) Manages client connections Provides incoming and out going message queues Checks connectivity of clients Message Interface RC JDB Grid Mon. Data Rep. Server DAG Reducer Tracking Sys. Message Int. Planner Pred. Engine Sched. Client RC & TC Grid Mon. PJDB
Input Complete Abstract DAG (from message int.) Replica data (from RC) Output DAG pruned for file existance (to message int.) Dag Reducer RC JDB Grid Mon. Data Rep. Server DAG Reducer Tracking Sys. Message Int. Planner Pred. Engine Sched. Client RC & TC Grid Mon. PJDB
Input Job description (from planner) Updated history information (from tracking system) History data (from PRDB) Output Job prediction (to planner) History information (to tracking sys.) History Data (to PRDB) Predict the time for a job on each site in the grid Prediction Engine RC JDB Grid Mon. Data Rep. Server DAG Reducer Tracking Sys. Message Int. Planner Pred. Engine Sched. Client RC & TC Grid Mon. PJDB
Input Pruned DAG (from DAG reducer) Job status (from planner) Prediction information (from pred. engine) Status req. (from message interface) Job data (from JDB) Output Job status (to planner) New history information (to pred. engine) Status information (to message interface) Job data (to JDB) Periodically access grid monitor and update job status Tracking System RC JDB Grid Mon. Data Rep. Server DAG Reducer Tracking Sys. Message Int. Planner Pred. Engine Sched. Client RC & TC Grid Mon. PJDB
Input Job status (from tracking system) Job predictions (from pred. engine) PFN’s (from TC and RC) Grid status (from grid mon.) Output Job status (to tracking system) Job run requests (to message interface) Scheduling process Check grid status Determine next job to run and its execution site Transfer input files Send message to client to run job Update tracking Transfer files to storage Clean up Update RC Tracking System RC JDB Grid Mon. Data Rep. Server DAG Reducer Tracking Sys. Message Int. Planner Pred. Engine Sched. Client RC & TC Grid Mon. PJDB
Input Grid status Job queue Output Entries to RC Monitor grid and determine hot spots Select sites to replicate data Transfer data to replication sites Clean up unneeded data Data Replication Service User Request VDC TC RC Data Rep. Service Chimera A-Planner Scheduling Server Grid Monitor Interface Scheduling Client PRDB RJDB GDB Condor-G Globus (gahp-server) Remote Site
Grid Simulation • Only two outside interfaces • Condor-G • Remote sites • Condor-G emulator takes real Condor-G submit files and sends fake jobs to remote site emulators • Remote site emulators sleeps for designated periods for each job and send simulated data to the grid monitor
Development Schedule • Research ~ present-Jan 20th • Survey existing monitoring sytems • Decide what must be monitored • Initial framework ~ Jan 20th- end of Feb • Build grid monitor interface • Build grid simulator • Design scheduler and data replication service • Build scheduler ~ March • Build data replication service ~ April • Grid Testing ~ May