90 likes | 217 Views
WP1 Status and Y2 plans. Massimo Sgaravatto INFN Padova. EDG WP1 Status. Implemented first workload management system (1 st release) Code delivered to EDG-IT and integrated with the sw delivered by the other WP’s Being tested by applications (validation)
E N D
WP1 Status andY2 plans Massimo Sgaravatto INFN Padova
EDG WP1 Status • Implemented first workload management system (1st release) • Code delivered to EDG-IT and integrated with the sw delivered by the other WP’s • Being tested by applications (validation) • Working on bug fixes and improvements (also according to Validation feedback) • Start working with year 2 issues
WP1 1st release • UI (User Interface) • Ability to submit a job, described via anv appropriate Job Description Language (JDL), based on Condor ClassAds to the DataGrid testbed from any user machine • UI allows to monitor and control (terminate) the job, and to transfer a "small" amount of data to and from the client machine and the executing machine (Input/Output Sandbox), using gridftp • Lightweight, python-based client
WP1 1st release • RB (Resource Broker) • Responsible to choose the “best” resources where to submit jobs based on the constraint specified in the JDL and characteristics and status of resources (published in the Grid Information Service and Replica Catalog) • The strategy that is used for this first project release is to send the job to an appropriate CE (Computing Element): • where the submitting user has proper authorization • that matches the characteristics specified in the JDL (Architecture, computing power, application environment, etc.) • where the specified input data (and possibly the chosen output SE) are determined to be "close enough" by the appropriate resource administrators. • The RB Strategy module is designed to be easily replaceable • Matchmaking performed using Condor ClassAds library
WP1 1st release • JSS (Job Submission Service) • Responsible for job management operations (issued when requested by RB) and to keep tracks of submitted jobs • Wrapper of Condor-G • II (Information Index) • First filter to the Grid Information Service • Specific applications of Globus GIIS • LB (Logging & Bookkeeping) • Job status information • “State machine” view of each job • Push model
WP1 Y2 plans • 1.1 (January 2002) • Bug fixes • Minor improvements • 1.2 (March 2002) • Support for automatic proxy renewal • Use of Globus MyProxy (with proper customizations) • Necessary to implement some customizations to GRAM (to forward the “fresh” proxy to the jobmanager) • 1.3 (May 2002) • Development of APIs for the applications • Now only command line tools provided • Ability to submit MPI jobs • Starting submitting a MPI job to a single CE • Tests of use of WP3 R-GMA for L&B services • Date for actual integration can’t be foreseen now
WP1 Y2 plans • 1.4 (June 2002) • Support for interactive jobs • Jobs running on some CE worker node where a channel to the submitting (UI) node is available for the standard streams (proof like applications) • Specification of job dependencies • Triggering of file transfers • Integration of network info into scheduling policy • Deployment of accounting infrastructure over testbed (HLR with command line interface) • Development of GUI
WP1 Y2 plans • 1.4 (June 2002) • Advance reservation API’s for three use cases: • A job is scheduled on a CE when the requested storage space for output becomes available on some SE • A job is scheduled on a CE when the required input data has been replicated near the CE • A job is scheduled on a CE when some network service reservation slot becomes available to service the needs of the main job execution thread • 2.0 (September 2002) • Support for job partitioning • Full integration of cost estimation/accounting into scheduling policy • Integration of advance reservation/coallocation into RB