90 likes | 236 Views
Integrated Genomics Applications Pipeline (iGAP) on the GRID. iGAP – Current Architecture. iGAP – Current Architecture. Input Sequences are Spilt at the Client Catered for a Cluster Environment with a Local Scheduler Assumes Database and Input Sequences to be Shared among all Nodes (NFS)
E N D
Integrated Genomics Applications Pipeline (iGAP) on the GRID
iGAP – Current Architecture • Input Sequences are Spilt at the Client • Catered for a Cluster Environment with a Local Scheduler • Assumes Database and Input Sequences to be Shared among all Nodes (NFS) • Job Monitoring not Implemented
Porting iGAP to the Grid – Tasks Involved • Security and Access to Computing Resources • A Method of Publishing Available Grid Nodes • Sending Input Sequences to the Nodes • Scheduling Jobs Among Nodes • Location of Binary and Databases • Retrieving Output Files • Monitoring Job Status
iGAP – Job Submission • Authentication – Managed through Globus • Resources – Available grid nodes and their resources are published in a text file. • Scheduling – Weighted Round Robin • Input – Input files are copied to nodes using globus-url-copy • Execution – RSL Generated on client and submitted to Globus. • Log – Database updated with job information
iGAP – Job Monitoring • Retrieval – Search table for active jobs for the user • Status – Check status of these jobs with Globus • If job is complete, retrieve output files • Update status of the job in the database • Back to Step 1
Shortcomings • Current resource discovery mechanism inadequate • Shared file-systems within clusters in the Grid required • Globus job status reporting inadequate • Prior knowledge of location of database and binary
Future Implementation • Integration with the data-grid using SRB • Publishing database and binary location in the MDS • Automated re-submission of “failed” jobs • User notification on job completion