270 likes | 443 Views
Example applications of e-Infrastructure: NGS UI/WMS Jonathan Churchill - STFC/RAL jonathan.churchill@stfc.ac.uk. Summary. Overview Example case study. Logging in: SSH and MyProxy . Parametric jobs Head node for the grid Submitting simple jobs. Submitting ‘real’ jobs. Lab Session.
E N D
Example applications of e-Infrastructure:NGS UI/WMSJonathan Churchill - STFC/RALjonathan.churchill@stfc.ac.uk
Summary • Overview • Example case study. • Logging in: SSH and MyProxy. • Parametric jobs • Head node for the grid • Submitting simple jobs. • Submitting ‘real’ jobs. • Lab Session.
NGS UI/WMS • User Interface / Workload Management System • UMD (gLite) UI and WMS Distributions • NGS Useability improvements: • ssh / proxy logins • Extensive proxy checking • Gridftp on UI • “Head node for the Grid” • Works with NGS and gLite/GridPP sites • 12,000 + CPU Cores Rapid take up sinceOct ‘09 startup.
NGS WMS ssh Login : <MEG> UI happyuser@ngs.ac.uk <gsiFTP> <Soap WS> MyProxy WMS RAL-NGS2, Scotgrid-Glasgow, Oxford-OERC, Manchester-NGS2 RAL-LCG ....etc 12,000+ CPUs Information Server (BDII)
Transcriptome Analysis using the NGS UI/WMSJonathan Churchill - STFC/RALjonathan.churchill@stfc.ac.ukPaul Wilkinson - Exeter University
Green Dock Beetle Gastrophysaviridula Tobacco Hornworm MothManducasexta
mRNA Transcript genome.gov: National Human Genome Research Institute
Bio Databases and Applications • Database Mirrors:EMBL • UNIPROT, TREMBL, SWISSPROT • PROSITE • PRINTS • REBASE • Pre-Installed Applications: • BLAST, EMBOSS, FASTA • GROMACS, MrBAYES, • EXONERATE, NAMD, Siesta rsync ftp
WMS Parametric Case Study • What Proteins do these ‘contigs’/ transcripts code for ? • NCBI BLAST Search in the EBI Uniprot database • 1 x 55,000 Contigs • 1 month elapsed annotation time. • 1000 x 55 Contigs + NGS + WMS • < 6 hours elapsed annotation time. • Using WMS ‘Parametric’ JDL file • one JDL for 1000 Jobs • one Submission Command • One Status Command • Outputs returned to UI Automatically
JDL File Type = "Job"; JobType = "Parametric"; Executable = "/usr/ngs/BLAST-NCBI"; Arguments = "blastall -p blastx -d uniprot -icontig-_PARAM_.fsa -a 1"; StdOutput = "contig-_PARAM_.out"; StdError = "contig-_PARAM_.err"; Parameters = 997; ParameterStart = 0; ParameterStep = 1; MyProxyServer = "myproxy.ngs.ac.uk"; InputSandbox = {"contig-_PARAM_.fsa"}; InputSandboxBaseURI = "gsiftp://ngsui03.ngs.ac.uk:2811/home/ngs0055/ParamBlast/inputs"; OutputSandbox = {"contig-_PARAM_.out","contig-_PARAM_.err"}; OutputSandboxBaseDestURI = "gsiftp://ngsui03.ngs.ac.uk:2811/home/ngs0055/ParamBlast/outputs"; Requirements = ( Member("NGS-UEE-BLAST-NCBI", other.GlueHostApplicationSoftwareRunTimeEnvironment) ); Rank = other.GlueCEStateFreeCPUs; ShallowRetryCount = -1;
Submit & Monitor • glite-wms-job-submit –a –o jobIDs blast.jdl • glite-wms-job-status –ijobIDs • One jobID for all 1000 jobs • 1000 Output files • IP & OP files copied from/to UI • Jobs 2-3 hours each • Head node for the grid. Peak 320 Jobs in flight Avg150 Jobs in flight
Summary • 50,000 Contig analysis in < 6 hours vs 1 Month • ssh Username/password Logins • 1000 Jobs all managed as one ‘job’. • Input/Output on the UI. • Head node for the Grid.
Summary • Overview • Example case study. • Logging in: SSH and MyProxy. • Parametric jobs • Head node for the grid • Submitting simple jobs. • Submitting ‘real’ jobs. • Lab Session.
Simple Example [ Create your proxy <- UI does this for you ! ] voms-proxy-init --voms ngs.ac.uk See what’s available lcg-infosites --vo ngs.ac.uk ce Submit the job glite-job-submit -a –o jobIDs.txt my_test.jdl https://ngswms01.ngs.ac.uk:9000/LHGIagvDl701_msz0jpIg Check the status of your job glite-job-status -i jobIDs.txt Get the output glite-job-output –i jobIds.txt --dir ./outputs Note: UI/WMS can retrieve outputs automatically
Simple JDL file Some default parameters set on the UI Simple Example jdl Type = "Job"; JobType = "Normal"; Executable = “settings.sh"; StdOutput = “output.out"; StdError = “output.err"; InputSandBox = {“settings.sh”}; OutputSandbox = {“output.err",“output.out"}; RetryCount = 1; Requirements = ( other.GlueCEUniqueID == "ngs.rl.ac.uk:2119/jobmanager-lsf-ngs“); Rank = other.GlueCEStateFreeCPUs; Requirements = other.GlueCEStateStatus == “Production”;
Summary • Overview • Example case study. • Logging in: SSH and MyProxy. • Parametric jobs • Head node for the grid • Submitting simple jobs. • Submitting ‘real’ jobs. • Lab Session.
Input/Output Files • InputSandBox lists all input files • Inc’s binaries/scripts to run • Wildcards ok • OutputSandBox lists o/p files to retrieve. • Wildcards not allowed. • Tutorial shows ‘Epilogue’ script. • InputSandboxBaseURI • Avoids 3rd party transfers via WMS server. • OutputSandBoxBaseDestURI • O/P’s to UI or elsewhere. • Output dir must exist. • Files arrive before job “Done”. Type = "Job"; JobType = "mpich"; Executable = "/usr/ngs/DLPOLY2"; CpuNumber = 8; StdOutput = "std.out"; StdError = "std.err"; Myproxyserver= "myproxy.ngs.ac.uk"; InputSandbox = {"CONFIG","CONTROL","FIELD","REVCON"}; InputSandboxBaseURI = "gsiftp://ngsui03.ngs.ac.uk:2811/home/ngsxxx/dlpoly"; OutputSandbox = {"OUTPUT","STATIS","CONFIG", "CONTROL","FIELD","REVCON","REVIVE", "stdout.out","stderr.out" }; OutputSandboxBaseDestURI = "gsiftp://ngsui03.ngs.ac.uk:2811/home/ngsxxx/dlpoly"; Requirements = ( member("NGS-UEE-DLPOLY2", other.GlueHostApplicationSoftwareRunTimeEnvironment) ); ShallowRetryCount = -1;
Key Features Summary • ssh Logins • Input/Output on the UI • Head node for the Grid. • Single Jobs and Parametric Sweeps • Normal and MPI jobs • Example JDLs on wwww.ngs.ac.uk • Questions : support@grid-support.ac.uk
Further Information • NGS Web site UI-WMS Page: • http://www.ngs.ac.uk/uiwms • Links to simple WMS Tutorials (2) & app specific (Gaussian, NAMD etc) • http://www.ngs.ac.uk/applications • Tutorials • http://www.ngs.ac.uk/ngs-workload-management-system-and-user-interface-tutorials • http://wiki.ngs.ac.uk/index.php?title=UI-WMS_Tutorial • http://wiki.ngs.ac.uk/index.php?title=UI-WMS_Tutorial2 • Parametric Case Study: • http://www.ngs.ac.uk/mrna-analysis-using-the-ngs • http://www.ngs.ac.uk/sites/default/files/file/newsletters/Dec%202009%20NGS%20news.pdf • Links to Guides and JDL attributes doc. • http://wiki.ngs.ac.uk/index.php?title=UI-WMS_Tutorial#Further_Resources
Lab • http://wiki.ngs.ac.uk/index.php?title=UI-WMS-SeIUCCR-Tutorial • Login: • SSH username = “SummerUserXX” • XX is on your packs • SSH Password = “2012-SSXX” • Valid until Friday afternoon
Inputs >whitefly_assembly.accurate.15_lrc1 GGTATCAACGCAGAGTKCGCGGGGAGTAGAACAAAGAGCGTCTGAGAGGACTTCGCGATA GTGTTACGTTAATCGATAGCTCGTGTGTTAAAAAAATCTTTCAAGTCCTTCCTGTCTTTT GACTACTTAATTAGTTAATTATTATTTTGATCGAGACAAGCAAAGAAAAATGAATTCCAT ATTATCTTTGACCGTTTTCGTAACTTTCACAATTGTCTTGGCTCAAAGTGAACAATTAGA CAAGAACTTCGGCGTGGGCGAAATCAAGACTCGCATCCAAGATAAAAAATTTGTTGAGAA GCAGTTGGGCTGTGTCCTAGGGAAAGCCGATTGCGACACCTTAGGAAATCAGTTGAAAGT TGCCATTCCAGAAGTCCTAGTTAAAGGCTGCAAGGATTGCACTCCGGAACAATCTGCAAA TGCCAATCGATTAATAGCTTTTATAAAGATGAATTATCCAGCAGAATGGAGTCAAATTGC TGCAAAATATGGTGTGAAAGGTGATGCTGTAAAGAGGCCACGACGACATATCAGAAGGTG AAAGGAGTGATGCCAAAGATGTGATAAGTTTTTATTGTTAACTTTCGAGTCTTGACTTGA TTTGATCATTGTGTACGTATGTATTTTAATTCTTCCAATTGTGAGCAGTATTTTAAGAGG GTATTCTAAATAACAGCCGTCCAAAAAGTTTTGAACTGAAATTTAAACTGTTAAGTGTTG ATGACTTTTACCAATATTTATTTTTTTATCACCGAACTGTTAGTAATACTGCGACCAATA CAAATTTATCTTTAGTCAGCTTGATTTTTTATCAAGTTGATTCTTTTTTTTGGACAATTT TTTTTTTATTATTATTCTTCCTCATTTAATGTATGTTTAAAATTGTTAATTGACCACCAT TCGCATTTAATTGATTAAGTTTTTCTTATTTTTTTTTTATATGAACCAATGTTATAATTT TGCTCTCATAAACCTACTGTAAAATATTGAGTGTCCAGTTAAAGCTTTAAACTTTATATA TTTTAACAAAAAATTAATGAGCTATTTTATAGAACCTAATAA >whitefly_assembly.accurate.15_lrc2 TCGGGGGAGTAAATTCATGAAAGATAATCTAATCGTGCAGCCTTTTTATGAGACGCGCTG AAGTTTCGGATTAGGTTTTAGTCTTTACTAATTAATTGTATTTGTTTAGCTCATTAATTT TAATTATTCCACATTTAAAGATGTCTAAGGAAGAAGCAGCAATCCCTCCTCCAATGATTT GGGCCCAGAGATCTGGTGTTGTCTTTTTAACAATTAATGTAGAGGATTGTAAAGACCCCG AAATTAAAATTGAAGAAGATAAATTTTCTTTTAAAAGTGTTGGTGGTGTTGAAAAGAAGA AATATGAAGTCACAGTAAATCTATTTAAAGAAATAGACCCAGAAAAATCTGTAAAACATG TTCGCGAACGACACATTGAGTTAGTCCTAAAAAAGAAAGAAGACAAAGCTCCTTACTGGC CACAATTGACGAAAGAAAAGACTAAGCACCATTGGTTAAAAGTGGATTTCAATAAGTGGA AGGATGAAGATGATAGCGAAGATGAAGCCGAAGGACAAGACTCAGATTTTGGTGATCTAA TGCGGTCGATGGGTCAAGGAGGCGGTATGGGTGGTATGGGCGGTATGGGTGGTATGGGAG GAATGGGAGGTATGGGTATGGGCGGTATGGGTGGTATGGGAATGGGTGGTTTAGGTGACA AGCCCTCTTTCGAAGGAATGGAAGAAGAAGATTCGGACGACGAAGATTTGCCCGACCTCG AAGAGTAATAGTGTTTTTATTACACCATATTCCATTTCCCTGTTATTGCATAAGGCCTCA GAAGAAGATGAAAAAATTGAAGCTATGAACGGACAGTCAAATCGATCACGCAGTTCACTG • ....55,000 more contigs • Split upinto ~1000 files of ~55 contigs each. • Custom perl script or bioperl routines. • contig-0.fsa ... config-997.fsa
NGS WMS ssh Login MEG ngsui03.ngs.ac.uk a.PhD@happyuser.ngs.ac.uk gridftp myproxy.ngs.ac.uk ngswms01.ngs.ac.uk RAL-NGS2, Scotgrid-Glasgow, Oxford-OERC, Manchester-NGS2 RAL-LCG ....etc 12,000+ CPUs bdii.ngs.ac.ukuk
Job types Single Job Normal: simple batch job MPICH: parallel jobs Interactive: o/p streamed back to the client Parametric Set of similar jobs whose jdl attributes are parameterised Collections Group of jobs without dependencies DAG Group of dependent jobs