1 / 8

Graphs for workflow

Graphs for workflow. Dots assemblies. NRDB. Psipred. PDB. BlastX. Blast NR PDB. BlastP. MapCandAssembly SeqsToGenome. InterProScan. Genome Workflow. Compile time Include/Exclude. Molecular Weight. Calculate Protein Seq Include:Tbrucei927, Lmajor, Linfantum, Lbraziliensis.

lobo
Download Presentation

Graphs for workflow

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Graphs for workflow

  2. Dots assemblies NRDB Psipred PDB BlastX Blast NR PDB BlastP MapCandAssembly SeqsToGenome InterProScan Genome Workflow Compile time Include/Exclude Molecular Weight Calculate Protein Seq Include:Tbrucei927, Lmajor, Linfantum, Lbraziliensis Analysis steps (Blue rectange) Extract Genome Seq Molecular Weight Min/Max Isoelectric point Extract Protein Seq Find tandem repeats Make ORF Make Protein Seq for NCBI filtverSequences load tandem repeats load ORF run SignalP formatncbiBlastFile run TMHMM Copy Genomic Sequence to Cluster loadLowComplexitySeq createEpitope MapFiles Load SignalP Load TMHMM Copy Protein Seq to Cluster LoadEpitope extractNaSeqAltDefLine runSplign loadSplignResults Analysis subflow (Orange rectangle With round corner)

  3. NRDB/PDB Sub-flows NRDB PDB • Move download file • NR.gz • gi_taxid_prot.dmp.gz Find ProteinXRefs Load DbXrefs Shorten defLine (NR) Move download file Pdb.fsa Copy nr.fsa to cluster • Rename files • nr.fsa->nr_shortDef.fsa • nr->nr.fsa Copy pdb.fsa to cluster

  4. Create Similarity Dir Blast Sub-flows Copy Similarity dir To cluster Start Blast on Cluster Wait for Cluster Copy results from cluster Rename file blastSimilarity.out.gz->blastSimilarity.unfiltered.out.gz Filter BLAST Results BlastX Optional step (runtime test) Extract Ids From BLAST Results BlastX & BlastP Load NRDB Subset BlastX & BlastP Load Protein Blast

  5. Psipred Subflow Create psipred Data Dir Fix protein IDs for psipred Create psipred Task Dir • Copy files to cluster • Data Dir • AnnotatedProteinPsipred.fsa Start psipred On cluster Wait for cluster copy psipred files from cluste fix psipred File Names Make Alg Inv Load Secondary Structures

  6. InterproScan Subflow Create Iprscan dir Copy files to cluster Iprscan Dir start Iprscan On cluster Wait for cluster Copy Iprscan Files from cluster Load Iprscan Results

  7. mapCandAssemblySeqs ToGenome Subflow Make Candidate Assembly Seqs Extract Candidate Assembly Seqs Extract Genomic Seqs Into Separate Fasta Files Create Genome dir for GfClient Create Repeat Mask dir • Mirror To Cluster • Genome Dir • Repeatmask dir Stare GenomeAlign On Compute Cluster Wait for Cluster • Copy file from cluster • Results of Genome alignment • Results of repeatmask Update gus table with xmi Load contig alignments

  8. clusterMultiEstSoursesByAlign Dots Assemblies Subflow getNotAlignedEstAndAddOneCluster splitCluster AssembleTranscripts extractAssembles Create Genome dir for GfGlient Create Repeat Mask dir • Copy files to cluster • Genome Dir • Repeatmask dir Start Genome Align On Compute Cluster Wait for Cluster • Copy file from cluster • Results of Genome alignment • Results of repeatmask Load contig alignments updateAssemblySourceId

More Related