70 likes | 227 Views
AHM 2002. Tutorial on Scientific Data Mediation Example 1. Clusfavor. http://mbcr.bcm.tmc.edu/genepi/. wrap1. Accession Number. : wrap2.xml. pwrap1. NCBI: GeneBank. http://www.ncbi.nlm.nih.gov/. wrap2. Sequence to search. : wrap3.xml. BLAST. pwrap2.
E N D
AHM 2002 Tutorial on Scientific Data Mediation Example 1
Clusfavor http://mbcr.bcm.tmc.edu/genepi/ wrap1 Accession Number : wrap2.xml pwrap1 NCBI: GeneBank http://www.ncbi.nlm.nih.gov/ wrap2 Sequence to search : wrap3.xml BLAST pwrap2 http://www.ncbi.nlm.nih.gov/blast/ wrap3 The top match : wrap4.xml http://transfac.gbf.de/cgi- bin/matSearch/matsearch.pl MatInspector pwrap3 Resulting sequences & similarity scores wrap4 • an external program to build a model or • back to blast to find additional matches, or • to clustal to determine a consensus sequence which is then sent to blast. SCENARIO WORKFLOW
CLUSFAVOR • CLUSFAVOR- CLUSter and Factor Analysis with Varimax Orthogonal Rotation • A standalone program whose output consists of several clusters of named sequences that have similar expression characteristics in the current experiment. • GOAL: Given a gene expression data, to end up with another set of related sequences from which to build a model. • INPUT:gene expression data • OUTPUT: collection of clustered cDNA fragments
NCBI GeneBank • GOAL: Given the name (or, better, the accession number) of a cDNA string from the clusfavor results, do a name lookup in GenBank to obtain the cDNA sequence. • INPUT: The accession number or the name of a cDNA string • OUTPUT: cDNA sequence for the input cDNA string
BLAST • Basic Local Alignment Search Tool_ BLAST • A set of similarity search programs designed to explore all of the available sequence databases regardless of whether the query is protein or DNA. • INPUT: Output cDNA sequence from GeneBank. • OUPUT: A set of similar sequences.
MatInspector V2.2 based on TRANSFAC • MatInspector - Matrix Inspector • TRANSFAC - The Transcription Factor Database • Search for potential transcription factor binding sites in your own sequences and detect consensus matches in nucleotide sequence data using the TRANSFAC 4.0 matrices.
GENEBANK MEDIATION DEMO