230 likes | 527 Views
Introduction to Bioinformatics. Lecturer: Dr. Yael Mandel-Gutfreund Teaching Assistant: Martin Akerman. http://biology.technion.ac.il/courses/common/index.asp?CourseID=236523. Course Structure and Requirements. Class Structure 2 hours Lecture 1 hour tutorial 2. Home work
E N D
Introduction to Bioinformatics Lecturer: Dr. Yael Mandel-Gutfreund Teaching Assistant: Martin Akerman http://biology.technion.ac.il/courses/common/index.asp?CourseID=236523
Course Structure and Requirements • Class Structure • 2 hours Lecture • 1 hour tutorial 2. Home work • Homework projects will be given every second week • The homework will be done in pairs. • 5/5 homework projects submitted • A final project will be conducted and submitted in pairs
Grading • 30% Homework assignments • 70% final project
Literature list • Gibas, C., Jambeck, P. Developing Bioinformatics Computer Skills. O'Reilly, 2001. • Lesk, A. M. Introduction to Bioinformatics. Oxford University Press, 2002. • Mount, D.W. Bioinformatics: Sequence and Genome Analysis. 2nd ed.,Cold Spring Harbor Laboratory Press, 2004. Advanced Reading Jones N.C & Pevzner P.A. An introduction to Bioinformatics algorithms MITPress, 2004
Bioinformatics • An approach to mine knowledge from biological data. • A bunch of methods to ease biological research in the lab. Human catcgtagCTAGACTacgc Mouse ctagctgaCTAGACTatcg Dog tacctatcCTAGACTcgac Horse acctactcCTAGACTcgaa
Tutorial 1 Biological Databases • DNA,RNA & Protein sequences • RNA & Prot. Structure • Gene Expression • Protein localization • Mutations • Similarity between species • Specie Specific database • Literature • Experimental support http://www.ncbi.nlm.nih.gov/ http://www.genome.ucsc.edu/
Biological Sequences: RefSeq A comprehensive, integrated, non-redundant set of sequences, including genomic DNA, transcript (RNA), and protein products. • Genomic Sequences. • known mRNAs. • Predicted mRNAs: • - Putative genes (homologue to known gene). • - Orphan genes (look like ORFs but have no homologues). • Known Proteins. • Predicted Proteins (Putative & Orphan).
RefSeq Comprehensive, it covers a wide variety of sequences. How to identify each kind of sequence?: accession numbers.
RefSeq Accession Number:A unique identifier given to a sequence Complete Table:http://www.ncbi.nlm.nih.gov/RefSeq/key.html
ENTREZ Integrated, It is related to other databases through ENTREZ, A NCBI interface that connects between different Databases. OMIM (genetic disorders) PubMed (Literature) RefSeq PDB (Protein Structure) Swiss-Prot (Protein Sequences) GenBank (genomic data) GEO (Gene Expression) ENTREZ: http://www.ncbi.nlm.nih.gov/Entrez/
Experimental support Literature Disease Sequences Prot. Structure Entrez Gene Expression Similarity between species Integrated database
At last… RefSeq is non-redundant, each sequence is represented only once. But...What is redundancy in biological databases? Are two alleles of the same locus redundant? Are the same loci in two closely related organisms redundant? Are two gene copies redundant? It depends on the kind of database. In RefSeq two alleles from a same locus are considered redundant. In RefSeq two loci from closely related organisms are not redundat. In RefSeq two gene copies are not redundant.
GenomeBrowser • A Bioinformatic Navigator that concentrates information • from various sources. • It enables visualization of a big amount of information at the • same time.
Chromosome Coordinates Chromosome Position mRNAs Evolutionary Conservation 5’ UTR ORF…
Display options Full>Pack>Squish>Dense>Hide