440 likes | 610 Views
Basics bioinformatics. Vítězslav Kříž vkriz@med.muni.cz Department of Biology , Medical Faculty , MU. Applied mathematics Informatics Statistics. Biolog y. Bioinformatics: definition. Bioinformat ics.
E N D
Basics bioinformatics Vítězslav Kříž vkriz@med.muni.cz Department of Biology, Medical Faculty, MU
Applied mathematics Informatics Statistics Biology Bioinformatics: definition Bioinformatics Solve formal and practical problems arising from the management and analysis of biological data
The main efforts in bioinformatics • Gene finding • Sequence alignment • Genome assembly Exercise from bioinformatics • Gene expression, microarrays • Protein structure alignment • Modeling of evolution
Example: Breast cancer http://www.nature.com/jid/journal/v116/n3/fig_tab/5601010f1.html?url=/jid/journal/v116/n3/full/5601010a.html Known phenotype (disease) gene identification Gene finding
http://www.ncbi.nlm.nih.gov/sites/entrez?db=OMIM&itool=toolbarhttp://www.ncbi.nlm.nih.gov/sites/entrez?db=OMIM&itool=toolbar
Database: OMIM;On line Mendelian Inheritance in Man Name of the geneLocusProtein functionFrequent mutations
What has been published? Pubmed=Medline http://www.ncbi.nlm.nih.gov/entrez/query.fcgi
5890: The number of publications, which include the word „brca1“
30 citations PubMed: Bibliographic database MeSH: topic specification
How does differ normal and tumour tissue? • Gene expression comparison • Human around 30 000 genes • Gene separation according to their function https://courses.stu.qmul.ac.uk/SMD/kb/pathology/funmedpics/pathtes2.htm
Microarrays RNA expression Chip-commercially available, covalently bound ssDNA, each spot cDNA from one gene, one chip > 10 000 genes
Signalization Transcription factors Extracellular matrix DNA and RNA metabolisms Proteolytic degradation Microarrays: Huge amount of data is processed http://bloodjournal.hematologylibrary.org/cgi/content/full/103/3/868/TBL1
Structural bioinformatics • What is the protein function? • What is the protein structure? • Protein vizulation • Protein classification New drugs design • Build PDB databases: biopolymers in their 3D structure
X-ray crystallography Nuclear magnetic resonance http://www.physiologie.uni-freiburg.de/nmr_spectroscopy.html http://www.rcsb.org/pdb/home/home.do
3D Protein structure determination The central protein parts are identical only in 28-35% 3D structure is better evulutionaly conserved than the sequence
PDB database provides resources for studing structure of biopolymers and their relationship to sequence and function
Identification of mutations Genomic alignment: What makes as different from other animals? Homologs identification: Is the gene present also in other organisms? Creation of evolution tree Similarities and differences in the primary sequence
To be able to find gene homolog we have to know the gene structure (exon, intron, promoter etc.) and gene sequence • GenBank-Entrez nucleotide • http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=nucleotide&cmd=search&term • Ensembl Genome Browser • http://www.ensembl.org/index.html
GenBank-Entrez nucleotide • Publicly available sequences of nucleic acids and proteins (primary structure) • Prokaryotes and eukaryotes • Exons and introns
Ensembl Genome Browser Actual annotation of certain eukaryotic genomes (human, mouse, chicken, fugu, drosophila, yeast...) Primary sequence of proteins and genes exons, introns, localization on the chromosome orthologs
Zvelebil, 2006 Sequence alignment • Searching for similarities between inserted sequence (primary structure) and between sequences published in publicly available databases (polynucleotides, peptides) • Orthologs identification • Identification of evolutionally conserved domains Global alignment - attempt to align every residue in every sequence • Example: ClustalW http://www.ebi.ac.uk/clustalw/index.html Local alignment - attempt to find similar parts in the sequences, it does not necessary find the best alignment of the entire sequences • Example: NCBI Blast Search http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi
FASTA (format) Possibility to import sequence from other databases
Blast search enable to align polynuclotides or protein sequences Access to the databases of more than 100 eukaryotes
Evolution biology: Homolog/Analog • homolog – same origin , but the function can be different • analog – similar function, different origin
Evolution Biology: Ortholog / paralog ortholog – aquired by speciation, the function is probably the same paralog - aquired by gene duplication, the function does not need ohnolog – aquired by whole genome duplication xenolog – aquired by lateral gene transfer http://www.stanford.edu/group/pandegroup/folding/education/orthologs3.gif
Animal models • Animal studies • Mouse Genome Informatics (MGI) • http://www.informatics.jax.org/
Mouse models for human diseases Structure of murine genes • Mouse and human orthologs location • Mouse strains and polymorphisms