1 / 52

Lecture #4 : Comparing genes

Lecture #4 : Comparing genes. 9/14/09. This week. Homework #2 due on Wed Email with questions Email me answers or hand in in class Wed - I will be at Dept of Biology retreat Lecture will be given by Kelly O’Quin - expert in phylogenetics

lyre
Download Presentation

Lecture #4 : Comparing genes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture #4 : Comparing genes 9/14/09

  2. This week • Homework #2 due on Wed • Email with questions • Email me answers or hand in in class • Wed - I will be at Dept of Biology retreat • Lecture will be given by Kelly O’Quin - expert in phylogenetics • He will go over homework so it must be done before class

  3. Questions for today 0. More BLAST • Where do we get high quality gene sequences? • How do genes evolve? • How do we compare genes?

  4. How to find genes • Start with genes which are known from model organisms • Use these to pull out genes from genomes • Compare genes to learn about sensory evolution

  5. Blast - Genbank • What database do you want to search? • What do you want to compare? • What program do you want to do the searching?

  6. Types of blast queries

  7. Defaults Database Program Confirm

  8. Nucleotide BLAST = DNA nucleotide query vs nucleotide database

  9. Choices for programs • Megablast Highly similar sequences >95% • Word length 28 • Discontiguous megablast • Pretty similar seqs • Word length 11 • Blastn Dissimilar seqs • Word length 11

  10. Translated blast = protein query vs translated database

  11. BLAST a genome Request ID AWJ4D4B7012

  12. BLASTing is fun • This is meant to be enjoyable • Be a genome explorer • Find out what kind of data is out there • Find out what kind of data isn’t there • QUESTIONS?????

  13. Q1. • There is so much data in Genbank. How do you find GOOD data? • Example • Bovine rhodopsin - 1st G protein coupled receptor to be sequenced • Search Genbank with text • 49 entries

  14. Bovine opsin

  15. Bovine rhodopsin

  16. Searching for genes • Searching by text is fraught with peril • Genbank has too many links • Pull up many things that are not what you want • BLAST is better approach • NCBI has also made records which combine all similar sequences into one

  17. NCBI has done some of the work • They have hand-curated data for some species to make a set of reference sequences • Nucleotide sequences - NMxxxxxxx • Protein sequences - NPxxxxxx • For human rhodopsin • NM000539 • NP000530 • These are the gold standard for sequences

  18. Homologene

  19. Homologs • Two genes which arise in the common ancestor of two organisms and are passed down • Implies genes perform same function in two organisms • Therefore they can be compared to learn about evolution

  20. These 4 primates have many genes which are homologs and have been passed down from primate ancestor Human Chimp Macaque Bushbaby

  21. Homologene search for rhodopsin

  22. Homologene

  23. Three primary sequence portals: 1. NCBI

  24. 3. DNA database of Japan

  25. 2. Ensembl - European Bioinformatics Institute (EBI)

  26. Select just genes

  27. Scroll down to find the gene you want

  28. Location Orthologues are predicted and linked Links to transcript and protein

  29. OMIM - Online mendelian inheritance in man

  30. Good places to find genes • Model organisms: NCBI homologene • Genes from models and other organisms: Sanger Ensembl gene families • NOTE: These are often predicted from genome sequences • If there is a sequence in NCBI homologene, it may be different (and more accurate) than Sanger predictions • OMIM is a good reference

  31. Q2. How do genes change through time? • Change in actual sequence • Mutation • Recombination • Change in frequency of a sequence • Selection - “survive” better • Drift - get passed on by chance • Migration - move between populations

  32. Mutation vs selection • Mutation = sequence change • ATGCCGTGACGT • ATGCCTTGACGT • Selection/drift/migration = sequence frequency changes across a number of individuals • ATGTG ATGTG ATGTG ATGTG ATGTG ATGTG • ATGTG ATGTG ATGTG ATGTG ATGTG ATGTT • ATGTG ATGTG ATGTG  ATGTT ATGTT ATGTT • ATGTG ATGTG ATGTG ATGTT ATGTT ATGTT • ATGTT ATGTG ATGTG ATGTT ATGTT ATGTT

  33. Evolution as tinkerer • Changes are typically small • Mutation is source of new sequence • Not all mutations are created equal • Some occur more often than others • Other forces shift frequency of particular sequence

  34. Triplet amino acid code

  35. Mutation causes nucleotide change • What about AA sequence? • Synonymous change • Syn = same • AA stays same • Nonsynonymous change • Not same • AA changes

  36. Amino acid code

  37. Amino acid (AA) types • Non-polar A, F, G, I, L, M, P, V, W • Polar N, Q, S, T, Y • Charged, + H, K, R • Charged, - D, E • Other C Often changing AA within a group does not affect protein function

  38. Selection • Stabilizing selection - Acts to keep protein function the same • Synonymous change more frequent than nonsynonymous • Amino acid changes occur within group much more common than between • Non polar  nonpolar • Polar  polar

  39. Similarity matrix A = alanine C = cysteine D = aspartic acid E = glutamic acid F = phenylalanine G = glycine H = histidine

  40. Comparing sequences • Can do at either nucleotide or AA level • Gather sequences from a bunch of different organisms • Need to align them so that sites which perform the same function can be compared

  41. Aligning sequences • Sequences may differ in length • Often have differences at amino- or carboxy- terminus of the protein • Need a way to align parts of protein that are performing the same function

  42. Example - RH2 opsin in fishes Goldfish MNGTEGNNFYVPLSNR Medaka MENGTEGKNFYIPMNNR Zebrafish MNGTEGSNFYIPMSNR Killifish MGYGPNGTEGNNFYIPMSNK Trout MQNGTEGSNFYIPMSNR Halibut MVWDGGIEPNGTEGKNFYIPMSNR Cod MRMEANGTEGKNFYIPMSNR Tetraodon MVWDGGIEPNGTEGKNFYIPMSNR

More Related