480 likes | 800 Views
Biology/Computer Science 251: Introduction to Bioinformatics. What is bioinformatics ?. Definition : “Bioinformatics is nothing but good, sound, regular biology appropriately dressed so that it can fit into a computer” -- Claverie & Notredame, Bioinformatics for Dummies.
E N D
Biology/Computer Science 251: Introduction to Bioinformatics What is bioinformatics? Definition: “Bioinformatics is nothing but good, sound, regular biology appropriately dressed so that it can fit into a computer” -- Claverie & Notredame, Bioinformatics for Dummies
Class Web Site • http://cs.gettysburg.edu/~leinbach/Bio_CS251/ • This site will contain all important documents related to the class. • Power points of all lectures • Labs • Exam Answer Keys (Posted after the exam) • Homework Assignments and Answer Keys • Note the updated syllabus contained at this site. It supercedes the one distributed on paper.
>DNA [250000, 270000]: Aspergillus nidulans contig 1.29 ACTGTCTTTTTTCCAACCCCCTCCGGTGCTCCTCCCATCTCTTTTTGGTC CAGAACCATTCATCCGGCTTGAACTCCAAACCCCACTCACGAGCTAGTTC TTCGATATACGGCCCCGCGAGCTGCGAATGCATGCTCGCATTTCGGCTTC CATCTCCACCCTTCGGATCCCCATTCTCATCCTGCCCAGTGCGGGTAAGA ACTTCCCTAGCTAATTGTATAGGATGCCGCACGTTTCTGATACCGATGAC TCCCGCACCGCGTTTCATGGTGTCACTATTATTTGGCACATCGTCCTCGC CTCGCACCGTCGTAACCATAACTGAAGCTTCCATTTCGATTGTTCCCGCA GTTGTGAACACGCTACCGCGACCGCAGTTGAACAGCTCGTCGTCTTCCAT GAGCGAGACCGCATGGATGGCGGCATCAAGAGCGCGGGCACCATTGTCCA ACAAGTGCCTTGTGGCGCGGAGGTAGGTGAGGAGGGACGCTTCGTATTTC GCCCAAAGCTCTGGTGGGAGCGTCGAGCGGTGGATGGCGCCTGCGCCTCC ATGAAGGATAAGGGTCGGCTTGAAAGTTGGTTTGAGCGTCATTTATATAC CGTTTATTCGTTCGTCCGGTAGACACGAACGATTTAGCGCGTAGAAGAGC AATCGACCGTTCTCGATTATGAGATGGTCGGAAGATGTGGGGAAGATGCG TCGAAATGCTGTCATGTCTAGCGGCAAATCAGCCTTATCTTGGTAACAGC AGACTAGCCCATAGCGCACGATAACCGTCAAAATTCGCAGAATTCAGAAA CAAGCCGAGCATCGCATTAATCCTGGTATTGGCAGTAGTAATGTACAAGA AATTAGAAAACATATTGATAACATGCCCCGGTCGCCAATCGAGGATCTGC GTCAGAGACAAAACAGATTTCTTTTCGCCCCCTAAAAAAGTATCAGGTAA AGAGCAAGGCTGTGTATCCAAAAAAGATCGCCGGTGAAAACCAGTTCTTG TTGGCAATAGTTAAGAGCGGAATAACCAGCATGCTTCGAAAATAAAAGAG AAAACCTCTATACCATGCACGCATCCTGATAATATGACCAGTTCCTAACA CCAGAGAAGAGATGAGATGAGATGTAATGTAATATAGCAGCAACGTTTAT GGGCATACAGCTGGACGCAAGAATCATAGGCCTCGGGGGAAGTTGGGATA GAAATACAATGTGGCTTAGGCAACGGCCTTGTCGGAGGGTGGAGCAGGAG GGAGCTGTGGACGTAAAGAATTAGAAGCAGAAAAAAATATACGTCGGGGT TGAAAGCGCTTACCTCTAGTTTTGTTTGGTCGGCCATCATCCTCTGGAAA CCAATGCCGACACCCTCGATAACAGCCAGGAAGACAGCGCACATGATGGC ACCGTTTCGGGCAGCCTTGTAACCACCACGGATAGCCAACGATCCACCAG TGAAGAAACCGGCAATGACTGGGACACAATCAATTCGCGTTCACCGACAG AATTTCACTCTGGACAGAACATACTAGCATTATAAGGATCCTCCTTCTTC CGAATACCCTTGACAGCACAGTCAAATGTTGAGAAAAGTCCACCCCAGAC ACCAAAGTTACCACCAAGTACAGGCGCGCGAGCCTTGATGGCTGTGATGG CTCCTATCCGTCGCTCACCGTACGGACTGTTTCGAAATCCCTTGACACCA TGCCATACAGCACCACCGATGGCCTATCAAAAACATATTGTCAGTTCTCA ATTCATATAGCCAGGCAGGATTCAGGTTGTCAAGACTTACACCCATACAA AACGCACCACCGAAGTCGCTCAGGGCAACCCAAGGGCAGGGATCTCTGGA GTGATCCATGTTATGTGCGTATGTGATATATGAGAAGAGTGTTGGAGGAT TGAGCGAAACGAGGGCGGAGGAATAGCTTCGTCCAGCTCAAAGTGGTTCA CGGTCAGAGAGACGTCACTCGGGGCAAGGTGAGGCGAGCTTGCACAAACG CCGAAAGGAGAGACAACGCGAGGTCCGGTCGATCGTAGAAGCTCTTGAAA GAGACACGAGAATTGAGAGATATCAGCAGCGGGACTGAAAAGGCGATATG GTTGAGGAGGTTTAAAAACTGGGGTCCGATGGAGAAATGACGGCTCGTCT GGTTTCGAAGGCTTTTTTTCGCCTTGAACGTGACCAAGGCTGTCACGTGA TTGCGGCGACTCCGGCATTCCCCTAAGCCTCAGGCATCCTTGGCATACAT GAATGCCAGAAATCCGTTATGGTCTCGGGACATGTAGGGATCTAGTTCTT TTTTTATTCTTTCTTTATACGGAGTACTTCTTTTACAAACTCGATCGACA GGGTTTATATTTATTGCGGTACAGAGGGCCCGCTTTATTCAACATGAATA TTTCTCTCAAATTTTCGTGAACAGTATCAGCAACCAAGCCCAGTCCACCC TACTTAAGGCTCATTGCACCAGTCTTTCCGAAGACCTCGTCGACACTGTT GACATGCGTCAGAGACACTTCTGACATTGGAGCATAAACATACAGTGCTC CGCGTCACACCTCTCACATATGCATGGACATCATCGCCCTTCGACCGTCG AAGACGTCGATTATACCTACACTGCGCACGATAATCTCCAAGTGATCTTC TGGGGCTACATCAGGTCGCATACAATCTCATCTCTTTTAATTGAAAGCAG TTACATGTCTTCCATAAACCGCAGATAAATTTGGGCCCCACAATTGCAAC TGCCATGGGTATTAATGACCATGTCTTTGAGATTAACAGATATTTGGAGA AACCTATTTGGAGAAACCTGGACAACATACCTCATGTGGTAACAGCCTGT GCTCAACTCCACTAAAGCCCTATATGCGTAGCAACGTCGCCGCACATGTT CTTCGCCAGAGTGAACGTGGCTTGAAGCTCAGTGTGTCTGCAGTGCAATC ATCTTTGTCGTGGAGGCACTTCGATACATCACACATGTGATCGAGCCGCA ATCAGGGCATCTTGCTCATCCGCTTCAATACTCTTGTAAGGTACGAACTT GCCGGAGTCCGCGTTGCCGTAATATGTCTTCTAGGTAGTTGAGAACCCAT CTTCAGCATTTTTGAATTCATCATATCATGCTGCGACATCTCGCCTGAAC AGGGTAAAGAGGTATTGGCGGAGATAGCATTTTGTCCGCAATAACCTTCA AGAACAGGCTCTTAAGGCAGCAATGACAATAACAACGGGCGCAATCCCCA ACTTTTGACCCTATTCAAAGGAAAACTGACGTAGCAAGCACTGCACGTTT CGGATCTCTAAAGCCTTCTCGCTGTCTCTTCCTGTCGCTGCGTGTATGTC ATGAAAGGTCCTGCATCGTTTTCTGACTCCGTCACTGGCTGTTCATGAAG CGATCATTTAGGCTGTGACAGAGCCGCTGGAAATATGAACAGTCACTGCC GGTGTCATCATTAAATGTCGAATAGCTTCTTAACTGGGGTCATCGTGTTG TCATCCAATGCTAGTCGCTGATCCTCTTCGACTTGGTCTCTTCTTGCATA CTGGCTTAATAGCTTGAGCGTCCAATTCGATAGCGTGCGCAAGACTTTGA GCAAACTTCAAGTCAACGATAATGGCGAGCACGCTGAGTTTCATTGTCCA GTAGATCATTAGCACGATGATTCCCTTGATCATGTTTGTGGGTATATCCA ATCTCAAGTTATTATGCGCGAGTTTCGATCTCGCAATCCAGAACCAAGCC ACTCGCTTCGTATTCCTGCGGCTGGAGATGCTTCATAGATGTATGTGCTC AAAGAAGGCCGAGTCAATTGGGGGACGATATCGGAAAGTTTTGTACTTGA CTTGGCTGGGATATCACTTTCGCCATTGCGAGCCCCGCTACGAAACATTC AGCTTCCTTGCCCCGGAAGTGGCGGAGATCGCTTGCCATGGTCTATCGAT TTCGTAACGATGATCCTAGACGAACCCACCGCATTGTCAAACCACCCGTT ATACAACACCTGCCCTCAACCACACTCTTACGTTCAAGAAGGTAAGTTCC AGATAAAGTTCAAGGCACCTCCTTCGTAAGCTGGGAATTGTCCTCCGCCA CTCATCCCAAATGGTGCCATAGAAGGATCAAAAGGCAAAGAAAGCCAATC GTCAGCAGCCATATTCTGAAATGTTGGAACAGATCCCAGTTCGGGTACTA TTTGCCCAGCAGGAACACCGATTGTTGGCGTACTGTCTTCTGCGCCGGTA TCTGCCAATGGTGACCGCACCGTATACCCATTGGTGCTTCTTGTAGCCTT AGAGGCGAGCAAATTGCGTCGTAGGCGCGACACATGTGTTTCCAACAGGG CAGCGTAGCGGGTGCTCAAGTGAATATCGTCCAGTGCATTGGACTTAAGC GCCTGTATACAGCGCTCAATGATATCTAGAGACTCACGCAATTTCGCATG GCGGGTCCCAATGCTCAATGCCTTCATCAAGAAGATGGAGGACGTCGTGA TCCGAAGGAAGATACGGACTGGGGAAAACTTCAGAGCCCCGGCTTCGCCC AGCTCACTTGCTTTCTGTAAGACCTGGCAACAACCGTCGATGACTTCTTG GATGTATTCATAATCAACTTGGTCAATGCTCGCCGCGCGGATTTCAACGT CCGGGTCGCTATCCGCAAGAACACGCTCCACTACCGCTTGCATACCGATC GAGTGGGTGTAAGCTCGGACGAAGTGATACTCGATGAACAGGTCGTTATA AAAGCGTTTGTTGAGTACTGCAATGTATTAGCTGACAAGTGTTCTGTGGA TGCATCATATGGCTAGTAGCTTACCTTCCGGCCGGAGATGATCTTCCTCC CATTTTAGTAGTAGAGGACGGAAGTGGTCCAACATATCAATATACCGACC ACTACGTAACTGTTGCCGAGCAAAGTCTGCGGACGGAAAAAAAACATCTG TGACGGATTTAGCCAGCTTTGTGAGGTCCATCCAAGAGTCCATAAATGCA AGCCATTGGTCGTTATAGTGCTTCAAGTTCCTTGATGTTTGCCTATTTAA GATGGCATGATTCAAGCTCTGAGGCATAAGAGATACACATCCTATTCTCC ATGCTAGTTGGTTGATGTAAACATAGAGCAGTCGCTGAACCCGCTGTCGT CGAAGTTTGATCTGGTCGCTAGGGATATACCCCTCATACGCGAGGGACCA GCTCCGCTTGTCCTCGTTGACCTCATATATTCCCAGCTCGTGGGCGAGGG ACAATGCAGCCCCAAGCAGCATCCATGACATTTGGTCGGATCTCTTCGCC GGTTCAACCATGTCCTCCAGGAACCGGTTTTTCGATGACACATCCTCATC TCTGGGCTGGGGAGCTTTGATGACTAGGTCAGAATCCCATCCATCGCTTT CTGGGGGAAAGTGAAGAGATCTTGGATGCCACTCCGACATCAATAGGAGA GCCTCAATTGTGCCAATTCTTCGAAGTTTTGAGTTGGAGGTCCGTTCTTG CCCAAATATAAGCCTTGTAACTAGCTGTTGGCAATGCTGCCACAGACGAT GATGGATAAAGAAGTTTCTTGACTCGCCACCTGGCCCCGGTAATATATGG TACCTCGAAGATAACATGAGCATCGTGCAGCACAGGACGGGATCACAGGT GATCAATTCGCGGTGATTTCGGTGGTCCGCATAGAAGTCGGTCAGGATTG GGGACAATGAGGACATGTTCTCGTAAAATCTGGCTCGCAAGAGATCAACC ACTATGCTGACCACATACTTAAACATAGGGCACTTACAGGTCAATGAGTG TAACGGCTTCTCTGGAGGTAAACCATCCCATCATCACAAATCGGCACGTT TCCCAAGTGGTGAGTACCTCTTTTGACGCATTGGAGATTTCCACCGGCCG GATTGCCATTGAAAATATTCCGGTCGGGACTGCAGAGTGCATTTGACTGA ATGCATTCTCGTAGCTCATCGCGTTAACTGCACGGCTGGTGTCCCTCGAA TCCATTCTCGCCTCGGCCAGGTCCGCTTCCTGGCTATGGGCAGCTGCCTC AAAGAGGATATTGAGTGCATCATTTCCGCTCGATACTACAGTGCGCATCA TGGAGTTTTCCAATGTGGATGCTGACAGTTGGCGGGTGTGGGAGTTCTTT GCGGTAGGTTGGTGAGAACTTAAATTACTTTCAGTGTACCCGTCGACATT GTCACCTGAGGTCAGATCGCGCTGTCGAATCTCCTGCCCCTCATCATCAT GAGATCGTACGTTACTTGACAATTTTCTGCGAGTATTCGCGGGCGAGGGA GTGCTCGTCTGCGACGCTGTAGATTTTGTCAACATGGAAAATACGGTGAT CGTTGTCCGGGCACCTACCTCGCTTTTTCCTCCTCTCCCATGCCCTTGTC TCGCTAAACACACATTCTCTCTGCTCTCTCCGACATTTTGCACATGGTGG GCCGATGGGAGACCCGTCTGGGAGTTCGCCCAGGTCACATTTGGCCTTCC TTTGCCGGCAAGGGATACAAGCCTTGTAAGTACGCTGGTACGTGCGCTCT GACGAATTGCCGGGCCTTGAATGCCTTCTGGCTTCCATGACAATCAAATT CGACCAATGACCGTTGGTTCAGTATATAAGATAATCATGCAATCTAGATT CAGCTCACGCGGATGCAACCGAGACAGCGGCTGGCTGACCGGGTGTTGTA AAGTGAAAGAATGCCTGAGGTTGGGAATAAAACAGCCGCGACTAGCCGAA GTGAAAAATATCCGCCAGCCCAAGTTGACCTGGGCCAGGTTTTCCTTCAA TCTGCATTCAAAGAATGAACAACGCAATGGTTTTAGCGGTATTGTGCAGA GCTATGGTCACAGGATATATAAATCTCTCTCTTAGGGATTGGAGTGAAGT CGGATTTAGAAAACCCGTGAGAAAAAAAAAAAATATGACAAGGGTGGGAT TCGAACCCACGCCAAATTAATGACGCGGAAACTTGATAGATCAAGAGGAG GTTCTAATTAGATACCTTAACCGCGCGCCTTAGACCGCTCGGCCACCTTG CCAATTTGTTGAAATATGAATTATTAAAAAAATATATAAGAAAGTTCACC ACTGGTGCTCACGAATCACGTGCCTTTGGACATTGTATTCACGATGCATC ATTTGGCATATGTACCAGTATAGTAATGGGCTATCATAATTAATTACTAA CTATGCTATGTCTGGTGTTGTAAGTATTGGTGCATCGTCACTGGCGGTAG GGTATGCTCACGAACCCTTATACCGTCACCAATATATATTTATAATGATT GCGTGCTGGCTGGAGCACCCGAGTGTTATGGGTCCTTTGCCTATACAAGG ACCTTAGACCTTAGTGACTCGGCCAAGGCCTGCGCTGTCCTGAAGGCGGT GAGCCACCTACAAGACTTCCTTGCAACAACAATCCTTCTTTCTCATTTCT TCTTTAGCGATTCCTTCTTGTACGTACGGCACGTCTAGATAGGAAGATCC ATCTAAATACGTCCCTTAACACCGAGGGTCGAAAAATTACCCACAACGAT CCTATTCAGATCGACTCCTTTCGCACTACATATAAGCCAGAGACAACTAA TTAGACACAGCGTCTGGAATGCAGCAAAACATTGGCAGATATGTGCATAA TGTGCCTGATAGTGCAGGGCTGTCCCCTACTTGTTAGGTCTACAAAGTCT CGACCCTCTCTCCCCGCTGCAGATATCACTAACTATATACAGCTAGTACA CCCCACCTGGCAGATTGTGGCTTAGATCTGCGCCCAAGATAATAATCCTA AAGGCCTCGGAACCGTGGACGAGAAGCGGAGCCCTGGAGACCGGAGGATA ATTCCGAGGAAGACAAATGGTACCACTACAGGCACCTAACTTGATCCCGT TTCCCCTTTTCTTGTTTCTAGGTCGACCAAACAAGAAATCCGGGACCCGC GCGACCTAGATATTGGCTGCAGGGTTGAATTTCACCCATGCAGCATGCTA ACAAAGACGATGCGTAGTCTAGACACGGGCCAGGGAATCCCCCGGGTCCA GATTGCTTCTTCAAAGAGTGAAGAGACAAACTGTAGAGCCAAAATGCCAC CACATGAGACCCAGCTCGGCCGAAGAGGGGTTGATTCATTGGACGGTTAA TAAAGCCGACACTAGAAAATGACTGGAAAGGCCGAAAAGTGGGCAGTGGC CGCGTTGAGGAACTAGACTCTGGTTCAACCTGCGGTGATCTGCTGTGCAC TCTCGGCCGCGACGGAGTGCGATCGGGAGAAAAGTCGCTGCCCTGGCACA ATTCGCTCGACGAACGCCTTAGATCAGAGCTCAACCTAAGCTATTTATCA GGCCATCCGTTGCCACTACTTCTCTAGCCTCCCGGGAAACACCTTTCTCT CGCAAGACGCGACCTAGAGAGCGCCGCTTTAGCCGTCATTATCAGCGGGC TGGGTTGCCTACTCGCCGCCCCCCAGCAAGAGCATGTTGTTTGCGGACCT GCCTGAGGAGCTTCTGTGGCATATCTTCATCACCTACTTTGGTCGGGACT CGCACACCCTGGTTCTCCTCGCCACCCTCAACAAGAGGTTTCACCGCATC ACGACCCCGATCCTCTATTCGCACGTTACGCTAAGTCTTGTTGACGGCGA TGAGTCGCGCAAGGTACCGCGACCCGCCCCTCTCTTTGGAGGTCTGCTTG CTGACACCGTCTCTCTCACAAGGTCCGGCGATTTATCATGTCGGTATTTT CCAGCCCCTACCTCGCGCAGTGCGTCCGATCGCTCGACCTGAACGAACTC TCCTGGGTCCCGCACCAATCCCTTTCCCGCCGCCGCAAAGAGCTAGTAGC CCGGATGATGAGAGGTGACATCCTCGGTCGCCCCGATAGACTCGACATGT TCAAGCTCGTGACCGTCGTTCGGCGCCTCCCGTTATCGGATAAACTCAAA CACAAGTGGTGCGCGGAGCTGCAGGAGGTTGCGCCGAGCTTGGACTCTCT TATCGCGCTTTTATTTGTGTTCCTCCCTTCTCTGGAAAAATTAGAGTCCA ACTGGTCTCTGGATCCAATGTACATCTGGCACCTGTTGCCTCAGGCTGAT ATGGGAAAGGTAGAATCGTCGCGCTCGCCCTCGCCTTTTGTCCTGCGGAA CTTGACCCATCTGAAAGTCAACTCCGAGACCCCCTGCGGCAACTCCTCTG AAATACTACCCCTTCTTCAGATGCCTTCACTGACCCACTTTTTTGGCAGC AATTGGGGGCCCATCAGGTGGGATGGTGGAGTTGCGGGAGTTGATATTCA GGCAAAGGGTCTACGAAGCGGCAAAACGACTTCGTCGATAACGCACCTCG AGCTCCGTTACTGCAACGTCGATATGCAGAGTCTGCGAACCATTCTCAGA TCCTGTCGCATGGTCAAAACATTCATCTTCCATCGTGATTGGGACCCCCG GGTTCATGTGAAGCTTTCTGGGGCTTCACTCTCCAAGGCACTTCGCCCAC TTCGAAAAACACTGGAAAATATCGCTTTACATTTCGAGCCGGGGCTCTAC ATCCATCAGGAAGGCGAGATACATCCGCTGGATTTCTCTCAATTCTCTGT ACTCGCCAATATAAACGTTGCCGCTGGTTACCTCATTCACGATCCGGAGG ATTTCGATAGTTATGAGTTTTCGAAGACGTACGGTTCGGAAGACGAACAG GAACCGATAAATGTTCCTCTTCACGATCGGCTGCCGGAATCGCTCGAAAT CTTACGCATCACAGGCTTCAGCACTCCGCAACAGCTGGAATTCCTCTTAA AGGACTGCTGCGGAATGCTTCAGCATCGCTCCCGCTTCCCACGACTGCGT GAACTCTCTATCGAGGCCAATTTCGACGAGGCTGATGCTGTATTCGACAC GAGCGCGCTTCAACAGGAGGCAGACCGCGCGGAGGTGGTATTCCGCAAAA TCAACACCGCGGATTATCCTGACGATGGAATCGACTTACTCACCCCTGCA GGACGCAATTGGGGCATGGATGGCGAGTTTAAGTGGAGCACGAAGTTATT TTGATAACACAGTGGTGGCAATGCTAGGCAAACCGTCACGGTGGACGGAA ATGACAGCCTGACGAGAAGCAGAATCGTGACAGGCAAGATGATTTAGCGT TTTCTGTGTGATACGAAGTCATGACCATGAATATGCACGGCAAATTATCT CCAACTAGAGATACCCTATGGTTTACCCGGCTGATGCGGTGACCGATGGA TGACAGCCATTGTAGTGCATCAAGAAAAAAGGGTACAGGGGCCAGACTTC GCAACGATTAGCTTGCAGCTCTGGACAACCAAGCCTGGCCCTACTTCCTA GTCCGGCGGCAGAAACGGTGAGTGCACCTAGTTCGACATTCTTGACCTTG GGATGGACGAATGGCCAGCGTATGGGCATGTCGCCGGTTTTCTTCTTCGA AGTTACAGAATGTACTGGTTGTTTGACGGCTGTTCAATGAGCATCTGCGA CGGGAATGCGCCATCTCAGGCACCAAGAGGCAAGTCTCAACTCACCTTTG GGAGCCCCATACATGCTTTGCACTCCTTGGGATACTTCACGGGCAAACAG GATTGCTCCGTGTGAGCCGACGTATCGCCAATGTAGGAAGGCAGGGTTGG TGTGGCTCAGAACCAGGTACCCGGGACAATAGTCCACAAACTGGACACAG GGCTTGTACTCCGGGAGACTGACCGGCAGCAGTCTTTTTGTACAAAATTT CCTCGTACTTCCCCGGTGCTAGGCGCATCTTAAAGAGTAACCTTTTCTGA ATGGGTTAGAAATAGAATCATGGACTGCTCTCCAGGCTCGATTAGGGAAG TTCTTGCAATGTCTATTGCATGCACTTCAAGCGACCTAGTCACCTGTTAC CAATGGGAATTTCTCGTACAATACGCCGTCATGAATATACTCCTTTAAGT TCGGTAAGCACATGAACTGGGATTGCTACCCTACCTTCGTACGGTACTCC GAACAAAAGTGTCATAGGAGTGACGAAGTCACGGAAATCCGCGGGAATTT GCGCGTCCAGCAGCGGATATTCCATCGCGGAATCCCTTCCGCCACCCCGC ATCTATCATTTTTCGAACGTGCTCGCCGCAAAAAGAATCACGTAGCGAAA CTAGGAATGAATAGGAAATCGCCGTGGATACCCTTGGCCAGAGCCTTGAA TACGAGCGATTCCGAGTTTATCCTCTAGATAGAGAACTGCATCGATGAAC ACTTCCAACGCACCACCTTTTCCCTCTAGCAGACTCAAACCGTTTCCAAG CAACAAACCCTAGTCAAGACTGCAATATCTACTTAAAGGGACACCGCACC GGTCAAGATGTCTACAAATTCAGACCTCAGAGTGACACTCTACACCTACT TTCGCTCCTCGTGCTCCGCGCGCCTTCGTATCGCCCTCGCCCTTAGGTCA ATCTCCTATACCTCAGTCCCTATCAACCTGCTGAAGGGCGAGCAGTCGAG CACAAAAAACACGGCCGTAAACCCATCGGCCACCGTTCCCACGCTCATCA TCGAGCATGTAGACCGGAGCCAGTCTCCAATAACAATAACCCAGTCCCTC GCAGCGCTCGAATACCTGGATGAAGCATTCCCCGACAATCCCAACCCACT CCTTCCACCTATTTCTAACCCGCAACAACGCGCGCTCGTGAGGTCTCTAG CATCAATTATAGCCTGCGACATTCAGCCAGTCACGAACCTACGGATTCTT CAGCGCGTTGCACCATTCGGCGTCGACCGGGCCGCCTGGTCAAAGGATCT CATTGAAGCTGGGTTTGCGGCGTATGAGGCTATTGCTAGAGACTCTGCTG GAGTATTCAGTGTTGGCGATACAATCACGATGGCGGATGTTTGTTTGATC CCCGCTGTTTGGGGCGCAGAGAGAGCGGGGGTAAATCTGGGACAGTATCC TACAATTAAAAGGGTTGCGGAGGCCCTGGAGAAGGAGAATGCAGTTAAGG AGGGGCATTGGAGAACCCAGCAGGATACACCAACGGAATTCAGGTGTTGA ATCTGCTGCTTGGCGACCGATGTTCCATTGTCGGAACGAAAATGTTATAC AGACCGGTCTTGAATGATGGACGGGGTTTACTGCGAGAGTAAAAGGTTAT CGACAAGTTCGGACAGGAGTCTCTGCGTACCCTTGCGCTAAATAAGAGTA ATGTAAATGACCTGCAAAGCATCTCCAAAGAAGACCCGAGTAGCAAATAA AAAAGTCAAAAAAAAGTATGGTCCCGGAGGGGATCGAACCCTCAGCCTTG GCGTTATTAGCACCACGCGCTAACCAATTGCGCCACGGAACCCGCTTGCT GAAATATTTCAAATTACTACTAATCTATATTCTTAGTGAAAAAGGAGCGC AGTAATGTCCCCAGAAGCCGCCAATTTTTGCCTGAAAGTGTTGCCAAATA TAACTACTCAGCTCTGGGGTAAAAAAAAACCTAGCATTGCCATCAAGCTT GAAATATCTCATTTTATTTTAACTATTGCATGCGAATCACAGATCCAGCT GCACAGGCGGCAAAATTGTTCCGGCACAATCGCCAAACCCTACATAGTTG CCTTCAGTTCCCGCCATACCACGCAGAATAACAGTGTCACCGTCCTCTAA GAACAAGCGCTCGGAGCCATCTGCGAGCTTGATAGGATTCTTTCCATTGG TTTGCTCGAGGAAACTGCCTTCGGTCTGATTCTCCTTGCCAGAGATCGTA CCACTGCCGAGGAGATCACCGGTGTTCAGGTTGCAGCCCGTTATGGTGTG GTGGGCCAGCATCTGAGGGAAAGAGTACAGGAGGTTCTTTGCATTGCTGT GGGAGATAACAGTAGGCTCGCCGCCTGCATTGGTGACTTCGACCTCCAGC GGGATATCATAGGCAGTGTCAGCCCGCTTCTCCCGCAGATACGGTAGGAG AGACTCCCTGTTTCCGGGTTCTAGACCGACGGTGCGGAAGGGCTCGAGAG CATCAATCAGGACAACCCAGGGGGTGATCGTTGTGCCAAAGTTCTTTGCG TTAAAGGGACCAAGAGGAACGTACTCCCATGCCTGGATGTCACGGGCAGA CCAGTCGTTCATAAGCACCACTCCGAAGATATGGTCTTCGGCTTTGTCGA TATGAACCGGATGCCCTAGATCATTGGGAGTTGAGACGAAAAACGCTAAC TCCAGTTCGATATCGAGCTTCTTGCAAGGAGAGAAGGTGGGTAATTTAGG GTTTGCCGCGGGATTCGTCAAGATCTGGCCCTGTGGACGGTGGAGAGGCG TGCCGGAGGTGACCACAGACGAGGCACGGCCATGGTAAGCCACGGGGAGA TGTTTATAATTTGGCTGTAGGGCATTGTCGGGTCCCCTGAAGAGGACTCC AACGTTGTAAGCATGGTTAAGGCCGGCGTAGAAATCTGTGTAATCGCCGA TCTGCATGGGGAGATGGTTTGTAACCTCAGACAGCGGCAGGAGGGCCTCT TTCTGGAGCGCTGCATTGTCTCTGAGGATTTGAGGAAAAGGCGTCTCTGT GCTGAAGACCTTCTGAATGTATTCACGGACTTGTCGGTGCACTGGTCGAC CAAGAGCTGCGAACGCGTTCAGGGTGGATTGGTTGAAGACATTCAGATGA GGCTGGATGACGGGAAGCTGGGAGAAACCGCCAGACGAGGCAAATTTGCT CAAGTCTAAGGCATAATCTCCGATCGCAATAGCGGGAACTCGCGACGAGA GCTTCGATGATGAAATGATACCGAATGGGATGTTTGCCAGCGAGAAAGGC GAGTTCTTGGGGATTTGTAACCACGAAGCCATTGTGACGGGAGCTAGCAA AAATTAGGAGAGAATGGTAGATGAATGATTCAGGAGACAACGAAGACCAA ATAACAGATAAAACGCTGCTGAGAGAGATGTCGACATAGAACGGGGCAGG AAGGATTCCGTACGAGGAGCTGCAGCACCGCGGAATGAAGCCGCACCCGC TTATCACCGGCTGTTGTGTTTCCCCAGTGTGATCCCGACTTTCCGCAGCC CCTCTTTGTCGCCCTGAAGTGAAGGTCGCCGCTGGCCGGGCGGATTGCGG GAATTCAGCTTTCTGGGAAGTCATGGGTTGATCTGGAGTTCTGTCACATT TCTTCTGCCTATGTGCTAGCTAAGTATATAGGCCGAGTCCTCACTGTCCC TGCAGCAGCACCCGATATCCAACGTGACAAACCATACACTATCAACATGC CAGTCACCGAGTTCTCCTTCAAAGATCCTTACACCTATCAGAATGGGTTC GATTCTTACCATGAGTATGCTAGTGGGTACTGAATCGATGCTACTGAGCT AACTTGACTTAGATCTGAAGCCATCGAAGGCGCCCTCCCCGTAGGGCACA ACTCTCCCCAGAAAGCTCCTTACGGCTTATATGCCGAAAAGCTCTCTGGG ACCGCTTTCACTGCTCCCCGGCACGAGAACAAACAGACTTGGGTCTACCG TATCCTTCCCGCCGCAGCGCACGAGAACTTCGTTGAAGAGGATGCGAGCT CTTACCACACGCTCTCAGATGCCAAAAAGCTTCAGCATATCCCCAACCAG CTACGGTGGGATCCATTCGATCTGGACGAGACCGTCGACTGGGTTCATGG GTTGCACCTTGTGGCTGGTTCTGGAGACCCTACCGTAAAACAAGGCCTGG GCATCTTGCTGTATGCGGCGGGGAAAGATATGGGAAAGGAAGCTTTCTAC TCCGCGGATGGCGACTTCCTGATCGTAGCACAGCATGGCGTGCTCGATAT TCAGACTGAGCTTGGCCGCTTGCTGGTTCGGCCGAACGAGATTTGCGTGA TTCCTCGTGGTGTAAGGTTTGTACTACGCGCTGACAAACCATATTTGAGA TCATGAGCTTACGGAATAAGATACCGTGTTACCCTACCCGACGGTCCTGT GAGAGGCTACATTTGCGAGCTCTACCAGGGCCACTATCAGCTACCCGAAT TAGGACCTATTGGATCCAACGGTCTTGCGAATGCACGAGACTTTCAAGCA CCAGTGGCTGCCTTTGACGACGAGGAAGGACCGACGGAATACCGACTGTA CAGCAAGTTCAACAACCACCTCTTCTCCGCGCGCCAGGATCACACCCCTT TTGACATTGTCGCATGGCATGGCAACTACTACCCCTACAAATATGACCTT GGCCGCTTCAACACCATGGGCTCCGTATCATTCGACCACCCTGACCCCTC CATCTACACGGTGCTCACCGGCCCGTCCGATCATGTGGGCACTGCTATTG CTGACTTTGTTATTTTTCCGCCGCGCTGGTTGGTTGCGGAGAAAACTTTC CGGCCTCCCTGGTATCACCGCAATACTATGTCAGAATTCATGGGTTTGAT CACCGGAAACTATGATGCGAAGACTGGTGGCGGCTTCCAACCTGCTGGCG CGAGTCTGCACAACATCATGAGTGCACATGGGCCGGACATGCATGCATTC GAGGGAGCAAGCAACGCGGACCTGAAACCTACGAAGATTGGGGATGGCAG CATGGCTTTCATGTTTGAAAGGTAGGTCAACCTGTCTTTTTTCCAAGTTC GAAATACTAACTTTGCATAGCTCCCTCATGGTTGGCGTGTCGGAGTGGGG TCTTAAGACATGCCAGAAGGTGCAGGAGGAGTATAACGAACATAGCTGGC AGCCCTTGAAGAGGCACTTCAAGGATCCAAGGAAGGCCCAGTGATTCTGA GTAACTTTTATCTTGGAGAAAAACTGTGATATACCAGCATGAATTACTGA AGCGCTTTGTAATAGGTTTAAAACATTATTTTTGACAAGTTCAAAGCCAA ACATATCCTGGAAGTAGCCACATTAGGTTTCACTTCTACTCAAAAAACTA TACGCCCTCAACCTCCTTCCTCATCGCCTCCTTCACGATTGCCCCACACA TCTTCGCATACGGATTCGTTTGCACAGCCATCTGCTGCGCAACTTTATTC CCATACCGGCCGCCAAAGAAGAAGTCACGGCTTGCCATTTCTTCTAGACC CTCGGAACCGATCTCCTTCAATGTATCAGGGATAAACTTGATTTCATGCT CGTCACCGTTCCTGAGTTGCCGCGATATCTTCAGGAGAAGGTAGGCGCGG TGTGTATGAGTGTCTGCTAAGAGCCTTGCCTGACTCGTGGAGACAGCATC TGCGGGTGAGAGAGGAGTGGCGAGAGTAATTGCGTTGCCTAGATCTGAGA GGAGAGACGAGAGACGGGAGGCTGTTTCTGAAGCGGATTTCTGCTTAAGC TCGTTTTCGTCATCGATCAACATTCGCAGAGCCTGAGCTCGATTTGCGTA AGCGGATGGGTATCTTGGGTATTCATTGATAATGGCATCGAGGGCATTGA TTGCGGATTGAACCTGGCTGGTTTCCGGTGTCGGCTGTGTTGTCGATGCT AATGGGGTGATAATCTCTCTCTCGCGAGCTTGCAGAGCCGTGTGTTCTTC AAGAGAGATACCCAGGTGTGCTGGGAGCTGAGGAAGGCTTAAGTCAACTG TCACTGCTGAGGATGGTGAGGATTCGGCGTCGAAGAGGGCTTGTAGAACG GCGGAGTCGTTGGTTGTTAGGGAGGGTTTGGAGGAGTTGATTGTCGTTGC GGACATTTTGCTAGCTTAGTGTTGGTATAACCTAATCTCTGGATGTGAGA AGAAAAGATCTCGGTGTAATTTGATATCTGGAGAAGAGATACTGGCACAA ACCATGTGCTTAATCTCGTGTGTAATGGCGATGGCTGCTGCCGGATACTT GGCGGGCGGCCCCGGGTGTTTCGAAATAGCCGTCACTTTCCTCGCCCAAT ATCCGTCTCCAGCCGTCTTCAACCTCGTCCTCCCGGAACTTTAACATCTC GGTCACGATGAATAATGCGCAGCGAATATACGAAAGATAGAGGAATACCA TATCACACATCATTATTACTCGATATCCGTATTCTGGGTTTATAACCACA GAAACCCTCGCACAAACCATTCCGCGTCAACCATCGATATTCAGTTTCTA CTCAAGATACCCTTTAGATATCAGGTTCCAGATATACTTACTACCACAAG ATGGCGCCCTCCGCAATCGACCCTCCGACCTCAACATCTCTGCCCTCCTC TATCTCTAGCTATCGTGGCTATGACCACGTTCATTGGTATGTCGGCAATG CCAAACAAGCCGCCTCATACTACATCACCCGCATGGGTTTCAAGCGCATC GCCTACCGCGGCCTGGAGACAAACTCCCGCGCCATCTGCTCCCACGTCAT TCGTAACGGCGACATTACATTTGTACTCACTTCGCCCCTCCGCTCGCTAG ACCAAATATCCCGGTTCACACCTGAAGAGCAAGACCTCCTGAGAGAGATC CACCACCATCTCGAGCAGCACGGGGACGCGGTCAAGGACGTTGCGTTTGA GGTCGACTCTGTCGATGCGGTGTTCAATGCAGCGGTGGCGAACGGAGCCA AGGTTGTGAAAGGCTTGACAACAGTGGAGGATGAGAATGGGAAAGTGACG ATGGCAACAATCCAAACGTACGGGCAGACTACACATACGCTTATTGAACG AGGCGCGTATAGAGGAACTTTCTTGCCCGGGTACAGAGTCGAAACAAGCC TCGAAGACCCGATTTCGGCTTTGTTACCGGGTGTCCTGTTGAACCGAATT GACCACTGCGTTGGTAACCAGGATTGGAATGAGATGGATAAGATTTGTGA ATAGTATGTCCTCTCCCTGACTAAAAAAAATATGTACAGAGGCTAATGAG AGCAGCTACGAAAAAGCTCTTGGATTCCACCGATTCTGGTCCGTTGACGA TAAGCAGATCTGCACGTAAGTCACCTTTAAGGCAAATGTCACAGCCACGA TGCTAATTCGCAGCAGCGAGTTTTCCGCCCTAAAGAGTATTGTCATGGCC TCCCCCAATGATATCGTCAAAATGCCCATCAATGAGCCCGCCAAAGGCAA AAAGCAATCACAGATTGAAGAGTATGTCGATTTTTACAACGGCGCCGGAG TACAGCACATTGCCCTCCTAACAGACGACATCATCCGCGACATCACGAAT CTTAAGGCGCGCGGCGTGGAGTTTATCAAGGTCCCTGATACATATTATGA GGACATCAAGATTAGACTGAAGAAGGCTGGATTAACTCTACATGAGGACT TTGAGACAATCCGGGCTTTGGATATCTTGATTGATTTCGACGAGAATGGG TATTTGTTGCAATTGTTTACAAAGGTAACCACCCACTCTCCTTGTTCCTA CCTATATGCAAGGTAACGGAAGCTGACCGATGAAACTAGCACCTTATGGA CCGTCCTACAGTGTTTATTGAAATCATTCAACGACACAACTTTTCTGGAT TCGGAGCTGGAAATTTTAAGTCATTGTTTGAGGCTATTGAGAGGGAGCAG GCGCTGCGAGGCAATCTTGTGTAAACTTGAATTCCTTCAGACTATTGTAA TCCTATTATCATTGACCGCAGACATTTCCGTCTCTTCTTACTATACCCAG GAGTGAGCAATGCTAAGGATTAAAATCTGACTGGGAGCGTTGGTGCTCCG TTCCTAAGATTAAGGCCTTTAGACATAAGAGCCCTCGGATATTAAGCTAA GGATATGGCACCAGGGGAGAATGAAATAACGTCAATCTCCTATGCCACAT AGCGTTACCGTATAACGCCACAGGCAGAAATATTTTCGTAGGGGGATTGG ATCCGAGCCCTAGGGGGTTGTCTGAGACATTCTTTTTGCCCTTTGCTACA GTAATGTCTGGTTTGCAGTACTTTCACCGAGTAACATAACCTGCTTTATT TAAACAATATCCTCCAAGACTGTAATCTCCCTGTGGCGGTTTTCATTTAC TTCTTTTTTGAAGAGCTGGTACCCCCAGGCGTAGAACCTGGCCGGCGCTT GTACTTCCGAGGGTCTGGTAACATCAGCAGAGTTTCAGCGAAGACAGCCC ACGCCTCTTACCTTTCCTGCTCATCAGGGTCTTATTCCTGGTCCCCTTTC CTCCTTCCCAGAACCCGCTCCCTGCTTTTCCAATTAGGTATGAGCGACAG ATACCGCAGTTCTCAGGACGGTATATTTGTTCGCGAGAACAAAAGCCACT GAATAAATATAATCCGTTAGACTAATTCAGACAAATTAAGTAGATATGAG ATAGGGATACGCATACCAGATCATGCGGTTTGCGTGCTCGTTTGGATGGT CTGTGGCTGCATCGTGGCATCTATTGGCCGTTAGGGTCTGAAAGTCGAAT TCTCAGAGTTCTACATACTTGTCACAAGGAAAGACTTTCTGACAGCAGCT GAATCTGTAAGGTGTAAGAAGACGGCCATGTATGAAGAGAAGTAGATGAG GTACCTGAACCAGCGGTAGCTTTTGCTATAGTGCATGCATCGACCGCGTC GGGGTAGTTCCTGGCCTGCCACGATACCAAGAACTTCCTTTGGCTGTTAT GACTGTCAGTAATTGTGATACCGAGATCGAAGAGTAATGTACCTTTTTTC GCGCTGGAGCACGTTCTCGTGACGTGACTGCGAAATTAGTACTCAAGAGT CTTAACCAATGGTCAGAGCTAACTCACATGCTGCTGATCCTACTTTGAGG AATTTCACTTCAGGAATTTTGAACACTGCGATGATTAGTGAAAGTGTAAT TGATTCACAGGGCACCATCATACCTAATTTACGATGGCATTGGCGGCAGT TGGCCATGGCGCTCTCCCCTCGGACTGCGACTACCCCTGGTCCTGGAAAT GGGGTCGAACACTCCGCGCACGTTGGGATGAATCCGCTATCATGGGGTAA GTATGCAGACTAATAAGCCAGATTAATAACGTACCTAGGCAGAAGATCCA CTACTGTGCAGCCGTCCAGGTCCAGGTAGCCTGCACGGTTAGAGTTGGGG TGCATCAACAATCGACGGAACCCTACAAAGTGAGCAAGTCATCCATGCAA A
Where to start: A brief history of Bioinformatics 7 pea traits, or characters, studied by Mendel Gregor Mendel: 1866 - first described a set of mathematical rules by Which the appearance of an organism (its PHENOTYPE) could be related to its inherited genetic makeup (GENOTYPE)
Two Scientists Who In ~900 Words Reshaped the Way In Which We View Life on Earth
Red blood cells undergo sickling due to a single base change in the DNA of the beta-globin gene. This base change in DNA changes one residue in the protein`
Bioinformatics allows the study of gene origins and the evolution of new genes SCIENCE, 12-22-06
We are now moving into the post-genomic era, as entire genomes are sequenced and made available with individual genome databases
Bioinformatics was born out of databases, constructed to collect protein and DNA sequences How fast is the GenBank database growing? http://www.ncbi.nlm.nih.gov/GenBank/genebankstats.html Proteins were first: 1960’s thru 1970’s Margaret Dayhoff, National Biomedical Research Foundation (NBRF) established PIR, Protein Information Resource, which grew into the PIR-International Protein Sequence Database: http://www-nbrf.georgetown.edu/pir DNA came second: 1977 - reliable DNA sequencing developed 1974 - GenBank was established, followed by 1980 - European Molecular Biology Laboratory (EMBL) Data Library, and… 1984 - DNA Databank of Japan (DDBJ) Today, GenBank is under National Center for Biotechnology Information (NCBI). GenBank, EMBL, and DDBJ formed International Sequence Database Collaboration, Data are exchanged on a daily basis. GenBank: http://www.ncbi.nlm.nih.gov
The Net (no pun intended) Result is an Astounding Growth of Biological Information Source: Scrabanek op. cit.
Whole- genome structure/organization can be compared between species This is a human karyogram, or idiogram This is a human-mouse synteny map
Bioinformatic and genomic approaches have exploded the study of evolution (molecular evolution)
Bioinformatic and genomic approaches allow discovery of new and unsuspected species of organisms that cannot be detected using conventional approaches Archaeal Richmond Mine Acidophilic Nanoorganism (ARMAN)
Bioinformatics can be applied to study interactions among proteins within the cell, i.e, proteomics For example, take a look at the Saccharomyces genome database (SGD): http://www.yeastgenome.org/
Protein-protein interaction map for budding yeast The color of a node indicates The effect of deleting the Corresponding protein. Red: lethal Green: non-lethal Orange: slow growth Yellow: unknown Jeong et al. 2001. Nature 411: 41-2
Novel insights from the layering of 3-D protein structure with Protein Interaction Networks
Cancer cells Normal cells DNA chips (DNA microarrays) can be used to unravel the geneticbasis of cancer
One form of cancer, Non-Hodgkins Lymphoma, is actually two types: two varieties of Diffuse Large B-Cell Lymphoma (DLBCL) exhibit very different microarray RNA-expression profiles and very different survival outcomes.
What is bioinformatics? Definition I: “Bioinformatics is nothing but good, sound, regular biology appropriately dressed so that it can fit into a computer” -- Claverie & Notredame, Bioinformatics for Dummies Definition II: “A field that involves the building and manipulation of biological databases. In the context of genomics, this means managing massive amounts of sequencing data and providing useful access to and interpretation of the data”-- Weaver, Molecular Biology, 3rd ed. Definition III: “A field that extracts biological information from large datasets such as sequences, protein interactions, microarrays, etc. This field also includes the area of data visualization” -- Campbell & Heyer, Genomics, Proteomics, and Bioinformatics
Carl’s Definition of Bioinformatics A study of the algorithms and programs that are used by Molecular Biologists and others in the Biological and Medical Sciences in their quest for understanding protein structure and function in living organisms. This is just one of many definitions that may be found in text books, scientific papers, and on the web. The simplest definition is that it is an interdisciplinary subject drawing on material from Biology, Mathematics, and Computer Science. To me this is like saying that e = mc2has something to do with relativity theory.
Some Implications of this Definition • An individual studying Bioinformatics needs to have some understanding of the basic ideas of Molecular Biology research. • They also need to have a familiarity with DNA sequences and how they contribute to 3D Protein Structure as well as gene identification and phylogenetics. • They need to be familiar with the many “in silico” tools that are used and the parameters that control the output of the programs or algorithmically controlled devices. • It is important for them to understand the objectives and limitations of both Computer Science and Molecular Biology. • They need to have some experience with collecting biological data for analysis
Computer Science Biology Micro Biology & Medical Science Computational Biology Bioinformatics Computational Biology Computational Biology Micro Biology & Medical Science (Note the two way arrow)
Early Pre History Computer Science Micro Biology Bioinformatics
Late Pre History Computer Science Micro Biology Bioinformatics
Recent History Computer Science Micro Biology Bioinformatics
As a result, DNA sequencing and Proteomics have had an increasing number of important applications in the life, medical and social sciences. Pickup any scientific journal that deals with the life or medical sciences, any popular scientific magazine, or, for that matter, any daily newspaper and you will find an article where DNA or related issues play an important role Why, it even makes the comic section:
For example: A fungus, Aspergillus nidulans http://www.broad.mit.edu/annotation/genome/aspergillus_nidulans/Home.html What can we do with these databases? What is the purpose of bioinformatics? Answer: to make sense out of this: AN 1.29 nt 240000 - 270000 Preview a few tools - A.n. genome database: “Browse Regions” - “Feature Map” - “Get DNA sequence” GenBank: “ORF Finder” - “Blastp search” - “Conserved domain” - “Format Results”