490 likes | 629 Views
Genome Annotation. GTTGCAATCTGAGACACATATTTTTGATATTCCAGTTGTTGCAATC GAATGTAAAACATATTTAGATCTTTAAATGTATGGTAC ATTCAAGATCCAACCTTCATTCTAGTGTTTAAAGAGAAC GTTGCAA TTAGGTTTTGTGATTTGTTTGCAGGGGCAGGAGGCTTTGGTTTAGGTT
E N D
GTTGCAATCTGAGACACATATTTTTGATATTCCAGTTGTTGCAATCGAATGTAAAACATATTTAGATCTTTAAATGTATGGTACATTCAAGATCCAACCTTCATTCTAGTGTTTAAAGAGAACGTTGCAAGTTGCAATCTGAGACACATATTTTTGATATTCCAGTTGTTGCAATCGAATGTAAAACATATTTAGATCTTTAAATGTATGGTACATTCAAGATCCAACCTTCATTCTAGTGTTTAAAGAGAACGTTGCAA TTAGGTTTTGTGATTTGTTTGCAGGGGCAGGAGGCTTTGGTTTAGGTT AAATGGCAGGCTTCTCTGTACCTTTATCTGTTGAAATTGATACCTGGGCTTGTGATACACTACGCTACAACCGCCCTGATTCAACAGTTATTCAAAATGATATCGGTAACTTTAGTACAGAAAATGACGTTAAGAACACTACGCTACAACCGCCCATATCTGCAACTTTAAACCTGATATTATTATTGGCGGGCCTCCATGCCAGGGATTTAGAGATCCTAGAAATGGTATTGCTGGGCCAGCCCAAAAAGATCCTAAAGATCCTAGAAATG GTTTAAATCCTCATAAATTATCAAACAAATCATATGATCAGAATAATCGCCGTTTAAATCCTCA TTTATTCATCAACTTTGCACAATGGATAAAATTTCTTGAACCTAAAGCGTTTGTCATGGAAAACGAAGGTTTTAAAGTTATAGATAAAAGGATTGCTATCAAGGAAAAATGCAGAAGGTTTTAAAGTTATAG TTATGCAGAAAAATTTGACTTCTCACTAAATATAAAGATTTTTTAGATCAGCAGCATTATGCAGAAAA ACCCACCGTTTGGGCAAAAAGACGACGGTACTGGTTTAACCAGCCAAATGTTCTTTCTACTACCCACCGTTT TTATTAAGAAAACATTTGAAGAACTTGGTTATTTTGTCGAAGTATGGGTTTTAAATGCTGCGGAATATGGCATTCCGCAAATTAGAGAACGTATTTTTATTGTTGGCAATAAAAAAGGTAAAGTACTAGGTATGAGTATTATACCTGCACTAACTTTGTGGGACGCAATATCAGACTTACCAGAACTTAATGCGCGTGAAGGAAGTGAAGAGCAACCCTATCATTTAAAACCTCAAAATACTTATCAGACTTGGGCTAGAAATGGTAGTGCTACGCTTTACAATCATGTTGCAATGGAACATTCTGACCGTTTAGTAGAACGTTTCCGGCATATAAAATGGGGTGAATCCAGTTCGGATGTATCTAAAGAACATGGAGCTAGACGACGTAGTGGTAATGGTGAATTATCAAACAAATCATATGATCAGAATAATCGCCGTTTAAATCCTCATAAACCGTCTCACACTATTGCTGCGTCATTCTATGCTAATTTTGTCCATCCTTTTCAACATCGAAATTTAACAGCCCGTGAAGGAGCTAGAATCCAATCTTTTCCAGATAACTATAGATTTTTTGGAAAAAAAACTGTCGTATCTCATAAACTATTGCATCGAGAAGAAAGATTTGATGAAAAATTTCTTTGTCAATATAATCAAATCGGTAATGCTGTACCCCCTCTTCTCGCTAAAGTAATTGCACATCATCTTCTAGAGAAATTAGAGTTATGCCAACAACTGATAGAAATCCTCTAGTGCATGGATCAAATCTTGAACAAAAAGAGAATCATCGTACAAAATACAGAGATACTGAAAGCAGGACTTTCCTTAGAGAAATCAGAACTGAATATGACAAATGGCATAAAGCAAATATGAACCTGGTTGGACCAAAATCAGAAATTACTGACCAAGATGATTCAATTATTACTCAAAGAGTGGAACTTCTCACTAAATATAAAGATTTTTTAGATCAGCAGCATTATGCAGAAAAATTTGATTCAAGATCCAACCTTCATTCTAGTGTTTTAGAGACCATTTATAAAGTAAATCTTTAGACGACTAGACGACGTAGCATAATACGAGTCATAACGGCATATATGGCAGCCTCACTCATTTCTGGGAGACGCTCATAATCCTTACTGAGACGACGGTACTGGTTTAACCAGCCAAATGTTCTTTCTACTACCCACCGTTTGGGCAAAACCTGAAATTCTTGATTAGTACGCCGGATTACCTCAACATGAGCTTGAATCATCAGCCAAACAGAGAGCGCAAATTTATCACCGTCATAGCCGGAATCAACCCAGATGACTTCAACTTTTTCCAGTAATTCTGGACGCTCTTCTAACAGTTCCATCAAAGTATAGGCGGCAAGTAATCTTTCTCCAGCATTTGCTTCACTTACAACCACTTTTAACAAAAGTCCCAGACTATCAACCAAAGTTTGCCGCTTTCGTCCTTTTACCTTCTTGCCACCATCAAAACCGTACACATCCCCCTTTTTTCAGTCGTTTTTACCGACTGGCTGTCTGCCGCGATCGCCGTGGGTTGAGTTGACTTCCCCATTTTTTGACGAACTTGATCGCGCAAAGTATGATTCATTTCAGTTGAAACCTGTTTTCAGATGGTAGTAGATAGCGTTGCATACTTCTCGCATATCAGTTGTTCGGGGATGCCCACCGCATTTAGCGGGTGGAATCAAAGGAGCTAAAATTGCCCATTCTGAGTCATTAAGGTCTGTAGAATAAGACTTTCGTCTCATTGTTTCCTATGTAAATACACTCTACAAACAGTATCTTATCGCTGCCTTTTTATCTTAGCTCTCCTTTAGATTTACTTTATAAATAGCCTCTTAGAAGAATTTCTTTATTATTTATTTAAAGATTTAGTACAAGATTTCGGGCAGAACGCTCTTATTGGTAAGTCACACACGTTCAAAGATATTTTCTTCGTACCACCAAAATATTCTGAAATGCTCAAGCGACCTTATGCGCGAATTGAGAGAAAAGATCATGATAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAATTCGTAATTGGTGCAACTGTTCAAGCATCGCTTAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAGAAGCAGCATGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAATGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAAT TGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAATTGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAATTGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAATAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAAT GenomeAnnotation GenomeAnnotation
GTTGCAATCTGAGACACATATTTTTGATATTCCAGTTGTTGCAATCGAATGTAAAACATATTTAGATCTTTAAATGTATGGTACATTCAAGATCCAACCTTCATTCTAGTGTTTAAAGAGAACGTTGCAAGTTGCAATCTGAGACACATATTTTTGATATTCCAGTTGTTGCAATCGAATGTAAAACATATTTAGATCTTTAAATGTATGGTACATTCAAGATCCAACCTTCATTCTAGTGTTTAAAGAGAACGTTGCAA TTAGGTTTTGTGATTTGTTTGCAGGGGCAGGAGGCTTTGGTTTAGGTT AAATGGCAGGCTTCTCTGTACCTTTATCTGTTGAAATTGATACCTGGGCTTGTGATACACTACGCTACAACCGCCCTGATTCAACAGTTATTCAAAATGATATCGGTAACTTTAGTACAGAAAATGACGTTAAGAACACTACGCTACAACCGCCCATATCTGCAACTTTAAACCTGATATTATTATTGGCGGGCCTCCATGCCAGGGATTTAGAGATCCTAGAAATGGTATTGCTGGGCCAGCCCAAAAAGATCCTAAAGATCCTAGAAATG GTTTAAATCCTCATAAATTATCAAACAAATCATATGATCAGAATAATCGCCGTTTAAATCCTCA TTTATTCATCAACTTTGCACAATGGATAAAATTTCTTGAACCTAAAGCGTTTGTCATGGAAAACGAAGGTTTTAAAGTTATAGATAAAAGGATTGCTATCAAGGAAAAATGCAGAAGGTTTTAAAGTTATAG TTATGCAGAAAAATTTGACTTCTCACTAAATATAAAGATTTTTTAGATCAGCAGCATTATGCAGAAAA ACCCACCGTTTGGGCAAAAAGACGACGGTACTGGTTTAACCAGCCAAATGTTCTTTCTACTACCCACCGTTT TTATTAAGAAAACATTTGAAGAACTTGGTTATTTTGTCGAAGTATGGGTTTTAAATGCTGCGGAATATGGCATTCCGCAAATTAGAGAACGTATTTTTATTGTTGGCAATAAAAAAGGTAAAGTACTAGGTATGAGTATTATACCTGCACTAACTTTGTGGGACGCAATATCAGACTTACCAGAACTTAATGCGCGTGAAGGAAGTGAAGAGCAACCCTATCATTTAAAACCTCAAAATACTTATCAGACTTGGGCTAGAAATGGTAGTGCTACGCTTTACAATCATGTTGCAATGGAACATTCTGACCGTTTAGTAGAACGTTTCCGGCATATAAAATGGGGTGAATCCAGTTCGGATGTATCTAAAGAACATGGAGCTAGACGACGTAGTGGTAATGGTGAATTATCAAACAAATCATATGATCAGAATAATCGCCGTTTAAATCCTCATAAACCGTCTCACACTATTGCTGCGTCATTCTATGCTAATTTTGTCCATCCTTTTCAACATCGAAATTTAACAGCCCGTGAAGGAGCTAGAATCCAATCTTTTCCAGATAACTATAGATTTTTTGGAAAAAAAACTGTCGTATCTCATAAACTATTGCATCGAGAAGAAAGATTTGATGAAAAATTTCTTTGTCAATATAATCAAATCGGTAATGCTGTACCCCCTCTTCTCGCTAAAGTAATTGCACATCATCTTCTAGAGAAATTAGAGTTATGCCAACAACTGATAGAAATCCTCTAGTGCATGGATCAAATCTTGAACAAAAAGAGAATCATCGTACAAAATACAGAGATACTGAAAGCAGGACTTTCCTTAGAGAAATCAGAACTGAATATGACAAATGGCATAAAGCAAATATGAACCTGGTTGGACCAAAATCAGAAATTACTGACCAAGATGATTCAATTATTACTCAAAGAGTGGAACTTCTCACTAAATATAAAGATTTTTTAGATCAGCAGCATTATGCAGAAAAATTTGATTCAAGATCCAACCTTCATTCTAGTGTTTTAGAGACCATTTATAAAGTAAATCTTTAGACGACTAGACGACGTAGCATAATACGAGTCATAACGGCATATATGGCAGCCTCACTCATTTCTGGGAGACGCTCATAATCCTTACTGAGACGACGGTACTGGTTTAACCAGCCAAATGTTCTTTCTACTACCCACCGTTTGGGCAAAACCTGAAATTCTTGATTAGTACGCCGGATTACCTCAACATGAGCTTGAATCATCAGCCAAACAGAGAGCGCAAATTTATCACCGTCATAGCCGGAATCAACCCAGATGACTTCAACTTTTTCCAGTAATTCTGGACGCTCTTCTAACAGTTCCATCAAAGTATAGGCGGCAAGTAATCTTTCTCCAGCATTTGCTTCACTTACAACCACTTTTAACAAAAGTCCCAGACTATCAACCAAAGTTTGCCGCTTTCGTCCTTTTACCTTCTTGCCACCATCAAAACCGTACACATCCCCCTTTTTTCAGTCGTTTTTACCGACTGGCTGTCTGCCGCGATCGCCGTGGGTTGAGTTGACTTCCCCATTTTTTGACGAACTTGATCGCGCAAAGTATGATTCATTTCAGTTGAAACCTGTTTTCAGATGGTAGTAGATAGCGTTGCATACTTCTCGCATATCAGTTGTTCGGGGATGCCCACCGCATTTAGCGGGTGGAATCAAAGGAGCTAAAATTGCCCATTCTGAGTCATTAAGGTCTGTAGAATAAGACTTTCGTCTCATTGTTTCCTATGTAAATACACTCTACAAACAGTATCTTATCGCTGCCTTTTTATCTTAGCTCTCCTTTAGATTTACTTTATAAATAGCCTCTTAGAAGAATTTCTTTATTATTTATTTAAAGATTTAGTACAAGATTTCGGGCAGAACGCTCTTATTGGTAAGTCACACACGTTCAAAGATATTTTCTTCGTACCACCAAAATATTCTGAAATGCTCAAGCGACCTTATGCGCGAATTGAGAGAAAAGATCATGATAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAATTCGTAATTGGTGCAACTGTTCAAGCATCGCTTAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAGAAGCAGCATGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAATGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAAT TGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAATTGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAATTGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAATAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAAT GenomeAnnotation
A Walk in the Forest * Photo courtesy of www.webshots.com
Observation * Photos courtesy of www.webshots.com and Peter Smallwood
Observation * Photos courtesy of www.webshots.com and Peter Smallwood
Observation * Photos courtesy of www.webshots.com and Peter Smallwood
Observation * Photos courtesy of www.webshots.com and Peter Smallwood
Experiment * Photos courtesy of www.webshots.com and Peter Smallwood
English RedOrange YellowGreenBluePurple RedOrangeYellowGruePurple Mayan Filters: Information reducers
TCTACTTATA TTCAATCCAC AGGGCTACAC CTAGTTCTTG AAGAGTCTGT TGAATGAACA CATACATGGT TTATCTGTTT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC CACTAGTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC TTAGATAAAC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCACGCCC CTCCGTAAAC CTCTAACATG ATGTCAGCAA ATATTAAAAA TGAATAAACT TTGTTAAAGG TACAAATGAA AATTAGCAAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT CATTCTAGGG AAACCTGTAT GGTTACATGA ACTGCCTAAA AAACAAGCTA TTATATATTT TAAGAAATTA ATTGCAATTA ATTTCCTGGG CCCCAGCTGT CATTAAAAAG AGGCAAATAC AGCCAAGGAC GACAGCACTG ACCCTCAAGA AGGCACCGGC TGACAGACAG GCTGAAATTC CGCTGAGAGC AGAGTGGTAC ATTGAACCCT CCCTGCACCA GGTCTTTCCT GTGGGCACTG AGTGCAGACA ATGAATGACT GAACGAACGA TTGAATGAAA AGAAATGAGA TATGAGGCAA TCACAGCATC AGGTGACCTT AGTATCTATT CTCGGGAGCG CACGGCTCTA AAGAGGCCCA TATCCAGGCA CCTTTAGATG CAAGAAGGAG GAAACAGCTC GAAATCCCTG AGGCCGGAGG GTCAAGAACT CTCCACCGGC GGCAGCGGCC CCCCGGCCTA AGGCTGCCTG TGCTATAAAT ACGCGGCCCA TTCCCTGGGC TCGGCGGGAC AGATAACATG AATGTGCCCT CTCCGTAAAC CTCTAAC... Filters: Information reducersSequence filter
Display of gene context by TIGR/CMR (or Kazusa) Nostoc sp. PCC 7120: Position Search and Segment Retrieval >Nostoc sp. PCC 7120 3455501-3456500 ttagtagatgggcttgatgtacagtactttgaaatcaattccctccgccgcaaaatggct gtagttagtcaagatacatttattttcaacacttctattagagacaatatcgcctacggt acatctggggcgagtgaagcggaaattagagaagtagcgcggctagcaaatgcgttgcaa tttatcgaagaaatgcccgaagggtttgatactaagttaggcgatcgcggtgtccgttta tctggaggacagagacaacggattgcgatcgctcgtgcattactccgagatcccgaaatc ctcattcttgacgaagccaccagcgccctagattcagtctccgagcgattaattcaggag tctatagaaaaactttccgtgggtagaacagtaattgcgatcgctcacagactctccaca attgccaaagcagataaggttgtggtgatggaacaagggcgaattgttgagcagggaaat tatcaagaacttctagaacaacgcggaaagctctggaaatatcaccagatgcaacacgaa tcaggacagactaattcgtaatatcaattcaaaattcaaaattcaaaattcaaaattagg gaagccgagcagaatcatggttttggggtatgtatctgtcccattcttttttcaaatcgg tataactccccaatccccaatccccaatctccagtccccaatccccaatccccaatcccc aatccccagtccccaatccccaatcccatgaaaatttccgtcatcatctcgaattacaac tatgctcgttatctttctagagcaatcaactctgttctcgctcaaactcactcagacatt gaaatcgttatcgtagatgatggttctacagataacagccgtgatgttattacccaactg caagaacaagcaccggataaaatcaagcccatctttcaagcaaatcaaggacagggaggc gctttcaatgcggggtttgcggcggcgactggcgaagtcg
Anabaena Chromosome (6413771 bp): 3455501-3456500 .........|.........|.........|.........|.........| alr2835hepA: ABC transport3454238 -> 3456061 alr2836 glycosyltransferase 3456248 -> 3457216 TTAGTAGATGGGCTTGATGTACAGTACTTTGAAATCAATTCCCTCCGCCG CAAAATGGCTGTAGTTAGTCAAGATACATTTATTTTCAACACTTCTATTA GAGACAATATCGCCTACGGTACATCTGGGGCGAGTGAAGCGGAAATTAGA GAAGTAGCGCGGCTAGCAAATGCGTTGCAATTTATCGAAGAAATGCCCGA AGGGTTTGATACTAAGTTAGGCGATCGCGGTGTCCGTTTATCTGGAGGAC AGAGACAACGGATTGCGATCGCTCGTGCATTACTCCGAGATCCCGAAATC CTCATTCTTGACGAAGCCACCAGCGCCCTAGATTCAGTCTCCGAGCGATT AATTCAGGAGTCTATAGAAAAACTTTCCGTGGGTAGAACAGTAATTGCGA TCGCTCACAGACTCTCCACAATTGCCAAAGCAGATAAGGTTGTGGTGATG GAACAAGGGCGAATTGTTGAGCAGGGAAATTATCAAGAACTTCTAGAACA ACGCGGAAAGCTCTGGAAATATCACCAGATGCAACACGAATCAGGACAGA CTAATTCGTAATATCAATTCAAAATTCAAAATTCAAAATTCAAAATTAGG GAAGCCGAGCAGAATCATGGTTTTGGGGTATGTATCTGTCCCATTCTTTT TTCAAATCGGTATAACTCCCCAATCCCCAATCCCCAATCTCCAGTCCCCA ATCCCCAATCCCCAATCCCCAATCCCCAGTCCCCAATCCCCAATCCCATG AAAATTTCCGTCATCATCTCGAATTACAACTATGCTCGTTATCTTTCTAG AGCAATCAACTCTGTTCTCGCTCAAACTCACTCAGACATTGAAATCGTTA TCGTAGATGATGGTTCTACAGATAACAGCCGTGATGTTATTACCCAACTG CAAGAACAAGCACCGGATAAAATCAAGCCCATCTTTCAAGCAAATCAAGG ACAGGGAGGCGCTTTCAATGCGGGGTTTGCGGCGGCGACTGGCGAAGTCG 3455501 3455551 3455601 3455651 3455701 3455751 3455801 3455851 3455901 3455951 3456001 3456051 3456101 3456151 3456201 3456251 3456301 3456351 3456401 3456451 Contig GoTo Block Find Display PgUp/PgDnHelp Quit
Anabaena Chromosome (6413771 bp): 3455501-3456500 .........|.........|.........|.........|.........| alr2835hepA: ABC transport3454238 -> 3456061 alr2836 glycosyltransferase 3456248 -> 3457216 TTAGTAGATGGGCTTGATGTACAGTACTTTGAAATCAATTCCCTCCGCCG CAAAATGGCTGTAGTTAGTCAAGATACATTTATTTTCAACACTTCTATTA GAGACAATATCGCCTACGGTACATCTGGGGCGAGTGAAGCGGAAATTAGA GAAGTAGCGCGGCTAGCAAATGCGTTGCAATTTATCGAAGAAATGCCCGA AGGGTTTGATACTAAGTTAGGCGATCGCGGTGTCCGTTTATCTGGAGGAC AGAGACAACGGATTGCGATCGCTCGTGCATTACTCCGAGATCCCGAAATC CTCATTCTTGACGAAGCCACCAGCGCCCTAGATTCAGTCTCCGAGCGATT AATTCAGGAGTCTATAGAAAAACTTTCCGTGGGTAGAACAGTAATTGCGA TCGCTCACAGACTCTCCACAATTGCCAAAGCAGATAAGGTTGTGGTGATG GAACAAGGGCGAATTGTTGAGCAGGGAAATTATCAAGAACTTCTAGAACA ACGCGGAAAGCTCTGGAAATATCACCAGATGCAACACGAATCAGGACAGA CTAATTCG■□□■□■■□□■■■□□□□■■■□□□□■■■□□□□■■■□□□□■■□□□ □□□□■■□□□■□□□□■■□■□□■■■■□□□□■□■□■□■■■□■■■■□■■■■■■■ ■■■□□□■■□□■□■□□■■■■■■□□■■■■■□□■■■■■□□■■■■■□□■■■■■□ □■■■■■□□■■■■■□□■■■■■□□■■■■■□□■■■■■□□■■■■■□□■■■■ATG AAAATTTCCGTCATCATCTCGAATTACAACTATGCTCGTTATCTTTCTAG AGCAATCAACTCTGTTCTCGCTCAAACTCACTCAGACATTGAAATCGTTA TCGTAGATGATGGTTCTACAGATAACAGCCGTGATGTTATTACCCAACTG CAAGAACAAGCACCGGATAAAATCAAGCCCATCTTTCAAGCAAATCAAGG ACAGGGAGGCGCTTTCAATGCGGGGTTTGCGGCGGCGACTGGCGAAGTCG 3455501 3455551 3455601 3455651 3455701 3455751 3455801 3455851 3455901 3455951 3456001 3456051 3456101 3456151 3456201 3456251 3456301 3456351 3456401 3456451 Contig GoTo Block Find Display PgUp/PgDnHelp Quit
New World 1 aagctttgaa agcactacag gatttacctt 61 aacaactaag tcgctctcaa gttacttctc 121 aatctgaatc tatcgcgggt gtggcaaaag 181 tctcaatgga agacttatta actcaaattc 241 gagtggcgag gattagcgtc aatctatagg 301 taaaatgctt atactgtcat ggcttgagtc 361 cgcctgaacc tttgctagag tatctttttc 421 ccaatggctg aaaagctacc ttagtttcag 481 acagtaagcc ttctagactc aggcagtttt We get the information Old World We ran the process
The New World TGAGACACATATTTTTGATATTCCAGTTGTTGCAATC GAATGTAAAACATATTTAGATCTTTAAATGTATGGTAC ATTCAAGATCCAACCTTCATTCTAGTGTTTAAAGAGAAC TGATTTGTTTGCAGGGGCAGGAGGCTTTGGTTTAGGTTTTG AAATGGCAGGCTTCTCTGTACCTTTATCTGTTGAAATTGATACCTGGGCTTGTGATACACTACGCTACAACCGCCCTGATTCAACAGTTATTCAAAATGATATCGGTAACTTTAGTACAGAAAATGACGTTAAGAATATCTGCAACTTTAAACCTGATATTATTATTGGCGGGCCTCCATGCCAGGGATTTAGTATTGCTGGGCCAGCCCAAAAAGATCCTAAAGATCCTAGAAATGGTTTATTCATCAACTTTGCACAATGGATAAAATTTCTTGAACCTAAAGCGTTTGTCATGGAAAACGTAAAAGGATTGCTATCAAGGAAAAATGCAGAAGGTTTTAAAGTTATAGATATTATTAAGAAAACATTTGAAGAACTTGGTTATTTTGTCGAAGTATGGGTTTTAAATGCTGCGGAATATGGCATTCCGCAAATTAGAGAACGTATTTTTATTGTTGGCAATAAAAAAGGTAAAGTACTAGGTATTCCTAAAAAAACACATTCTCTGCAATTTTTAAATTTAAATAGGTCTCAATTATCGATCTTCGATGATATGAGTATTATACCTGCACTAACTTTGTGGGACGCAATATCAGACTTACCAGAACTTAATGCGCGTGAAGGAAGTGAAGAGCAACCCTATCATTTAAAACCTCAAAATACTTATCAGACTTGGGCTAGAAATGGTAGTGCTACGCTTTACAATCATGTTGCAATGGAACATTCTGACCGTTTAGTAGAACGTTTCCGGCATATAAAATGGGGTGAATCCAGTTCGGATGTATCTAAAGAACATGGAGCTAGACGACGTAGTGGTAATGGTGAATTATCAAACAAATCATATGATCAGAATAATCGCCGTTTAAATCCTCATAAACCGTCTCACACTATTGCTGCGTCATTCTATGCTAATTTTGTCCATCCTTTTCAACATCGAAATTTAACAGCCCGTGAAGGAGCTAGAATCCAATCTTTTCCAGATAACTATAGATTTTTTGGAAAAAAAACTGTCGTATCTCATAAACTATTGCATCGAGAAGAAAGATTTGATGAAAAATTTCTTTGTCAATATAATCAAATCGGTAATGCTGTACCCCCTCTTCTCGCTAAAGTAATTGCACATCATCTTCTAGAGAAATTAGAGTTATGCCAACAACTGATAGAAATCCTCTAGTGCATGGATCAAATCTTGAACAAAAAGAGAATCATCGTACAAAATACAGAGATACTGAAAGCAGGACTTTCCTTAGAGAAATCAGAACTGAATATGACAAATGGCATAAAGCAAATATGAACCTGGTTGGACCAAAATCAGAAATTACTGACCAAGATGATTCAATTATTACTCAAAGAGTGGAACTTCTCACTAAATATAAAGATTTTTTAGATCAGCAGCATTATGCAGAAAAATTTGATTCAAGATCCAACCTTCATTCTAGTGTTTTAGAGACCATTTATAAAGTAAATCTTTAGACGACTAGACGACGTAGCATAATACGAGTCATAACGGCATATATGGCAGCCTCACTCATTTCTGGGAGACGCTCATAATCCTTACTGAGACGACGGTACTGGTTTAACCAGCCAAATGTTCTTTCTACTACCCACCGTTTGGGCAAAACCTGAAATTCTTGATTAGTACGCCGGATTACCTCAACATGAGCTTGAATCATCAGCCAAACAGAGAGCGCAAATTTATCACCGTCATAGCCGGAATCAACCCAGATGACTTCAACTTTTTCCAGTAATTCTGGACGCTCTTCTAACAGTTCCATCAAAGTATAGGCGGCAAGTAATCTTTCTCCAGCATTTGCTTCACTTACAACCACTTTTAACAAAAGTCCCAGACTATCAACCAAAGTTTGCCGCTTTCGTCCTTTTACCTTCTTGCCACCATCAAAACCGTACACATCCCCCTTTTTTCAGTCGTTTTTACCGACTGGCTGTCTGCCGCGATCGCCGTGGGTTGAGTTGACTTCCCCATTTTTTGACGAACTTGATCGCGCAAAGTATGATTCATTTCAGTTGAACTAGGAGGAAAATCCCCTGGAAGCATATCCCACTGACAACCTGTTTTCAGATGGTAGTAGATAGCGTTGCATACTTCTCGCATATCAGTTGTTCGGGGATGCCCACCGCATTTAGCGGGTGGAATCAAAGGAGCTAAAATTGCCCATTCTGAGTCATTAAGGTCTGTAGAATAAGACTTTCGTCTCATTGTTTCCTATGTAAATACACTCTACAAACAGTATCTTATCGCTGCCTTTTTATCTTAGCTCTCCTTTAGATTTACTTTATAAATAGCCTCTTAGAAGAATTTCTTTATTATTTATTTAAAGATTTAGTACAAGATTTCGGGCAGAACGCTCTTATTGGTAAGTCACACACGTTCAAAGATATTTTCTTCGTACCACCAAAATATTCTGAAATGCTCAAGCGACCTTATGCGCGAATTGAGAGAAAAGATCATGATTTCGTAATTGGTGCAACTGTTCAAGCATCGCTTGAAGCAGCACCTCCTCCAGAACAAAACCATGCTTGAGGGATCTTCACGCGCAGCAGAGGATTTAAAAGCGAGAAATCCTAACAGTTTATACCTTGTGGTTATGGAATGGATAAAACTGACCAATGATGTAAATTTACGAAAATATAAAGTTGATCAAATTTATGTACTACGTCAGCAAAAAAATACTGATAGAGAGTTTAGGTATGAGTCAACTTACATAAAAAAT
AATAAAGCTTTACAAACCAAACTCTGGCTTCAATTGTGTAACCCAAGCTTTGATTCTTTCCTCTGTTAAATCGGATTGATTATCTTCATCAAGGGCAAGACCTACAAATTTACCATCACGAACAGCTTTAGACTCACTGAATTCATAACCTTCTGTAGGCCAATAGCCAACTGTTTCACCACCATTTTCTGAAATTTTTTCCTCTAGAATACCGAGGGCATCTTGAAATGTATCAGGATAACCAACCTGGTCTCCAGGAGCAAAATAAGCAACTTTTTTGCCGATGAAGTCAATGTTATCTAACTCATCATAAAAATTTTCCCAATCACTTTGCAATTCTCCAACATTCCAGGTAGGACAACCAACAACGATATAATCGTAGTTATTGAAATCACTTGGTTCAGCTTGTGAAATATCATATAAAGTTACAACACTATCACCACCAAACTCCTTCTGAATTATTTCTGATTCAGTTTGGGTATTGCCTGTTTGAGTACCAAAAAATAAACCAATATTAGACATTTTTACTCCTTTTATGTATTTGCAAAATTATTTCAATTAAAATATTTAGTAATAATTAATTGTTAGCTAGCTAATAATTAAATTTTTATTACAATCATTGTAAAAGGCATTGAAAAAGTAAATAAAAATTTTTATTCTACGTTATTTCAAAAATATTTACTTACATATACTTAACCTTTATAGTGATGTAATATACTCTAATTCCTATTTTACTTATAAATACCATCTCAGCTTAATGTAACGAATTTTTCTGTTTATCTTTAAATACAAAAAATTCAACAAAACTACAGAAAATTAATCTTAATAACACAAAACAAGTATCAATCTGTAATACAACTAAGCTTAAATAAATTAATAGAAAGCTTCATCTATCTAATAGGTTGAGAATAGTTTATGTCTAATGACATAAATTCATTCGTGTTGATTTCATTTGGGTATATTCATCTGATTTAGGATTTACTCCATTAAGTTTGTACTCATCAATGCCCGCCTGTTGGTATCCACAATTCTCATACAGTGCGCGAGCAAAGTAATCAATCGTTCGTCGCCATATCTAACTTTGAGTCAAACAAACCAGTTGGATTACCAACCCTCAACTAATCGCTTCTTTAAGGCGAGCGATCGCACATTTAACTGTTGGTTGTCACAAGAGAACTAATACTACAGCAGTATATTTAACAACTAAGGGTGGTTCAACTTTCGCTGCGACTCCTCCAACGCGCTGAAATACACAGGACTGATGCGATCGCAAACTCTTTGACTAAATTCCATACATTATCATGACCATCTCCCAAACAAACAAGTGGGTTAACCAGATGCTGACTATTAACATCCCCTGAGTTCGGAGTTGTAGGTCTATTTGACTGGTTCAAAGCGATGATGGAACGGCTTTGTTGCATGAATTAAAAAAAGACACACCATCACCTACTTCTAGGATAGACACATCAAACGTCCCACCGCCTAAGTCAAATACCAAGATAATTTCGTTAGTTTTCTTGTCAAGTCCGTAAGCGAGGGCCGCCGCCGTGGGCTAGTTGATAATTCGCAGAACTTTAATCCCGGCAATTCTACTGGCATCTTTGGTAGCCTGCCGTTGAGAGTCATTGAAATAGGCAGGGGTGGTAATTACCGCTTGCCTCACTGGTTCCCCCAGATATGTGCTGGCATCATCTATCAGCTTGCGGACTACCTCATACCATTTCACGAAAAACCTGATACACATGTAAACTCTGAAACCCTTGCTGTATCAAAGTTTTGTAATTACGAATTACGAATTACGAATTGATATCAGCCGAGATTTCTTCGGGTGAAAATTCCTTGTTCAGAGCGGGACAGTGTAGCTTGACATTGCCATTACTGTCACGTACCACTTTGTAAGTAACTTGTTTTGCCTCTTGCGTAACTTCATCATACCTGCGCCCGATGAACCGCTTCACAGAATAAAAAGTGTTTTCTGGGTTCATTACACCCTGGCGCTTAATAAAGCTTTACAAACCAAACTCTGGCTTCAATTGTGTAACCCAAGCTTTGATTCTTTCCTCTGTTAAATCGGATTGATTATCTTCATCAAGGGCAAGACCTACAAATTTACCATCACGAACAGCTTTAGACTCACTGAATTCATAACCTTCTGTAGGCCAATAGCCAACTGTTTCACCACCATTTTCTGAAATTTTTTCCTCTAGAATACCGAGGGCATCTTGAAATGTATCAGGATAACCAACCTGGTCTCCAGGAGCAAAATAAGCAACTTTTTTGCCGATGAAGTCAATGTTATCTAACTCATCATAAAAATTTTCCCAATCACTTTGCAATTCTCCAACATTCCAGGTAGGACAACCAACAACGATATAATCGTAGTTATTGAAATCACTTGGTTCAGCTTGTGAAATATCATATAAAGTTACAACACTATCACCACCAAACTCCTTCTGAATTATTTCTGATTCAGTTTGGGTATTGCCTGTTTGAGTACCAAAAAATAAACCAATATTAGACATTTTTACTCCTTTTATGTATTTGCAAAATTATTTCAATTAAAATATTTAGTAATAATTAATTGTTAGCTAGCTAATAATTAAATTTTTATTACAATCATTGTAAAAGGCATTGAAAAAGTAAATAAAAATTTTTATTCTACGTTATTTCAAAAATATTTACTTACATATACTTAACCTTTATAGTGATGTAATATACTCTAATTCCTATTTTACTTATAAATACCATCTCAGCTTAATGTAACGAATTTTTCTGTTTATCTTTAAATACAAAAAATTCAACAAAACTACAGAAAATTAATCTTAATAACACAAAACAAGTATCAATCTGTAATACAACTAAGCTTAAATAAATTAATAGAAAGCTTCATCTATCTAATAGGTTGAGAATAGTTTATGTCTAATGACATAAATTCATTCGTGTTGATTTCATTTGGGTATATTCATCTGATTTAGGATTTACTCCATTAAGTTTGTACTCATCAATGCCCGCCTGTTGGTATCCACAATTCTCATACAGTGCGCGAGCAAAGTAATCAATCGTTCGTCGCCATATCTAACTTTGAGTCAAACAAACCAGTTGGATTACCAACCCTCAACTAATCGCTTCTTTAAGGCGAGCGATCGCACATTTAACTGTTGGTTGTCACAAGAGAACTAATACTACAGCAGTATATTTAACAACTAAGGGTGGTTCAACTTTCGCTGCGACTCCTCCAACGCGCTGAAATACACAGGACTGATGCGATCGCAAACTCTTTGACTAAATTCCATACATTATCATGACCATCTCCCAAACAAACAAGTGGGTTAACCAGATGCTGACTATTAACATCCCCTGAGTTCGGAGTTGTAGGTCTATTTGACTGGTTCAAAGCGATGATGGAACGGCTTTGTTGCATGAATTAAAAAAAGACACACCATCACCTACTTCTAGGATAGACACATCAAACGTCCCACCGCCTAAGTCAAATACCAAGATAATTTCGTTAGTTTTCTTGTCAAGTCCGTAAGCGAGGGCCGCCGCCGTGGGCTAGTTGATAATTCGCAGAACTTTAATCCCGGCAATTCTACTGGCATCTTTGGTAGCCTGCCGTTGAGAGTCATTGAAATAGGCAGGGGTGGTAATTACCGCTTGCCTCACTGGTTCCCCCAGATATGTGCTGGCATCATCTATCAGCTTGCGGACTACCTCATACCATTTCACGAAAAACCTGATACACATGTAAACTCTGAAACCCTTGCTGTATCAAAGTTTTGTAATTACGAATTACGAATTACGAATTGATATCAGCCGAGATTTCTTCGGGTGAAAATTCCTTGTTCAGAGCGGGACAGTGTAGCTTGACATTGCCATTACTGTCACGTACCACTTTGTAAGTAACTTGTTTTGCCTCTTGCGTAACTTCATCATACCTGCGCCCGATGAACCGCTTCACAGAATAAAAAGTGTTTTCTGGGTTCATTACACCCTGGCGCTT The New World What have we lost?
The New World What have we lost? What have we lost?
? Met-Thr-Tyr-Asp-Gln-Arg-Thr-Gly-Leu... Genetic code TCTACTTATA TTCAATCCAC AGGGCTACAC CTAGTTCTTG AAGAGTCTGT TGAATGAACA CATACATGGT TTATCTGTTT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC CACTAGTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC TTAGATAAAC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCACGCCC CTCCGTAAAC CTCTAACATG ATGTCAGCAA ATATTAAAAA TGAATAAACT TTGTTAAAGG TACAAATGAA AATTAGCAAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT CATTCTAGGG AAACCTGTAT GGTTACATGA ACTGCCTAAA AAACAAGCTA TTATATATTT TAAGAAATTA ATTGCAATTA ATTTCCTGGG CCCCAGCTGT CATTAAAAAG AGGCAAATAC AGCCAAGGAC GACAGCACTG ACCCTCAAGA AGGCACCGGC TGACAGACAG GCTGAAATTC CGCTGAGAGC AGAGTGGTAC ATTGAACCCT CCCTGCACCA GGTCTTTCCT GTGGGCACTG AGTGCAGACA ATGAATGACT GAACGAACGA TTGAATGAAA AGAAATGAGA 3% ATGACTTATGATCAACGCACAGGGCTA From Sequence to OrganismHow does Nature do it? ATGACTTATGATCAACGCACAGGGCTA • Begin transcription • End transcription • Splice transcript • Begin translation Rules of transcriptional and post-transcriptional control
TCTACTTATA TTCAATCCAC AGGGCTACAC AAGAGTCTGT TGAATGAACA CATACATGGT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA What genes are in my organism? Interpolated Markov model Candidate genes Predicted genes How do Biologists use Bioinformation? Gene finder
TCTACTTATA TTCAATCCAC AGGGCTACAC AAGAGTCTGT TGAATGAACA CATACATGGT TTCTGTCTGC TCTGACCTCT GGCAGCTTTC TGGATTTCGG AACTCTAGCC TGCCCCACTC GAACCTTAGT GACTTCTGCT ATACCAAAGT CTCCGTAAAC CTCTAACATG ATGTCAGCAA TGAATAAACT TTGTTAAAGG TACAAATGAA AAGAGTTTAA AGTTAAAAAC GAATTGCAGT AAACCTGTAT GGTTACATGA ACTGCCTAAA TTATATATTT TAAGAAATTA ATTGCAATTA CCCCAGCTGT CATTAAAAAG AGGCAAATAC GACAGCACTG ACCCTCAAGA AGGCACCGGC GCTGAAATTC CGCTGAGAGC AGAGTGGTAC CCCTGCACCA GGTCTTTCCT GTGGGCACTG ATGAATGACT GAACGAACGA TTGAATGAAA What genes are in my organism? How do Biologists use Bioinformation? Gene finder Interpolated Markov model Conform to standard model Challenge accepted beliefs Candidate genes Predictedgenes Predicted genes
globin • Highly filtered output • Easy to grasp • High-level insights Filters are powerful
globin • Highly filtered output • Easy to grasp • High-level insights • Unfiltered output • Confusing • Basic insights Filters Constrain New Discovery
Filters are tempting Globin
1. Need high-level filters Current State of Affairs
Current State of Affairs 1. Need high-level filters 2. Need access to raw phenomena AATAAAGCTTTACAAACCAAACTCTGGCTTCAATTGTGTAACCCAAGCTTTGATTCTTTCCTCTGTTAAATCGGATTGATTATCTTCATCAAGGGCAAGACCTACAAATTTACCATCACGAACAGCTTTAGACTCACTGAATTCATAACCTTCTGTAGGCCAATAGCCAACTGTTTCACCACCATTTTCTGAAATTTTTTCCTCTAGAATACCGCAACACTATCACCACCAAACTCCTTCTGAATTATTTCTGATTCAGTTTGGGTATTGCCTGTTTGAGTACCAAAAAATAAACCAATATTAGAC
ASSIGNK12-setFROMGene-finder (K12-DNA) ASSIGNO157-setFROMGene-finder (O157-DNA) CONSIDER EACHprotein IN O157-set WHENConstituent-of (K12-set, protein) = FALSE COLLECTprotein Current State of Affairs 1. Need high-level filters 2. Need access to raw phenomena 3. Need ability to build new tools
We need… Biologists . . . . . . and Programmers
Current State of Affairs 1. Need high-level filters 2. Need access to raw phenomena 3. Need ability to build new tools Need biologist programmers
Why hasn’t this happened? Part of bioinformatic program written in C if (pcInFile == NULL) pfInFile = stdin; else pfInFile = fopen(pcInFile, "r"); pfOutFile = fopen( pcOutFile, "w" ); if (pfInFile == NULL) { fprintf( stderr, "ERROR opening %s\n", pcInFile ); exit(1); } if (pfOutFile == NULL) { fprintf( stderr, "ERROR opening %s\n", pcOutFile ); exit(1); } fputc( fgetc(pfInFile), pfOutFile ); /* deal with first '>' in file */ for ( ; ; ) { if (processIdentifier( pfInFile, pfOutFile )) { } else { break; } if (processSequence( pfInFile, pfOutFile )) { } else { break; } } fclose( pfInFile ); fclose( pfOutFile );
Why hasn’t this happened? Part of bioinformatic program written in Perl sub match_positions { my $pattern; local $_; ($pattern, $_) = @_; my @results; local $matchStart; my $instrumentedPattern = qr/(?{ $matchStart = pos() })$pattern/; while (/$instrumentedPattern/g) { my $nextStart = pos(); push @results, "[$matchStart..$nextStart)"; pos() = $matchStart+1; } return @results;
Why hasn’t this happened? Biologists will not come to programming Programming must come to biologists
BioLingua • Provides knowledge in accessible form • Provides tools accessed in common way • Provides results that can be manipulated • Provides a programming language that speaks to biologists
Jeff Elhai Center for the Study of Biological Complexity Virginia Commonwealth University Phone: 828-0794 E-mail: ElhaiJ@VCU.Edu BioLingua http://ramsites.net/~biolingua/help