1 / 22

Inferência Filogenética

Inferência Filogenética. Construção de Árvores Filogenéticas II. Ana Margarida Sousa Instituto Gulbenkian de Ciência Grupo de Biologia Evolutiva. amsousa@igc.gulbenkian.pt. Árvore verdadeira. A. B. C. wt. D. E. F. G. BclI. BclI. BglII. BclI. BclI. BclI. BclI. Sau3AI. Sau3AI.

leo-vaughan
Download Presentation

Inferência Filogenética

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inferência Filogenética Construção de Árvores Filogenéticas II Ana Margarida Sousa Instituto Gulbenkian de Ciência Grupo de Biologia Evolutiva amsousa@igc.gulbenkian.pt

  2. Árvore verdadeira A B C wt D E F G

  3. BclI BclI BglII BclI BclI BclI BclI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI BclI BclI BclI BclI BglII BclI BclI BclI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI BclI BclI BclI BclI BclI BclI BglII BglII Sau3AI BamHI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI BclI BclI Sau3AI BclI Sau3AI Sau3AI Sau3AI Sau3AI BclI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI BamHI Sau3AI BamHI Sau3AI BclI BamHI BclI BclI Sau3AI BclI BclI Sau3AI BglII BclI Sau3AI Sau3AI BclI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI BamHI BclI Sau3AI Sau3AI Sau3AI Sau3AI BclI BclI Sau3AI BclI BclI BclI BclI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI BclI Sau3AI BclI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI BclI Sau3AI BclI Sau3AI BclI Sau3AI BclI BclI BglII Sau3AI Sau3AI BclI Sau3AI Sau3AI Sau3AI BclI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI Sau3AI BclI Dados I - DADOS DE RESTRIÇÃO A B C D E F G

  4. A B C D E F G 7 53 1 A 11011100100000111010000001110110101000001110000001110 B 11111010100100111010000001110100101000101110000001010 C 00011000110011111010000001101100101000001100000011010 D 00011000100000011010000011110111101110001100000011010 E 10000001101001111111110100100100111000111101010110100 F 10000110101000011110111111100110101001111101101001011 G 10010000100000111110010100100100101000011101001001010

  5. Dados II - SEQUÊNCIAS NUCLEOTÍDICAS 7 553 A GGAACATCTGCGTAGACAATACTGCTAACAGTTACTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGCTGGAAGCGTCTACTGAACGATGACCGTTGCTTCTACAAAGATGGCTTTATGCTTGATGGGGAACTCATGATCAAGGGCGTAGACTTTAACACAGGGTCCGGCCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGATTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGCTGCCTCTGCTACAGGAATACTTCCCTGAAATCAAATGGCAAGCGGCTGAATCTTACGAGGTCTACGATATGATAGAATTACAGCAATTGTACGAGCAGAAGCGAGCAGAAGGCCATGAGGGTCTCATTGTGAAAGACCC B GGAACATCTGCGTAGACAATACTGCTAACAGTTACTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGCTGGAAGCGTCTACTGAACGATGACCGTTGCTTCTACAAAGATGGCTTTATGCTTGATGGAGAACTCATGATCAAGAGCGTAGACTTTAACACAGGGTCCGGCCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGACTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGTTGCCTCTGCTACAGGAATACTTCCCTGAAATCAAATGGCAAGCGGTTGAATCTTACGAGGTCTACGATATGGTAGAATTACAGCAATTGTACGAGCAGAAGCGAGCAGAAGGCCATGAGGGTCTCATTGTGAAAGACCC C GGAACATCTGCGTAGACAATACTGCTAACAGTTACTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGTTGGAAGCGTTTACTGAACGATGACCGTTGCTTCTACAAAGATGGCTTTATGCTTGATGGGGAATTCATGATCAAGGGCGTAGACTTTAACACAGGGTCCGGCCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGACTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGCTGCCTCTGCTACAGGAATACTTCCCTGAAATCAAATGGCAAGCGGCTGAATCTTACGAGATCTACGATATGGTAGAATTATAGCAATTGTACGAGCAGAAGCGAGCAGAAGGCCATGAGGGTCTCATTGTGAAAGACCC D GGAACATCTGCGTAGACAATACTGCTAACAGTTACTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGCTGGAAGCGTTTACTGAACGATGACCGTTGCTTCTACAAAGATGGCTTTATGCTTGATGGGGAACTCATGATCAAGGGCGTAGACTTTAACACAGGGTCCGGCCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGACTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGCTGTCTCTGCTACAGGAATACTTCCCTGAAATCAAATGGCAAGCGACTGAATCTTACGAGGTCTACGATATGGTAGAATTACAGCAATTGTACGAGCAGAAGCGAGCAGAAGGCCATGAGGGTCTCATTGTGAAAGACCC E GGAACATCTGCGTAGACAATACTGCTAACAGTTACTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGCTGGAAGCGTCTACTGAACGATGACCGTTGTTTCTACAAAGATGGCTTTATGCTTGATGGGGAACTCATGATCAAGGACGTAGATTTTAACACAGGGTCCGACCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGACTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGCTGCCTCTACTACAGGAATACTTTCCTGAAATCAAATGGCAAGCGGCTGAATCTTACGAGGTCTACGATATGGTAGAATTACAGCAATTGTACGAACAAAAGCGAGCAGAAGGCCATGAGGGTTTCATTGTGAAAGACCC F GGAACATCTGCGTAGACAATACTGCTAACAGTTATTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGCTGGAAGCGTCTACTGAACGATGACCGTTGTTTCTACAAAGATGGCTTTATGCTTGATGGGGAATTCATGATCAAGGGCGTAGATTTTAACACAGGGTCCGACCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGACTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGCTGCCTCTACTACAGGAATATTTTCCTGAAATCAAATGGCAAGCGGCTGAATCTTACGAGGTCTACGATATGGTAGAATTACAGCAATTGTACGAGCAAAAGCGAGCAGAAGGCCATGAGGGTCTCATTGTGAAAGACCC G GGAACATCTGCGTAGACAATACTGCTAACAGTTACTGGCTCTCTCGTGTATCTAAAACGATTCCGGCACTGGAACACTTAAACGGGTTTGATGTTCGCTGGAAGCGTCTACTGAACGATGACCGTTGTTTCTACAAAGATGGCTTTATGCTTGATGGGGAATTCATGATCAAGGGCGTAGATTTTAACACAGGGTCCGACCTACTGCGTACTAAATGGACTGACACGAAGAACCAAGAGTTTCATGAAGAGTTATTCGTTGAACCAATCCGTAAGAAAGATAAAGTTCCCTTTAAGCTGCACACTGGACACCTTCACATAAAACTGTACGCTATCCTCCCGCTGCACATCGTGGAGTCTGAAGAAGACTGTGATGTCATGACGTTGCTCATGCAGGAACACGTTAAGAACATGCTGCCTCTACTACAGGAATACTTTCCTGAAATCAAATGGCAAGCGGCTGAATCTTACGAGGTCTACGATATGGTAGAATTACAGCAATTGTACGAGCAAAAGCGAGCAGAAGGCCATGAGGGTCTCATTGTGAAAGACCC

  6. Mapas físicos (dados de restrição) Seqs nucleotídicas (alinhamento) Matriz de dados (0/1) Matriz de distâncias MP, ML Matriz de distâncias ME UPGMA NJ ME MP, ML UPGMA NJ Dados boleanos (0/1) Métodos Programa Cálculo distâncias restdist UPGMA neighbor NJ neighbor ME fitch MP pars ML restml Dados de sequência Métodos Programa Cálculo distâncias dnadist UPGMA neighbor NJ neighbor ME fitch MP dnapars ML dnaml

  7. BOOTSTRAP Matriz de dados (0/1) Seqs nucleotídicas (alinhamento) Gerar 100 pseudo-réplicas Gerar 100 pseudo-réplicas 100 Matrizes de distância 100 árvores NJ 100 árvores NJ Árvore consenso pela maioria Árvore consenso pela maioria Métodos Programa Pseudo-replicas seqboot Árvore consenso consens

  8. Sequência de passos para utilizar qualquer um dos programas do pack Phylip. • Copiar o ficheiro de entrada (formato .txt) para a pasta onde se encontra o programa executável que vai utilizar (ex: restdist.exe). • Clicar duas vezes sobre o executável para abrir o programa. • Escrever o nome do ficheiro de entrada (não esquecer a extensão “.txt”). • Alterar as opções pretendidas conforme indicado no menu. • Escrever ‘y’. Automaticamente é gerado um ‘outfile’ e/ou um ‘treefile’. • Transferir estes ficheiros para outra pasta e mudar-lhes o nome. • Abrir o ficheiro ‘treefile’ com o programa TreeView para analisar a árvore produzida.

  9. Inferência Bayesiana usando o programa MrBayes Dados mistos: Dados de restrição + Dados de sequência #NEXUS begin data; dimensions ntax=14 nchar=5128; format datatype=mixed (Restriction:1-304,DNA:305-5128) interleave=yes gap=- missing=?; matrix A 0000100010100?01001000000000001000100001000001011000000000000001010010000000000010000011100000000110000000000000001000010100000100000100000000001000000000000000000000000001010001000000010001010000000001000010100000000100010000000001000001000101000010000010001000001000100000000010100101010000010100000100 B 0100010000000?0?001101001000001110000100100001001000000000000001011110000010110000000000101000010001100000010000010000001000000010000101000100000100001100000010001000011001000000100100000001100000000000000001100010000101100000001001010101000000000001000100000100000010100001000000100001000000010011101000 C 0010010000000?0?001101101000001110000100000100001000000000000001111010000010111000011000101000010001000000000000010000001000000010000101000100000100001100000010000000000111000001100100000001100000000000010001100010000101100000000001000101000000000001000100000100000010100001000100100001000000010000101000

  10. B11P10 TAAAAATCTGAGTGACTATCTCACAGTGTACGGAC-CTAAAGTTCCCCCA B13P10 TAAAAATCTGAGTGATTATCTCACAGTGTACGGAC-CTAAAGTTCCCCCA B14P10 TAAAAATCTGAGTGATTATCTCACAGTGTACGGAC-CTAAAGTTCCCCCA [ 4810 4820 ] [ * * ] a5P10 TAGGGGGTACCTAAAGCCCAGCCA a7P10 TAGGGGGTACCTAAAGCCCAGCCA a8P10 TAGGGGGTACCTAAAGCCCAGCCA a9P10 TAGGGGGTACCTAAAACCCAGCCA a11P10 TAGGGGGTACCTAAAGCCCAGCCA a13P10 TAGGGGGTACCTAAAGCCCAGCCA a14P10 TAGGGGGTACCTAAAGCCCAGCCA B3P10 TAGGGGGTACCTAAAGCCCAGCCA B7P10 TAGGGGGTACCTAAAGCCCAGCCA B9P10 TAGGGGGTACCTAAAGCCCAGCCA B10P10 TAGGGGGTACCTAAAGCCCAGCCA B11P10 TAGGGGGTACCTAAAACCCAGTCA B13P10 TAGGGGGTACCTAAAACCCAGTCA B14P10 TAGGGGGTACCTAAAACCCAGTCA ; end; begin mrbayes; delete 1 4 6 7 12 13 14; charset Restriction=1-304; charset DNA=305-5128; partition Names=2: Restriction, DNA; set partition=Names; lset applyto=(2) nst=6 rates=gamma; unlink shape=(all) pinvar=(all) statefreq=(all) revmat=(all); prset ratepr=variable; mcmcp ngen=1000 printfreq=100 samplefreq=100 nchains=4 savebrlens=yes filename=Allenz+Allseqs0; mcmc; end;

  11. Sequência de passos para utilizar o programa MrBayes • Gravar o ficheiro de entrada na mesma localização que o programa executável. • Iniciar o programa. • Escrever o comando ‘execute’ e depois o nome do ficheiro de entrada (não esquecer a extensão ‘.txt’). • Aumentar o número de gerações para 1 000 000. • Verificar se ao fim deste nº de gerações o valor do desvio padrão entre as cadeias é ≤ 0.01. • Se sim pode parar o programa. • Escrever o comando ‘sump burnin = 2500’ (resumir os valores dos parâmetros). • Escrever o comando ‘sumt burnin = 2500’. • Verificar o resultado abrindo o ficheiro com extensão ‘.con’ com o programa TreeView.

More Related