400 likes | 495 Views
A T G G T T A AC T A G T T A G G A A T C G C G C A T T A T G T C C. A C G G T A. A C G T T A G G T T G A A C G G C A G G T T T A A A T C G A T T C C. C A G A T. A C G T T A T G A A A T T G G G G C A G G T T T A A C G C G C C C. M. V. N. M. S. T.
E N D
A T G G T T A AC T A G T T A G G A A T C G C G C A T T A T G T C C A C G G T A A C G T T A G G T T G A A C G G C A G G T T T A A A T C G A T T C C C A G A T A C G T T A T G A A A T T G G G G C A G G T T T A A C G C G C C C
M V N M S T K P Prolina Lisina G Glicina V P Leucina L Glutamina Q M K L G Q V M Metionina N Asparagina S Serina V Valina STOP T Treonina A U G G UU A A C U A G UU A G G A A U C G C G C A U U A U G U C C A C G G U A A C G U U A G G U U G A A C G G C A G G U U U A A A U C G A U U C C C A G A U A C G U U A UG A A A U U G G G G C A G G U U U A A C G C G C C C
presente in ? algoritmo che richiede un numero di confronti pari alla lunghezza di ATTACGGCCATGCGGAGCCGGAAG CCATG
T G T A C G G A A T C G G A T C T C C G A C C A T C G G A 4 T G - T A - C G G A - - A T C G G A + T - C T - C C G - A C C A T C G G A = 3 7 confronto approssimato di stringhe ALLINEAMENTO T G C TAC C G G A C C A T C G G A
T C T T C T C C G C C G A C C A C C A T C A T C G G A G G A T G T A C G G A A T C G G A T G T A C G G A A T C G G A
4 T G - T A - C G G A - - A T C G G A + T - C T - C C G - A C C A T C G G A = 3 7 T - G T A C - G G A - - A T C G G A T C - T - C C - G A C C A T C G G A
cammino minimo quante operazioni ? N.B. : il numero di cammini è molto elevato impossibile la valutazione esplicita !
+1 min = +1 RICORSIONE !
numero operazioni = numero archi = ogni arco viene considerato esattamente una volta due sequenze di 1000 basi richiedono un milione di operazioni
T G T A C G G A A T C G G A 4 T G T A C G G A - - A T C G G A T C T C C G A C C A T C G G A 2 T C T C C G - A C C A T C G G A 2 8 Diverso modello: sostituzioni ammesse
T C T 14 C C G A C C 6 A T C G G A T G T A C G G A A T C G G A
T G T A C G G A - A T C G G A T G T A C G G A - - A T C G G A T C T C C G A C C A T C G G A T C T C C G - A C C A T C G G A
T G T A C G G A A T C G G A T C T C C G A C C A T C G G A T G T A C G - G A A T C G G A A C T C A G A C A A T G A T C T C C G A C C A T C G G A A C T C A G A C A A T - - G A ALLINEAMENTO MULTIPLO
un miliardo di operazioni ! Numero confronti = prodotto lunghezze stringhe 3 stringhe lunghe 1000
? CTAGA CTGA ? ATGA ? CTAGA ATGA TAGA ? CTGA TAGA AGGA ATGA TACA TAGA - T A CA - T A CA - T A G A - T A G A C T - G A CT - G A A G - G A AG - G A A T - G A A T - G A ATAGA
AUGCCGAUUCAACGGUCCUACUCGGACUUUACC M P I Q R S Y S D F T M R I S R S D S D Y T punteggio (M<->M, P<-> R ...) basato sulle probabilità di mutazione
ACGTTACG TTACGGAT CGGATTCA CGGCGATT AACAAGCTT CGGAATCG TTACCGGAT CGGTTAGG ACGTTACG TTACGGAT CGGATTCA CGGCAATT AACAAGCTT CGGAATAG TTACCGGAT CGGTTAGG ACGTTACG TTACTGAT CGGATTCA CGGCGATT AACAAGCGT CGGAATCG TTACCGGAT CGGTTAGG ACGTTACG TTACGGAT CGGATTTA CGGCGATT AACAAGCTT CGGAATCG TTACCGGAT CGGTTAGG ACATTACG TTACGGAT CGGATTCA CGGCGACT AACAAGCTT CGGAATCG TTACCGGAT CGGTTAAG CGAATTAG TGGCGAA GGCCTTAA ACGACGTT GCATTCGA ATATCGAT CGCGCGAA TGTGCATA AGAATTAG TGGCGAA GGCCTTAA ACGACGAT GCATTCGA ATATCGAT CGCGCGAA TGTGCATA CGAATTAG TGGCGAA GGCCTTAA ACGACGAT GTATTCGA ATATCGAT CGCGCGAA TGTGCATA CGAATTAG TGGCGAA GGCCTTAA ACGACGAT GCATTCGA ATATCGAT CGCGCGAA TGTGCATA AGAATTAG TGGCGAA GGCCTTAA ACGACGAT GCATTGGA ATATCGAT CGCGCGAA TGTGCATA AACGGAC TGTCGCGA CGCGCGAT GTGTAGAG CTTGTTCT CGGATATA CGCGATAT TGTGAATA ACCGGAC TCTCGCGA CGCGCGAT GTGTAGAG CTTGATCT CGGATATA CGCGCTAT TGTGAATA ACCGGAC TGTCGCGA CGCGCGAT GTGTAGAG CTTGATCT CGGATATA CGCGATAT TGTGAATA ACCGGAC TGTCGCGA CGCGCGAT TTGTAGAG CTTGATCT CGGATATA CGCAATAT TGTGAATA ACCGGAC TGTCGCGA CGCTCGAT GTGTAGAG CTTGATCT AGGATATA CGCGATAT TGTGAATA ACGTTACG TTACGAAT CGGATTCA CGGCGATT AACCAGCTT CGGAATCG TTACCGGAT CGGTTAGG CGAATTAG TGGCGAA AGCCTTAA ACGACGAT GCATTCGA ATATCGAT CGCGCGAA TGTGCATA ACCGGAC TGTCGCGA CGCGCGAT GTGCAGAG CTTGATCT CGGATATA CGCGATAT TGTGAATA
ACGTTACG TTACGGAT CGGATTCA CGGCGATT AACAAGCTT CGGAATCG TTACCGGAT CGGTTAGG CGAATTAG TGGCGAA GGCCTTAA ACGACGAT GCATTCGA ATATCGAT CGCGCGAA TGTGCATA ACCGGAC TGTCGCGA CGCGCGAT GTGTAGAG CTTGATCT CGGATATA CGCGATAT TGTGAATA ACGTTACG TTACGAAT CGGATTCA CGGCGATT AACCAGCTT CGGAATCG TTACCGGAT CGGTTAGG CGAATTAG TGGCGAA AGCCTTAA ACGACGAT GCATTCGA ATATCGAT CGCGCGAA TGTGCATA ACCGGAC TGTCGCGA CGCGCGAT GTGCAGAG CTTGATCT CGGATATA CGCGATAT TGTGAATA ACGTTACG TTACGGAT CGGATTTA CGGCGATT AACAAGCTT CGGAATCG TTACCGGAT CGGTTAGG AGAATTAG TGGCGAA GGCCTTAA ACGACGAT GCATTCGA ATATCGAT CGCGCGAA TGTGCATA ACCGGAC TCTCGCGA CGCGCGAT GTGTAGAG CTTGATCT CGGATATA CGCGCTAT TGTGAATA ACATTACG TTACGGAT CGGATTCA CGGCGACT AACAAGCTT CGGAATCG TTACCGGAT CGGTTAAG CGAATTAG TGGCGAA GGCCTTAA ACGACGTT GCATTCGA ATATCGAT CGCGCGAA TGTGCATA ACCGGAC TGTCGCGA CGCGCGAT TTGTAGAG CTTGATCT CGGATATA CGCAATAT TGTGAATA ACGTTACG TTACTGAT CGGATTCA CGGCGATT AACAAGCGT CGGAATCG TTACCGGAT CGGTTAGG AGAATTAG TGGCGAA GGCCTTAA ACGACGAT GCATTGGA ATATCGAT CGCGCGAA TGTGCATA AACGGAC TGTCGCGA CGCGCGAT GTGTAGAG CTTGTTCT CGGATATA CGCGATAT TGTGAATA ACGTTACG TTACGGAT CGGATTCA CGGCAATT AACAAGCTT CGGAATAG TTACCGGAT CGGTTAGG CGAATTAG TGGCGAA GGCCTTAA ACGACGAT GTATTCGA ATATCGAT CGCGCGAA TGTGCATA ACCGGAC TGTCGCGA CGCTCGAT GTGTAGAG CTTGATCT AGGATATA CGCGATAT TGTGAATA
1 + 1 + 2 = ___ 4 ACCGT CGTGC TTAC TACCGT - - ACCGT - - - - - - CGTGC TTAC - - - - - - TACCGT - - TTACCGTGC TTAC - - - - - - TACCGT - - - - ACCGT - - - - - - CGTGC
TAGG 1 AGGT TAGG 1 4 AGGT 3 4 3 4 TAGG 4 GTCG AGGT 2 1 4 2 4 4 CGTC TAGG AGGT CGTC GTCG
4 4 TAGG 1 3 4 4 GTCG AGGT 2 1 4 2 4 4 CGTC CGTC - GTCG - - - - TAGG - AGGT CGTCGTAGGT lunghezza 10
TAGG 1 4 4 3 4 4 GTCG AGGT 2 TAGG 1 4 2 4 - AGGT 4 - - GTCG CGTC - - CGTC TAGGTCGTC lunghezza 9 CGTCGTAGGT
A B C D E F ALBERI FILOGENETICI
0 0 1 0 1 1 0 0 1 1 1 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 0 0 a b c d e A B C D E F
00110 00010 00100 10010 00011 00101 00100 01011 00010 10011 10010
1 0 1 0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 1 0 1 0
1 0 1 0 0 1 1 0 0 0 0
0 0 1 0 1 1 0 0 1 1 1 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 0 0 a b c d e A B C D E F esiste un albero filogenetico perfetto con A,B,C,D,E,F nodi?
A B B A C A B A C C B 12 6 18 60 30 30 120 2 foglie 3 foglie 4 foglie 5 foglie
0 0 1 0 1 a b c d e 1 0 0 1 1 A 1 1 0 1 0 B 0 0 1 0 1 C 0 1 0 1 1 D 1 0 1 0 0 E F caratteri ordinati: solo 0 --> 1 ammesso problema facile
0 0 1 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 a b c d e A B C D E F
a b c d e A 0 0 1 0 1 B 1 0 0 0 0 C 1 1 0 1 0 a D 0 0 1 0 0 b c a b c d e 0 0 0 0 0 E A 0 0 1 0 1 1 1 0 0 0 F d B B E e 1 0 0 0 0 C 1 1 0 1 0 D 0 0 1 0 0 C F A D 0 0 0 0 0 E 1 1 0 0 0 F
a b c d e f g a b c d e f g A 0 1 0 1 0 0 0 1 0 0 0 1 0 1 A B 0 1 0 1 0 0 1 0 1 1 0 0 1 1 B C 0 1 0 1 0 0 1 0 0 0 1 1 0 0 C D 0 1 1 0 0 0 1 0 0 1 0 1 0 0 D 0 1 0 1 1 1 0 1 0 1 0 1 0 0 E E 1 0 0 1 0 0 1 0 0 0 1 0 0 1 F F caratteri non ordinati (filogenia perfetta)
1001011 1101011 1001010 0101011 1101011 1001010 1001010 E B 1101011 0101011 C 0100011 F 1101001 D A