230 likes | 297 Views
Dynamic Programming. How to match up sequences and have the matches make sense and be quantitative. Question is. How does a specific sequence compare to one other specific sequence? Is it similar? If so, at what level? Can’t compare every base to every other base--to complex.
E N D
Dynamic Programming How to match up sequences and have the matches make sense and be quantitative
Question is • How does a specific sequence compare to one other specific sequence? • Is it similar? • If so, at what level? • Can’t compare every base to every other base--to complex
You are in the driver’s seat • What is the most important? • Exact nucleotide match? • One-for-one (no gaps)? • Length
Mathematical model • Derive equation for each position, based on your value system • Methodically go through each base for each sequence and calculate the value • At the end, find the optimal path
Starting point: three possible scenarios for each position in sequences X and Y • At a given position, the bases (Xm and Yn) are identical in X and Y • At a given position, the base (Xm) in X is aligned with a gap in Y (and Yn appeared earlier) • At a given position, the base in Y is aligned with a gap in X (and Xm appeared earlier)
Assign a value to each situation • Identical: +5 • Mismatch: -2 • Insertion or deletion: -6 (Could have others; could choose different values)
Alpha-glucosidase in plants: • Enzymes sharing WIDMNE signature • sequence • alpha-glucosidase (all groups) • alpha-xylosidase (plant, bacteria, archaea) • Sucrase/Isomaltase (animal) • Related sequences with broad substrate • specificity
Mj Aglu Plantae Fungi At XYL1 Pt Aglu Tm XYL Sp Aglu St MAL2 Anig aglA Pp BAB3946 An AgdA So Aglu Ca GAM1 Bv Aglu Soc GAM1 At Aglu-1 An agdB Hv Aglu Protista Tp GAA Hs GAA Cj GAAII Cj GAAI Archaea Ss xylS Hs S/I-N Hs S/I-C Bt Aglu-III Lv GAA Ce AAA8317 Bh BAB0442 Aa GlcA Lp XylQ Sc CAB8890 Tm AAD3539 Animalia Bacteria 0.1
Plant -amylases are located in different cellular compartments • Plastids (chloroplasts, amyloplasts) • Cytosol • Apoplast (cell wall space) • What is the function of the non-plastid forms?
Arabidopsis AMY1 Clade I Secreted 421-445 aa rice 2A barley A barley B morning glory rice 3B dodder maize adzuki bean rice 3E rice XP_472377 Arabidopsis AMY2 apple 10 cassava apple 9 kiwifruit apple 8 potato plantain Clade II Cytosolic 407-414 aa Clade III Plastidic 877-906 aa Arabidopsis AMY3 rice NP_916641
Homologous sequences (homologues) Share a common ancestor Paralogs Homologues derived by gene duplication Functions may vary Look for differences Orthologs Homologues derived by speciation Common function Look for similarities
Use alignments to look for: • Structures important for common • functions (orthologs) • Structures important for unique • functions (paralogs) • Unusual structures
N C AMY1 has a three amino acid deletion AtAMY1 3
Barley -amylase Red: NHDTGST Blue: VAEIW Active site residues
Variation in the active site loop among plant and bacterial -amylases AtAMY1