590 likes | 906 Views
Construction of Substitution matrices BLOSUM BLO CKS SU BSTITUTION M ATRIX PAM P OINT A CCEPTED M UTATIONS. Substitution matrices
E N D
Construction of Substitution matrices • BLOSUM • BLOCKS SUBSTITUTION MATRIX • PAM • POINT ACCEPTED MUTATIONS
Substitution matrices • Substitution matrix contains values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution • Substitution matrices are constructed from a large and diverse sample of sequence alignments
How to construct substitution matrices • Multiple alignment of well studies gene sequences from different species • use orthologs: functionally similar • observed substitutions tend to preserve functions • minimal gaps
How to construct substitution matrices ? • Tabulate substitutions • A to A: 9867 times • A to R: 2 times • A to N: 9 times • etc….
How to construct substitution matrices ? Substitution matrix score = Log Observed mutation rate in alignment Expected random mutation rate
The random mutation rate • compute the overall occurrence of an amino acid in a protein database
The random mutation rate • compute the overall occurrence of an amino acid in a protein database http://www.ebi.ac.uk/swissprot/sptr_stats/index.html
The random mutation rate • Example: • Expected random mutation rate is 1 in 10000 and observed mutation rate of W to R is 1 in 10 • Score = log (0.1/0.0001) = log (1000) = +3
PAM matrices • Point Accepted Mutations • [1 point mutation per 100 amino acids] • does not take into account different evolutionary rates between conserved and non-conserved regions
PAM1 is 1% average change in amino acids • PAM 250:??
Why use substitution matrices? • Database searches
Database searching • Query Sequence; Database sequences
Database searching: Filtering • Dynamic programming is computationally expensive • Apply DP to sequence pairs that are likely to be similar • find short words: query-database • DNA 7-28bases (BLAST?) • PROTEIN 3 amino acids (BLAST?)
BLAST • Basic Local Alignment Search Tool • Heuristic method?
Blast output parameter E value
E value • number of alignments one can expect see by chance. • Number of alignments having the same or greater score. • Dependent on size of database and length of query seq.