110 likes | 515 Views
Protein Structure Alignment using a Genetic algorithm. By Szustakowski et al Proteins:Structure, Function, and Genetics(38):428-440,2000 Presented by Nannan Li. Introduction. To establish evolutionary relationships between the proteins. Biological problem.
E N D
Protein Structure Alignment using a Genetic algorithm By Szustakowski et al Proteins:Structure, Function, and Genetics(38):428-440,2000 Presented by Nannan Li
Introduction • To establish evolutionary relationships between the proteins
Biological problem • For many protein pairs, distinct alignments could be generated that are indistinguishable in terms of number of equivalent residues and root mean square error of superposition • Protein structures are more conserved in the core than in exposed loops and turns
Motivation • To develop a structure alignment algorithm with the goal of generating high-quality, biologically meaningful alignments by first aligning the protein’s cores (secondary structure elements)
Method • Target Function--resulting in correct pairing of SSEs. “Elastic similarity score” has adopted to simultaneously maximize the number of equivalent residue pairs and minimize the distance between these pairs • Treating each protein as a collection SSEs to avoid exhaustive search for regions of similarity shared by two distance matrices
Genetic Algorithm • Use genetic algorithm to search optimal solution to target function • Algorithm starts from a population of completely random pairs of alignment and happens in generations. Multiple SSE alignment are stochastically selected from the current population, modified (mutated or recombined) to from a new population, which becomes current in the next iteration of the algorithm
Genetic Algorithm Steps • Generate an initial population for possible SSE alignments • Alter each alignment using “mutate"," hop”, and “swap” operators • Carry out “recombination” between randomly assigned pairs of alignments using the “crossover” operator • Accept or reject the alterations made to each alignment • Exit if certain conditions are met. Otherwise go to step 2
Initial Population • Since SSE alignment search space is very large, we biased the initial population toward SSE pair doublets • Similarity scores are then calculated for all SSE pair doublets based on target function (Population size is set to 100)
Genetic algorithm operators • “mutate”– with a mutation probability, mutate the individual SSE pairs at each residue pairs • “hop”– with a hop probability, two SSE pairs in one selected alignment trade places • “swap”– with equal probability, an alignment is swapped with its parter
Genetic algorithm operator(‘contd) • Crossover– each alignment is randomly assigned a crossover partner from the rest of the population
Availability • C++ program called KENOBI http://zlab.bu.edu/k2/documents.shtml