160 likes | 256 Views
Se gment A lignment (SEA) Yuzhen Ye Adam Godzik The Burnham Institute. Outline. A new look at the local structure prediction Network matching problem Practical issues Applications. GSDKKGNGVALMTTLFADN. HHHHHHHH. HHHHHH. HHHHHHHH. LL HHHHHHHH LLL. LLLLL. HHHHHH. EEEEEE. A prediction.
E N D
Segment Alignment (SEA)Yuzhen YeAdam GodzikThe Burnham Institute
Outline • A new look at the local structure prediction • Network matching problem • Practical issues • Applications
GSDKKGNGVALMTTLFADN HHHHHHHH HHHHHH HHHHHHHH LLHHHHHHHHLLL LLLLL HHHHHH EEEEEE A prediction EEEEEE EEEEEE LLLLL Real structure LLHHHHHHHHLLL LHHHHHLLLL LLLEEEEEEEEE Description of local structure one or many answers? GSDKKGNGVALMTTLFADN
Motivation • A natural description of local structures: keep the segment information of local structures • Keep uncertainties in local structure predictions: drawbacks of prediction programs and intrinsic uncertainties of local structures in absence of global interactions Incorporating the protein local structure in protein sequence comparison may help to detect the distant homologies and to improve their alignments (for homology modeling)
Proteins are described as a network of PLSSs (predicted local structure segments)
Protein comparison problem is equivalent to a network matching problem Given two networks of PLSSs, find two optimal paths from the source to the sink in each of the networks, whose corresponding PLSSs are most similar to each other. It does not follow the typical position-by-position alignment mode
1 (i-1)1 i1 3 2 (i-1)3, (i-1)4 i2 4 i 1 2 V(i1,j1) V(i1,j2) V(i3,j1) V(i3,j2) V(i,j) j Solving the network matching problem: dynamic programming
Example: (1e68A,1nkl) Each protein is represented as a collection of potentially overlapping and contradictory PLSSs (a network). SEA finds an optimal alignment between these two proteins Simultaneously, SEA identifies the optimal subset of PLSSs (a path in the network) describing each protein. 1e68A: Bacteriocin As-48 1nkl : Nk-lysin
General performance of SEA incorporating different local structure diversities
Keeping local structure diversity helps improve alignment quality alignment between -repressor from E.coli (1lliA) and 434 repressor (1r69)
Local structure information is crucial for improving alignments, especially in the more divergent regions Variable region Stable region 1esfA: straphylococcal enterotoxin 2tssA: toxic shock syndrome toxin-1
Practical issue: local structural prediction • Searching I-site database (web-server or standalone program) • Our solution: FragLib • using sensitive profile-profile alignment program FFAS to predict local structures
Applications • Distant homology detection • Local structure prediction • Improving alignments for protein modeling
Reference A segment alignment approach to protein comparison (Bioinformatics, April issue) Web server http://ffas.ljcrf.edu/sea
Related work • Spliced sequence alignment • Gelfand et al., 1996, PNAS; Novichkov et al., 2001 • Assembling genes from alternative exons • Jumping alignment • Spang R, Rehmsmeier M, Stoye J. JCB, 2002 • Computes a local alignment of a single sequence and a multiple alignment • The sequence is at each position aligned to one sequence of the multiple alignment (reference sequence) instead of a profile • Partial order alignment • Lee C, Grasso C, Sharlow MF, Bioinformatics, 2002 • Multiple alignment
Acknowledgements • Dariusz Plewczyński • Iddo Friedberg • Łukasz Jaroszewski • Weizhong Li • This project is supported by SPAM grant GM63208