160 likes | 320 Views
A World of New Discovery. STRUCTFAST : Novel Dynamic Programming and Profile Scoring CASP6 & CAFASP4 Servers EXPM, BNMX &SFST Aleksandar Poleksic, Joseph F. Danzer, & Derek A. Debe. Derek A. Debe, Aleksandar Poleksic, Joseph F. Danzer, respectively.
E N D
A World of New Discovery STRUCTFAST: Novel Dynamic Programming and Profile Scoring CASP6 & CAFASP4 Servers EXPM, BNMX &SFST Aleksandar Poleksic, Joseph F. Danzer, & Derek A. Debe
Derek A. Debe, Aleksandar Poleksic, Joseph F. Danzer, respectively.
OutlineWhat's Different About Our Methods? 1. Bridge-Bulge Dynamic Programming 2. Analytical Profile-Profile Scoring 3. Convergent Island Statistics p-Values
STRUCTFAST StructureRealization Utilizing Cogent Tips From Aligned Structural Templates Basic Principle: Gaps known to exist should not be strongly penalized. Known Gap Known Gap Structure Alignment of Homologous Crystal Structures Structure-Structure alignments give us clues...
I B I G T O W N S O W N B G T O W N S O W N 18 11 B B 5 6 3 0 0 0 0 0 0 16 10 8 7 4 3 2 1 0 0 0 12 16 I I 7 5 6 3 0 0 0 0 0 12 14 9 8 6 4 3 2 0 0 0 12 14 G G 9 7 5 6 3 0 0 0 0 10 12 12 9 6 6 4 3 0 0 0 12 B B 9 9 9 2 5 6 3 0 0 0 10 10 12 10 7 4 6 4 1 0 0 10 12 R R 8 5 5 1 5 6 1 0 0 10 9 10 12 8 5 4 6 2 0 0 13 10 12 O O 8 5 5 1 1 6 0 0 12 10 9 8 12 6 4 2 6 0 0 10 13 10 10 W W 3 5 3 0 0 4 0 10 12 10 7 6 10 4 3 0 4 0 10 13 N N 6 8 3 5 8 3 0 0 2 9 10 12 8 5 4 8 4 1 0 2 13 T T 6 7 8 12 6 4 4 6 2 0 0 6 6 8 6 3 5 6 1 0 0 11 O O 4 6 5 6 10 6 4 3 6 0 0 3 6 3 6 6 3 1 6 0 0 W W 3 4 6 4 4 8 6 4 1 4 0 1 3 6 3 4 9 6 3 0 4 0 N N 2 3 4 6 2 3 6 6 2 0 2 0 1 3 6 1 1 7 6 1 0 2 O O 0 0 1 2 6 0 1 2 6 0 0 0 0 0 1 6 0 0 1 6 0 0 W W 0 0 0 0 0 4 0 0 0 4 0 0 0 0 0 0 4 0 0 0 4 0 N N 0 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 0 2 0 0 0 2 Bridge-BulgeDynamic Programming Dynamic Programming STRUCTFAST BIG-TOWNSOWN--- BIGBROWNTOWNOWN BIG-----TOWNSOWN BIGBROWNTOWN-OWN
CASP Target T0247 Template: 1pj5A. “Bridge” taken from 1wooA Query: 2 AQ---QTPLYEQHTLCGARMVDFHGWMMPLHYGS-------------------------- 32 P H GA GW P Sbjct: 427 PrnlrVSPFHARHKELGAFFLEAGGWERPYWFEAnaallkempaewlppardawsgmfss 486 Query: 33 --QIDEHHAVRTDAGMFDVSHMTIVDLRGSRTREFLRYLLANDVAKLtkSGKALYSGMLN... E RT M D G L L D AK G Y L Sbjct: 487 piAAAEAWKTRTAVAMYDMTPLKRLEVSGPGALKLLQELTTADLAKK—PGAVTYTLLLD... Query: 91 ASGGVIDDLIVYYFTEDFFRLVVNSATREKDLSWITQHAEPF-----GIEI-TVRDDLSM 144 GGV D V ED F L N H Sbjct: 545 HAGGVRSDITVARLSEDTFQLGANGNIDTAYFERAARHQTQSgsatdWVQVrDTTGGTCC 604 Query: 145 IAVQGPNAQAKAATLFN-DAQRQAV-EGMKPFFGVQAGDLFIATTGYTGEAGYEIALPNE 202 I GP A D Y GE G E Sbjct: 605 IGLWGPLARDLVSKVSDdDFTNDGLkYFRAKNVVIGGIPVTAMRLSYVGELGWELYTSAD 664 Query: 203 KAADFWRALVEAG----VKPCGLGARDTLRLEAGMNLYGQEMDETISPLAANMGWTIAWE 258 W AL AG V G A LRLE G G M P A G Sbjct: 665 NGQRLWDALWQAGqpfgVIAAGRAAFSSLRLEKGYRSWGTDMTTEHDPFEAGLGFAVKMA 724 Query: 259 paDRDFIGREALEVQREHG-TEKLVGLVMTE-KGVLRNELPVRFtdaqGNQHEGIITSGT 316 FIG ALE E L L PV Q G TS Sbjct: 725 --KESFIGKGALEGRTEEAsARRLRCLTIDDgRSIVLGKEPVFY----KEQAVGYVTSAA 778 Query: 317 FSPTLGYSIALARVPEGI--GETAIVQIRNREMPVKVTKPVFV-RNGKAV 363 T IA P G R VT Sbjct: 779 YGYTVAKPIAYSYLPGTVsvGDSVDIEYFGRRITATVTEDPLYdPKMTRL 828 1wooA 1pj5A 1pj5A “Bulge”
CASP Target T0263 Template: 1iujB. “Bridge” taken from 1a7gA Query: 1 MaTLLQLHFAFN-GpFGDAMAEQLKPLAESINQEPGFLWKVWTESEknhEAGGIYL---- 55 M E A PGF G Y Sbjct: 1 M-FVTMNRIPVRpE-YAEQFEEAFRQRARLVDRMPGFIRNLVLRPK---NPGDPYVvmtl 55 Query: 56 FTDEKSALAYLE--KHTARLKNLG-------VEEVVAKVFDV 88 E A E G F V Sbjct: 56 WESEEAFRAWTEspAFKEGHARSGtlpkeafLGPNRLEAFEV 97 1a7gA 1iujB 1iujB “Bulge”
OutlineWhat's Different About Our Methods? 1. Bridge-Bulge Dynamic Programming 2. Analytical Profile-Profile Scoring 3. Convergent Island Statistics p-Values
STRUCTFASTScoring Score = ? Seq1: VLKRAKAKGDRLILEADGEKNKLNVIF Seq2: FRGVTLSAKEFKKIVDTFSQLSDSVNFQVDKEGIKL L DB Profile-Profile Case
STRUCTFASTScoring Like Throwing Dice I D L K T A W L K H Which die is more likely to generate D? K D N B G S R R M D C V A Score = log(PL(D) / PB(D)) We solve this analytically. Details to be published...
How do we evaluate the statistical significance of our STRUCTFAST alignments?
4 12 e The Challenge Secondary Structure Profiles Parameters (Gaps, et al.) Bridges & Bulges Rigorous Statistical Significance of the Alignment (p-Value)
Island Statistics Altschul, et al. Not fast enough for full database search
. = == ==== ====== ======== ============ ================ ==================== ====================== ======================== ======================== ====================== ==================== ================ ============ ======== ====== ==== == = . . == ==== ========== ================== ========================== ================================ ==================================== ====================================== ====================================== ==================================== ================================ ========================== ================== ========== ==== == . Convergence Trick K Lower p-values Higher p-values Higher p-values Lower p-values True True K We shuffle until there is evidence that the sequences are not related.
Benefits of CIS (Convergent Island Statistics) Analytical e-values No need for distribution fitting Analytical (yet tunable) trade-off between speed and sensitivity Allows one to change any algorithm parameters on the fly