90 likes | 105 Views
TEXTAL Progress. Basic modeling of side-chain and backbone coordinates seems to be working well. even for experimental MAD maps, 2.5-3A using pattern-recognition with feature-extracted database assuming C-alpha coordinates are correct
E N D
TEXTAL Progress • Basic modeling of side-chain and backbone coordinates seems to be working well. • even for experimental MAD maps, 2.5-3A • using pattern-recognition with feature-extracted database • assuming C-alpha coordinates are correct • Use sequence-alignment to match fragments after prediction & correct identities
CAPRA Progress • Picks Ca’s using neural network, connect into chains • Re-implement based on new tracing routine • Does good job with 2Fo-Fc maps, secondary structure is apparent, RMS<0.8A • Has harder time with low-quality maps • Sec. Str. recognition from trace geometry
Prelim. Design for Xtal Agent • Decision-making in structure solution • Which program to use? Params? Iterations? • PHASES, SOLVE, SHARP, DM, TNT, CNS, TEXTAL, WARP… • Local decision-making: input params, when to stop iterating - for 1 program at a time • Try a statistical approach (Terwilliger)
Global Decision-Making • When to back-track? What to make of information gained by exploring 1 path? • Example: select initial, conservative mask for solvent-flattening; if doesn’t lead to good model, go back and re-flatten • When to “throw out” data (e.g. low FOM)? • Use NCS or not? Alternative paths compete
AI Search Problem • Choice-points form branches in tree • Initial data collection at root • Try to find path (sequence of computational actions) that produces a solved structure • Question: when to continue down one path versus re-start from a previous branch-pt?
Sequential Decision Procedures • Branch of Decision Theory • Focus on utility of information gained in earlier steps to make better choices later • Attempt to optimize “long-term payoff” • Define a target utility function that measures model goodness, e.g. combination of Rfree, completeness, consistency...
Parameter Estimation • Need quantitative estimates of probabilistic effects of running a program on quality of model • Fit equations from synthetic experiments: • Prob[Rfree(S’)=x | FOM(S)=y] • where S’ is result of running program on S • Prob[Rfree*(flatten(S,50%))=x | Rfree*(flatten(S,40%))=y]
Utility and Risk • Utility of an action A is integral over expected value of outcomes, weighted by prob: U(S,A)=S v(S’) x Prob(S’|S) • Can use to compare different actions & states, provided v is “final model quality” • Risk-aversion: modify the values in integral to prefer possibility of higher reward over average loss - for handling uncertainty
Computational Cost • Why not just run all programs with all params? Want to minimize CPU time. • At any given moment, pick the action that produces a state with highest expected utility minus estimated cost of runtime: • gain: G(A,S)=U(A,S)-f(T(A,S)) • where T(A,S) is estimated time to run A on S • and f(.) correlates effort to model quality scale