1 / 8

SAXS Scatter Performance Analysis

SAXS Scatter Performance Analysis. Chris Wilcox 2/6/2008. Scatter Status. Prototype of basic algorithm, arbitrary number of atoms and topology. Atom types: C, N, O, H, P, S, Zn, and very easy to add more. Matches results with original R prototype from Stefan, for several small molecules.

armina
Download Presentation

SAXS Scatter Performance Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SAXS Scatter Performance Analysis Chris Wilcox 2/6/2008

  2. Scatter Status • Prototype of basic algorithm, arbitrary number of atoms and topology. • Atom types: C, N, O, H, P, S, Zn, and very easy to add more. • Matches results with original R prototype from Stefan, for several small molecules. • Computes intensity function divided into specified number of steps.

  3. Scatter Performance (Current) • Original algorithm, no optimization, debug version: 5000 atoms = ~ 60 hours • Original algorithm, no optimization, release version: 5000 atoms = ~ 4 hours • Obvious restructuring, pre-compute factors, release version: 5000 atoms = ~39 minutes. • Avoid redundant work, compiler flags, release version: 5000 atoms = ~19 minutes. Pentium Core Duo, mobile CPU, 166Mhz

  4. Scatter Performance (Analysis) • Scatter factors are pre-computed, requires ~0% of the fastest calculation. • Distance calculations are step independent, requires ~3% only because of SQRT function. • FSIN function appears to be consuming ~60% of processor cycles, is there an alternative? • Intensity calculation itself uses ~86% of the cycles, need to verify again on latest calculation. No real optimization yet, compiler wins anyway!

  5. Scatter Performance (Model) N = # of atoms, S = # of steps, A = # of types • Scatter factors are O(S•A) * (4 exp+4 pow+4 fmul), i.e. 10K iterations for 1000 steps, 10 types. • Distance math is O(N2/2) * (1 sqrt+3 fmul+2 fadd), i.e. 12.5M iterations for 1000 steps, 5000 atoms. • Intensity math is O(S•N2/2) * (1 fsin+9 fmul+2 fadd), i.e. 12.5G iterations for same as previous. • Operations shown are based on code reading, actual floating point instructions are ~2X more frequent.

  6. Scatter Performance (Future) • Complete optimizations, convert sine function to lookup table: 5000 atoms = ~500 seconds? • Find faster floating point performance, not hard to beat by 8x: 5000 atoms = ~60 seconds? • Intensity calculations are independent, so use more processors: 5000 atoms = ~10 seconds? • Question: How many molecules need to be run to represent non-rigid structure?

  7. Next Steps (Short Term) • Add precise timing, develop model to predict performance for arbitrary number of atoms. • Analyze instructions in inner loop of scatter, but may be impossible to improve on compiler. • Extend to read .pdb file format, or integrate with existing Python code. • Try on processor with better floating point, or on parallel machine, what is required to do this? Project setup takes precedence for several weeks.

  8. Next Steps (Long Term) • Close the loop with experimental data on known molecule, algorithms changes as necessary. • Develop streaming version of program that accepts multiple molecules and averages. • New program for modeling elastic topology, previously called “parametric” model. • Investigate change to streaming architecture, may prototype simple framework user interface.

More Related