1 / 12

XRD data analysis software development

XRD data analysis software development. Outline. Background Reasons for change Conversion challenges Status. X-Ray Diffraction (XRD). What is XRD experiment for? provides information on the relative positions of atoms in a crystal

dana-holman
Download Presentation

XRD data analysis software development

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XRD data analysis software development

  2. Outline • Background • Reasons for change • Conversion challenges • Status

  3. X-Ray Diffraction (XRD) • What is XRD experiment for? • provides information on the relative positions of atoms in a crystal • allows individual crystalline structures to be identified • detects stains in the crystals as well • What is XRD data? • They are digital images collected by CCD Camera when synchrotron X-Ray beamline scanning on a sample area • Image data sizes are large • One 2D image collected for each scan point • 8MB/image at CLS, 2084X2084 pixels/image • Hundreds or even thousands of images could be collected in an experiment, depending on • the size of the sample area to be scanned • the step size in moving the sample during the scan • more scan points provide more detailed information for the analysis

  4. XRD data analysis • Deals with large amount of image data • Several procedures to each of the images • Peak searching, identify regions of interest • including threshold finding, blob searching and 2D curve fitting on each blob • Indexing, identify possible/known crystalline structures • Strain analysis, detect stains in material (?) • Existing XRD data analysis software was written in IDL • A proprietary scripting language • Only carry out processes sequentially • It is very time consuming! • e.g. normally, days are needed to complete the processing of a whole package of data

  5. Reasons for change • Needed for incorporation into Science Studio • Aim of SS to provide remote users feedback during experimental runs; XRD analysis is one • Existing code in scripting language and relied on sequential processes • Existing software is written in IDL • Peak searching is in IDL • Indexing and strain analysis are in IDL calling externals in Fortran • Needed to have versions for Streaming data analysis • Stream processing -- taking a steam of input data, processing the data in a series of steps, steaming the results out, achieving real time or close to real time performance • Needed to solve data storage problem • Accumulating large amount raw image data for a long time could cause storage problem • Actually, only those peaks in each image are the useful information for the analysis • If those peaks can be found during data collection in real time, it might not be necessary to keep the raw images • E.g. a typical raw image size is 8MB at CLS, while the peak data for the image is only about 10KB

  6. How to make the change • Our development target • To port existing software for XRD data analysis to a Cell system at SHARCNET to achieve stream processing for XRD data analysis • SHARCNET’s Cell system • Including 8 Cell blades (QS22) -- 2 Cell processor chips on each Cell blade, i.e. total 16 Cell processors • Cell processor -- a heterogeneous multi-core architecture • Two types of cores optimized for different tasks • 1 Power Processing Element (PPE) and 8 Synergistic Processing Elements(SPE) • PPE -- Power PC architecture, acts as a controller to perform control-intensive tasks • SPEs -- simpler cores devote more resources, perform computation intensive tasks • Cell processor can be programmed to achieve streaming processing

  7. Basic Cell Programming Model Orientation Strain XRD data analysis procedures Resultant Maps Diffraction pattern

  8. Challenges • Cell only runs Linux and compiled code in C/C++ • PPE and SPE execute different instruction sets • Compile code for PPE and SPE use different compiler • Existing software is written in IDL • Peak searching is in IDL • Indexing and strain analysis are in IDL calling externals in Fortran • Challenges • No algorithm provided: rewrite code in C using only the source code in IDL • Programming on Cell is new and challenge because of Cell’s special architecture • Need knowledge of programming at assembly level • Limited function libraries available for Cell’s SPE

  9. Development plan • Rewrite code in C • Validate the results produced by the C code • Comparing with results from existing software • Make the code run on Cell’s PPE • Design for parallel processing on Cell • Identify strategy for parallel computation • Identify what should be executed on Cell’s SPEs • Implement the design • Validate the results produced by Cell • Performance measurement

  10. Progress Report • Peak searching and Indexing procedures have been rewritten in C • Results produced by the C code for both procedures have been validated • at least with our limited data set • Peak searching has been ported on Cell successfully • Threshold finding and blob searching are carried out by PPE • 2D Curve fitting (Lorentz fitting) for each blob is carried out by SPUs • Typical number of blobs found on each image is about 100 ~200 depending on the threshold setting • Some preliminary performance measurements have been done on Cell system for peak searching procedure

  11. Some preliminary performance measurement (2)Peak searching on CLS XRD data:8MB/image, 2084X2084 pixels/image, Desktop speed: 9.34 sec./image

  12. More work to do .. • Continue rewrite code in C for strain analysis on XRD data • Port indexing and strain analysis procedures onto Cell • Design programming model for Cell to achieve streaming processing for all procedures in XRD data analysis • Implement the design • Integrate the streaming processing on XRD data with Science Studio

More Related