1 / 13

Prototyping the Next EPICS Archiver

Explore the evolution of data handling in scientific research through the SciDB system, including benchmark results and proposals for Channel Archiver backends and engines. Learn about the array-oriented model, operators, and languages used in SciDB, alongside other database solutions. Discover the potential for interdisciplinary applications across domains like genomics, astronomy, and fusion. Benchmark performance comparisons between SciDB and Channel Archiver backends provide a glimpse into the future of data storage and retrieval in scientific contexts.

Download Presentation

Prototyping the Next EPICS Archiver

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prototyping the Next EPICS Archiver Nikolay Malitsky EPICS Collaboration Meeting PSI, Switzerland October 5, 2011

  2. Content • Background • SciDB in Three Slides • Benchmark and Proposal of Channel Archiver Backend • Benchmark and Proposal of Channel Archiver Engine • Integrated Environment

  3. Background ARCHIVER • Backends: original approach, RDBs • New requirements: • 1 M channels and 1 M samples/s (BNL, SLAC, Diamond, PSI, …) • New data types (EPICS 4, DAQ, ITER, …) • New solutions: Protocol Buffers (SLAC), MSDPlus (Consorzio RFX ), Hypertable (INFN), SciDB (BNL), …

  4. SciDB: Open Source Data Management and Analytics System http://www.scidb.org • Consistent array-oriented model and formalism : • Generalization of the OLAP models • Natural and fundamental data type of scientific software (e.g. MATLAB) • Strong team of database experts lead by Mike Stonebraker • Wide range of domains represented by Science Advisory Board • Genomics • Astronomy • Environmental Observing Systems • Earth Science • Fusion • Remote Sensing • High Energy Physics • Atmospheric Sciences • Oceanography • Control System

  5. SciDB Array-Oriented Model CREATE ARRAY Example < a1:integer, a2:float>, a3:MyType> [Dim1=0:5, Dim2=0:4] AQL:Array Query Language, similar to SQL AFL: Array Functional Language, reminiscent of APL, etc. Including a collection of operators, like join, filter, slice, apply, and others • Other approaches: • Y. Zhao, et al., Array-Based Evaluation of Multi-Dimensional Queries in Object-Relational Database Systems, 1997 • P. Baumann, et al., The Multidimensional Database System RasDaMan, 1998 • Barrodale Company, DBXtenDataBlade , 2010 (thanks to Lana Abadiefor this reference)

  6. SciDB Array-Oriented API Chunk-based partitioning Column-oriented partitioning

  7. May 17, 2011 Benchmark of the SciDB 0.75 backend http://sourceforge.net/projects/epics-archbench/ struct Complex { double re; double im; }; • Scenario: • 1000 channels • 100 chunks/channel • 1000 samples/chunk • 1 sample : status(int16), severity (int16) , time stamp (int64), and Complex • Writing of 1000 channels: 127 s ( ~ 0.8 M samples/sec) REGISTER_TYPE(complex, 0); // 0 – for variable size data … Complex v; v.re = 1.0*ic; v. im = 1.0*(ie+1); value.setData(&v, 16); chunkIterator->writeItem(value); • Reading of 4 channels : 0.12 s ( ~ 3 M samples/sec) Value& v = chunkIterator->getItem(); Complex* data = (Complex*) v.data(); Nice results, but the existing in-memory indexing approach does not scale for historical data

  8. Benchmark of the SciDB and Channel Archiver backends CREATE ARRAY Ch001 < dbr: dbr_time_double> [ ts (epicsTimeStamp) = 0:*,1000, 0] struct dbr_time_double{ dbr_short_t status; /* status of value */ dbr_short_t severity; /* severity of alarm */ epicsTimeStamp stamp; /* time stamp */ dbr_long_t RISC_pad; /* RISC alignment */ dbr_double_t value; /* current value */ }; • Scenario: • 1000 channels • 1000 - 100 chunks/channel • 100 - 1000 samples/chunk • 1 sample : dbr_time_double Writing Reading

  9. SciDB - EPICS Driver CHUNK SIZE SciDB API SciDB-EPICS EPICS backend Similar approach: Barrodale Company. Universal File Interface (again, thanks to Lana Abadie for this reference )

  10. Benchmark of the Channel Archiver Engine 20-100 K dbr_time_double/s Engine 2-10 K monitors @ 10 Hz M. Kraimer, et al., EPICS Application Developer ‘s Guide, Ch 2.2 Example IOC Application, 2010 G.Manduchi, et al. New EPICS Channel Archiver based on MDSPlus Data System, Proc. IEEE RT, 2011

  11. Chunk-based solution Herb Sutter, Writing a Generalized Concurrent Queue, Dr. Dobb’s, 2008 PV CircularBuffer PV PV CircularBuffer CircularBuffer Queue Engine 10,000 channels 10,000 channels Engine CA CA PV CircularBuffer 09/30/2011 14:43:33 Engine: writing done, channels: 10000, count: 1000000, time: 3.419260e+00 s 09/30/2011 14:43:42 Engine: writing done, channels: 10000, count: 1000000, time: 2.950856e+00 s 09/30/2011 14:43:52 Engine: writing done, channels: 10000, count: 1000000, time: 2.758174e+00 s 09/30/2011 14:44:03 Engine: writing done, channels: 10000, count: 1000000, time: 3.164271e+00 s 09/30/2011 14:44:12 Engine: writing done, channels: 10000, count: 1000000, time: 2.884632e+00 s 09/30/2011 14:44:22 Engine: writing done, channels: 10000, count: 1000000, time: 2.759355e+00 s 09/30/2011 14:44:32 Engine: writing done, channels: 10000, count: 1000000, time: 2.911632e+00 s 09/30/2011 14:44:42 Engine: writing done, channels: 10000, count: 1000000, time: 2.809776e+00 s Buffer :10s (Chunk size = 10 Hz * 10s = 100) Writing rate: 1 M / 3 s = 300 K samples/ s

  12. A Bigger Picture SciDB Node Archiver SciDB Node SciDB Node SciDB Node SciDB Node EPICS 3 Driver EPICS 3 Driver HDF5 Driver V 3 HDF5 Driver HDF5 Driver EPICS3 HDF5 HDF5 HDF5 Detectors Magnets, BPMs, etc Beamline Experiments EPICS 3 Control System SciDB Node Archiver SciDB Node DAQ EPICS 4 Infrastructure Model Service Another Service V 4 V 4 EPICS 4 Driver EPICS 4 Driver HDF5 Driver HDF5 Driver EPICS4 HDF5 SciDB API Magnets, BPMs, etc Detectors

  13. Thank You • BNL : B. Dalesio, D. Dohan, • Diamond Light Source : J. Rowland • INFN : M. Giacchini • SciDB: J. Becla, P. Brown • SLAC: M. Shankar • Stony Brook University: Y. Kulinich

More Related