200 likes | 324 Views
Oracle Archiver Past Experience. Lessons Learned for Future EPICS Channel Archiver RDB Development. Topics. Background System Elements Key Technologies Retrieval Considerations Table Structure Summary. Background.
E N D
Oracle Archiver Past Experience Lessons Learned for Future EPICS Channel Archiver RDB Development EPICS Collaboration Mtg Oct. 13-14, 2007
Topics • Background • System Elements • Key Technologies • Retrieval Considerations • Table Structure • Summary EPICS Collaboration Mtg Oct. 13-14, 2007
Background • Designed and developed from late 2001 through 2002 by Lee Ann Yasukawa and Bob Hall utilizing version 1 of the EPICS Channel Archiver • Thomas Birke of BESSY developed the Oracle LibIO C++ classes used to support retrieval of Oracle data • Was in production use at SLAC from February 2003 until the end of 2004 EPICS Collaboration Mtg Oct. 13-14, 2007
Factors in Discontinuing Use • The new version 2 of the Channel Archiver became available with many improvements. We could also use the Archiver Viewer with this version, which gave our users an improved browser. • The retrieval speed of the file-based version 2 channel archiver was better than the Oracle archiver. • Discovered in late 2004 that the Stanford license for Oracle did not cover our use of the Oracle partitioning feature at SLAC, which would cost our group approximately $ 10 K. EPICS Collaboration Mtg Oct. 13-14, 2007
System Elements • Oracle Channel Archiver Engine based on the version 1 EPICS Channel Archiver • Channel Archiver Monitor program running every 5 minutes to check the health of the Archiver Engine by reading the Engine log. • Archive Browser written in Python and using SWIG to be able to call LibIO C++ routines. Capable of browsing scalar data in a plot window and storing data of interest into Matlab files for further analysis. • Data may also be viewed using the version 1 EPICS Channel Archiver CGIExport facility. • Matlab Archiver Retrieval facility written in C using Matlab MEX to retrieve Oracle Archiver data via function calls from within Matlab. EPICS Collaboration Mtg Oct. 13-14, 2007
Oracle Archiver SCS AFS File Server MCC Computer Room Web Browsers (for CGI Export) Web Browsers (for CGI Export) Web Server 2-T3 Disks Retrieve Oracle 9i Client RAID Controller Oracle 9i Instance SQL Listener Apache Listener Archive Browser Archive Browser TARF Archiver (opi00gtw04) PEPII Archiver (slcs2) 8 Pack Archiver (opi00gtw05) EPICS Collaboration Mtg Oct. 13-14, 2007
Key Technologies • Oracle Application Direct Path Storage • Oracle Partitioning EPICS Collaboration Mtg Oct. 13-14, 2007
Direct Path Storage • The Oracle direct path mechanism bypasses the normal, conventional-path Oracle buffering mechanism used with conventional SQL INSERT statements to write directly to Oracle data files. • Results in dramatic increases in data storage rates when loading large amounts of data into Oracle. • During scalar EPICS data performance testing over 16,000 samples per second could be stored. EPICS Collaboration Mtg Oct. 13-14, 2007
Direct Path Storage • Oracle Archive Engine calls C routines that call Oracle Call Interface (OCI) routines to store data into Oracle tables using the direct path mechanism. • OCI routines are low-level, C callable routines optimized for performance and provide the only API available for using direct path storage. • OCI routines are quite low-level and require considerable programmer time and effort to call correctly. EPICS Collaboration Mtg Oct. 13-14, 2007
Direct Path Storage • Blocks of data (e.g., 2 K rows) are stored together, not individual rows. This means that data is not stored until a block is full or logic is added to flush partially filled blocks periodically. • Direct path loading locks the storage table for any other Oracle operations other than non-indexed SELECT statements. Therefore, active data storage tables cannot be indexed and other methods must be used to improve data retrieval performance for these tables (e.g. partitioning). • A restricted number of column datatypes are supported for direct path load operations. EPICS Collaboration Mtg Oct. 13-14, 2007
Partitioning • Partitioning allows active data storage and archive data storage tables to be subdivided by time ranges. • This feature can dramatically speed data retrieval requests due to partition elimination. That is, requests for data for a specified time span from a partitioned table subdivided by time ranges allows Oracle to search only those partitions that may contain data for the specified time range rather than the whole table. • This is critical to retrieval performance when data tables become very large. EPICS Collaboration Mtg Oct. 13-14, 2007
Partitioning • Partitioning can allow simplified data maintenance. Maintenance operations that need to be performed on only a subset of partitions do not need to be applied to an entire table. • Data from old partitions may be conveniently stored to table and dropped from online storage. EPICS Collaboration Mtg Oct. 13-14, 2007
Partitioning • At SLAC relatively small 10 minute partitions were used for the active storage data tables to allow fast retrieval of the most recent data, which tends to be the data most requested by users. • After midnight each night a job ran which effectively moved the data in the 10 minute partitions of the previous day for each data storage table into a day long archive partition, which was then indexed to improve data retrieval performance. • This strategy allowed reasonably fast data retrieval while keeping the number of partitions manageable. EPICS Collaboration Mtg Oct. 13-14, 2007
Daily Processing of Partitions • This was accomplished by copying all partitions for the previous day into a temporary table, validating the data in the table and creating indexes, and then exchanging the temporary table (along with the indexes) with a partition in the archived data table. • This method allowed the use of the Oracle PL/SQL “alter table exchange partition” command, which is very fast and does not involve the physical copying of data. EPICS Collaboration Mtg Oct. 13-14, 2007
Daily Processing of Partitions(Scalar Data) arch_data_f arch_data_i temp_data_f temp_data_i archive_data_f archive_data_i daily partitions ~10 min partitions The temporary table, along with indexes, is exchanged with a partition in the archived data tables All partitions for yesterday are copied to a temporary table The data is validated and indexes are created EPICS Collaboration Mtg Oct. 13-14, 2007
Retrieval Considerations • Data retrieval is accomplished by SQL SELECT statements made with OCI routine calls. • The SELECT statement contains the channel identification and the time range of interest. • There are two tables for each supported data type: an active data table and an archive data table. • Direct path loading is done in an active data table, where indexing is not allowed. EPICS Collaboration Mtg Oct. 13-14, 2007
Retrieval Considerations • Since there are two tables for each supported data type, an Oracle view is created to retrieve data for each data type. An Oracle view is a logical table comprised of one or more physical tables. • The retrieval SQL queries the view instead of querying the active and archive data tables directly. The views are created as read-only. • Views allow flexibility as to what data the user has access to. That is, the definition of the view may be changed to retrieve data in a different way without changes to retrieval code. EPICS Collaboration Mtg Oct. 13-14, 2007
Retrieval Considerations • In addition to the use of timestamp range partitioning, the use of indexes for archive data is crucial to retrieval performance. • The archive data tables are indexed by a combination of timestamp and channel identification for best retrieval performance. EPICS Collaboration Mtg Oct. 13-14, 2007
Table Structure Overview pv_name pv_starttime pv_id pv_endtime pv_active arch_pvname pv_id pv_id pv_id data_start data_table data_end arch_meta arch_table_data arch_data_i (scalar integer) arch_data_f (scalar float) archive_wave_f (waveform float) archive_wave_i (waveform integer) pv_id stat value sevr timestamp ostat nanosecs EPICS Collaboration Mtg Oct. 13-14, 2007
Summary • There were some limitations to the Oracle Archiver that we did not have time to address (e.g., the ability to store strings). • Also there were many ideas for new features that were not implemented. A database related example is the creation of an Oracle management web interface to specify the partition scheme and manage user connections. • The SLAC Oracle Archiver project was successful. However, the retrieval performance was only “acceptable” and not nearly as fast as version 2 of the EPICS file-based Channel Archiver. This was the area of greatest concern. At the time the project was abandoned, we did not run out of ideas to experiment with to improve retrieval performance, such as creating better indexes and changing the partition characteristics. EPICS Collaboration Mtg Oct. 13-14, 2007