340 likes | 356 Views
Oracle. A Role in LHC Data Handling? Jamie Shiers, IT-DB Based on work with early releases of Oracle 9i by IT-DB + experiments. The Story So Far…. 1992: CHEP – DB panel , CLHEP K/O, CVS … 1994: start of OO projects 1997: proposal of ODBMS+MSS; BaBar 2001: CMS change of baseline Objy
E N D
Oracle A Role in LHC Data Handling? Jamie Shiers, IT-DB Based on work with early releases of Oracle 9i by IT-DB + experiments
The Story So Far… • 1992: CHEP – DB panel, CLHEP K/O, CVS … • 1994: start of OO projects • 1997: proposal of ODBMS+MSS; BaBar • 2001: CMS change of baseline Objy • 2003: POOL ready for production • POOL production plans include use of EDG Replica Location Service • Deployment on Oracle 9iAS+9iRAC/DB at Tier0/1?
ODBMS • BaBar (SLAC) claim probably largest DB in the world • 681.8 TB stored in 473205 files • CERN 300TB (COMPASS) + HARP (30TB) + CMS (300TB) + others • Recently migrated 300TB@120MB/s out of ODBMS • Oracle + ‘flat file’ solution • Many high level similarities to LHC proposal • Time pressure required pragmatic solution – could not wait for POOL
Migration History - Data Rates http://lxshare075d:8888/
Data Processing Diagram Processing Node LOG 9940 9940B Input diskpools2x200GB Output disk pool ORACLE Castor Castor 10 MB/s overall data throughput per node Sustained rates of 120MB/s over 24 hour periods
Oracle for LHC? Numerous concrete examples for non-Physics data Machine construction / controls Detector construction / assembly Physics infrastructure (book-keeping, catalogues etc.)
Oracle & LHC – What is ~Clear • Will continue to be used as part of EDMS service • Will continue be used “à la LEP” for logging, monitoring, control of LHC • Will continue to be used for detector construction / assembly / monitoring • Total Data: ~10TB • 2nd Sun cluster for physics apps ~300GB disk • Growing maybe to 10TB by LHC startup
Oracle & LHC – What is Likely • Will continue to be used as part of EDMS service • Will continue be used “à la LEP” for logging, monitoring, control of LHC • Will continue to be used for detector construction / assembly / monitoring • Total Data: ~10TB • 2nd Sun cluster for physics apps ~300GB disk CERN Engineering Data Management System ~300,000 documents, many related to LHC construction
Oracle Usage: Some Examples • Detector DB • Conditions DB • Run Catalogues • The Grid Details in hidden slides
The Grid Example: POOL file catalogue (based on EDG-RLS)
POOL File Catalogue • Require ~106 entries / expt now • Rising to ~108 / 109 (?) in 2008 / 2020 • A few KB / entry; a few TB total • Implementation based on EDG-RLS • Deployed at Tier0/Tier1 on • Oracle 9iAS / Oracle9iRAC • Have to demonstrate it can meet requirements (# concurrent users / transaction rate / manageability / cost of ownership) • Fall back: 9iAS + non-RAC (Tomcat/MySQL at Tier2/3) • Open question about event-level meta-data • COMPASS / HARP 100-200bytes/event • LEP “collaboration Ntuple” 200 columns = 1KB/event • Could result in 100TB – 1PB data volumes
Oracle for Physics Data • Focus on scalability issues: • Current Very Large Database (VLDB) market in 1-50TB • Can we really extend by 3 orders of magnitude?
Oracle for Physics DataKey Issues • Complexity of data • Oracle’s support for Objects? • C++ binding (OCCI) • Volume of data • Several hundred PB • Oracle 9i technologies: • VLDB support • 9iRAC
Oracle for Physics DataKey Issues • Complexity of data • Oracle’s support for Objects? • C++ binding • Oracle C++ Call Interface (OCCI) • Object Type Translator (OTT) • Volume of data • Oracle 9i technologies
OCCI / OTT Can handle HEP data models • Define data model using SQL • Generate C++ definitions & code using OTT • Add user attributes & code in classes that inherit from generated ones • Tested for a variety of non-trivial data models • Objects embedded by value and/or reference • Arrays of … • Polymorphic tables • Templated transient classes with multiple inheritance on the transient side
Oracle for Physics DataKey Issues • Complexity of data • Extensive use of Oracle’s support for Objects • C++ binding (OCCI) • Volume of data • Several hundred PB • Oracle 9i technologies: • VLDB support • 9iRAC
Data R A W E S D A O D TAG 1TB/yr 10TB/yr 100TB/yr Tier1 1PB/yr (1PB/s prior to reduction!) Tier0 random seq. Users
LHC Data Volumes Data Category Annual Total RAW 1-3PB 10-30PB Event Summary Data - ESD 100-500TB 1-5PB Analysis Object Data - AOD 10TB 100TB TAG 1TB 10TB Total per experiment ~4PB ~40PB Grand totals (15 years) ~16PB ~250PB
Divide & Conquer • Split data from different experiments • Split different data categories • Different schema, users, access patterns,… • Focus on mainstream technologies & low-risk solutions • VLDB target: 100TB databases • How do we build 100TB databases? • How do we use 100TB databases to solve 100PB problem?
Why 100TB DBs? • Possible today • Expected to be mainstream within a few years • Vendors must provide support • (See also hidden slides)
Oracle for Physics DataKey Issues • Complexity of data • Extensive use of Oracle’s support for Objects • C++ binding (OCCI) • Volume of data • Several hundred PB • Oracle 9i technologies: • 9iRAC • VLDB support
Potential Benefits of 9iRAC • Scalability • Allows 100TB databases to be supported using commodity h/w: Intel/Linux server nodes • Manageability • Small number of RAC manageable with foreseeable resources: tens – hundreds of smaller single instances not • Better Resource Utilization • Shared disk architecture avoids hot-spots and idle / overworked nodes • Shared cache improves performance for frequently accessed read-only data
LHC Data Volumes Data Category Annual Total RAW 1-3PB 10-30PB Event Summary Data - ESD 100-500TB 1-5PB Analysis Object Data - AOD 10TB 100TB TAG 1TB 10TB Total per experiment ~4PB ~40PB Grand totals (15 years) ~16PB ~250PB
100TB DBs & LHC Data • Analysis data: 100TB ok for ~10 years • One 9iRAC per experiment • Intermediate: 100TB ~1 year’s data • ~40 9iRACs • RAW data: 100TB = 1 month’s data • 400 9iRACs to handle all RAW data • 10 RACs / year, 10 years, 4 experiments
RAW Data: a few PB / year • Access pattern: sequential • Access frequency: ~once per year • Use time partitioning + offline tablespaces • Historic data copied to “tape” • “Eventuellement” dropped from DB catalogue • Restored on demand • 100TB = 10 day time window • Current data (1 RAC) historic data (2nd RAC)
Partitions & Files: Limits • Currently limited to 216 • 179 years if 1 partition / day • ~500TB DBs with ~10GB files • Current practical limit is 38,000 files / DB • Sufficient to build 100TB DBs • Need to be raised at some stage in the future…
Event Summary Data (ESD) • ~100-500TB / experiment / year • Yotta-byte DBs predicted by 2020! • 1000,000,000 TB • Can RAC capabilities grow fast enough to permit just 1 RAC / experiment? • ++500TB / year • An open question …
Oracle Deployment DAQ cluster: current data – no history reconstruct analysis export tablespaces to RAW cluster AOD/TAG 1 total? ESD cluster: 1/year? 1? to/from MSS to RCs to/from RCs
VLDB issues • Oracle addressing limits of current architecture • Already permits 2EB databases theoretically… • Limits on e.g. # files, partitions etc are expected to be significantly increased beyond Oracle 9i • Limited to 216 architecturally, 38K measured • An area of work, but not concern…
Storage Issues • Oracle Number format (8 bytes) provides greater precision than IEEE double (22 B) • Mapping 1000 classes with numeric data members to Oracle Number requires effort! • Solutions being investigated to allow efficient storage of floats / doubles / ints without user specifying precision / range • Target: next major Oracle release?
If You Want to Know More… http://cern.ch/LCG/ http://cern.ch/db/ http://cern.ch/hep-proj-database/
Summary – Oracle for LHC • A clear & important role to play • Likely to be used for non-event data • Hybrid solution (POOL) is the baseline for physics data • RDBMS backend to POOL in progress