1 / 9

Physics Analysis inside the Oracle DB

Physics Analysis inside the Oracle DB. Progress report 10 Octobre 2013. Performance with parallel scaling. Original result showed DB-version of the H+Z benchmark was faster than serial execution of the root- ntuple - analysis,on test3 -setup.

azia
Download Presentation

Physics Analysis inside the Oracle DB

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Physics Analysis inside the Oracle DB Progress report 10 Octobre 2013

  2. Performance with parallel scaling • Original result showed DB-version of the H+Z benchmark was faster than serial execution of the root-ntuple-analysis,ontest3-setup • Parallelism with root can be mimicked by running multiple simultaneous jobs each running on a subset of data • Ntuple-version improves more with parallel execution as the DB is limited by IO • more data needs to be read by the DB compared to the column-storage in the ntuples • Test3-setup has datafiles stored on nfs, resulting in relativly slow I/O speed of ~250 MB/s

  3. Ttbarcutflow analysis • A cutflow analysis for the top-pair production cross-section measurement was implemented as a new benchmark. • Original “RootCore”-packages used by ATLAS top physics group are compared to a modified set of packages that retrieve data from the DB via an SQL-query. More realistic than Higgs+Zbenchark: • uses 262 variables (compared to 40 in Higgs+Z) • also uses data from photon and primary-vertex objects (photon-table=114 GB table!) • Another 3 external libraries added, used to call functions for corrections on electron and photon objects • Selection of electron and photon objects can not be done as single table select, the corrections depends on the number of vertices in the event (pile-up) so an inner join with the vertex-table is required Only “electron”-channel implemented so far but “muon”-channel is very similar…

  4. Performance with parallel scaling • Ttbarcutflow analysis from root-ntuples is very slow in serial! • But when running multiple simultaneous root-jobs it again becomes faster than DB-version • … this is on the test3-setup where I/O is relatively slow

  5. Mapred-cluster • The mapred-cluster (described in previous report) has better I/O reads • 5 nodes connected to 5 disk arrays with a total of 60 disks -> up to 2500 MB/s • As shown last time root-ntuple analysis is faster on this cluster for Higgs+Z benchmark: 40 root-jobs: 71 seconds SQL parallel 40: 135 seconds

  6. Mapred-cluster • But the root-ntuple analysis is slower than DB for ttbarcutflow analysis SQL parallel 40: 372 seconds 40 root-jobs: 588 seconds • Optimistic conclusion: Oracle DB beats root-ntuple analysis for realistic physics analysis, when given fast enough I/O reads!

  7. ttbarcutflow: why so slow? • But WHY is the ttbarcutflow analysis so much slower than Higgs+Z for the ntuple-analysis on the “mapred”-cluster ? • As a test I rewrote the ttbar-analysis packages to return the result only for a specific sub-selection, and similary broke the SQL-version down to produce the same result • This allows to compare the timing results for DB vsntuple for different type of selections! • The DB analysis was always faster, even though much less branches need to be loaded for the separate object-selection than for the full ttbarcutflow (=262 branches) • This looked suspicious to me!

  8. ttbarcutflow: why so slow? • I wrote a single root-macro to reproduce the same results as the single-electron-select from the ttbar-packages • This macro was much faster: 55 seconds instead of 214 seconds • I think the difference in time is due to the ttbar-packages disabling branches and calling Tree->GetEntry() while my root-macro call branch->GetEntry(), this turns out to make a large difference in CPU: good-electron, simple macro w. tree->GetEntry(), 40 root-jobs: Time = 156 seconds good-electron, simple macro w. load branches, 40 root-jobs: Time = 156 seconds

  9. Summary • Real analysis code is not always optimized to run as fast as possible! • ttbarcutflow is an interesting case-study and it uses many different object-selections that can be studied separately • Experience from converting ttbarcutflow in SQL: • Using a materialized view for the goodrunlist-selection • Adding external libraries (PL/SQL calling Java calling C++) is easy once you know how to!  • It is more difficult when selection requires cross-table selection with INNER JOIN (photon/electron selection requiring info from vertex-table for pileup correction) but not impossible • I’m using lots of MATERIALIZE hints, I don’t trust the SQL optimizer as it goes a crazy with all those JOINs… • I still have a single analysis query to run entire cutflow, eventually might be better/easier to write the code by explicitly creating tables to hold intermediate selection (using NOLOGGING-option and parallel dml hints to speed up table-creation) • To do: • Finish ttbarcutflow, there are still some cuts requiring cross-table matching not implemented • Add muon-channel, ntuple-analysis produces 2 cut-flow for muon- and electron-channel in a single analysis as they use the results after good object selection, I can do the same in SQL but I will need to create intermediate tables to hold temporary selection

More Related