280 likes | 397 Views
Physics analysis & databases M. Limper. Part 1: SQL analytics (live physics analysis demo) Part 2: Chopped-up tables and partition-wise joins. 20 /03/2014. Physics Analysis demo. SQL analysis on data stored in database VS PROOF-analysis on data stored in root- ntuples (on same hardware)
E N D
Physics analysis & databasesM. Limper Part 1: SQL analytics (live physics analysis demo) Part 2: Chopped-up tables and partition-wise joins 20/03/2014
Physics Analysis demo SQL analysis on data stored in database VS PROOF-analysis on data stored in root-ntuples (on same hardware) • I prepared some ‘easy’ SQL statements to demonstrate SQL analysis • Easy to discover the Z-boson, W-boson and various quarkonium states Z-boson W-boson Charmonium J/ψ Ψ(3686)
ROOT analysis via SQL Basic DB-user-account used in DEMO: • Has limited QUOTA (1024 MB) on USERS-tablespace • Has permission to execute functions analysistools-schema (and loadLibrary-permissions for C++ classes loaded via JNI by analysistools) • Has permission to select on DATA12_8TEV-schema (the data-owner) • Datat stored in DB running Oracle RAC on 5-node cluster SQLPlotter-interface used in DEMO: • SQLPlotter=my pre-compiled C++ macro using ROOT-classes • Executes queries using TSQLServer-interface provided by ROOT • Creates a cached table (‘QUERY_RESULT’) holding results for given SQL • Query can be provided on command-line or provided via SQL-file (=easier for long queries are easier debug) • User can plot any column from the ‘QUERY_RESULT’-table, with the option of filtering the data with an additional predicate
ROOT analysis via PROOF PROOF: Parallel Root Framework • Use PROOF-on-demand on my 5-node cluster to start 8 PROOF-workers per node • 127 root-ntuples (same dataset as in DB) distributed over 5 nodes distributed files access through xrootd My PROOF-analysis DEMO: • .x open_proof_session connect to the 40 proof-workers • .x define_testdata.C makes TDataSet out of 127 ntuples • testdata.Process(MySelector.C++) • MySelector=Example implementation of standard TSelector-class • Macro compiled on each worker, PROOF distributes events over workers • User defines analysis (per event) in Process-function • Histograms filled per worker and later summed • My implementation only loads the relevant branches per event • This means significantly more code, but makes the analysis much faster
ROOT analysis via SQL Analysis via SQL on DB advantages: • No need to re-compile macro when adjusting analysis • SQL is less code than root C++-macro • True for simple demo-example, more complex to combine results of multiple queries? • The DB holds intermediate query-results • Can decided what to plot afterwards • No need to store filtered results in ntuples • User can analyse the data anywhere with ROOT and connection to DB Analysis via SQL on DB disadvantages: • Administrator needs to prevent users from tying up all resources in badly-written SQL • Analysis SQL is maybe not that easy to read/write/debug • Complex functions from C++ to be made available centrally • User can store PL/SQL functions in own schema • But only dba can ensure C++ called via JNI
Analytics in SQL Writing analysis SQL, di-muon pair select example: with "sel_muon" as (select "muon_i","RunNumber","EventNumber","E","px","py","pz","charge","pt","phi","eta" from DATA12_8TEV_2."muon_p40" where "pt" > 10000. and abs("eta") < 2.7 and "tight" = 1 and ("id_d0"<10. or abs("eta")>2.5) and "ptcone20"<0.1*"pt") select "RunNumber","EventNumber",mu_sel_n, muon0."muon_i" as mu_id0, muon1."muon_i" as mu_id1, muon0."pt"/1000. as pt0, muon1."pt"/1000. as pt1, muon0."eta" as eta0, muon1."eta" as eta1, (case when abs(muon0."phi"-muon1."phi")<acos(-1.) then sqrt(POWER(abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) else sqrt(POWER( 2.*acos(-1.) - abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) end) as DELTAR, ANALYSISTOOLS.PHYSANALYSIS.INV_MASS_LEPTONS(muon0."E",muon1."E",muon0."px",muon1."px",muon0."py",muon1."py",muon0."pz",muon1."pz")/1000. as INV_MASS from DATA12_8TEV_2."periodAllYear_v47-pro13-01" INNER JOIN (select "RunNumber","EventNumber",COUNT(*) as mu_sel_n from "sel_muon" group by ("RunNumber","EventNumber")) USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon0 USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon1 USING ("RunNumber","EventNumber") where muon0."muon_i"<muon1."muon_i" and muon0."charge" != muon1."charge" and mu_sel_n=2;
Analytics in SQL Writing analysis SQL, di-muon pair select example: with "sel_muon" as (select "muon_i","RunNumber","EventNumber","E","px","py","pz","charge","pt","phi","eta" from DATA12_8TEV_2."muon_p40" where "pt" > 10000. and abs("eta") < 2.7 and "tight" = 1 and ("id_d0"<10. or abs("eta")>2.5) and "ptcone20"<0.1*"pt") select "RunNumber","EventNumber",mu_sel_n, muon0."muon_i" as mu_id0, muon1."muon_i" as mu_id1, muon0."pt"/1000. as pt0, muon1."pt"/1000. as pt1, muon0."eta" as eta0, muon1."eta" as eta1, (case when abs(muon0."phi"-muon1."phi")<acos(-1.) then sqrt(POWER(abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) else sqrt(POWER( 2.*acos(-1.) - abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) end) as DELTAR, ANALYSISTOOLS.PHYSANALYSIS.INV_MASS_LEPTONS(muon0."E",muon1."E",muon0."px",muon1."px",muon0."py",muon1."py",muon0."pz",muon1."pz")/1000. as INV_MASS from DATA12_8TEV_2."periodAllYear_v47-pro13-01" INNER JOIN (select "RunNumber","EventNumber",COUNT(*) as mu_sel_n from "sel_muon" group by ("RunNumber","EventNumber")) USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon0 USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon1 USING ("RunNumber","EventNumber") where muon0."muon_i"<muon1."muon_i" and muon0."charge" != muon1."charge" and mu_sel_n=2; This is the view defining the “good”-muon pre-select
Analytics in SQL Writing analysis SQL, di-muon pair select example: with "sel_muon" as (select "muon_i","RunNumber","EventNumber","E","px","py","pz","charge","pt","phi","eta" from DATA12_8TEV_2."muon_p40" where "pt" > 10000. and abs("eta") < 2.7 and "tight" = 1 and ("id_d0"<10. or abs("eta")>2.5) and "ptcone20"<0.1*"pt") select "RunNumber","EventNumber",mu_sel_n, muon0."muon_i" as mu_id0, muon1."muon_i" as mu_id1, muon0."pt"/1000. as pt0, muon1."pt"/1000. as pt1, muon0."eta" as eta0, muon1."eta" as eta1, (case when abs(muon0."phi"-muon1."phi")<acos(-1.) then sqrt(POWER(abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) else sqrt(POWER( 2.*acos(-1.) - abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) end) as DELTAR, ANALYSISTOOLS.PHYSANALYSIS.INV_MASS_LEPTONS(muon0."E",muon1."E",muon0."px",muon1."px",muon0."py",muon1."py",muon0."pz",muon1."pz")/1000. as INV_MASS from DATA12_8TEV_2."periodAllYear_v47-pro13-01" INNER JOIN (select "RunNumber","EventNumber",COUNT(*) as mu_sel_n from "sel_muon" group by ("RunNumber","EventNumber")) USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon0 USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon1 USING ("RunNumber","EventNumber") where muon0."muon_i"<muon1."muon_i" and muon0."charge" != muon1."charge" and mu_sel_n=2; Calculate opening-angle between the two muons
Analytics in SQL Writing analysis SQL, di-muon pair select example: with "sel_muon" as (select "muon_i","RunNumber","EventNumber","E","px","py","pz","charge","pt","phi","eta" from DATA12_8TEV_2."muon_p40" where "pt" > 10000. and abs("eta") < 2.7 and "tight" = 1 and ("id_d0"<10. or abs("eta")>2.5) and "ptcone20"<0.1*"pt") select "RunNumber","EventNumber",mu_sel_n, muon0."muon_i" as mu_id0, muon1."muon_i" as mu_id1, muon0."pt"/1000. as pt0, muon1."pt"/1000. as pt1, muon0."eta" as eta0, muon1."eta" as eta1, (case when abs(muon0."phi"-muon1."phi")<acos(-1.) then sqrt(POWER(abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) else sqrt(POWER( 2.*acos(-1.) - abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) end) as DELTAR, ANALYSISTOOLS.PHYSANALYSIS.INV_MASS_LEPTONS(muon0."E",muon1."E",muon0."px",muon1."px",muon0."py",muon1."py",muon0."pz",muon1."pz")/1000. as INV_MASS from DATA12_8TEV_2."periodAllYear_v47-pro13-01" INNER JOIN (select "RunNumber","EventNumber",COUNT(*) as mu_sel_n from "sel_muon" group by ("RunNumber","EventNumber")) USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon0 USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon1 USING ("RunNumber","EventNumber") where muon0."muon_i"<muon1."muon_i" and muon0."charge" != muon1."charge" and mu_sel_n=2; Calculate invariant mass of the two muons
Analytics in SQL Writing analysis SQL, di-muon pair select example: with "sel_muon" as (select "muon_i","RunNumber","EventNumber","E","px","py","pz","charge","pt","phi","eta" from DATA12_8TEV_2."muon_p40" where "pt" > 10000. and abs("eta") < 2.7 and "tight" = 1 and ("id_d0"<10. or abs("eta")>2.5) and "ptcone20"<0.1*"pt") select "RunNumber","EventNumber",mu_sel_n, muon0."muon_i" as mu_id0, muon1."muon_i" as mu_id1, muon0."pt"/1000. as pt0, muon1."pt"/1000. as pt1, muon0."eta" as eta0, muon1."eta" as eta1, (case when abs(muon0."phi"-muon1."phi")<acos(-1.) then sqrt(POWER(abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) else sqrt(POWER( 2.*acos(-1.) - abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) end) as DELTAR, ANALYSISTOOLS.PHYSANALYSIS.INV_MASS_LEPTONS(muon0."E",muon1."E",muon0."px",muon1."px",muon0."py",muon1."py",muon0."pz",muon1."pz")/1000. as INV_MASS from DATA12_8TEV_2."periodAllYear_v47-pro13-01" INNER JOIN (select "RunNumber","EventNumber",COUNT(*) as mu_sel_n from "sel_muon" group by ("RunNumber","EventNumber")) USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon0 USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon1 USING ("RunNumber","EventNumber") where muon0."muon_i"<muon1."muon_i" and muon0."charge" != muon1."charge" and mu_sel_n=2; The “good-lumi-block” selection
Analytics in SQL Writing analysis SQL, di-muon pair select example: with "sel_muon" as (select "muon_i","RunNumber","EventNumber","E","px","py","pz","charge","pt","phi","eta" from DATA12_8TEV_2."muon_p40" where "pt" > 10000. and abs("eta") < 2.7 and "tight" = 1 and ("id_d0"<10. or abs("eta")>2.5) and "ptcone20"<0.1*"pt") select "RunNumber","EventNumber",mu_sel_n, muon0."muon_i" as mu_id0, muon1."muon_i" as mu_id1, muon0."pt"/1000. as pt0, muon1."pt"/1000. as pt1, muon0."eta" as eta0, muon1."eta" as eta1, (case when abs(muon0."phi"-muon1."phi")<acos(-1.) then sqrt(POWER(abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) else sqrt(POWER( 2.*acos(-1.) - abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) end) as DELTAR, ANALYSISTOOLS.PHYSANALYSIS.INV_MASS_LEPTONS(muon0."E",muon1."E",muon0."px",muon1."px",muon0."py",muon1."py",muon0."pz",muon1."pz")/1000. as INV_MASS from DATA12_8TEV_2."periodAllYear_v47-pro13-01" INNER JOIN (select "RunNumber","EventNumber",COUNT(*) as mu_sel_n from "sel_muon" group by ("RunNumber","EventNumber")) USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon0 USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon1 USING ("RunNumber","EventNumber") where muon0."muon_i"<muon1."muon_i" and muon0."charge" != muon1."charge" and mu_sel_n=2; Require exactly 2 good-muons per event
Analytics in SQL, example Writing analysis SQL, di-muon pair select example: with "sel_muon" as (select "muon_i","RunNumber","EventNumber","E","px","py","pz","charge","pt","phi","eta" from DATA12_8TEV_2."muon_p40" where "pt" > 10000. and abs("eta") < 2.7 and "tight" = 1 and ("id_d0"<10. or abs("eta")>2.5) and "ptcone20"<0.1*"pt") select "RunNumber","EventNumber",mu_sel_n, muon0."muon_i" as mu_id0, muon1."muon_i" as mu_id1, muon0."pt"/1000. as pt0, muon1."pt"/1000. as pt1, muon0."eta" as eta0, muon1."eta" as eta1, (case when abs(muon0."phi"-muon1."phi")<acos(-1.) then sqrt(POWER(abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) else sqrt(POWER( 2.*acos(-1.) - abs(muon0."phi"-muon1."phi"),2)+POWER(abs(muon0."eta"-muon1."eta"),2)) end) as DELTAR, ANALYSISTOOLS.PHYSANALYSIS.INV_MASS_LEPTONS(muon0."E",muon1."E",muon0."px",muon1."px",muon0."py",muon1."py",muon0."pz",muon1."pz")/1000. as INV_MASS from DATA12_8TEV_2."periodAllYear_v47-pro13-01" INNER JOIN (select "RunNumber","EventNumber",COUNT(*) as mu_sel_n from "sel_muon" group by ("RunNumber","EventNumber")) USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon0 USING ("RunNumber","EventNumber") INNER JOIN "sel_muon" muon1 USING ("RunNumber","EventNumber") where muon0."muon_i"<muon1."muon_i" and muon0."charge" != muon1."charge" and mu_sel_n=2; Self-join of the “sel_muon”-view to select 2 good muons in the same event, with opposite charge
Analytics in SQL Execution plan for query on previous slide: Pre-selection to TEMP • (Temp Usage: 190 MB) GROUP BY and SELF-JOIN
Analytics in SQL I can rewrite this query using analytics functions: with "sel_muon" as (select "muon_i","RunNumber","EventNumber","E","px","py","pz","charge","pt","phi","eta" from DATA12_8TEV_2."muon_p40" where "pt" > 10000. and abs("eta") < 2.7 and "tight" = 1 and ("id_d0"<10. or abs("eta")>2.5) and "ptcone20"<0.1*"pt" ), "sel_2muon" as ( select "RunNumber","EventNumber", FIRST_VALUE("muon_i") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_id0, FIRST_VALUE("charge") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_charge0, FIRST_VALUE("E") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_E0, FIRST_VALUE("px") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_px0, FIRST_VALUE("py") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_py0, FIRST_VALUE("pz") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_pz0, FIRST_VALUE("phi") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_phi0, FIRST_VALUE("eta") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_eta0, FIRST_VALUE("pt") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_pt0, LAST_VALUE("muon_i") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_id1, LAST_VALUE("charge") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_charge1, LAST_VALUE("E") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_E1, LAST_VALUE("px") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_px1, LAST_VALUE("py") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_py1, LAST_VALUE("pz") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_pz1, LAST_VALUE("phi") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_phi1, LAST_VALUE("eta") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_eta1, LAST_VALUE("pt") OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_pt1, ROW_NUMBER() OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i") rn, COUNT(*) OVER (PARTITION BY "RunNumber","EventNumber" ORDER BY "muon_i" RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) mu_sel_n from "sel_muon") Select "RunNumber","EventNumber",mu_sel_n,mu_id0,mu_id1,mu_pt0,mu_pt1,mu_eta0,mu_eta1, (case when abs(mu_phi0-mu_phi1)<acos(-1.) then sqrt(POWER(abs(mu_phi0-mu_phi1),2)+POWER(abs(mu_eta0-mu_eta1),2)) else sqrt(POWER( 2.*acos(-1.) - abs(mu_phi0-mu_phi1),2)+POWER(abs(mu_eta0-mu_eta1),2)) end) as DELTAR, ANALYSISTOOLS.PHYSANALYSIS.INV_MASS_LEPTONS(mu_E0,mu_E1,mu_px0,mu_px1,mu_py0,mu_py1,mu_pz0,mu_pz1)/1000. as INV_MASS from DATA12_8TEV_2."periodAllYear_v47-pro13-01" INNER JOIN "sel_2muon" USING ("RunNumber","EventNumber") where mu_charge0 != mu_charge1 and mu_sel_n=2 and rn=1 ; This query is FASTER: 24 seconds instead of 32 seconds More SQL to write: need to define the window (“OVER (PARTITION BY …”) for every variable!
Analytics in SQL Execution plan for query on previous slide: NOTE: Window-functions won’t for every analysis, a query involving combinatoricsstill require self-joins (for example: find the best muon-pair from all possible pair combinations in one event) Partition-wise-join! window sort WINDOW SORT= no need for group by and self-join : first and last muon selected in window which contained two muons (Temp Usage: 4 MB)
Analytics in SQL How to make this SQL more readable/easier to write • If the GROUP/PARTITION BY clause could auto-vectorize[]any variable not used as group-by-column, my SQL would look something like this: select “RunNumber”, “EventNumber”, “muon_i”[] , “charge”[], “pt”[], “phi”[], “eta”[] , INV_MASS(“E”[0] , “px”[0] , “py”[0], “pz”[0] , “E”[1] , “px”[1], “py”[1], “pz”[1] ) from sel_muon group by (“RunNumber”,“EventNumber”) HAVING COUNT(*)=2 ) • SQL-analysis would be easier if I could hide JOIN USING and GROUP/PARTITION BY (“RunNumber”, “EventNumber”) • Maybe use an interface that translate a physics-data request into SQL?
Part 2: Chopped-up tables and partition-wise joinsshort version
JOINs & Physics Analysis Summary of my project so far: • A lot of variables are involved to describe a physics analysis event • A “big data challenge”: how to provide access to all this information in the fastest and easiest way? • My approach was to divided data in separate tables per physics object • Allows quick per-object predicate filtering • Object irrelevant for specific analysis can be ignored, saving I/O • JOIN USING (“RunNumber”, “EventNumber”) combine sub-selection to create an analysis • PL/SQL or PL/SQL calling C++ via JNI used to make calculations not easily written in SQL Problem: • Tables still have too much unused columns per row, wasting I/O in typical analysis • JOIN USING (“RunNumber”, “EventNumber” ) is fast when sub-selection yields few results but does not scale well with increasing number of events Store data per object over multiple sub-tables, to reduce number of columns per table? • Requires efficient joining of tables • HASH JOIN of many tables uses a lot of memory • Solved by increasing table-partitioning and ensure parallelism equal to number of partitions to trigger partition-wise joins Partition-wise joins don’t work with self-joins and/or group by statements (see next slides)
Partition-wise JOIN & sub-tables Single table select: 59 s Single-table select: with sel_electron as (select /*+ PARALLEL(40) */ "electron_i","RunNumber","EventNumber“ from DATA12_8TEV."electron“ where PREDICATE5) select count(*) from sel_electron; IO reads: 94 GB
Partition-wise JOIN & sub-tables 6 sub-table select without partitions: 334 s Sub-table select: with sel_electron as (select /*+ PARALLEL(40) USE_HASH (a b c d e) */ "electron_i","RunNumber","EventNumber“ from “el_IOT_main” a LEFT OUTER JOIN “el_IOT_track" b USING ("RunNumber","EventNumber","electron_i") LEFT OUTER JOIN “el_IOT_cone” c USING ("RunNumber","EventNumber","electron_i") LEFT OUTER JOIN “el_IOT_rest1” d USING ("RunNumber","EventNumber","electron_i") LEFT OUTER JOIN “el_IOT_rest3” e USING ("RunNumber","EventNumber","electron_i") LEFT OUTER JOIN “el_IOT_rest2a” e USING ("RunNumber","EventNumber","electron_i") where PREDICATE5) ) select count(*) from sel_electron; IO reads: 114 GB IO writes: 44 GB
Partition-wise JOIN & sub-tables 6 sub-table select with partition-wise-join: 75 s Sub-table select: with sel_electron as (select /*+ PARALLEL(40) USE_HASH (a b c d e) */ "electron_i","RunNumber","EventNumber“ from “el_IOT_main” a LEFT OUTER JOIN “el_IOT_track" b USING ("RunNumber","EventNumber","electron_i") LEFT OUTER JOIN “el_IOT_cone” c USING ("RunNumber","EventNumber","electron_i") LEFT OUTER JOIN “el_IOT_rest1” d USING ("RunNumber","EventNumber","electron_i") LEFT OUTER JOIN “el_IOT_rest3” e USING ("RunNumber","EventNumber","electron_i") LEFT OUTER JOIN “el_IOT_rest2a” e USING ("RunNumber","EventNumber","electron_i") where PREDICATE5) ) select count(*) from sel_electron; IO reads: 84.2 GB IO writes: 3.6 GB
Partition-wise JOIN & self-join Find electrons selected from 6 sub-tables join with HASH40-tables and plot inverse-mass of any selected-pair combination in one event (92 s) partion-wise-join Query result shows Z ee production! self-join • Runs in 92 s, selects 149547 electron-pairs • SELF-JOIN outside of PX PARTITION HASH ALL • Could SELF-JOIN not have been done inside the partition-loop?
Partition-wise JOIN: Wμν selection create table QUERY_RESULT CACHE AS with sel_muon as (select "muon_i","RunNumber","EventNumber","E","px","py","pz","charge","pt","phi","eta" from "muon_p40" where "pt" > 10000. and abs("eta") < 2.7 and "tight" = 1 and ("id_d0"<10. or abs("eta")>2.5) and "ptcone20"<0.1*"pt" ), w_candidates as (select "RunNumber","EventNumber",mu_pt,deltaPhi,M_T,Wcand_pt from “periodAllYear_v47-pro13-01” INNER JOIN sel_muonUSING("RunNumber","EventNumber") INNER JOIN "MET_p40" MET USING ("RunNumber","EventNumber") ) select * from w_candidates ; PLAN without partitions (85s): PLAN with 40 HASH PARTITIONS (45s): Partition-wise join with 1 bloom filter (Temp Usage: 320 MB) Uses Hash joins with multiple bloom filters (Temp Usage: 1.1 GB)
Partition-wise JOIN: Wμν selection create table QUERY_RESULT CACHE AS with sel_muon as (select "muon_i","RunNumber","EventNumber","E","px","py","pz","charge","pt","phi","eta" from "muon_p40" where "pt" > 10000. and abs("eta") < 2.7 and "tight" = 1 and ("id_d0"<10. or abs("eta")>2.5) and "ptcone20"<0.1*"pt" ), w_candidates as (select "RunNumber","EventNumber",mu_pt,deltaPhi,M_T,Wcand_pt from “periodAllYear_v47-pro13-01” INNER JOIN sel_muonUSING("RunNumber","EventNumber") INNER JOIN "MET_p40" MET USING ("RunNumber","EventNumber") INNER JOIN (select "RunNumber","EventNumber",count(*) as mu_sel_n from sel_muon group by ("RunNumber","EventNumber")) USING ("RunNumber","EventNumber") where mu_sel_n=1 ) ) select * from w_candidates ; GROUP BY kills the partition-wise join! Time to run query is 74 s (Temp Usage: 6 GB)
JOINs & Physics Analysis Partition-wise join is only half the solution, why not run the entire query per-partition? • Ideallyeach parallel server would perform all operations on a sub-set of events! • Example ZH-benchmark: select events with two good muon or two good electrons and two good jets: • Requires pre-select to determine good-object • Requires self-join to find two good objects • Requires join between multiple tables electron muon jet output P1 10k events SELF JOIN JOIN JOIN SELF JOIN SELF JOIN 10k events output P2 SELF JOIN JOIN JOIN SELF JOIN SELF JOIN output P3 10k events SELF JOIN JOIN JOIN SELF JOIN SELF JOIN Query result=sum ouput parallel servers
Divide & Conquer In the fight against I/O we need a grand divide & conquer strategy! • Smart “horizontal” partitioning, to be defined by the data-owner, group related data together • “vertical” partitioning by groups of events • Distributed data analysis system to pick-out “horizontal” blocks based on analysis requirement and run one parallel server per group of events • And popular data cached using in-memory columnar? Data A Data B Data C Data D Data E Data F X events select A.*, B.*,C.*,D.*,E.*,F.* + X events select A.*, B.*,C.*,D.*,E.*,F.* + X events select A.*, B.*,C.*,D.*E.*,F.* + X events select A.*, B.*,C.*,D.*,E.*,F.* + X events select A.*, B.*,C.*,D.*,E.*,F.*
Divide & Conquer In the fight against I/O we need a grand divide & conquer strategy! • Smart “horizontal” partitioning, to be defined by the data-owner, group related data together • “vertical” partitioning by groups of events • Distributed data analysis system to pick-out “horizontal” blocks based on analysis requirement and run one parallel server per group of events • And popular data cached using in-memory columnar? Data A Data B Data C Data D Data E Data F X events select A.*, C.*,E.*,F.* + X events select A.*, C.*,E.*,F.* + X events select A.*, C.*,E.*,F.* + X events select A.*, C.*,E.*,F.* + X events select A.*, C.*,E.*,F.*
Summary/Outlook Divide & Conquer model: • Ideas on how to improve horizontal and vertical partitioning for physics analysis data: - vertical: make 1 partition per lumi-block - horizontal: use object-detail-levels • Requires DB to run entire query on one parallel per partition and sum the results • Feels like it should be possible within Oracle via partition-wise-join mechanism! • Can somebody tell me how to do this? SQL analysis: also study performance after filtering? • Compare speed and comfort of doing analysis with mini-ntuples vs mini-tables using previously demonstrated analysis-interfaces (PROOF vs database server) • I expect mini-tables cached as in-memory columnar to be super-fast! • Not really a “big data”-problem, but SQL analysis could provide an easy-to-use analysis interface