270 likes | 485 Views
Introduction to PAT. PAT Tutorial – CERN – December 2011 Felix Höhle. RWTH Aachen. Content. Part I: Brief review of CMSSW Framework Essentials Event Data Model (EDM). Part II: Introduction to PAT The PAT Dataformat The PAT Workflow. Framework Essentials. Framework Essentials.
E N D
Introduction to PAT PAT Tutorial – CERN – December 2011 Felix Höhle RWTH Aachen
Content Part I: Brief review of CMSSW • Framework Essentials • Event Data Model (EDM) • Part II: Introduction to PAT • The PAT Dataformat • The PAT Workflow
Framework Essentials • One executable cmsRunwhich can be configured with python files • Those files contain configurations and parameters for modules written in C++ • You can compose your analysis with these modules • The python config file defines: • Which data is used • Which modules are executed, their parameters and execution order (path) • How these paths are connected to output files The Framework offers to you:
Framework Essentials The Framework has predefined types of modules: • EDAnalyzer:Reads collections and creates histograms • EDFilter:Reads collections and returns a boolean • EDProducer: Reads a collection and writes a new collection in the Event • There are routines to create skeletons for these modules: • mkedanlzr mkedfltr mkedprod • These create the necessary substructure for the modules • BuildFile.xml: Needed for compilation • myanalyzer_cfi.py: Demo python configuration • Myanalyzer.h:Header file • Myanalyzer.cc: Definition file • Compilation with: $ scram b
Framework Essentials The Framework contains a lot of very useful tools: • ROOT is able to read your datafiles: • $ root –l myfile.root
Framework Essentials The Framework contains a lot of very useful tools: • ROOT is able to read your datafiles: • $ root –l myfile.root • Have a look in datafiles with edmDumpEventContent and edmEventSize –v
Framework Essentials The Framework contains a lot of very useful tools: • ROOT is able to read your datafiles: • $ root –l myfile.root • Have a look in datafiles with edmDumpEventContent and edmEventSize –v • Config files can be checked and investigated with python –i • $ python –i myfile_cfg.py • >>> process.mypath • cms.Path(MyJetAnalyzer) • >>> process.MyJetAnalyzer • cms.EDAnalyzer(”MyAnalyzer”, jetTag = cms.InputTag(”myjets”) )
Framework Essentials The Framework contains a lot of very useful tools: • ROOT is able to read your datafiles: • $ root –l myfile.root • Have a look in datafiles with edmDumpEventContent and edmEventSize –v • Config files can be checked and investigated with python –i • And interactively with the edmConfigEditor
Event Data Model (EDM) • The EDM is centered around the concept of an Event • An edm::Event is a C++ Container for RAW and reconstructed data of a particular collision • It is built up of several independent ROOT trees, each entry corresponds to a particular collision • One ROOT tree for one object class Basics of the Event Data Model:
Event Data Model (EDM) • The EDM is centered around the concept of an Event • An edm::Event is a C++ Container for RAW and reconstructed data of a particular collision • It is built up of several independent ROOT trees, each entry corresponds to a particular collision • One ROOT tree for one object class • They are connected via SmartPointersedm::Ref, edm::Ptr, … Basics of the Event Data Model:
Event Data Model (EDM) • Modules can only communicate via the Event • The Event can be extended by modules which can add collections (EDProducer) Basics of the Event Data Model:
Event Data Model (EDM) • Modules can only communicate via the Event • The Event can be extended by modules which can add collections (EDProducer) • These collections are identified within the Event by four quantities:C++ class type, module label, sublabel within module and process name • These is shown in the edmDumpEventContent command Basics of the Event Data Model:
FWLite: A light Version of EDM This is ROOT with known data formats • PAT isfully compatiblewith (and even especially supports) FWLite. • No writingto the event content! • Full framework ↔ FWLite: This isnotan exclusive or! • Python configuration, edm::Handle, TFileService, data access equivalent to EDM • Very useful for plotting and interactive analysis • Have a look at: WorkBookFWLite
Event Data Model (EDM) Difficulties with the EDM: • Retrieval of high level information for an analysis is complicated pointer arithmetic! (What do I need? Where do I find it?) • Reduction to the data needed for an high level analysis is complicated due to high complexity of connections between collections.(Where is the dropped data used troughout the Event?)
Top 5 Analyst‘s Problems PAT can help you with these problems!
Part II Part I: Brief review of CMSSW • Framework Essentials • Event Data Model (EDM) • Part II: Introduction to PAT • The PAT Data Format • The PAT Workflow
What it is the Physics Analsis Toolkit? • PAT is a toolkit which is an integral part of the CMSSW Framework • It is an interface between the some times complicated EDM and the simple mind of a common user • It serves as well tested and supported ground for user and group analysis • It facilitates the reproducibility and comprehensibility of an analsis • If another CMS analyst describes you a PAT analysis you can easily knowwhat he/she is talking about
What it is the Physics Analsis Toolkit? Three main aspects of PAT: Interface: • Between RECO expertise and analysis contacts • Simplifies access via dataformats • Canalizes expertise of POG and PAG Common Tool: • approved algorithms & sensible defaults • synergy (everybody can profit from recent developments) • quick start into analysis for the beginners Common Format • facilitates transfer & comparisons • PAG common configurations • sustained provenance
Facilitated Access to Event Information PAT summarizes information for you: The reco::Candidate is a base class common to all kind of “particles” It has a lot of information from different subdetectors and reconstruction algorithms PAT objects summarize this information which is distributed over different collections When you are using PAT it is just calling a member function to get this information!
PAT Data Formats Concept of PAT Data Formats: All pat::Objects inherit from their corresponding reco::RecoCandidates Additional information (e.g. overlap with other objects) is accessible A PAT Candidate is a Reco Candidate + more All reco::Candidate information is accessible, you don't need to know the details!
PAT Data Formats Have a look in the online documentation: https://twiki.cern.ch/twiki/bin/view/CMSPublic/WorkBookPATDataFormats
PAT Data Formats The PAT Data Formats are configured by the user via the _cfi.py files: Size: 14 kb/event ( for ttbar)
The PAT Workflow Steps of the PAT Workflow: Candidate Creation: aodReco collecting of information which is not in AOD/RECO, e.g. isolation variables, overlaps, … Candidate Production: patCandidates translation of the collected information into pat::Object e.g. pat::Muon, pat::Electron, pat::Jet Candidate Selection: selectedPatCandidates selection of interesting Objects with specific properties e.g. pT > 30 GeV Candidate Disambiguation : cleanPatCandidates Due to the way objects are reconstructed in CMS there are ambiguities: e.g. two objects sharing an energy deposit or track
The Code Location • DataFormats/PatCandidates • Definition of all PAT Candidates. • pat::Photon, pat::Electron, pat::Muon, pat::Tau, pat::Jet, pat::MET, … • PhysicsTools/PatAlgos • Implementation and filling of all data formats. • Definition of common workflow and PAT tools • PhysicsTools/PatUtils • Definition of common tools and helper functions used in • PatAlgos • PhysicsTools/PatExamples • Location of many examples e.g. all non-trivial examples used during this • Tutorial
Documentation • SWGuidePATandWorkBookPATmain documentation pages • WorkBookPATDataFormatsdescription of all PAT Candidate • WorkBookPATWorkflowdescription of the PAT workflow • WorkBookPATConfigurationdescription of the configuration of PAT • SWGuidePATToolsdescription of all PAT tools • WorkBookPATTutorialtutorials and examples to get started • SWGuidePATRecipesinstallation recipes • SWGuidePATEventSizetools for event size estimate • And last but not least: This Tutorial and/or former Tutorials...
Exercises By now you should be prepared to do the following Exercises on WorkBookPATTutorial: Have Fun! Exercise 1:(WorkBookPATDocNavigationExercise) The PAT Documentation is one of the most looked after parts of the WorkBook. To know the documentation and how to use it can speed up your learning curve enormously. Learn more about the PAT Documentation and how to make effective use of it. Exercise 2: (WorkBookTupleCreationExercise) Learn how the default PAT tuple is produced Exercise 3: (SWGuidePATConfigExercise) Learn how to configure PAT and its tools.