460 likes | 475 Views
This workshop presentation discusses the architectural styles used in High Energy Physics (HEP) experiments, focusing on the transition to object-oriented systems and the integration with various software tools. It covers topics such as software architecture, frameworks, algorithm/event data styles, persistency, user interface, and physical design.
E N D
Architectural Styles of HEP Experiments Presented by RD Schaffer LCB Workshop Marseilles, September 29 1999 Architecture Session – Rapporteur talk
Architectural Styles of HEP experiments • HEP experiments have been moving their software to Object-oriented systems for a few years • We would like to have a look at the variety of architectural styles which have been evolving • Understanding our software systems in architectural terms should help us both • Improve the designs of the systems we need, and • Allow for better integration with various software tools which are shared across experiments Architecture Session – Rapporteur talk
Architecture: Why are we interested in it? • Each LHC experiment needs to develop a framework to be used in their event data processing applications • physics/detector simulation, high level triggers, reconstruction, analysis, visualization, etc. • The experiment frameworks will incorporate external frameworks/toolkits: e.g. GUI, persistency, simulation • Since it is the architecture which a framework implements • A good starting point is to share ideas on architecture Architecture Session – Rapporteur talk
Software Architecture • Outline of talk: • Architecture: what is it? • Definitions of the terminology/vocabulary • Architecture styles: examples from the literature • Architectural styles in HEP • An example architecture : LHCb’s GAUDI • Variations of algorithm/event data styles • Architectural issues of persistency • User interface (UI) and visualization • Implementation and physical design Architecture Session – Rapporteur talk
Bibliography Notation: […] is used as cross-reference in slides. • Architecture: • [G&S]: D. Garlan and M. Shaw. "An Introduction to Software Architecture," 1994 • http://www.cs.cmu.edu/afs/cs/project/vit/ftp/pdf/intro_softarch.pdf • [S&C]: Shaw and Clements, “Preliminary classification of architectural styles for software systems”, 1996 • http://www.cs.cmu.edu/afs/cs.cmu.edu/project/vit/ftp/pdf/Boxology.pdf • [USDP]: I. Jacobson, et al. “The Unified Software Development Process”, Addison Wesley 1999 • [Booch]: G. Booch, “Object Solutions”, Addison Wesley 1996 Architecture Session – Rapporteur talk
Bibliography (2) • Frameworks: • [Gamma]: E. Gamma, et al., “Design Patterns” Addison-Wesley 1995 • [IBM1]: Building Object Oriented Frameworks(html) • http://www.ibm.com/java/education/oobuilding/index.html • [IBM2]: Leveraging Object Oriented Frameworks (html) • http://www.ibm.com/java/education/ooleveraging/index.html • Physical design: • [Martin]: R. Martin, “Designing O-O C++ Applications using the Booch Method” Prentice Hall 1995 • [Lakos]: J. Lakos, “Large Scale C++ Software Design”, Addison-Wesley 1998 Architecture Session – Rapporteur talk
Architecture: what is it? • Definitions: • architecture: [USDP] • Set of significant decisions about the organization of the software system • Selection of the structural elements and their interfaces which compose the system • Their behavior — collaboration among the structural elements • Composition of these structural and behavioral elements into progressively larger subsystems • The architectural style that guides this organization Software architecture is also concerned functionality, performance, resilience, reuse, comprehensibility, economic and technology constraints and trade-offs, and aesthetic concerns. Architecture Session – Rapporteur talk
Architecture: definitions (2) • toolkits: [Gamma] set of related and reusable classes designed to provide useful, general-purpose functionality Examples • C++ I/O stream library, • containers/iterators/algorithms library, • CLHEP, • GEANT4 Comments • they do not impose a particular design on one’s application • they provide functionality to help one’s application do it’s job Architecture Session – Rapporteur talk
Architecture: definitions (3) • framework: [Booch] + [Gamma] • A kind of micro-architecture that codifies a particular domain • Provides the suitable knobs, slots and tabs that permit clients to use and adapt to specific applications within a given range of behavior A framework is generally composed of two elements: • a set of classes that capture the vocabulary of a particular domain • a control policy that orchestrates the instances of those classes A framework realizes an architecture A large O-O system is constructed from several cooperating frameworks Architecture Session – Rapporteur talk
Architecture: definitions (4) • design pattern: [Gamma] • Description of communicating objects and classes that are customized to solve a general design problem in a particular context Is more abstract than a framework: • a framework itself can be embodied in code • a pattern can only have examples embodied in code Is a smaller architectural unit than a framework Is less specialized than a framework Architecture Session – Rapporteur talk
Architecture: definitions (5) • component: [USDP] • A physical or replaceable part of a system that conforms to and provides the realization of one or more interfaces. • interface: [USDP] • A collection of operations that are used to specify a service of a class or component. Architecture Session – Rapporteur talk
Architecture Styles: Outline • Part I • General categorization of systems [Booch]: • user-centric • data-centric • computation-centric • Part II • Further classification of architectural styles [C&S]: • Constituent parts: components and connectors • Examples of architectural patterns: • pipe-and-filter systems • data abstraction (object-oriented) • implicit invocation • data-centered repository Architecture Session – Rapporteur talk
Architecture Styles: Part I • General categorization of systems [Booch]: • user-centric: • focus on the direct visualization and manipulation of the objects that define a certain domain. • data-centric: • focus upon preserving the integrity of the persistent objects in a system. • computation-centric: • focus is on the transformation of objects that are interesting to the system. Our applications have elements with all three. The interesting question is which one dominates? Architecture Session – Rapporteur talk
Architecture Styles: Part I (2) • Booch describes three dominant layers for each system type • user-centric • classes that provide the system’s look-and-feel • classes that map the GUI layer to the domain model • classes that denote the domain model • data-centric • classes that access and manipulate the domain model • classes that denote the domain model • classes that provide persistence for the domain model Architecture Session – Rapporteur talk
Architecture Styles: Part I (3) • computation-centric • classes whose objects act as agents responsible for carrying outalgorithms that involve the collaboration of several other objects • classes that model the objects transformed by the system • classes to which higher level objects delegate certain more primitive responsibilities, so that common behaviors can be localized and thus reused • examples: STL algorithms, decomposing reconstruction algorithms into track finders, cluster finders, etc. In my opinion it is the computation-centric architectural style that is at the heart of the reconstruction, analysis, and high-level triggers Architecture Session – Rapporteur talk
Architecture Styles: Part II Further classification of architectural styles [C&S] • Constituent parts - the building blocks of architecture • components - a functional unit of software • e.g. objects, processes, filters • connectors - mechanisms that mediates communication, coordination, or cooperation of components • e.g. shared representations, data streams, or data format converters • To uniquely identify a style, one must also specify • the control discipline, the data organization, and the interaction of control and data Architecture Session – Rapporteur talk
Architecture Styles: Part II (2) • Control issues • How control passes among components • e.g. control topology - linear or acyclic, hierarchical/tree-like, star/hub-and-spoke, or arbitrary • How components work together in time • e.g. lockstep (sequential or parallel), synchronous, or asynchronous • Data issues • Data topology (as for control) • Continuity - continuous/sporadic, high/low volume • Mode - passed, shared, or broadcast Architecture Session – Rapporteur talk
Architecture Styles: Part II (3) • Control/data interaction issues • Are control-flow and data-flow topologies isomorphic? • If isomorphic, is the direction the same or opposite? • Useful examples of architectural patterns • Data flow styles: pipe-and-filter systems • Call-and-return styles: data abstraction (object-oriented) • Interacting process styles: implicit invocation • Data-centered repository styles: blackboard • Note: few systems are purely any one of these! Architecture Session – Rapporteur talk
filter pipe Architecture Styles: Part II (4) • Data flow styles: pipe-and-filter systems Filters transform input into output • components: filters - computational i.e. retain minimal state • connectors: data streams • control: asynchronous • data: passed Same topology and direction topologies: arbitrary, acyclic (no feedback), fanout, pipeline (linear) Architecture Session – Rapporteur talk
Architecture Styles: Part II (5) • Call-and-return styles: data abstraction Localized state maintenance (encapsulation) • components: managers • connectors: method invocation • control: decentralized, usually single thread • data: passed • topologies: arbitrary Same topology and direction managers method calls Architecture Session – Rapporteur talk
! ? Architecture Styles: Part II (6) • Interacting process styles: implicit invocation Independent reactive objects (or processes) • components: objects that register interest in “events” and objects that “signal events” • connectors: automatic method invocation • control: decentralized, de-coupling of sender and receiver • data: passed with event, may also require a shared repository • topologies: arbitrary ? ! ? “event” signals ! method invocation Architecture Session – Rapporteur talk
Architecture Styles: Part II (7) • Data-centered repository styles: blackboard or DB Centralized data, usually structured • components: central data store, many computational objects • connectors: computational objects interact with central store directly or via method invocation • control: may be external, predetermined or internal • data: shared or passed • topology: star memory Blackboard (shared data) computational objects Architecture Session – Rapporteur talk
Architectural Styles in HEP: Outline • Global structure • Foundation libraries • An example architecture : LHCb’s GAUDI • Variations of algorithm/event data styles • Architectural issues of persistency • User interface (UI) and visualization Architecture Session – Rapporteur talk
Architectural Styles of HEP experiments • I believe that most people would agree with a global software structure as expressed by LHCb’s GAUDI A series of data processing applications built on top of the frameworks and implementing the required physics algorithms. High level triggers Reconstruction Simulation Analysis A series of Frameworks and Toolkits. One main framework: GAUDI, various specialized frameworks: visualization, persistency, interactivity, simulation (Geant4), etc. Frameworks Toolkits A series of basic libraries widely used: STL, CLHEP, etc. Foundation Libraries Architecture Session – Rapporteur talk
Foundation Libraries • These form a basic vocabulary which is used throughout the code • ATLAS, CMS and LHCb all propose to use • C++ Standard library (STD) • collections, iterators, algorithms, stings, streams, numerics • For missing or HEP-specific pieces, they are working (via LHC++) on a common set of foundation libraries • CLHEP - random number generators, physics vectors, geometry and linear algebra • open questions: more linear algebra (Blitz++ or CL++), use of G4 or Fermilab’s SIunits package, missing: error logger, exception handling Architecture Session – Rapporteur talk
Foundation Libraries (2) • ALICE is following a different policy: (based on using ROOT as their underlying framework): • Use ROOT containers • Forbid use of STL and templates • Do not use CLHEP • Rely on CINT (ROOT’s C++ interpreter) Architecture Session – Rapporteur talk
An example architecture : LHCb’s GAUDI • LHCb has specified the high level view of their software system’s architecture. • The specification consists of: • scenarios and requirements • Overall system design: • set of major design criteria • identification of the major components and their interactions • physical design of their system (i.e. packaging) • More detailed specification of the individual components • e.g. purpose, interface, dependencies • Currently GAUDI is being implemented and deployed Architecture Session – Rapporteur talk
GAUDI architecture (2) • Major design criteria: • Clear separation between data and algorithms • Three basic types of data: event, detector, statistics • Clear separation between persistent and transient data • Isolation of user’s code. • Different/incompatible optimization criteria. • Transient as a bridge between various representations. • Data store -centered architectural style • User code encapsulated in few specific places: algorithms and converters • All components with well defined interfaces and as generic as possible Architecture Session – Rapporteur talk
GAUDI architecture (3) Architecture Session – Rapporteur talk
GAUDI’s use of interfaces • GAUDI makes use of Java-like interfaces: • any component implements one or more interfaces • clients of a component hold references to interfaces, not to the concrete component • ensures minimal coupling between components • there is an interface query mechanism to allow interface discovery and interface evolution EventDataService IDataProvider IAlgorithm DetectorDataService IDataProvider ConcreteAlgorithm client HistogramService IHistogramSvc IProperty MessageService IMessageSvc Architecture Session – Rapporteur talk
Gaudi algorithms • Each Algorithm only knows what data (type and name) is expecting as input and creating as output. • The only coupling is through the data. • Scheduling of sub-algorithms is responsibility of the parent algorithm. Data T1 Data T1 Real dataflow Apparent dataflow Transient Event Data Store Data T1 Algorithm A Data T2, T3 Data T2 Data T2, T3 Data T2 Algorithm B Control flow Data T4 A Data T3, T4 Algorithm C Parent Data T4 B Data T5 C Data T5 Architecture Session – Rapporteur talk
GAUDI’s use of architectural styles • Interaction of top level components: • call-and-return styles: data abstraction • components: managers • connectors: method invocation • control: decentralized, usually single thread • data: passed • Issues: “interacting objects” are coupled because they know each other’s identity (i.e. class definition) • changes in one class affects others who use it • GAUDI deals with this through the use of interfaces • This has an impact on the physical software design (see later) Architecture Session – Rapporteur talk
GAUDI’s use of architectural styles (2) • Algorithm and event data: • logical view – follows pipe-and-filter style • algorithms – filters • event data – data stream • data flow is implemented as a data-centered repository • this implies that data is effectively shared between all filters • issues: policy on shared data needed, e.g. who can modify what? • control flow follows the logical data flow • no general mechanism for control (who calls whom) • control is left up to the application manager and each “parent” algorithm • data and control flow: in same direction, but different paths Architecture Session – Rapporteur talk
control framework control app module app module input/output modules do event selection input module output module app module app module event Variations of algorithm/event data styles • Many of today’s HEP experiments use a similar style • BaBar and CDF share AC++ framework: • framework is a state machine: loops over modules at transitions • module interface: Init, BeginRun, Event, EndRun, EndJob, TalkTo • follows pipe-and-filter style: • allow parallel data flow paths, filters can stop flow • data flow is the event: central repository • control flow (done by framework) is separate but parallel Architecture Session – Rapporteur talk
intermediate data objects module module module input module output module module module module event event module Variations of algorithm/event data styles (2) • Object Networks under consideration by ATLAS • follows both pipe-and-filter and implicit invocation styles: • data may be any C++ type, includes event, tracks, calo clusters • control flows in same direction as data flow (and done by framework) • module execution is triggered by an input which has changed (uses observer pattern) • allows finer granularity of algorithm decomposition • filters can stop flow, as for CDF/BaBar Architecture Session – Rapporteur talk
Variations of algorithm/event data styles (3) • CMS CARF: Reconstruction via notification and action-on-demand • follows implicit invocation style • No central ordering of actions, no explicit control of data flow: only implicit dependencies • External dependencies managed through an Event Driven Notification to “subscribers” • Internal dependencies through an Action on Demand mechanism • how does it work? • reconstruction algorithms register themselves saying what they produce and their name • a client asks the event for something, e.g. a list of tracks, and may specify which algorithm is to be used • iterating over the list induces the algorithms to be run Architecture Session – Rapporteur talk
Variations of algorithm/event data styles (4) • CMS CARF: example Rec Hits Detector Element Hits Event Rec T1 T1 T2 Rec T2 Analysis Architecture Session – Rapporteur talk
Architectural issues of persistency • The questions of persistency or I/O are primarily treated in another session • Which issues are pertinent to software architecture? • What is the desired coupling of the system to the persistency mechanism? • How does one design an architecture which allows • access to the “advantages” of a desired I/O system, e.g. ODBMS? • access to different types of I/O systems? • migration to new technologies? • Are there performance implications? • What are the implications of wide-area distribution of the data and the users? Architecture Session – Rapporteur talk
* * Hit Architectural issues of persistency (2) • From the application’s point of view, clients (e.g. algorithms) want to see objects as if they were transient. They • do not want to “know” about persistency details • just want de-referencing semantics: ptr->object • want to be able to traverse relationships TrackSet Track HitsOnTrack • Where is the knowledge about • what is in memory or on disk, tape or WAN? • the I/O implications when one follows an association or de-references a pointer? Architecture Session – Rapporteur talk
algo 2 algo 1 converted shape persistent shape transient shape Architectural issues of persistency (3) optimized for a specific purpose • Example architecture 1): ODBMS knowledge to memory disk Forced conversion when bringing objects in - do not let application see “persistent object” • advantages: application insulation, future migration • disadvantages: • potential loss of ODBMS benefits - may need to implement on transient side • number of classes or converters Architecture Session – Rapporteur talk
algo 2 algo 1 converted shape persistent shape Architectural issues of persistency (4) optimized for a specific purpose • Example architecture 2): ODBMS knowledge to memory no transient shape disk Allow application to access persistent objects directly • reduces to architecture 1) if • conversions are used everywhere • paged-in from disk with non-trivial conversion/streamer • advantages: ODBMS benefits - avoids reimplementing object manager, avoids extra classes where unnecessary • disadvantages: must take care not to contaminate application with ODBMS knowledge, future migration Architecture Session – Rapporteur talk
Architectural issues of persistency (5) • ROOT and Objectivity can be used with either architecture, RDBMS requires architecture 1) – needs transient shape • Who chose what? Arch 1 Arch 2 • BaBar X • CDF (non-trivial streamers) X • D0 X • ALICE X • CMS X • ATLAS ? Architecture Session – Rapporteur talk
Architectural issues of persistency (6) • What was the motivation? • What price was paid? • (see the persistency session) Architecture Session – Rapporteur talk
User interface (UI) and visualization Architecture Session – Rapporteur talk
Implementation and physical design • Large software systems need to be decomposed into small and manageable components • A decomposition has significant implementation-related consequences • compile-time coupling, link-time dependencies, size of executables, etc. • Physical design focuses on grouping classes together into packages • Minimizing the number of package dependencies is a key issue • Because of the consequences, system architects should be involved in the physical design Architecture Session – Rapporteur talk
Implementation and physical design (2) • Many HEP experiments find “guidance” in [Lakos] and [Martin] • Example use of abstract interfaces ... Architecture Session – Rapporteur talk