260 likes | 424 Views
Ariadne: Prima Facie. Illya Shapoval CERN, KIPT, UNIFE, INFN-FE. 2 nd LHCb Computing Workshop 4th-8th November 2013 CERN. Content. Objectives Approach Choice of DBMS Ariadne Use cases. Introduction. LHCb data processing implies handling of heterogeneous metadata entities
E N D
Ariadne:Prima Facie Illya Shapoval CERN, KIPT, UNIFE, INFN-FE 2nd LHCb Computing Workshop 4th-8th November 2013 CERN
Content Objectives Approach Choice of DBMS Ariadne Use cases LHCb Computing Workshop , 4th-8th November, CERN
Introduction • LHCb data processing implies handling of heterogeneous metadata entities • Versions of applications (in 2 dimensions) • Conditions Database states (in 2x3 dimensions) • Real data reconstruction types • MC data simulation types • Many others: trigger configurations, arch. specificators, etc. LHCb Computing Workshop , 4th-8th November, CERN
What entities are compatible with a data processing type? • Is a CondDB state consistent? • What entities are compatible with an application? Objectives: how did it start • What entities are compatible? • What entities are compatible with a CondDB state? • How are entities related? LHCb Computing Workshop , 4th-8th November, CERN
Requirements • An operational space with generic way of • Expressing relationship constraints • Tracking relationships • Extracting solutions • Ease of data management • Flexibility (to extend the area of application) LHCb Computing Workshop , 4th-8th November, CERN
Approach: property graphs … • Modeling structured metadata • as nodes with attributes (key+value) A:1 R:”T” S:99 A:2 B: “W” … A:3 … LHCb Computing Workshop , 4th-8th November, CERN
Approach: property graphs … • Modeling structured metadata • as nodes with attributes (key+value) • with typed connections A:1 compatible xyz R:”T” S:99 A:2 compatible B: ”W” … A:3 … LHCb Computing Workshop , 4th-8th November, CERN
Approach: property graphs … • Modeling structured metadata • as nodes with attributes (key+value) • with typed connections • Tracking? • Extracting? A:1 compatible xyz R:”T” S:99 A:2 compatible B: ”W” … A:3 … LHCb Computing Workshop , 4th-8th November, CERN
Choice of DBMS model [1] C. Ireland et al, A Classification of Object-Relational Impedance Mismatch, DBKDA ’09. [2] Correlated with the “Performance of structural queries” row of the table [3] C. Vicknair et al, A Comparison of a Graph Database and a Relational Database, ACM SE ’10, NY. [4] See the next slide LHCb Computing Workshop , 4th-8th November, CERN
Other NoSQL DBMS models Key-value dbs Data size Column dbs Document dbs Still billions of nodes Relational dbs Graph dbs Data complexity LHCb Computing Workshop , 4th-8th November, CERN
Choice of graph DBMS Neo4j Taken from the “Knowledge Base of Relational and NoSQLDBMS” at: http://db-engines.com/en/ranking/graph+dbms LHCb Computing Workshop , 4th-8th November, CERN
Neo4j customers LHCb Computing Workshop , 4th-8th November, CERN
Are we alone? LHCb Computing Workshop , 4th-8th November, CERN
Ariadne: system designA tracking system for relationships in LHCb metadata Ariadne Topological solutions Metadata primitives Context diagram (Ariadne Python API, CL tools, web FE) (CL tools, web FE) Ariadne UI Publish Query Neo4j database Level 0 (Leveled data flow diagram, shown in the Yourdon-DeMarco DFD notation) LHCb Computing Workshop , 4th-8th November, CERN
Ariadne API (py2neo) Ariadne: previous security model • Neo4j Jetty server • Admin CL tools • Neo4j admin web interface • Public Ariadne XMLRPC server • Users’ • CL tools • LHCb job ? RW RW Problem: Administration access was secured by ip-based rules, and thus was very limited and incovenient. RO LHCb Computing Workshop , 4th-8th November, CERN
Ariadne API (py2neo) Ariadne: new security model • Neo4j Jetty server • Admin CL tools • Neo4j admin web interface • Apache server with SSO IAA • Public Ariadne XMLRPC server • Users’ • CL tools • LHCb job ? RW Evolution: All administration components of Ariadne integrated into CERN SSO IAA infrastructure (no LDAP!) RO LHCb Computing Workshop , 4th-8th November, CERN
Ariadne: current knowledge graph • Metadata entities that current graph contains (>500) • Applications • CondDB tags (or , and to specify partitions) • DetectorTypes(DataTypes) , RecoTypes , SimTypes • Platforms , GRID sites • Relationships between those entities (~50k) • , ,, • , , A T TD TC TQ D R S P G A T A D A R A S D T R T S T LHCb Computing Workshop , 4th-8th November, CERN
Tracking Relationships: Matching Patterns (1) • What is the full compatible set of entities for concrete real data processing type? Query Solution A A A Ariadne S R TC TQ R TQ D TC D TD TC TC TD TD D LHCb Computing Workshop , 4th-8th November, CERN
Tracking Relationships: Matching Patterns (2) • What is the full compatible set of entities for concrete application andMC data processing type? Query Solution A A A Ariadne S S TC TQ S TQ D TC D TD TC TC TD TD D LHCb Computing Workshop , 4th-8th November, CERN
Tracking Relationships: filtering multiple solutions (3) • What is the full compatible set of entities for concrete application andMC data processing type? Set of solutions A Query A A Latest S TQ Ariadne S D TQ S A TC TQ TD D D TC TC TD TD S TQ Ariadne filters multiple results according to the criterion provided by a user. … D TC TD LHCb Computing Workshop , 4th-8th November, CERN
What entities are compatible with an application? • What entities are compatible with a data processing type? • Is a CondDB state consistent? Other application domains • What entities are compatible? • What does an application depend on? • Where an entity is deployed to? • What platform an application wasbuilt for? • …? • What entities are compatible with a CondDB state? • How are entities related? LHCb Computing Workshop , 4th-8th November, CERN
Summary • Ariadne – ageneric tracking system for relationshipsin LHCb metadata • Providesgeneric UI layer for heterogeneous metadata; • Basedon the novel Neo4jgraph database; • Provides powerful expressiveness when dealing with complexdata(lots of relationships); • Scalable andhighperformantsolution for complex data. LHCb Computing Workshop , 4th-8th November, CERN
Ariadne developpers: Illya Shapoval Marco Clemencic Marco Cattaneo Many thanks to Joel Closier: for the admin support of the Ariadne hosting machine Special thanks to: Regina Hunyadi for the amazing painting “Ariadne” and authorization to use its copy The system was called after ancient Greek character of Ariadne, who, according to the legend, was in charge of the Cnossian Labyrinth and assisted Theseus, with a clue of thread, to find a way back from the labyrinth after killing the Minotaur. LHCb Computing Workshop , 4th-8th November, CERN
BACKUP LHCb Computing Workshop , 4th-8th November, CERN
Ariadne: system requirements(narrows down to the Neo4j’s requirements) LHCb Computing Workshop , 4th-8th November, CERN
Compatibility of Entities: Definition • Two entities are declared compatible if a job, that uses them simultaneously: • never fails because of the combination • is configured by the combination to work exactly in the way a developer anticipated it. Implication: • A set of entities is declared compatible if and only if each pair of the entities (the compatibility is tracked between) out of the set is compatible. LHCb Computing Workshop , 4th-8th November, CERN