250 likes | 466 Views
Graph Databases (GDB). Adrian Silvescu Doina Caragea Anna Atramentov. Problems And Motivations. The necessity to represent, store and manipulate complex data make RDBMS somewhat obsolete [P1] Problem 1: Violations of the 1NF Multi-valued attributes Complex attributes
E N D
Graph Databases (GDB) Adrian Silvescu Doina Caragea Anna Atramentov
Problems And Motivations • The necessity to represent, store and manipulate complex data make RDBMS somewhat obsolete • [P1] Problem 1: Violations of the 1NF • Multi-valued attributes • Complex attributes • Complex combination of the previous two • [P2] Problem 2 : Accommodate Changes • Appears when acquiring data from autonomous dynamic sources or Web (eg: genexp & restaurants ). • RDBMS may require schema renormalization
Problems and Motivations (contd) • [P3] Problem 3: Unified representation for: • Data • Knowledge (Schemas are a subset of this) • Queries (More generally: Goals) [results+def] • Models (Concepts are a particular example) • In order to facilitate the application of learning and reasoning methods on these structures
Existing Approaches • RDBMS – may need schema renormalisation • Approaches that try to fix the above mentioned problems: • OO Databases [P1], [P2] - graphs [but procedural] • XML Databases [P1] (somewhat [P3]) – trees • OORDBMS [P1] – graphs with foreign keys • Others • Datalog – More Efficient Crippled Prolog • Network Models - graphs • Hierarchical Models – trees • Therefore the motivation for Graph Databases
Outline • Graph Databases • Examples • DDL • DML – Queries • DML – UPDATES • Informal Semantics • DB => GDB • GDB vs. OO, XML, OR, … • Conclusions and Further Work
Graph Databases • We propose a new kind of Database: Graph Databases (GDB) as a solution to Problems [P1],[P2] and [P3]. • In order to define the GDB we will specify: • The Data Definition Language (DDL) • The Query Language (more generally DML) • Informal Semantics of the above languages • We will also show how to convert existing DBs (RDBMS) into the GDB DDL to facilitate the transition to GDBs
Goals and Design Choice • Goals • Declarativity • Change • Design Choice : Have unique instance identifiers vs. having foreign keys • Close in Spirit to OO • Will allow us to cope easier with Change • Declarativity is an issue in OO, but not for GDB as we will show
sid bid sid bid bname sname day rating color age 101 22 101 Interlake 10/10/96 red 22 dustin 7 45.0 102 58 103 Clipper 11/12/96 green 31 lubber 8 55.5 103 Marine red 58 rusty 10 35.0 Database Representation • Sailors(sid:integer, sname:char(10), rating: integer, age:real) • Boats(bid:integer, bname:char(10), color:char(10)) • Reserve(sid:integer, bid:integer, day:date) Sailors Reserves Boats
22 IOF sid dustin name ID1 IOF Sailors rating 7 TBL 31 sid age 45.0 IOF IOF lubber name IOF ID2 Boats rating 8 Reserves IOF IOF age 55.5 ID8 IOF ID6 ID3 … ID7 : ID5 : Graph Representation ID4 : : :
22 sid dustin name ID1 rating 7 age 45.0 Sailor 22 sid day 10/10/96 ID4 bid 101 Boat 101 bid ID6 bname Interlake color red Foreign Keys
Val1 Name1 Val2 Name2 ID …… NameN ValN Data Representation in the GDB DDL • ID:(Name1=Val1,…,NameN=ValN) Examples: ID1:(sid=22, name=“Dustin”, rating=7, age=45.0) ID4:(sailor=ID1, day=“10/10/96”, boat=ID6) ID6:(bid=101, bname=“Interlake”, color=“red”)
Person Person Person IOF IOF IOF :- _ID1 GrSon _ID2 _ID1 Son _ID3 Son _ID2 GrSon Defining New Concepts in GDB DDL– Grandson _ID1:(GrSon=_ID2) :- _ID1:(IOF=“Person”,Son=_ID3), _ID3:(IOF=“Person”, Son=_ID2), _ID2:(IOF=“Person”).
_X Sailors _X Boats Name IOF Name IOF _ID :- _ID Boat _ID1 Color Red [DML-QL] Writing simple queries: • The names of all sailors who have reserved a red boat _ID:(Name = _X) :- _ID:(IOF = Sailors, Boat = _ID1, Name = _X), _ID1:(IOF = Boats, Color = Red).
Informal Semantics • Three kinds of Definitions • Facts: • G1. [Extensional definition] • Definitions: • G1 :- G2. [Intensional definition] • G1 :- PROC f(x1,…,xn). [Procedural definition] • Queries = Graphs to be Matched = QG • The same as a definition: Query :- QG.
Query Query match Facts Extended Graph Informal Semantics - Picture
sid bid sid bid bname sname day rating color age 101 22 101 Interlake 10/10/96 red 22 dustin 7 45.0 102 58 103 Clipper 11/12/96 green 31 lubber 8 55.5 103 Marine red 58 rusty 10 35.0 RDBMS => GDB • Sailors(sid:integer, sname:char(10), rating: integer, age:real) • Boats(bid:integer, bname:char(10), color:char(10)) • Reserve(sid:integer, bid:integer, day:date) Sailors Reserves Boats
DML - Updates • Inline Query • _X : [ _ID: (IOF = Sailor, sname = lubber, rating = _X)] • Updates: MODIFY (QryGraph, UpdList) • Add to all GrandSons the money of the Grandparent as a potential inheritance. • MODIFY ( _ID : (GrSon = _ID1), (=> NEWID:(IOF = POT_INHER, BENFICIARY = _ID1, AMOUNT = _AMNT: [ _ID:(Money = _AMNT )]) )) • .
Change Book IOF Title Databases _ID Author Ramakrishan Author Gehrke
Gene_ex value x1 IOF value x2 : _ID value 0.7 IOF _ID1 value xN Exp IOF _ID2 IOF _IDn Aggregate Operation Change
High Order Queries Find all the fields from Tables that contain the name John. _ID:(Name=_X) :- _ID1:(IOF=Tables), _ID2:(IOF=_ID1, _X=“John”).
GDB vs. OO, XML, OR, CG, … • GDB are close in spirit to OO but not the same (GDB : no encapsulation + more IDs). • Close To Datalog but with IDs(links) vs foreign keys • The same for ORDBMs and somewhat XML • Close to Conceptual Graphs But CG do not have IDs • We can also use foreign keys: _ID:[ _ID(IOF = Sailors, sname = lubber)].
Person ID1 IOF age 42 Car ID1 name John IOF ID2 car ID2 Class: Person age: 42 name: john car color red OO vs. GDB OO GDB Class: Car color: red
Conclusions and Further work • Advantages & Disadvantages of GDB • Conclusions • Proposed a new database model DBG that copes with problems existing in previous approaches • Showed how to link existing DB with GDB • Showed advantages of GDB over existing database approaches • Further Work • Use GDB for learning