1 / 25

8. Special database types

8. Special database types. Distributed databases Distribution of data: Several host sites. Availability and reliability: replicated data Distributed concurrency control Distribution of users: Client-server architecture Web databases; three-tier architecture.

Download Presentation

8. Special database types

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 8. Special database types Distributed databases • Distribution of data: • Several host sites. • Availability and reliability: replicated data • Distributed concurrency control • Distribution of users: • Client-server architecture • Web databases; three-tier architecture AdvDB-8 J. Teuhola 2015

  2. Distributed databases: Requirements • Replication and partitioning of data • Maintenance of a location map for data • Query optimization for multiple hosts • Maintenance of consistency among replicas after update operations • Recovery from network failures • Partial usability when some hosts are down • Management and control of access rights AdvDB-8 J. Teuhola 2015

  3. Distributed databases: Advantages • Improved efficiency by replication: data close to users, preferably in the local host. • Improved reliability by replication: When one host is down, others continue to operate. Data is accessible when one copy is available. • Transparency: The user does not need to know the location of data / replicas / partitions. • Extensibility: new nodes can be added to the network. AdvDB-8 J. Teuhola 2015

  4. Example: distributed join • Relation R(X, Y, Z) stored in host A • Relation S(Z, W) stored in host B • Steps of natural join R * S for host A: • Send column R(Z) from A to B • Compute semijoin T(Z, W) = R(Z) * S(Z, W) in B • Send relation T back to A • Compute the final join R * T • Note: the last step can be replaced by concatenation if duplicates are maintained in W and T AdvDB-8 J. Teuhola 2015

  5. Deductive (logic) databases Main features: • ‘Data’ consists of facts and rules. • Declarative language to define them • Inference engine = deduction mechanism for solving queries Related areas: • Relational data model (esp. relational calculus) • Logic programming (Prolog) • Datalog: Subset of Prolog AdvDB-8 J. Teuhola 2015

  6. Deductive databases: Example in Datalog Facts: parent(x, y) means that y is x’s parent parent(peter,mary). parent(peter,paul). parent(mary,john). parent(paul,joan). Rules: ancestor(x, y) means that y is x’s ancestor ancestor(X,Y) :- parent(X,Y). ancestor(X,Y) :- parent(X,Z),ancestor(Z,Y). Queries: (1) ancestors of Peter, (2) descendants of Joan ?- ancestor(peter,?). ?- ancestor(?,joan). AdvDB-8 J. Teuhola 2015

  7. Data warehouses • Support for decision making. • Derived, integrated and refined from operational databases. • No transaction processing, not quite up-to-date. • Multidimensional view of data (data cube) • OLAP = On-Line Analytic processing. • Summary and multidimensional data. • Statistical analysis tools. • Data mining tools. AdvDB-8 J. Teuhola 2015

  8. Sales-person Date Product Example: data cube on sales • Sales values per salesman, product and date AdvDB-8 J. Teuhola 2015

  9. Example: ‘Star’ schema for data warehouse AreaTable AreaNo Name Seller SalesTable ProdNo AreaNo Date Amount Value ProdTable Prod-no Name Descr Group TimeTable Date DayOfWeek ‘Fact table’: Sales ‘Dimension tables’: Prod, Area Time AdvDB-8 J. Teuhola 2015

  10. XML databases: ‘semi-structured data’ • Storage and retrieval of XML documents: structured using nested pairs of tags • Flexible, hierarchical schema • Alternative implementations for XML databases: • Relational database: various alternatives • Object database: more direct mapping of the structure • Native XML database: built from scratch, tailored especially for this data type • Query Language: XQuery AdvDB-8 J. Teuhola 2015

  11. <?xml version=“1.0”?> <course> <cname>Adv DB</cname> <teacher>Timo</teacher> <audience> <student>Pasi</student> <student>Pirjo</student> </audience> </course> <?xml version=“1.0”?> <course> <cname>C++</cname> <teacher>Esa</teacher> <audience> <student>Pasi</student> <student>Pia</student> </audience> </course> Example document collection: 2 courses AdvDB-8 J. Teuhola 2015

  12. Course document 1 Course document 2 Illustration as tree structures AdvDB-8 J. Teuhola 2015

  13. cid course document <?xml…?><course><cname>AdvDB</cname><teacher> Timo</teacher><audience><student>Pasi</student> <student>Pirjo</student></audience></course> c1 <?xml version=“1.0”?><course><cname>C++</cname> <teacher>Esa</teacher><audience> <student>Pasi </student><student>Pia</student></audience></course> c2 Relational alternative 1:XML data type for a column Courses-relation AdvDB-8 J. Teuhola 2015

  14. Relational alternative 2:Non-typed nodes Nodes-relation node-id element parent text-valuen1 course - -n2 cname n1 Adv DBn3 teacher n1 Timon4 audience n1 -n5 student n4 Pasin6 student n4 Pirjon7 course - -n8 cname n7 C++… … … … AdvDB-8 J. Teuhola 2015

  15. Relational alternative 3:Typed nodes Courses cid cname teacherc1 Adv DB Timoc2 C++ Esa Audience student cidPasi c1Pirjo c1Pasi c2Pia c2 AdvDB-8 J. Teuhola 2015

  16. Digital libraries • Organized collection of information ( web) • Close to multimedia databases, but more focused on information retrieval features • Two types of users: • End users make retrievals • Librarians select, organize and maintain the collection. • Important: Metadata and annotations • Hard job: digitalization of ’real’ libraries AdvDB-8 J. Teuhola 2015

  17. Spatial databases • Representations: Solid (2D, 3D), boundary, abstract (‘above’, ‘near’, ‘under’, ...) • Objects: points, line segments, rectangles • Spatial operations (intersection, nearest neighbor, spatial join, ...) • Important application area:GIS = Geographic Information system(objects on maps). • Temporal dimension may be included (movement, order of events) AdvDB-8 J. Teuhola 2015

  18. Scientific databases • Large amounts of observed data (raw, calibrated, validated, derived, interpreted) • Updated seldom - transaction processing not needed. • One form of data warehouse. • Metadata is crucial • Example of scientific database: genome and protein data in bioinformatics (sequences, 3D-structures) AdvDB-8 J. Teuhola 2015

  19. Multimedia databases • Text, hypertext, images, graphics, audio, video • Applications: Media servers, audio/video-on-demand, document management, educational services, marketing, intelligent systems, digital libraries, medical information systems, etc. • Issues: Modeling (complex objects), design, storage of large objects (LOBs), compression, retrieval (indexes), performance (critical for audio/ video). AdvDB-8 J. Teuhola 2015

  20. Multimedia databases: Required features • Supports the main types of multimedia (MM) data • Can handle a very large number of MM objects • Supports high-performance, high-capacity storage management • Offers DB capabilities: Persistence, transactions, concurrency control, recovery from failures, querying with high-level declarative constructs, versioning, integrity constraints, security. • Offers information-retrieval capabilities: Exact-match retrieval, probabilistic (best-match) retrieval, content-based retrieval, ranking of results AdvDB-8 J. Teuhola 2015

  21. Multimedia databases: Functional considerations • Interactive querying • Relevance feedback • Query refinement • Automatic feature extraction and indexing • Content- and context-based indexing of different media • Single- and multidimensional indexing AdvDB-8 J. Teuhola 2015

  22. Multimedia databases: Functional considerations (cont.) • Clustering of media data on storage devices • Support for efficient access of very large media objects • Optimization of multimedia queries and retrieval, supported by sophisticated indexing • Replication, parallelism, distribution, scalability • Recent approach: NoSQL databses, with relaxed requirements of consistency, compared to traditional ACID (see Chapter 3) AdvDB-8 J. Teuhola 2015

  23. NoSQL databases • ”Not only SQL” • ”Big Data” applications, e.g. search engines, social media, data streams, observation data • Traditional relational technology does not scale well to huge amounts of data. • Typical of NoSQL systems: • Requirement for very efficient retrieval • Real-time updating can be relaxed • Large-scale distribution is required AdvDB-8 J. Teuhola 2015

  24. NoSQL approaches • Key–value storesE.g. DynamoDB (Amazon) • Column storesEg. BigTable (Google), Cassandra (Apache) • Graph databasesE.g. Neo4j (Open-source, Java-based) • Document storesE.g. Native XML databases AdvDB-8 J. Teuhola 2015

  25. End of slides – Remember also the exercises! AdvDB-8 J. Teuhola 2015

More Related