190 likes | 399 Views
Advanced Database Applications: CS562 -- Fall 2011. George Kollios Boston University. Prof. George Kollios Office: MCS 288 Office Hours: Monday 2:30pm-4:00pm Thursday 11:00am-12:30pm Web: http://www.cs.bu.edu/faculty/gkollios/ada11. History of Database Technology.
E N D
Advanced Database Applications:CS562 -- Fall 2011 George Kollios Boston University
Prof. George Kollios Office: MCS 288 Office Hours: Monday 2:30pm-4:00pm Thursday 11:00am-12:30pm Web: http://www.cs.bu.edu/faculty/gkollios/ada11
History of Database Technology • 1960s: Data collection, database creation, IMS and network DBMS • 1970s: Relational data model, relational DBMS implementation • 1980s: RDBMS, advanced data models (extended-relational, OO, deductive, etc.) and application-oriented DBMS (spatial, scientific, engineering, etc.) • 1990s—2000s: Data mining and data warehousing, multimedia databases. • 2010s-: Data on the cloud, privacy, security. Social network data (facebook, twitter, etc), Web 3.0 and more
Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Modern Database Systems Extend these layers Structure of a RDBMS • A DBMS is an OS for data! • A typical RDBMS has a layered architecture.
Index Methods for RDBMS • Hashing Methods: • Linear Hashing, Extensible Hashing • B-tree family: • B+-trees and variations • Both of them are one-dimensional
Overview of the course • Spatial Database Systems • GIS, CAD/CAM, EOSDIS project NASA • Manages points, lines and regions • Temporal Database Systems • Billing, medical records • Spatio-temporal Databases • Moving objects, changing regions, etc
Overview of the course • Multimedia databases • A multimedia system can store and retrieve objects/documents with text, voice, images, video clips, etc • Time series databases • Stock market, ECG, trajectories, etc
Multimedia databases • Applications: • Digital libraries, entertainment, office automation • Medical imaging: digitized X-rays and MRI images (2 and 3-dimensional) • Query by content: (or QBE) • Efficient • ‘Complete’ (no false dismissals)
Database Outsourcing Owner(s): publish database Servers: host database and provide query services Clients: query the owner’s database through servers Clients Owner Server Security Issues: untrusted or compromised servers H. Hacigumus, B. R. Iyer, and S. Mehrotra, ICDE02
Security Issues • Query authentication and verification • Data privacy and confidentiality • Access control
Databases on the Cloud • Cloud computing is a new trend • Data are stored “in the cloud”, accessed from everywhere • System should maximize utility, minimize response time • Use of large clusters (data centers) • MapReduce
Semantic Web: A lot of data on the web… • There is a lot of data on the web… • Need to make them more accessible and useful • Machine should understand some of the semantics of the web data • Semantic Web: "a web of data that can be processed directly and indirectly by machines.“Tim Berners-Lee
Semantic Web • From document sharing to data sharing • Issues/Challenges: • Vastness:More than 24B pages • Vagueness and Uncertainty: meaning of “young”, “cheap”, “close”, etc. • Inconsistency: contradictions on data and semantics • Deceit: a user may want to mislead, deceive
Probabilistic (or Uncertain) Databases • Another approach to model many real world applications. • Data records are probabilistic or uncertain • Need to formally model and query (correctly and efficiently)=> Prob DBs
What is a Probabilistic Database ? • “An item belongs to the database” is a probabilistic event • Tuple-existence uncertainty • Attribute-value uncertainty • “A tuple is an answer to the query” is a probabilistic event
Two Types of Probabilistic Data • Database is deterministicQuery answers are probabilistic • E.g., IR-style/”fuzzy-match” queries • Approximate query answers • Database is probabilisticQuery answers are probabilistic
Prob DB Models • The database is a probability distribution over possible instances of (deterministic) databases
Example: x-relations [Trio] Each x-tuple represents a discrete probability distribution of tuples x-tuples are mutually independent, and disjoint
Back to reality… • Grading: • 4 Homeworks : 0.2 • 1 Term Project: 0.3 • You need to talk to me and get a problem • Project proposal due in a couple of weeks • Midterm: probably on Oct 26, in class: 0.2 • Final: Dec 20 at 12:30pm (?): 0.3