220 likes | 403 Views
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D. Database Systems {week 01}. What is a database?. What is a Database ? A collection of organized information that persists over a long period of time
E N D
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D. Database Systems{week 01}
What is a database? • What is a Database? • A collection of organized informationthat persists over a long period of time • Such information (or data) is managedby a DataBase Management System • DBMS: • Software used to create and manage large amounts of data efficiently and securely
DBMS characteristics • A typical DBMS is expected to: • Provide a means by which users can create databases and specify their schemas • Give users the ability to query the database(and efficiently return results) • Store huge amounts of data • Support durability and reliability, recovering from failures and errors • Control user access to data (i.e. security)
Data modeling • Database schemas abstract elements of the real world; to do so, we use a data model • A data model describes: • Structure of the data • Operations on the data (reads and writes) • Constraints on the data
Take the ACID test (if you dare) • Transactional (and other) databases must meet the ACID test: • Atomicity: all-or-nothing execution • Consistency: relationships between data elements must adhere to defined constraints • Isolation: each transaction appears as if it occurs without other database activity • Durability: the data itself is durable in the sense that the data must never be lost
Relational data modeling • A relation is a two-dimensional data structure that consists of a set of attributes and (zero or more) tuples or rows of data • Each attribute takes only simple values • i.e. strings, numbers, boolean values, dates, etc.
Relational data modeling • The relation schema consists of: • The name of the relation • The set of attributes • The name (and type) of each attribute • Other constraints • An example: Profs( firstname, lastname, office, nuttiness )
Relational data modeling • A relation contains a set of tuples • Each tuple contains values for all the attributes in the relation schema that are drawn from the domain of that attribute • Example tuples: possibly empty As a set, tuple order is not significant. ( 'Mark', 'Goldberg', 'AE 108', 6 ) ( 'Mukkai', 'Krishnamoorthy', 'Lally 305', 30 ) ( 'Sibel', 'Adali', 'Lally 313', 6 )
Keys • A key for a relation is a set of attributessuch that no pair of tuples has the same value for the key • Examples: • Social Security Number • RIN (Rensselaer ID Number) • First and last name (would this one work???) Given the key, we can query the relationand expect exactly one result (or zero!). Profs( firstname, lastname, office, nuttiness )
Primary keys • In practice, keys are used to improve efficiency of queries using such keys • And note that not all keys provide “uniqueness” • Since relations may have multiple keys,a primary key is selected • The primary key might be a separate(unused?) numeric field What would be the use of this?
Using SQL to store relations • To store a relation, we can use SQL to create a table in a relational database system • Example attribute (data) types include: • CHAR, VARCHAR, TEXT • BIT, INT, INTEGER, FLOAT, DOUBLE, REAL • DECIMAL, NUMERIC • DATE, DATETIME, TIMESTAMP • BLOB, MEDIUMBLOB, LONGBLOB
Table creation other attribute constraints might be included here create table tablename ( attribute1_name attribute1_type, attribute2_name attribute2_type, ... attributeN_nameattributeN_type, constraints ); might also have table options here
Example table creation create table student ( id int, name varchar(255), major char(4), gender char(1), dob date, constraint student_pk primary key (id) ); why did we specifythese attribute types? student_pk is an arbitrary name
Modifying table structure what’s the difference? • Removing a table from the schema: • Adding a new attribute to a table: • Removing an attribute from a table: drop table tablename; truncate table tablename; alter table tablename add attributenameattributetype; alter table tablename drop attributename;
Relational algebra and sets • Relational algebra consists of a set of simple operators that can be used to query the database • Each operator takes as input two relations and produces as output a relation • Think of a relation as a set of tuples • The input and output relations all musthave the same schema
Set operators • Given two relations R and S that havethe same schema, set operators include: • Union: • R S { tuples that are in R or S (or both) } • Intersection: • R S { tuples that are in both R andS } • Set difference: • R – S { tuples that are in R but not in S } remember that a set does not contain duplicates
Projection (attribute filtering) • The projection of a relation R onattributes A1, A2, ..., An is given by: • A1,...,An(R) = { t | t is a tuple in R and t only contains values for attributes A1, A2, ..., Aniff the schema of R contains attributes A1, A2, ..., An } • We use projection to remove existing attributes from R (by selecting a subset of them) duplicate tuples are omitted!
Selection • Find and select all tuples from relation Rthat satisfy some set of conditions • Forms the basis of querying a database • The selectionC (R) is based on Boolean condition C over attributes of relation R • Example conditions include: • A = e, A > e, A >= e, A < e, A <= e, A <> e • A1 = A2, A1 <> A2 • Any combination of conditions using AND, OR, NOT A, A1, and A2 are attributes e is a constant or expression
Selection • Selection selects a subset of tuples in relation R (with the schema unchanged) • C(R) = { t | t is a tuple in R and t satisfies the condition C on relation R } • Selection conditions can only refer toattributes in the given relation R • For conditions spanning multiple relations, we first must combine those relations (i.e. join)
Homework – Part One • Download and install both the Oracle and MySQL database packages noted on the course Web site
Homework – Part Two • Design a full schemato store informationabout celebrities,including: • Basic information • Relationships (e.g. marriages, flings, etc.) • Issues (e.g. drugs, affairs, addictions, etc.)