E N D
Preview • Businesses of every size organize data records into collections called databases. At one extreme, small businesses use databases to keep track of customers; at the other extreme, huge corporations such as Dell and Amazon.com use databases to support complex sales, marketing, and operations activities. In between, we have businesses like FlexTime that use databases as a crucial part of their operations, but they don’t have a trained and experienced staff to manage and support the databases. To obtain answers to the one-of-a-kind queries he needs, Neil needs to be creative and adaptable in the way that he accesses and uses his database. • Users have a crucial role in the development of database applications. Specifically, the structure and content of the database depends entirely on how users view their business activity. To build the database, the developers will create a model of that view using a tool called the entity-relationship model. You need to understand how to interpret such models, because the development team might ask you to validate the correctness of such a model when building a system for your use. Finally, we describe the various database administration tasks.
What Is the Purpose of a Database? • Purpose: to keep track of things • If structure of a list is simple, i.e., one theme, no need to use database technology (video)
General Rule • Lists of data involving a single theme can be stored in a spreadsheet. • Lists that involve data with multiple themes require a database.
Database • Database: • A self-describing collection of integrated records • In databases, bytes are grouped into columns, such as Student Number and Student Name. Columns are also called fields. Columns or fields, in turn, are grouped into rows, which are also called records.
Metadata Describes Structure of Database Components of a Database
What Are Relationships Among Rows? First row of the Email Table is related to Andrea Baker in Student Table Last row in Office_Visit Table related to Adam Verberra in Student Table
Relationship Special Terms • Key – Primary Key • A column or group of columns that identifies a unique row in a table. • Student Number is the key of the Student table. Given a value of Student Number, you can determine one and only one row in Student. Only one student has the number 1325. • Every table must have a key. • Sometimes more than one column is needed to form a unique identifier. In a table called City, for example, the key would consist of combination of columns (City, State). • Email_Num is the key of Email Table. • VisitID is the key of Office_Visit Table.
Relationship Special Terms • Foreign keys • These are primary keys of a different (foreign) table than the table in which they reside. • Relational databases • Relationships among tables are created by using foreign keys. • Relation • Formal name for a table
Metadata • Database: • A database is a self-describing collection of integrated records. • Metadata • Data that describe data
Components of a Database Application System • Applications make database data more accessible and useful. • Users employ a database application that consists of forms, formatted reports, queries, and application programs. • Database management system (DBMS) processes database tables for applications.
What Is a Database Management System (DBMS)? • DBMS • A program (software) used to create, process, and administer a database • Companies license DBMS products from vendors: • IBM, Microsoft, Oracle, and others • Popular DBMS products are: • DB2 from IBM • Access and SQL Server from Microsoft • Oracle from the Oracle Corporation • MySQL—an open-source DBMS product that is license-free for most applications
Creating the Database and Its Structures • Database developers use the DBMS to create and modify tables, relationships, and other structures in the database. • Below, the developer has added a new column called Response?. This new column has data type Yes/No.
Processing the Database • DBMS operations • Read, insert, modify, delete data • Applications call DBMS in different ways • From a form, when the user enters new or changed data, a computer program behind the form calls the DBMS to make the necessary database changes. • From an application program, the program calls the DBMS directly to make the change.
Structured Query Language (SQL) • SQL—“see-quell” • International standard language for creating databases and database structures, and processing databases • All five of the most popular DBMS products accept and process SQL. • Following SQL statement inserts a new row into the Student table: INSERT INTO Student ([Student Number], [Student Name], HW1, HW2, MidTerm) VALUES (1000, ’Franklin, Benjamin’, 90, 95, 100);
Administering the Database • DBMS provides tools to assist in administration of the database. • Used to set up a security system involving user accounts, passwords, permissions, and limits for processing the database • Backing up database data, adding structures to improve performance of database applications, removing data no longer wanted or needed, and similar tasks • Most organizations dedicate one or more employees to the role of database administration.
What Are Forms, Reports, and Queries? Reports show data in a structured context.
What Are Forms, Reports, and Queries? Sample query form used to enter phrase for search Sample query results of query operation
Why Are Database Application Programs Needed? • Forms, reports, and queries work well for standard functions. However, most applications have unique requirements that a simple form, report, or query cannot meet. • Application programs process logic that is specific to a given business need. • Application programs serve as an intermediary between the Web server and database. • Responds to events, such as when a user presses a submit button; also reads, inserts, modifies, and deletes database data
Four Database Application Programs Running on a Web Server Computer
Multi-User Processing Problem • Lost-update problem • Process A reads a customer record from a file containing account information, including the customer’s account balance and phone number. • Process B now reads the same record from the same file so it has its own copy. • Process A changes the account balance in its copy of the customer record and writes the record back to the file. • Process B—which still has the original stale value for the account balance in its copy of the customer record—updates the customer’s phone number and writes the customer record back to the file. • Process B has now written its stale account balance value to the file, causing the changes made by process A to be lost. (Source: http://en.wikipedia.org/wiki/File_locking)
Enterprise DBMS vs. Personal DBMS • Enterprise DBMS • Process large organizational and workgroup databases • Support many, possibly thousands, of users and many different database applications • Support 24/7 operations and can manage databases that span dozens of different magnetic disks with hundreds of gigabytes or more of data • IBM’s DB2, Microsoft’s SQL Server, and Oracle’s Oracle are examples of enterprise DBMS products. • Personal DBMS • Designed for smaller, simpler database applications • Used for personal or small workgroup applications that involve fewer than 100 users (normally fewer than 15), single user
Access: A DBMS and an Application Development Product Before building a database, developers construct a logical representation of database data called a data model to describe the data and relationships to be stored in database.
Database Development Process (video link needed)
What Is the Entity-Relationship Data Model? • Entity-relationship (E-R) data model • A tool for constructing data models • Developers use it to describe the content of a data model by defining entities that will be stored in database and relationships among those entities • Unified Modeling Language (UML), less popular, tool for data modeling
Entities • Some thing that the users want to track • Examples of entities: • Order, Customer, Salesperson, and Item. Some entities represent a physical object, such as Item or Salesperson; others represent a logical construct or transaction, such as Order or Contract. • Entity names are always singular. • Attributes • Describe characteristics of an entity. • Examples: order attributes are OrderNumber, OrderDate, SubTotal, Tax, Total, and so forth. • Identifier • An attribute (or group of attributes) whose value is associated with one and only one entity instance.
Sample Relationship (Version 1) Crow’s Feet 1:N N:M 1:N = many-to-many relationships One department can have many advisers, but an adviser has at most one department. N:M = many-to-many relationships One adviser can have many students and one student can have many advisers.
Sample Relationships (Version 2) Advisers may advise in more than one department, but a student may have only one adviser, representing a policy that students may not have multiple majors.
Crow’s-Foot Diagram Version Maximum cardinality—maximum number of entities that can be involved in a relationship. Vertical bar on a line means that at least one entity of that type is required. Minimum cardinality—minimum number of entities that can be involved in a relationship. Small oval means that the entity is optional; the relationship need not have an entity of that type.
Database Design • Database design is the process of converting a data model into tables, relationships, and data constraints. • Database design team transforms entities into tables and expresses relationships by defining foreign keys. • Two important database design concepts: normalization and the representation of two kinds of relationships. • Normalization is a foundation of database design. • Representation of relationships will help you understand important design considerations.
Normalization Normalization is the process of converting a poorly structured table into two or more well-structured tables. Problem with these tables, have two independent themes: employees and departments.
Data Integrity Problems • In previous figure, some rows show Dept. 100 is “Accounting and Finance” and others show Dept. 100 is “Accounting.” Which one is correct? • A table with data integrity problems will produce incorrect results and inconsistent information. • Data integrity problems happen when data are duplicated. • Users will lose confidence in the information, and system will develop a poor reputation. Information systems with poor reputations become serious burdens to the organizations that use them.
Normalizing for Data Integrity • Normalized tables eliminate data duplication, but they can be slower to process. • General goal of normalization is to construct tables such that every table has a single topic or theme.
Normalizing for Data Integrity • The way to correct the problem is to split the table into two tables, each with its own theme.
Summary of Normalization • Database practitioners classify tables into various normal forms according to the kinds of problems they have. • Transforming a table into a normal form to remove duplicated data and other problems is called normalizing the table. • Normalization is just one criterion for evaluating database designs. Normalized designs can be slower to process, database designers sometimes choose to accept nonnormalized tables. The best design depends on the users’ processing requirements.
Data Aggregators • Laws that limit the data that federal and other governmental agencies can acquire and store. • Some legal safeguards on data maintained by credit bureaus and medical facilities. • No such laws that limit data storage by most companies (nor are there laws that prohibit governmental agencies from buying results from companies like Acxiom.
How Will this Change by 2020? • Absent any public outcry for legislation to limit such activity, aggregator data storage will continue to grow exponentially and companies will have even more data about you, the state of your health, your wealth, your purchase habits, your family, your travel, your driving record, and, well, anything you do. • Query, reporting, and data mining technology will improve and Moore’s law will make computer operations that are too slow to be practical today, feasible tomorrow. • The picture of you will become more and more detailed.
Why Do You Care? • Data could be stolen and used for criminal activity against you. • Data might not be accurate. • No organization is required by law to tell you the data that it stores about you and what it does with it.
What If… • You enroll in a “healthy eaters” medical insurance program, similar to “safe drivers” auto insurance. Your premiums are lower because you eat well, except that the insurance company notes from last month’s data that you bought four large packages of Cheetos, and your health insurance premium is increased, automatically. You have no idea why. • Could this actually happen – or something like this?