290 likes | 417 Views
Database Management. Character, file, field, record, database???. What’s “File Processing”?. The “old” way of doing things; still often used in practice. Separate information stored on separate files. File Processing Example:. Sales. Production. Marketing. Knows how many of Products A,
E N D
Database Management Character, file, field, record, database???
What’s “File Processing”? • The “old” way of doing things; still often used in practice. • Separate information stored on separate files.
File Processing Example: Sales Production Marketing Knows how many of Products A, B, and C have been sold. File stores Prod. Name, Production Schedule, and Sales. Knows how much of Products A, B, and C have been produced. File stores Prod. Name, Production Schedule, and Number Produced. Knows the price of Products A, B, and C. File stores Prod. Name and Product Price.
Any problems here? • Duplication (redundancy). • Inconsistency. • Does anyone know how much money we made? No integration. • Set format. Data dependence. Y2K!!
Database ManagementDatabase Management System (DBMS) • Provides one integrated repository for data to be stored and queried. • Standards for data can be defined and enforced. • Reports and queries are easy (er). • SQL, etc.
Database Management Ex.: Sales Production Marketing (App. Progs) DBMS Database Prod. Name Production Schedule Sales Number Produced Product Price
DATABASE MANAGEMENT SYSTEMS • Four components of a DBMS
BUT... • Expensive. • Difficult. • Slow / inefficient.
Another Look(thanks to John Gallaugher, Boston College) Server - responds to client requests DBMS - the program. Manages interaction with databases. request • Database • a collection of related data. Usually organized according to topics: e.g. customer info, products, transactions • Database Management System (DBMS) • a program for creating & managing databases; ex. Oracle, MS-Access, Sybase response Client - makes requests of the DBMS server database - the collection of data. Created and defined to meet the needs of the organization.
A Simple Database • File/Table • Customers • Field/Column • 5 shown: CUSTID, FIRST, LAST, CITY, STATE • Record/Row • 5 shown: one for each customer
A More Complex Example • Entry & Maintenance is complicated • redundant data exists, increases chance of error, complicates updates/changes, takes up space
Normalize Data:Remove Redundancy Customer Table Transaction Table One Many
Key Terms • Relational DBMS • manages databases as a collection of files/tables in which all data relationships are represented by common values in related tables (referred to as keys). • a relational system has the flexibility to take multiple files and generate a new file from the records that meet the matching criteria (join). • SQL - Structured Query Language • Most popular relational database standard. Includes a language for creating & manipulating data.
Using SQL for Querying • SQL (Structured Query Language)Data language English-like, nonprocedural, very user friendly languageFree formatExample:SELECT Name, SalaryFROM EmployeesWHERE Salary >2000
THE VALUE OF QUALITY INFORMATION • Five common characteristics of high-quality information
THE VALUE OF QUALITY INFORMATION • Low-quality information example
THE VALUE OF QUALITY INFORMATION • The four primary sources of low-quality information include: • Online customers intentionally enter inaccurate information to protect their privacy • Information from different systems that have different information entry standards and formats • Call center operators enter abbreviated or erroneous information by accident or to save time • Third party and external information contains inconsistencies, inaccuracies, and errors
Understanding the Costs of Low-quality Information • Potential business effects resulting from low-quality information • Inability to accurately track customers • Difficulty identifying valuable customers • Inability to identify selling opportunities • Marketing to nonexistent customers • Difficulty tracking revenue due to inaccurate invoices • Inability to build strong customer relationships – which increases buyer power
Structures • Hierarchical: The old way. “Tree”. • Access elements by moving down tree. • One-to-many. • Network: Criss-cross patterns. • Many-to-many. • Relational: a common element relates “tables” to one another. Permits “ad hoc”. • Object-oriented: “objects” have data, processes, and properties “encapsulated” in them.
Database Structures Network Structure Hierarchical Structure Relation Relational Structure Dept Empno Dept A 1 A B 2 B C 3 C
Pros and Cons Obj. Relat. Ad Hoc Flexibility ==> Net. Hier. Speed ==>
Data Dictionaries The Data Dictionary • A reference work of data about data (metadata) compiled by the systems • analyst to guide analysis and design. • As a document, the data dictionary collects, coordinates, and confirms the • meaning of data terms to various users throughout the organization. Uses of the Data Dictionary • Documentation, Elimination of data redundancy • Validate the data flow diagram for completeness and accuracy • Provide a starting point for developing screens and reports • Determine contents of data stored in files • Develop the logic for data flow diagram processes
Data Flow Diagrams (“DFD”) Data Flow Process File or Data Store Source or Entity
DFD Example: Apartment Rental Tenant Info D1 Tenant File Copy of lease 1 Lease New Tenant Process Receipt Bank Deposit Tenant Bank 2 Payments Collection Process Notice Cash Report Unpaid Charges Tenant Info Ext. Mgr 3 D1 Tenant File Delinquent Process Delinquency Report Delinquencies
Entity Relationship Diagrams Dept. Projects Dept. Employee * Project Deadline Resources works on * Name Title Address “zero” “many” “one”
New Names, Same Ideas • Data Mining, OLAP • Data Warehousing
Data Mining • automated information discovery process, uncovers important patterns in existing data • can use neural networks or other approaches. Requires ‘clean’, reliable, consistent data. Historical data must reflect the current environment. • e.g. “What are the characteristics that identify when we are likely to lose a customer?” • OLAP is user-driven discovery
Warehouses & Marts • Data Warehouse • a database designed to support decision-making in an organization. It is batch-updated and structured for fast online queries and exploration. Data warehouses may aggregate enormous amounts of data from many different operational systems. • Data Mart • a database focused on addressing the concerns of a specific problem or business unit (e.g. Marketing, Engineering). Size doesn’t define data marts, but they tend to be smaller than data warehouses.
Data Warehouses & Data Marts 3rd party data Data Mart (Marketing) TPS & other operational systems Data Warehouse Data Mart (Engineering) = operational clients = query, OLAP, mining, etc.