550 likes | 681 Views
CHAPTER 6. DATABASES AND DATA WAREHOUSES Opening Case Searching for Revenue - Google. Chapter Six Overview. SECTION 6.1 – DATABASE FUNDAMENTALS Understanding Information Database Fundamentals Database Advantages Relational Database Fundamentals Database Management Systems
E N D
CHAPTER 6 DATABASES AND DATA WAREHOUSES Opening Case Searching for Revenue - Google
Chapter Six Overview • SECTION 6.1 – DATABASE FUNDAMENTALS • Understanding Information • Database Fundamentals • Database Advantages • Relational Database Fundamentals • Database Management Systems • Integrating Data Among Multiple Databases • SECTION 6.2 – DATA WARAEHOUSE FUNDAMENTALS • Accessing Organizational Information • History of Data Warehousing • Data Warehouse Fundamentals • Business Intelligence • Data Mining
SECTION 6.1 DATABASE FUNDAMENTALS
UNDERSTANDING INFORMATION • Information is everywhere in an organization • Employees must be able to obtain and analyze the many different levels, formats, and granularities of organizational information to make decisions • Successfully collecting, compiling, sorting, and analyzing information can provide tremendous insight into how an organization is performing
UNDERSTANDING INFORMATION • Information granularity – refers to the extent of detail within the information (fine and detailed or coarse and abstract) • Levels • Formats • Granularities
Information Quality • Business decisions are only as good as the quality of the information used to make the decisions • Characteristics of high quality information include: • Accuracy • Completeness • Consistency • Uniqueness • Timeliness
Information Quality • Low quality information example
Understanding the Costs of Poor Information • The four primary sources of low quality information include: • Online customers intentionally enter inaccurate information to protect their privacy • Information from different systems have different entry standards and formats • Call center operators enter abbreviated or erroneous information by accident or to save time • Third party and external information contains inconsistencies, inaccuracies, and errors
Understanding the Costs of Poor Information • Potential business effects resulting from low quality information include: • Inability to accurately track customers • Difficulty identifying valuable customers • Inability to identify selling opportunities • Marketing to nonexistent customers • Difficulty tracking revenue due to inaccurate invoices • Inability to build strong customer relationships
Understanding the Benefits of Good Information • High quality information can significantly improve the chances of making a good decision • Good decisions can directly impact an organization's bottom line
DATABASE FUNDAMENTALS • Information is everywhere in an organization • Information is stored in databases • Database – maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses)
DATABASE FUNDAMENTALS • Database models include: • Hierarchical database model – information is organized into a tree-like structure (using parent/child relationships) in such a way that it cannot have too many relationships • Network database model – a flexible way of representing objects and their relationships • Relational database model – stores information in the form of logically related two-dimensional tables
DATABASE ADVANTAGES • Database advantages from a business perspective include • Increased flexibility • Increased scalability and performance • Reduced information redundancy • Increased information integrity (quality) • Increased information security
Increased Flexibility • A well-designed database should: • Handle changes quickly and easily • Provide users with different views • Have only one physical view • Physical view – deals with the physical storage of information on a storage device • Have multiple logical views • Logical view – focuses on how users logically access information
Increased Scalability and Performance • A database must scale to meet increased demand, while maintaining acceptable performance levels • Scalability – refers to how well a system can adapt to increased demands • Performance – measures how quickly a system performs a certain process or transaction
Reduced Redundancy • Databases reduce information redundancy • Redundancy – the duplication of information or storing the same information in multiple places • Inconsistency is one of the primary problems with redundant information
Increased Integrity (Quality) • Information integrity – measures the quality of information • Integrity constraint – rules that help ensure the quality of information • Relational integrity constraint – rule that enforces basic and fundamental information-based constraints • Business-critical integrity constraint – rule that enforce business rules vital to an organization’s success and often require more insight and knowledge than relational integrity constraints
Increased Security • Information is an organizational asset and must be protected • Databases offer several security features including: • Password – provides authentication of the user • Accesslevel – determines who has access to the different types of information • Accesscontrol – determines types of user access, such as read-only access
RELATIONAL DATABASE FUNDAMENTALS • Entity – a person, place, thing, transaction, or event about which information is stored • The rows in each table contain the entities • In Figure 6.5 CUSTOMER includes Dave’s Sub Shop and Pizza Palace entities • Entity class (table) – a collection of similar entities • In Figure 6.5 CUSTOMER, ORDER, ORDER LINE, DISTRIBUTOR, and PRODUCT entity classes
RELATIONAL DATABASE FUNDAMENTALS • Attributes (fields, columns) – characteristics or properties of an entity class • The columns in each table contain the attributes • In Figure 6.5 attributes for CUSTOMER include: • Customer ID • Customer Name • Contact Name • Phone
RELATIONAL DATABASE FUNDAMENTALS • Primary keys and foreign keys identify the various entity classes (tables) in the database • Primary key – a field (or group of fields) that uniquely identifies a given entity in a table • Foreign key – a primary key of one table that appears an attribute in another table and acts to provide a logical relationship among the two tables
DATABASE MANAGEMENT SYSTEMS • Database management systems (DBMS) – software through which users and application programs interact with a database
DATABASE MANAGEMENT SYSTEMS • Four components of a DBMS
Data Definition Component • Data definition component – creates and maintains the data dictionary and the structure of the database • The data definition component includes the data dictionary • Data dictionary – a file that stores definitions of information types, identifies the primary and foreign keys, and maintains the relationships among the tables
Data Definition Component • Data dictionary essentially defines the logical properties of the information that the database contains
Data Manipulation Component • Data manipulation component – allows users to create, read, update, and delete information in a database • A DBMS contains several data manipulation tools: • View – allows users to see, change, sort, and query the database content • Reportgenerator – users can define report formats • Query-by-example (QBE) – users can graphically design the answers to specific questions • Structured query language (SQL) – query language
Data Manipulation Component • Sample report using Microsoft Access Report Generator
Data Manipulation Component • Sample report using Access Query-By-Example (QBE) tool
Data Manipulation Component • Results from the query in Figure 6.10
Data Manipulation Component • SQL version of the QBE Query in Figure 6.10
Application Generation and Data Administration Components • Application generation component – includes tools for creating visually appealing and easy-to-use applications • Data administration component – provides tools for managing the overall database environment by providing faculties for backup, recovery, security, and performance • IT specialists primarily use these components
INTEGRATING DATA AMONG MULTIPLE DATABASES • Integration – allows separate systems to communicate directly with each other • Forward integration – takes information entered into a given system and sends it automatically to all downstream systems and processes • Backward integration – takes information entered into a given system and sends it automatically to all upstream systems and processes
INTEGRATING DATAAMONG MULTIPLE DATABASES • Forward and backward integration
INTEGRATING DATAAMONG MULTIPLE DATABASES • Building a central repository specifically for integrated information
SECTION 6.2 DATA WAREHOUSE FUNDAMENTALS
HISTORY OF DATA WAREHOUSING • Data warehouses extend the transformation of data into information • In the 1990’s executives became less concerned with the day-to-day business operations and more concerned with overall business functions • The data warehouse provided the ability to support decision making without disrupting the day-to-day operations
DATA WAREHOUSE FUNDAMENTALS • Data warehouse – a logical collection of information – gathered from many different operational databases – that supports business analysis activities and decision-making tasks • The primary purpose of a data warehouse is to aggregate information throughout an organization into a single repository for decision-making purposes
DATA WAREHOUSE FUNDAMENTALS • Extraction, transformation, and loading (ETL) – a process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse • Data mart – contains a subset of data warehouse information
Multidimensional Analysis • Databases contain information in a series of two-dimensional tables • In a data warehouse and data mart, information is multidimensional, it contains layers of columns and rows • Dimension – a particular attribute of information
Multidimensional Analysis • Cube – common term for the representation of multidimensional information
Multidimensional Analysis • Data mining – the process of analyzing data to extract information not offered by the raw data alone • To perform data mining users need data-mining tools • Data-mining tool – uses a variety of techniques to find patterns and relationships in large volumes of information and infers rules that predict future behavior and guide decision making
Information Cleansing or Scrubbing • An organization must maintain high-quality data in the data warehouse • Information cleansing or scrubbing – a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information
Information Cleansing or Scrubbing • Contact information in an operational system
Information Cleansing or Scrubbing • Standardizing Customer name from Operational Systems
Information Cleansing or Scrubbing • Accurate and complete information
BUSINESS INTELLIGENCE • Business intelligence – information that people use to support their decision-making efforts • Principle BI enablers include: • Technology • People • Culture
DATA MINING • Data-mining software includes many forms of AI such as neural networks and expert systems