1 / 57

Databases and Data Warehouses: Supporting the Analytics-Driven Organization

Databases and Data Warehouses: Supporting the Analytics-Driven Organization. LEARNING OUTCOMES. List and describe the key characteristics of a relational database. Define the 5 software components of a DBMS. List and describe the key characteristics of a data warehouse.

sara-landry
Download Presentation

Databases and Data Warehouses: Supporting the Analytics-Driven Organization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Databases and Data Warehouses: Supporting the Analytics-Driven Organization

  2. LEARNING OUTCOMES List and describe the key characteristics of a relational database. Define the 5 software components of a DBMS. List and describe the key characteristics of a data warehouse. Define the 5 major types of data-mining tools. List key considerations in information ownership.

  3. INTRODUCTION • Businesses use many IT tools to manage and organize information • Online transaction processing (OLTP) – gathering and processing information and updating existing information to reflect the processed information • Online analytical processing (OLAP) – manipulation of information to support decision making

  4. INTRODUCTION • OLTP • Supports operational processing • Sales orders, accounts receivable, etc • Supported by operational databases & DBMSs • OLAP • Helps build business intelligence • Supported by data warehouses and data-mining tools

  5. OLTP, OLAP, and Business Intelligence

  6. How are OLTP and OLAP possible? Database Structures At its most basic level, a database is a collection of individual data items that are stored in a highly structured way that represents a model of reality. This collection of data items can be stored in a single location or hard drive or distributed across large networks with components in many physical locations. Regardless of where the data is physically stored, the database acts as a single entity to anyone or any system that makes use of it.

  7. What are some of the advantages to having databases? A database can help in a variety of ways including: Sharing: different departments share the same data Security: databases can be used to restrict access Less redundancy: multiple copies of files are stored in one database Better integrity: changes are made in one place, improving the accuracy and relevancy of the data

  8. Data Administration Subsystem • Backup and recovery – for backing up information and restarting (recovering) from a failure • Backup – copy of information on a computer • Recovery – process of reinstalling the backup information in the even the information was lost

  9. Data Administration Subsystem Security management – for CRUD access – create, read, update, and delete Query optimization – to minimize response times for large, complex queries Reorganization – for physically rearranging the structure of the information according to how you most often access it

  10. Database Characteristics Collections of information Created with logical structures Include logical ties within the information Include built-in integrity constraints

  11. Database

  12. Database

  13. Interoperability • Interoperability is technology dependent • Accuracy • Standardized • Dependability • Accessibility

  14. Interoperability The need for interoperability systems is evident in every organization.

  15. Interoperability • Interoperability requires: • Creation • Acceptance • Implementation of data

  16. Interoperability • Interoperability depends upon two important concepts: • Syntax • Semantics

  17. Categories of Standards

  18. Categories of Standards These standards are set by the organization who use them.

  19. Standards

  20. Standards Reality Use national standards Be informed leaders Build a collaborative culture Explore alternative financing models Use incentives to drive adoption Cultivate staff and knowledge

  21. Database Management System

  22. DATABASE MANAGEMENT SYSTEM TOOLS Database management system (DBMS) – helps you specify the logical requirements for a database and access and use the information in a database

  23. DBMS Engine • DBMS engine – accepts logical requests and converts them into the physical equivalents, and access the database and data dictionary on a storage device • Physical view – how information is physically arranged, stored, and accessed on a storage device • Logical view – how you need to arrange and access information to meet your needs

  24. 5 Components of a DBMS DBMS engine Data definition subsystem Data manipulation subsystem Application generation subsystem Data administration subsystem

  25. Types of DBMS Structures Hierarchical Database Network Database Relational Database Multidimensional Database Object-Oriented Database

  26. Relational Database

  27. RELATIONAL DATABASE MODEL • Database – collection of information that you organize and access according to the logical structure of the information • Relational database – series of logically related two-dimensional tables or files for storing information • Relation = table = file • Most popular database model

  28. Database – Collection of Information

  29. Database – Created with Logical Structures Before you can enter information into a database, you must define the data dictionary for all the tables and their fields. For example, when you create the Truck table, you must specify that it will have three pieces of information and that Date of Purchase is a field in Date format. Data dictionary – contains the logical structure for the information in a database

  30. Database – Logical Ties within the Information Customer Number is the primary key for Customer and appears in Order as a foreign key Primary key – field (or group of fields) that uniquely describes each record Foreign key – primary key of one file that appears in another file

  31. Database – Logical Ties within the Information

  32. Databases – Built-In Integrity Constraints • Integrity constraints – rules that help ensure the quality of information • Data dictionary, for example, defines type of information – numeric, date, and so on • Foreign keys – must be found as primary keys in another file • E.G., a Customer Number in the Order Table must also be present in the Customer Table

  33. Data Definition Subsystem • Data definition subsystem – helps you create and maintain the data dictionary and structure of the files in a database • The data dictionary helps you define… • Field names • Data types (numeric, etc) • Form (do you need an area code) • Default value • Is an entry required, etc

  34. Data Manipulation Subsystem Data manipulation subsystem – helps you add, change, and delete information in a database and query it to find valuable information Most often your primary interface Includes views, report generators, query-by-example tools, and structured query language

  35. View View – allows you to see the contents of a database file, make changes, and query it to find information

  36. Report Generator Report generator – helps you quickly define formats of reports and what information you want to see in a report

  37. Query-by-Example Tool QBE tool – helps you graphically design the answer to a question

  38. Structured Query Language SQL – standardized fourth-generation query language found in most DBMSs Sentence-structure equivalent to QBE Mostly used by IT professionals

  39. Application Generation Subsystem • Application generation subsystem – contains facilities to help you develop transaction-intensive applications • Mainly used by IT professionals

  40. Data Administration Subsystem • Data administration subsystem – helps you manage the overall database environment by providing facilities for… • Backup and recovery • Security management • Query optimization • Reorganization • Concurrency control • Change management

  41. Data Administration Subsystem Concurrency control – what happens if two people attempt to make changes to the same record Change management – how will structural changes impact the overall database

  42. DATA WAREHOUSES AND DATA MINING Help you build and work with BI and some forms of knowledge Data warehouse – collection of information (from many places) that supports business analysis activities and decision making

  43. Data Warehouse Characteristics • Multidimensional • Rows, columns, and layers • Support decision making, not transaction processing • Contain summaries of information • Not every detail

  44. The Tool Set of the Analytics Professional • Data-mining tools – software tools you use in a data warehouse environment • Query-and-reporting tools • Artificial intelligence • Multidimensional analysis tools • Digital dashboards • Statistical tools

  45. The Tool Set of the Analytics Professional

  46. The Tool Set of the Analytics Professional • Query-and-reporting tools – similar to QBE tools, SQL, and report generators • Artificial intelligence – tools to help you “discover” information and trends (more in Chapter 4) • Multidimensional analysis (MDA tools) – slice-and-dice techniques for viewing multidimensional information

  47. The Tool Set of the Analytics Professional • Digital dashboard – displays key information on a computer screen tailored to the needs and wants of an individual • Key performance indicator (KPI) – most essential information used in any analytics initiative • Statistical tools – for applying mathematical models to data warehouse information

  48. Digital Dashboard

  49. The Analytics Life Cycle

  50. Extraction, Transformation, and Loading (ETL) • ETL is a three-step process • Extract needed information from its source • Transform the data into a standardized format • Load the transformed data into a data warehouse

More Related