790 likes | 1.07k Views
Chapter 9 Business Intelligence Systems. Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration Gonzaga University Spokane, WA 99258 chen@jepson.gonzaga.edu. “We’re Sitting On All This Data. I Want to Make It Pay.”. Anne wants membership data to:
E N D
Chapter 9Business Intelligence Systems Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration Gonzaga University Spokane, WA 99258 chen@jepson.gonzaga.edu
“We’re Sitting On All This Data. I Want to Make It Pay.” Anne wants membership data to: • Combine membership data and publicly available data • Enable target marketing • Increase wedding revenue
Study Questions Q1: How do organizations use business intelligence (BI) systems? Q2: What are the three primary activities in the BI process? Q3: How do organizations use data warehouses and data marts to acquire data? Q4: How do organizations use typical reporting applications? Q5: How do organizations use typical data mining applications? Q6: What is the role of knowledge management systems? Q7: What are the alternatives for publishing business intelligence? Q8: 2022?
BUSINESS INTELLIGENCE • Business intelligence – information that people use to support/improve their decision-making efforts • Principle BI enablers include: • Technology • People • Culture
Working , Not Just Harder Smarter • Overlapping Human/Organizational (Culture, Process)/ Technological factors in BI/KM: PEOPLE ORGANIZATIONAL PROCESSES TECHNOLOGY • Knowledge N
CRM and BI Example • A Grocery store in U.K. with the following “patterns” found: • Every Thursday afternoon • Young Fathers (why?) shopping at store • Two of the followings are always included in their shopping list • Diapers and • Beers • What other decisions should be made as a store manager (in terms of store layout)? • Short term vs. Long term • This is an example of cross-selling • Other types of promotion: up-sell, bundled-sell • IT (e.g., BI) helps to find valuable information then decision makers make a timely/right decision for improving/creating competitive advantages.
Q/A • Can the “patterns” in the grocery store example be produced from its Database? • Y/N • Why? • It only can be produced from its “Data Warehouse” using a kind of “data mining” software.
Q1: How Do Organizations Use Business Intelligence (BI) Systems? • Information systems generate enormous amounts of operational data that contain patterns, relationships, clusters, and trendsabout customers, suppliers, business partners, and employees that can facilitate management, especially planning and forecasting. • Business intelligence (BI) systems produce such information from operational data. • Data communications and data storage are essentially free, enormous amounts of data (Big Data) are created and stored every day. • 12,000 gigabytes per person of data, worldwide in 2009
Why do organizations need business intelligence? • BI systems are computer programs provide valuable information for decision making. • Three primary BI systems: • __________ tools read data, process them, and format the data into structured reports (e.g., sorting, grouping, summing, and averaging) that are delivered to users. They are used primarily for assessment. RFM is one of the tool for reporting. • ___________ tools process data using statistical, regression, decision tree, and market basket techniques to discover hiddenpatterns and relationships, and make predictionsbased on the results • _______________________ tools store employee knowledge, make it available to whomever needs it. These tools are distinguished from the others because the source of the data is human knowledge. Reporting Data-mining Knowledge management
[1] [2] [3] Fig 9-1: Structure of a Business Intelligence System
Q/A Which of the following is true of source data for a BI system? A) It refers to the organization's metadata. B) It refers to data that the organization purchases from data vendors. C) It refers to the level of detail represented by the data. D) It refers to the hierarchical arrangement of criteria that predict a classification or a value. Answer: B
Tools vs. Applications vs. Systems • BI tool (e.g., decision-tree analysis) is one or more computer programs. BI tools implement the logic of a particular procedure or process. • BI application is the use of a tool on a particular type of data for a particular purpose. • BI system is an information system having all five components (what are they?) that delivers results of a BI application to users who need those results.
Example Uses of Business Intelligence [4] [3] [2] (Decision Support Systems) [1] Fig 9-2:Example Uses of Business Intelligence
Q2: What Are the Three Primary Activities in the Business Intelligence Process? • The primary activities in the BI process are: • 1. ______________ • The process of obtaining, cleaning, organizing relating, and cataloging source data. • 2. __________ • The process of creating BI analysis: reporting, data mining, and knowledge management. • 3. ____________ • The process of delivering BI to the knowledge workers who need it. Data acquisition BI analysis Publish results
What Are the Three Primary Activities in the Business Intelligence Process? I P O [1] [2] [3] The principle is the same as the “simple” model we learned before. What is it? Fig 9-3: Three Primary Activities in the BI Process
Using BI for Problem-solving at GearUp: Process and Potential Problems • Obtain commitment from vendor • Run sales event • Sell as many items as possible • Order amount actually sold • Receive partial order and damaged items • If received less than ordered, ship partial order to customers • Some customers cancel orders
Tables Used for BI Analysis at GearUp Fig 9-4: Tables Used for BI Analysis at GearUp
GearUp Analysis: Item Summary andLost Sales Summary Reports Fig 9-5: Extract of the Item_Siummary_Data Fig 9-6: Lost Sales Summary Report
Short and Damaged Shipments Details Report Fig 9-7: Lost Sales Detail Report
Publish Results Options • Print and distribute via email or collaboration tool • Publish on Web server or SharePoint • Publish on a BI server • Automate results via Web service
3: How Do Organizations Use Data Warehouses and Data Marts to Acquire Data? • Why extract operational data for BI processing? • Security and control • Operational not structured for BI analysis • BI analysis degrades operational server performance T/F: Placing BI applications on operational servers can dramatically increase system performance. Answer: FALSE Operational data is structured for fast and reliable “transaction processing” (e.g., payroll).
Data Base, Data Warehouse and Data Marts • Data base: An organized collection of logically related (current) data files. • Data Warehouse: A data warehouse stores data from current and previous years (historical data) that have been extracted from the various operational and management database of an organization. • Data mart: a subset of data warehouse that holds specific subsets of data for one particular functional area or project.
Components of a Data Warehouse • Data warehouses and data marts address the problems companies have with missing data values and inconsistent data. They also help standardize data formats between operational data and data purchased from third-party vendors. • These facilities prepare, store, and manage data specifically for data mining and analyses. Fig 9-11 Components of a Data Warehouse operational data ETL ETL: Extract, Transformation, Load
Legacy Systems Sales Data Mart Finance Data Mart Operational Data Store Marketing Data Mart Accounting Data Mart Operational Data Store Operational Data Store Operational Data Store Data Marts and the Data Warehouse Legacy systems feed data to the warehouse. The warehouse feeds specialized information to departments (data marts). ETL Organizational Data Warehouse ETL ETL: Extract, Transformation, Load
Examples of Consumer Data that Can Be Purchased Fig 9-12 Examples of Consumer Data for Sale
Possible Problems with Source (Operational) Data Fig 9-13 Possible Problems with Source (Operational) Data
Information Cleansing or Scrubbing • Standardizing Customer name from Operational Systems
Data Warehouses vs. Data Marts • Here’s the difference between a data warehouse and a data mart: • A data warehouse stores operational data and purchased data. It cleans and processes data as necessary. It serves the entire organization. • A data mart is smaller than a data warehouse and addresses a particular component or functional area of an organization. Fig 9-14 Data mart Examples
4. How Do Organizations Use Typical Reporting Applications • Four Basic operations: • Sorting • Filtering • Grouping • Calculating • Formatting • We will use a ‘reporting application’ to analyze and rank customers based on their purchasing patterns to help company make better decision for increasing company’s revenue.
What are typical reporting applications? • RFM Analysis allows you to analyze and rank customers according to purchasing patterns as this figure shows. • Recency: How recently a customer purchased items? => leads and opportunities • Frequency: How frequently a customer purchased items? => retention • Monetary Value: How much a customer spends on each purchase? => profitability • RFM Analysis • Sort the data by date (for recency), times (for frequency), and purchase amount (for money), respectively • Divide the sorted data into five groups • Assign 1 to top 20%, 2 to next 20%, 3 to the third 20%, 4 to the fourth 20% and 5 to the bottom 20%. • The the score, the better the customer. lower
What does RFM analysis Tell?Example RFM Scores • RFM Analysis allows you to analyze and rank customers according to purchasing patterns as this figure shows. • R = how recently a customer purchased your products • F = how frequently a customer purchases your products • M = how much money a customer typically spends on your products • The the score, the better the customer, and, consequently, the more profit the company will be. lower Fig 9-15 Example of RFM Score Data
Interpreting RFM Score Results • Ajax has ordered recently and orders frequently. M score of 3 indicates it does not order most expensive goods. • A good and regular customer but need to attempt to up-sell more expensive goods to Ajax • Bloominghams has not ordered in some time, but when it did, ordered frequently, and orders were of highest monetary value. • May have taken its business to another vendor. Sales team should contact this customer immediately. • Caruthers has not ordered for some time; did not order frequently; did not spend much. • Sales team should not waste any time on this customer. • Davidson in middle • Set up on automated contact system or use the Davidson account as a training exercise 80/20 Rule (Pareto Principle)
Q/A U.S. Grocery Corp. is a large grocery chain store. FOODFARM, one of the customers of U.S. Grocery Corp. holds an RFM score of 111. Which of the following characteristics relates FOODFARM with its RFM score? A) FOODFARM has ordered recently and orders frequently, but it orders the least expensive goods. B) FOODFARM has not ordered in some time, but when it did order in the past it ordered frequently, and its orders were of the highest monetary value. C) FOODFARM has not ordered for some time, it did not order frequently, and, when it did order, it bought the least-expensive items. D) FOODFARM has ordered recently and orders frequently, and it orders the most expensive goods. Answer: D
OLAP and its Applications • Online Analytical Processing (OLAP), a second type of reporting tool, is more generic than RFM. • OLAP provides you with the dynamic ability to sum, count, average, and perform other arithmetic operations on groups of data. Reports, also called OLAP cubes. • What software and function that enable you to create OLAP and its applications? • ANSWER • EXCEL with • Pivot table
Online Analytical Processing (OLAP) • Online Analytical Processing (OLAP) cubes, use • Measures which are data items of interest. In the figure below a measure is Store Sales Net . • Dimensions which are characteristics of a measure. In the figure below a dimension is Product Family. Fig 9-16 Example Grocery Sales OLAP Report OLAP Product Family by Store Type
Example Expanded Grocery Sales OLAP Report Figure 9-17 Fig 9-17: Example of Expanded Grocery Sales OLAP Report
Example of Drilling Down into Expanded Grocery Sales OLAP Report Fig 9-18: Example of Drilling Down into Expanded Grocery Sales OLAP Report
OLAP servers are special products that 1) read data from an operational database, 2) perform some preliminary calculations, and then3) store the results in an OLAP database Fig 9 (Extra): Role of OLAP Server & OLAP Database Third-party vendors provide software for more extensive graphical displays
On-Line Analytic Processing (OLAP) • Enables mangers and analysts to interactively examine and manipulate large amounts of detailed and consolidated data from different dimensions. • Analytical Processing: • Drill-up (Consolidation)– ability to move from detailed data to aggregated data • Profit by Product >>> Product Line >>> Division • Drill-down – ability to move from summary/general to lower/specific levels of detail • Revenue by Year >>> Quarter >>>>Week >>>Day • Slice and Dice– ability to look across dimensions • Sales by Region Sales • Profit and Revelers by Product Line
REGION CUSTOMER • Slicing a data cube
Data Base, Data Warehouse and Data Marts • Data base: An organized collection of logically related (current) data files. • Data Warehouse: A data warehouse stores data from current and previous years (historical data) that have been extracted from the various operational and management database of an organization. • Data mart: a subset of data warehouse that holds specific subsets of data for one particular functional area or project.
Database vs. Datawarehouse Database DBMS Datawarehouse ???
Database vs. Datawarehouse Database DBMS Datawarehouse Data Mining
How do BI Tools Obtain Data? ETL ETL: Extract, Transformation, Load
Data-mining Applications • Businesses use statistical techniques to find patterns and relationships among data and use it for classification and prediction. Data mining techniques are a blend of statistics and mathematics, and artificial intelligence (AI) and machine-learning. Fig 9-19 Data Mining Origins Data Warehouse
Unsupervised vs. Supervised Data Mining • Data mining is an automated process of discovery and extraction of hidden and/or unexpected patterns of collected data in order to create models for decision making that predict future behavior based on analyses of past activity. • There are two types of data-mining techniques: • Unsupervised data-mining characteristics: • No model or hypothesis exists before running the analysis • Analysts apply data-mining techniques and then observe the results • Analysts create a hypotheses after analysis is completed • Cluster analysis (and decision tree), a common technique in this category groups entities together that have similar characteristics • Supervised data-mining characteristics: • Analysts develop a model prior to their analysis • Apply statistical techniques such as Market Basket Analysis to estimate parameters of a model • Regression analysis is a technique in this category that measures the impact of a set of variables on another variable • Neural networks predict values and make classifications. • Used for making predictions
Unsupervised vs. Supervised Data Mining Unsupervised Supervised Model created before analysis Hypotheses created before analysis Regression analysis: make predictions • No model before running analysis • Hypotheses created after analysis • Cluster analysis to find groups
Neural Networks • Used for predicting values and making classifications • Complicated set of nonlinear equations • Go to http://kdnuggets.com and search for “neural network”
Probability for BI – Market Basket Analysis (Upselling and Cross-selling) Support - The probability of two items (A&B) will be purchased together. P(A&B) = P(A&B)/Total # of transactions Confidence - Conditional probability is the probability that an event (A) will occur, when another event (B) is known to occur or to have occurred. If the events are A and B respectively, this is said to be "the probability of A given B. P(A | B) = P(A&B)/P(B)
#times an item will be purchased when a customer entering the store Market Basket Analysis at a Dive Shop(Total # of Transactions (TOT)= 400) (s1) Purchase Mask and Fins together, A: Fins B: Mask P(A&B)/TOT P(Fins & Mask) = 250/400=0.625 (c1) Proportion of customers who bought a mask also bought fins (buying fins given s/he bought mask) P(A | B) = P(A&B)/P(B) P(Fins | Mask)= P(Fins&Mask)/P(Mask)=250/270 = .926 P(A | B) /P(A) the lift of fins and mask P(fins | mask)/P(fins)= confidence/base probability=.926/.7=1.32 Fig 9-20 Market-Basket Analysis at a Dive Shop