320 likes | 711 Views
Spreadsheet and Database Decision Support. Power Excel and Access for Business Analysts. High Level Structure. Data Warehousing (Access). Data Cleaning and Transformation (Excel). Online Analytical Processing (OLAP) (Excel). Session 1. Advanced Data Querying and Analysis
E N D
Spreadsheet and Database Decision Support Power Excel and Access for Business Analysts
High Level Structure Data Warehousing (Access) Data Cleaning and Transformation (Excel) Online Analytical Processing (OLAP) (Excel) Session 1 Advanced Data Querying and Analysis (Excel and Access) Session 2 Models and Modeling (Excel) Building DSS With VBA (Excel) Take MIS 436
Module Overview and Session Themes • Data is good. • Data is often not enough, need models too. • Models+Data+VBA = Decision support system
More Themes • Spreadsheets are de facto the most widely used platform for modeling and analysis in business today • Excel has rich set of modeling and analysis tools • End user DSS development • A wide open opportunity for stardom • "Things should be made as simple as possible, but not any simpler." Albert Einstein
Data Extraction, Transformation, and Loading Playing in the Mud
The Many Roles of Knowledge Workers Brilliant IS idea
The Gist of the Problem • Getting data out of some system to: • Analyze it (e.g. Excel, Access, stats package) • Get it into another system (e.g. ERP or Data Warehouse) • “Smart” manipulation of electronic reports with embedded data • Don’t want to do it manually (why?)
ACD Report Example Report Header Date Split Blank lines Data! Totals Lines Next report
Why Talk About This? • Deloitte & Touche – Data Quality and Integrity • Beaumont Oracle ERP project • Huge amount of time wasted doing manual processing • Really useful spreadsheet and database skills (and mindset) • It’s fun to play in the mud sometimes
OK, Let’s start with a little toy data extraction problem • Download PatientLocationLog.txt from the ATiB Downloads section (right click – Save Target As…) • Somehow, get it into Excel so you can analyze it • This is a table of entries of patients to various locations in a clinic. Note that each patient may visit multiple locations.
About the Data A patient type code Location ID When patient entered and exited the location. Unique Patient Arrival ID
Your Challenge • How many times did a patient with PatientType=2 enter Location 38? • If you get that, how might you quickly find how many patients of each type entered each location? • What date, time and day of the week did the very first patient arrive to location 31? • How much total time (in minutes) did the patient with ArrivalID=16 spend in the clinic?
Data Warehousing and OLAP Data is good A Dimensional Modeling Manifesto An Overview of Data Warehousing and OLAP Technology
What is a Data Warehouse? • Subject oriented • Integrated • Time-Variant • Nonvolatile • Data enters DW from operational environment, transaction processing systems (TPS). Data is: Source: Inmon, W.H., “What is a Data Warehouse?”, PRISM, Vol 1. No. 1 http://www.cait.wustl.edu/cait/papers/prism/vol1_no1/
Source: Inmon, W.H., “What is a Data Warehouse?”, PRISM, Vol 1. No. 1 1. Subject Orientation • TPS organized around processes, functions • billing, banking, purchasing, payroll, etc. • DW organized around subjects • customers, vendors, encounters, sales • Transactions • TPS processes transactions • DW stores summary info related to transactions • TPS - keeps data needed for transaction • DW - keeps data needed for analysis
Source: Inmon, W.H., “What is a Data Warehouse?”, PRISM, Vol 1. No. 1 2. Integration • DW must integrate data from different apps • Create consistency across applications • naming conventions • measurement of variables (units) • data types • encoding • DSS analyst - use the data, not worry about credibility/consistency of data • often best person to find subtle data problems
Source: Inmon, W.H., “What is a Data Warehouse?”, PRISM, Vol 1. No. 1 3. Time Variancy • TPS accurate at moment of access • DW accurate as of some moment in time Operational - current value data Data warehouse - snapshot data • Time horizon 5-10 years • Key contains an element of time • Once snapshot made, data cannot be updated • Time horizon 60-90 days • Key may or may not have an element of time • Data can be updated
4. Nonvolatility Change Replace Insert Insert Load Access Delete Change Operational Data warehouse Source: Inmon, W.H., “What is a Data Warehouse?”, PRISM, Vol 1. No. 1
Multi-dimensional Data Modeling • Designed to facilitate analysis (not transactions) • Common in data warehousing • Intuitive concept of many dimensions or perspectives on business measures or facts • view sales from customer, product and time perspective • Conceptual model • Hypercube – an “n-sided cube”
Data (Hyper) cubes 2-d to 3-d cube Rotating the cube
A Call Center ExampleTech Support for MS Office • Technology enabled business processes • Massive amount of data captured by ACD • Some data analysis done by ACD • Difficult operational questions related to staffing/scheduling impact on service level • Created simulation model to generate lots o’ data • Many call centers in many industries
Steps in Multi-dimensional ModelingCall Center Example • Choose business process • Servicing technical support calls • Choose grain of process • Individual phone calls • Choose dimensions • Customer, application, problem, time • Choose measured facts • time on hold, service time of call
Time dimension Customer dimension Fact table Problem dimension Application dimension The Star SchemaA multi-dimensional data model Non-normalized Non-normalized Non-normalized Normalized Non-normalized
Transaction focused Focus on many linked, normalized tables One big complex data model Very little redundancy Analysis focused Normalized fact table joined to a few highly non-normalized dimension tables Many simple, intuitive data models Lots of redundancy Data Models Relational vs. Multi-dimensional
One E-R vs. Many Stars Analysis focus Transaction focus One E-R model for all the business process. One star per modeled business process.
What is OLAP? • Software tool providing multi-dimensional view of data for business analysis • Example of “Decision Support” or “Business Intelligence” tool • Fast data access and fast computations • Interactive, flexible user interface • “Slice, dice, drill-down” • Excel Pivot Table and Pivot Chart
Data Warehousing and OLAPWWW Resources • A Dimensional Modeling Manifesto – Kimball, R. http://www.dbmsmag.com/9708d15.html • Kimball and Associates http://www.ralphkimball.com./html/articles.html • DSS Resourceshttp://dssresources.com/ • Data Warehousing Information Center http://www.dwinfocenter.org/ • Intelligent Enterprise http://www.intelligententerprise.com/
Some Good Books • The Data Warehouse Toolkit – Kimball, R. • Definitive • OLAP Solutions – Thomsen, E. • Definitive • Unlocking OLAP with Microsoft SQL Server and Excel 2000 – Freeze • For newbies • Microsoft OLAP Unleashed – Peterson and Pinkelman • For techasauruses
CallCenter_Scenario01_DW.zip CallCenterPivot.zip Let’s OLAP • Download and unzip the following from the ATiB page accessible from my home page: http://www.sba.oakland.edu/faculty/isken/isken.htm http://www.sba.oakland.edu/faculty/isken/ATiB406_Downloads.htm • Let’s look at Excel Pivot Tutorial