170 likes | 357 Views
Advanced Analytics. Business Intelligence with Data Mining. Data Mining. What’s important Association/Binning Clustering Classification Segmentation What to expect What-if Estimation Curve Fitting Fill in Sparse Matrix Prediction Probability Quantitative.
E N D
Advanced Analytics Business Intelligence with Data Mining
Data Mining • What’s important • Association/Binning • Clustering • Classification • Segmentation • What to expect • What-if • Estimation • Curve Fitting • Fill in Sparse Matrix • Prediction • Probability • Quantitative
Statistical Analyst – Business Modeling DBA Collected Sample Data Store Predictive Metrics & Segments business interpretation Marts • Optimize data marts Warehouse Methodology
Methodology - EDMDAPA • Extract • Integrate disparate data systems • Build holistic business view • Group and organize large sets of categorize • Discretize/Classify • Grouping and Segmentation • Simplify large flat dimensions • Model • Create predictive estimation functions • Deploy • Build/score data marts, cubes with predictive probability and quantitative metrics and simplified dimensional categories • Analyze, Visualize, Scorecard • Identify KPI's, Identify business problems • Plan • Predict(Forecast)/Test(What-If) • Apply performance rules on KPI’s • Act • Campaigns, personalization, optimization
Extract • DecisionStream unites information from disparate data sources for sampling the enterprise • 80% of the work involved in analytics is collecting, cleansing, and preparing data
Classification with Scenario • Segment and Classify combinations of stores, regions, divisions, customers or products • Benchmark against last month! Path of success
Model with 4Thought • Avoids over-fitting • Works well with • Noisy • Co-linear • Not much or sparse data • Factor Analysis • What-if
Filling in the sparse matrix – e.g. #1 • Revenue estimation: • Dimensional intersect: • Red shoes, southwest, women, springtime: • $50,000 • Black shoes, northeast, men, summer: • $38,000 • Black shoes, southwest, women, summer: • $43,000 • Black shoes, northeast, men, springtime: • ???? • Once a model is build against historical data, the resultant function can productively fill in the question marks
Filling in the sparse matrix – e.g. #2 • Insurance cost estimation: • Dimensional intersect: • Age 38, southwest, female, non-smoker, married: • $1,800 • Age 24, northeast, male, smoker, single: • $2,300 • Age 32, southwest, female, smoker, single: • $3,000 • Age 28, southwest, men, non-smoker, married: • ???? • Once a model is build against historical data, the resultant function can productively fill in the question marks
Deploy with DecisionStream • DecisionStream uses predictive function from 4Thought as UDF for derivation • Deploy data marts, cubes, and metadata
Plan • Determine Business Goals and apply • NoticeCast Agents • KPI Business Pack • Exception highlighting with reports • Forecast with 4Thought • Access forecasted results with ETL
Keys to Mining • Usefulness • Can the information discovered be considered knowledge? • Certainty • How viable is the discovered knowledge • Expressiveness • Can the discovered knowledge be represented in a meaningful way
Problems for Mining • Missing data • Inconsistent categories • Too much data • Difficult to focus • Not enough data • Nothing meaningful • Too many patterns • Hard to discern knowledge from garbage • Complexity of discoveries • Knowledge is too complex to be used • Unavailable data
The Cognos BI Solution • Integrating touch-points leads to a 360-degree view of your business. • Many scored metrics are loaded via predictive models. • Segmentation is useful for simplifying large flat dimensions.