500 likes | 813 Views
Predictive Analytics with Oracle Data Mining. Vinay Deshmukh Senior Director Oracle Applications Labs vinay.deshmukh@oracle.com Bryan Hodge Global Leader Customer Intelligence Customer Support Services bryan.hodge@oracle.com. We run the applications that run Oracle
E N D
Predictive Analytics with Oracle Data Mining VinayDeshmukh Senior Director Oracle Applications Labsvinay.deshmukh@oracle.com Bryan Hodge Global Leader Customer Intelligence Customer Support Services bryan.hodge@oracle.com Oracle Confidential – Internal/Restricted/Highly Restricted
We run the applications that run Oracle We drive enhancements based on our experience We share best practices with our customers
Value Chain Opportunities and Risks Large and Diverse Customer Base Transition to the cloud Opportunity & Risk Assessment Complex Global Hardware Value Chain 600+ global spares warehouses Oracle Confidential – Internal
Opportunity and Risk assessment using ODM Discover hidden/subtle data patterns Augment Value Chain Planning –both forward and reverse Identify inter-relationships among data elements Oracle Data Mining Quantify likelihood of opportunity/risk Rewind the clock and compare model accuracy against actuals. Oracle Confidential – Internal
Oracle Advanced Analytics EvolutionAnalytical SQL in the Database • New algorithms (EM, PCA, SVD • SQLDEV/Oracle Data Miner 4.0 “work flow” GUI launched with SQL script generation and SQL Query node (R integration) • OAA/ORE 1.3 + 1.4 launched adding several new scalable R algorithms • Oracle Adv. Analytics for Hadoop Connector launched with scalable BDA algorithms • ODM 11g & 11gR2 adds AutoDataPrep (ADP), text mining, perf. improvements • SQLDEV/Oracle Data Miner 3.2 “work flow” GUI launched • Integration with “R” and introduction/addition of Oracle R Enterprise • Product renamed “Oracle Advanced Analytics (ODM + ORE) • Oracle Data Mining 10g & 10gR2 introduces SQL dm functions, 7 new SQL dm algorithms and new Oracle Data Miner “Classic” wizards driven GUI • Oracle Data Mining 9.2i launched – 2 algorithms (NB and AR) via Java API • Oracle acquires Thinking Machine Corp’s dev. team + “Darwin” data mining software • 7 Data Mining “Partners” 2015 1998 1999 2002 2004 2005 2008 2011
Oracle Advanced AnalyticsPerformance and Scalability with Low Total Cost of Ownership Traditional Analytics Oracle Advanced Analytics Data Import Data Mining Model “Scoring” Data remains in the Database • Scalable, parallel Data Mining algorithms in SQL kernel • Fast parallelized native SQL data mining functions, SQL data preparation and efficient execution of R open-source packages • High-performance parallel scoring of SQL data mining functions and R open-source models Fastest way to deliver enterprise-wide predictive analytics • Integrated GUI for Predictive Analytics • Database scoring engine Lowest TCO • Eliminate data duplication • Eliminate separate analytical servers • Leverage investment in Oracle IT Data Prep. & Transformation avings Data Mining Model Building Data Prep & Transformation Data Extraction Model “Scoring” Embedded Data Prep Model Building Data Preparation Secs, Mins or Hours Hours, Days or Weeks
Oracle Data Miner • SQL Developer 4.0 Extension • Free OTN Download • Easy to Use • Oracle Data Miner GUI for data analysts • “Work flow” paradigm • Powerful • Multiple algorithms & data transformations • Runs 100% in-DB • Build, evaluate and apply models • Automate and Deploy • Save and share analytical workflows • Generate SQL scripts for deployment
More Data Variety—Better Predictive Models 100% 100% • Model with “Big Data” and hundreds -- thousands of input variables including: • Demographic data • Purchase POS transactional data • “Unstructured data”, text & comments • Spatial location data • Long term vs. recent historical behavior • Web visits • Sensor data • etc. • Increasing sources of relevant data can boost model accuracy Model with 20 variables Naïve Guess or Random Model with 75 variables Model with 250 variables Responders Population Size 0%
A1 A2 A3 A4 A5 A6 A7 F1 F2 F3 F4 Oracle Advanced Analytics Algorithms
In-Database Advanced Analytics Independent Samples T-Test • Query compares the mean of AMOUNT_SOLD between MEN and WOMEN Grouped By CUST_INCOME_LEVEL ranges • Returns observed t value and its related two-sided significance (<.05 = significant) • SELECT substr(cust_income_level,1,22) income_level, • avg(decode(cust_gender,'M',amount_sold,null)) sold_to_men, • avg(decode(cust_gender,'F',amount_sold,null)) sold_to_women, • stats_t_test_indep(cust_gender, amount_sold, 'STATISTIC','F') t_observed, • stats_t_test_indep(cust_gender, amount_sold) two_sided_p_value • FROM sh.customers c, sh.sales s • WHERE c.cust_id=s.cust_id • GROUP BY rollup(cust_income_level) • ORDER BY 1;
Case Study: Support Cancellation Early Warning Bryan Hodge Oracle Confidential – Internal/Restricted/Highly Restricted
Case Study: Support Cancellation Early Warning Premier Support Revenue Contracts to be Renewed 8M Product Lines Very diverse customer base Broad range of products $21B 550K Oracle Confidential – Internal/Restricted/Highly Restricted
Challenge: Predict the small percentage of contracts/lines that are at risk in order to focus resources appropriately , and minimize losses Oracle Confidential – Internal/Restricted/Highly Restricted
Business Solution Developed a cancellation early warning system Embedded system generated risk assessment into Forecasting Tool Sales Rep uses to help forecast & engage management Manager uses to inform forecast judgement Oracle Confidential – Internal/Restricted/Highly Restricted
Two Phase Approach Tribal Knowledge • Used sales rep experience to identify risk attributes • E.g. Age of product, size of deal • Profiled contract base • Established thresholds for Low, Medium & High risk per attribute • Algorithm to balance attributes into overall risk assessment Oracle Data Miner • Analyzed one year of outcomes to Train decision tree model • Cancelled or Renewed • ‘Wound the clock back’ on six months more data • Scored the six months data to generate predictions • Assessed Accuracy at 85% = (True Positive + True Negatives ) / Number of Observations Oracle Confidential – Internal/Restricted/Highly Restricted
Oracle Data Miner - Details Warehouse star schema with enhanced attributes Attribute Importance Analysis ODM Analysis Trained decision tree model & assessed accuracy Saved results in warehouse fact for use in Forecasting Tool Oracle Confidential – Internal
Business Benefits • Identified hidden relationships across many attributes • Improved quality of risk assessment • Early intervention for customer sat • Reduced cancellation rates • Bottom line improvements Challenges • Ensuring statistically significant data volumes in tree branches • Preparation of data to ‘wind the clock back’ • Avoiding bias during data prep. • Handling partially populated attributes Oracle Confidential – Internal/Restricted/Highly Restricted
Case Study: Predicting Spare Parts@risk VinayDeshmukh vinay.deshmukh@oracle.com Senior Director Oracle Applications Lab Oracle Confidential – Internal/Restricted/Highly Restricted
Case Study: Identify Spare parts @risk of short supply Hardware Service Revenue Global Spares Warehouses Large deployment of Value Chain Planning . Augment VCP capabilities with Oracle Data Mining Very diverse customer base Broad range of products 1.5 million part-location pairs supersessions & substitutions $2.3B 600+ Oracle Confidential – Internal/Restricted/Highly Restricted
Problem Statement Ensure a high level of service to our hardware customers by identifying the parts at risk of short supply at the warehouses closest to them and take proactive steps to remedy the shortage risk . Augment current Value Chain Planning Capabilities to provide risk assessment of parts@risk.
Oracle Value Chain Planning Solution Transformational Tools Service Parts Planning Deployed Demand Signal Management Global Order Promising Deployed Demand Management and Advanced Forecasting Network Design and Risk Management Deployed Production Scheduling Planning Analytics Deployed Sales and Operations Planning Collaborative Planning and VMI Trade Promotion Planning and Optimization Supply and Distribution Planning and Event-driven Simulation • Single source of truth • Integrated with ERP Deployed Deployed Deployed
Solution Approach • Augment Value Chain Planning using ODM Model • Exception Reporting • Customer Report
Model Attributes Supply Demand Forecast Accuracy • On hand • External repair orders • Projected available balance • Days of supply • Safety stock mean/std dev • Forecast mean/std dev • Shipments • Backorders • MAPE • Volatility • Intermittency Item Attributes • Cost • PLC Code Oracle Confidential – Internal
Prototype Assumptions for the Model: 1 of 2 • 4 Models based on – AMER (US), AMER (Non US), EMEA and APAC • Data used for training the model was Feb , Mar and Apr 2014 with May 2014 as the target • Input data used in the model - Apr, May, Jun 2014 with Jul 2014 as the target • Average and Std Dev used for time phased inputs to the Model – Forecast and Safety Stock. Latest value for projected available balance used. • Current Backorders and Onhand considered
Prototype Assumptions for the Model: 2 of 2 • For remaining parameters, 3 month average value used • Item is marked at RISK if • (backorder > 0 OR pab_qty < 0 OR ss_qty > oh_qty) • Item is marked as ‘Not at RISK’ if • (pab_qty between 0 and 0.25 )OR (oh_qty - ss_qty) between 0 and 0.25 • Remaining records were deleted. This tolerance logic was applied to restrict the count of ‘NO’ records in the training data • The final output shows the items at risk along with the orgs where planning exceptions are generated in the latest run of the corresponding Value Chain Plan for spares
Accuracy Analysis EMEA AMER APAC Total Cases: 2097 False YES: 0 (0%) False NO: 204 (10%) Accuracy: 90% Total Cases: 2097 False YES: 1 (0%) False NO: 433 (21%) Accuracy: 79% Total Cases: 1902 False YES: 0 (0%) False NO: 127 (7%) Accuracy: 93% Latin America Total Cases: 2358 False YES: 1 (0%) False NO: 443 (19%) Accuracy: 81% Oracle Confidential – Internal
Exception Reporting • OBIEE Report to show priority exceptions generated for the parts-at-risk predicted by the ODM Model (built on Oracle Advanced Planning Command Center , Oracle Value Chain Planning Suite) • ODM Output stored in Value Chain Planning data model by specifying the region and organization • VCP (Advanced Planning Command Center) reports latest exceptions for the respective plan for the parts-at-risk predicted
Customer Report • This report shows the contract, base-model and party impacted by the parts-at-risk predicted by ODM Model • Based on the part-at-risk, the model is fetched utilizing Demantra Data. • 'EXPIRED', 'CANCELLED', 'TERMINATED‘ contracts are filtered out • Premier Customers impacted by the parts @ risk are identified
Case Study: Predicting Hardware Opportunities VinayDeshmukh vinay.deshmukh@oracle.com Senior Director Oracle Applications Lab Oracle Confidential – Internal/Restricted/Highly Restricted
Problem Statement • Predict the outcome of (non-Cloud) Hardware Opportunities whose expected revenue is greater than $1 million • Includes both Direct and Indirect sales channel • For the opportunities predicted to be won , provide early visibility to suppliers and contract manufacturers by leveraging the capabilities of Value Chain Planning Suite .
Solution Approach • Train the ODM model with the historical data • Opportunities from 31-Mar-2011 to 31-Aug-2013 • Trained the model with opportunities that were Won or Lost between 31-Mar-2011 and 31-Aug-2013 • Additional computed attributes used - product weight, customer weight, partner weight • Predict the likely outcome of opportunities open as of 1-Sep-2013 using the model • Test the prediction by comparing against actual wins and losses for predicted opportunities • Future: Use the predicted opportunities in Value Chain Planning as causal factors to improve forecast accuracy
Model Attributes Customer Product Industry • Product Line • Primary Competitor • Product Group • Product Weight • Account Weight • Annual Revenue Category • Number of Employees • Top level Industry Geography Partner Opportunity • LOB Code • Region • Country • Sales Method • Channel Type • Partner Type • Partner Weight • Cycle Time • Opportunity Status • Expected Revenue • Opened Date Oracle Confidential – Internal
Calculating weights using Bayesian approach • ((Customer 'x' Won Opty / Total Won Opty) * (Customer 'x' Total Opty / Total Opty)) / (Σ(Customer 'x' Won Opty / Total Won Opty) * (Customer 'x' Total Opty / Total Opty)) • ((Product 'y' Won Opty / Total Won Opty) * (Product 'y' Total Opty / Total Opty) ) / (Σ(Product 'y' Won Opty / Total Won Opty) * (Product 'y' Total Opty / Total Opty)) • ((Partner 'z' Won Opty / Total Won Opty) * (Partner 'z' Total Opty / Total Opty)) / (Σ(Partner 'z' Won Opty / Total Won Opty) * (Partner 'z' Total Opty / Total Opty)) Where x = number of customers, y = Number of products, z = Number of partners • Note: Direct and Indirect Partner weights are calculated separately. For customer weights = 0 they are replaced by the median of the customer weights
The Metrics • False Positives Model predicted that an event will occur but the event did not occur over the risk horizon • False Negatives Model predicted than an event will not occur over the risk horizon but the event did occur Model accuracy = 1 - [(false positives + false negatives)/ total observations]
Accuracy Achieved (Direct + Indirect channels) Actual • Average Accuracy • 73.0 % • Overall Accuracy • 78.2 % • Accuracy of winning the deal • 84.0 % Predicted
Oracle Applications Lab Conclusion Predictive Analytics with Oracle Data Mining Solution Challenge • Predict Contract Lines@risk • Predict Spare Parts@risk • Predict H/W opportunity Wins • Oracle Data Mining for predictive analytics • Augment Oracle Value Chain Planning capabilities provided by Oracle Demantra and Oracle Advanced Planning Command Center • OBIEE Benefits • Reduced Cancellation Rates • Improve Service Delivery Performance to hardware spares customers • Early demand visibility to suppliers Oracle Confidential – Internal