240 likes | 420 Views
Artificial Intelligence for Data Mining in the Context of Enterprise Systems. Thesis Presentation by Real Carbonneau. Overview. Background Research Question Data Sources Methodology Implementation Results Conclusion. Background. Information distortion in the supply chain
E N D
Artificial Intelligence for Data Mining in the Context of Enterprise Systems Thesis Presentation by Real Carbonneau
Overview • Background • Research Question • Data Sources • Methodology • Implementation • Results • Conclusion
Background • Information distortion in the supply chain • Difficult for manufacturers to forecast
Current solutions • Exponential Smoothing • Moving Average • Trend • Etc.. • Wide range of software forecasting solutions • M3 Competition research tests most forecasting solutions and finds the simplest work best
Artificial Intelligence • Universal Approximators • Artificial Neural Networks (ANN) • Recurrent Neural Networks (RNN) • Support Vector Machines (SVM) • Theorectically should be able to match or outperform any traditional forecasting approach.
Neural Networks • Learns by adjusting weights of connections • Based on empirical risk minimization • Generalization can be improved by: • Cross Validation based early stopping • Levenberg-Marquardt with Bayesian Regularization
Support Vector Machine • Learns be separating data in a different feature space with support vectors • Feature space can often be a higher or lower dimensionality space than the input space • Based on structural risk minimization • Optimality guaranteed • Complexity constant controls the power of the machine
Support Vector Machine CV • 10-fold Cross Validation based optimization of Complexity Constant • More effective than NN because of guaranteed optimality
SVM Complexity Example • SVM Complexity Constant optimization based on 10-Fold Cross Validation
Research Question • For a manufacturer at the end of the supply chain who is subject to demand distortion: • H1: Are AI approaches better on average than traditional approaches (error) • H2: Are AI approaches better than traditional approaches (rank) • H3: Is the best AI approach better than the best traditional
Data Sources • Chocolate Manufacturer (ERP) • Toner Cartridge Manufacturer (ERP) • Statistics Canada Manufacturing Survey
Methodoloy • Experiment • Using top 100 from 2 manufacturers and random 100 from StatsCan • Comparison based on out-of-sample testing set
Implementation • Experiment programmed in MATLAB • Using existing toolbox where possible (eg, NN, ARMA, etc) • Programming missing ones • SVM implemented using mySVM called from MATLAB
Super Wide model • Time series are short • Very noisy because of supply chain distortion • Super Wide model combined data from many products • Much larger amount of data to learn from • Assumes similar patterns occur in the group of products.
Results Discussion • AI provides a lower forecasting error on average. (H1=Yes) • However, this is only because of the extremely poor performance of trend based forecasting • Traditional ranked better than AI. (H2=No) • Extreme trend error has no impact on rank. • SVM Super Wide performed better than the best traditional (ES). (H3=Yes) • However, exponential smoothing was found to be the best and no non-super-wide AI technique reliably performed better.
Results SVM Super Wide details • SVM Super Wide performed better than all others • Isolated to SVM / Super Wide combination only • Other Super Wide did not reliably perform better than ES • Other SVM models did not perform better than ES • Dimensionality augmentation/reduction (non-linearity) is important • Super Wide SVM performed better than Super Wide MLR
Conclusion • When unsure, us Exponential Smoothing it is the simplest and second best. • Super Wide SVM provides the best performance • Cost-benefit analysis by a manufacturer should help decide if the extra effort is justified. • If implementations of this technique proves useful in practice, eventually it should be built into ERP systems. Since it may not be feasible to build for SME.
Implications • Useful for forecasting models which should include more information sources / more variables (Economic indicators, product group performances, marketing campaigns) because: • Super Wide = More observations • SVM+CV = Better Generalization • Not possible with short and noisy time series on their own.