130 likes | 155 Views
Learn about the data mining process, defining data models, applying results, and interpreting outcomes in information systems. Discover methodologies, tools like SAS EM, and key data mining tasks for business intelligence.
E N D
Predictive Analytics David Douglas & Paul Cronan Information Systems ddouglas@walton.uark.edu
The Data Mining Process 3. The mining runs need to be defined 5. The results need to be applied 1. The data model needs to be defined Extracted Information Data Ware-house Business Problem Apply Results Selected Data Transform Mine Visualize Understand Select 2. The mining data store needs to be filled 4. The results need to be interpreted Source: Adapted from Building the Data Warehouse, IBM
One view of BI We will be working in the Discover Mode Information Analysis Discovery Mode Verification Mode Q & R OLAP Statistical Analysis Data Mining Track Answers Verify Analyze Discover
Why SAS EM? • Focus is on developing useful models, not programming • Flow driven -- no programming for most models • Industrial strength – heavily used and tested • Most popular commercial DM products • Faster which allows coverage of more models
Why not R or Python? • They highly recommend, but it is a programming language, the modeler must do a good bit of programming as well as modeling. Code versus Setting Properties
Core points • Supervised/directed vs unsupervised/undirected • Variables • Statistics terminology (dependent, independent) • Data mining (target, predictor) • Data Issues • Training, validation, test sample • Overfitting
SAS Methodology • SEMMA • Sample • Explore • Modify • Model • Assess
Data Mining Tasks Source: Data Mining for Business Intelligence – Shmueli, Patel, Bruce
Data Mining Tasks Supervised Directed • Prediction • Estimation • Classification • Clustering • Association • Difference; target variable—numeric or categorical unsupervised undirected 13