1 / 13

A Guide to Predictive Analytics and Data Mining in Information Systems

Learn about the data mining process, defining data models, applying results, and interpreting outcomes in information systems. Discover methodologies, tools like SAS EM, and key data mining tasks for business intelligence.

hollym
Download Presentation

A Guide to Predictive Analytics and Data Mining in Information Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Systems – Walton College of Business 2017 SWDSI

  2. Predictive Analytics David Douglas & Paul Cronan Information Systems ddouglas@walton.uark.edu

  3. Predictive Analytics

  4. The Data Mining Process 3. The mining runs need to be defined 5. The results need to be applied 1. The data model needs to be defined Extracted Information Data Ware-house Business Problem Apply Results Selected Data Transform Mine Visualize Understand Select 2. The mining data store needs to be filled 4. The results need to be interpreted Source: Adapted from Building the Data Warehouse, IBM

  5. One view of BI We will be working in the Discover Mode Information Analysis Discovery Mode Verification Mode Q & R OLAP Statistical Analysis Data Mining Track Answers Verify Analyze Discover

  6. Why SAS EM? • Focus is on developing useful models, not programming • Flow driven -- no programming for most models • Industrial strength – heavily used and tested • Most popular commercial DM products • Faster which allows coverage of more models

  7. Why not R or Python? • They highly recommend, but it is a programming language, the modeler must do a good bit of programming as well as modeling. Code versus Setting Properties

  8. Core points • Supervised/directed vs unsupervised/undirected • Variables • Statistics terminology (dependent, independent) • Data mining (target, predictor) • Data Issues • Training, validation, test sample • Overfitting

  9. Overfitting

  10. SAS Methodology • SEMMA • Sample • Explore • Modify • Model • Assess

  11. Data Mining Tasks Source: Data Mining for Business Intelligence – Shmueli, Patel, Bruce

  12. Data Mining Tasks Supervised Directed • Prediction • Estimation • Classification • Clustering • Association • Difference; target variable—numeric or categorical unsupervised undirected 13

More Related