1 / 6

Lab 0

Lab 0. High-Performance Data Mining: Problems and Current Tools. Monday, May 15, 2000 William H. Hsu Department of Computing and Information Sciences, KSU http://www.cis.ksu.edu/~bhsu. Data Mining Software Practicum. Lab and Implementation Project Objectives

lalo
Download Presentation

Lab 0

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lab 0 High-Performance Data Mining: Problems and Current Tools Monday, May 15, 2000 William H. Hsu Department of Computing and Information Sciences, KSU http://www.cis.ksu.edu/~bhsu

  2. Data Mining Software Practicum • Lab and Implementation Project Objectives • Learn to work with some KDD software tools • Understand modern ML-based object oriented HP-KDD systems • Main tools: MLC++/MineSet and NCSA D2K • Implementation Project Milestones • Project plan • Due Monday, May 22, 2000 (Homework 1) • Short writeup: 1-2 pages design plus simple specification document • Project interim report • Due Wednesday, May 31, 2000 (Homework 2) • Preliminary itinerary and documentation • Comparative results • Using software tools • Due Thursday, June 8, 2000 (machine problem; Homework 3) • Exposure to: MineSet; SNNS, NS3; Hugin, BKD;GPSys, Genesis

  3. Software Environments for KDD:NCSA D2K

  4. Rapid KDD Development Environment KDD and Software Engineering:D2K Framework

  5. Environment (Data Model) Knowledge Base Performance Element Learning Element Performance Element:Decision Support Systems (DSS) • Model Identification (Relational Database) • Specify data model • Group attributes by type (dimension) • Define queries • Prediction Objective Identification • Identify target function • Define hypothesis space • Transformation of Data • Reduce data: e.g., decrease frequency • Selectrelevant data channels (given prediction objective) • Integrate models, sources of data (e.g., interactively elicited rules) • Supervised Learning • Analysis and Assimilation: Performance Evaluation using DSS

  6. 6500 news stories from the WWW in 1997 NCSA D2K - http://www.ncsa.uiuc.edu/STI/ALG Cartia ThemeScapes - http://www.cartia.com Database Mining Reasoning (Inference, Decision Support) Destroyed Normal Extinguished Ignited Fire Alarm Engulfed Flooding Planning, Control DC-ARM - http://www-kbs.ai.uiuc.edu Some Interesting Industrial Applications

More Related