1 / 16

- Sachin Singh

CS 551 Research Track Filtering and Comparing of Classification trees using XML - Sachin Singh Data Mining - Concepts Extracting meaningful knowledge from huge chunk of ‘raw’ data. Types Association Classification Temporal Classification Method Prediction model

benjamin
Download Presentation

- Sachin Singh

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 551 Research Track Filtering and Comparing of Classification trees using XML - Sachin Singh

  2. Data Mining - Concepts • Extracting meaningful knowledge from huge chunk of ‘raw’ data. • Types • Association • Classification • Temporal

  3. Classification Method • Prediction model • The C4.5 Tree algorithm

  4. Classification Tree

  5. Analysis of Trees • Current work focuses largely on generation of trees • Efficient algorithms • Disk Resident gigantic data sources • Improving accuracy of the generated models • Motivation • Current research area – need for analysis

  6. Areas of Analysis • Two Sub Problems • Filtering Sub Problem • Comparison Sub Problem

  7. Filtering Sub Problem • Typical data warehouses are huge !! • Generation of “Bushy” trees • Not all outcomes are significant • Need to filter trees based on the required outcomes

  8. Filtering Sub Problem Filtered Classification Tree Full Classification Tree

  9. Filtering Sub Problem • Advantages • Efficient querying. Faster results • Easy Managed • Useful for comparison sub problem

  10. Comparison Sub Problem • Need to monitor changes in data trends by comparing the classification trees • Levels of changes identified • Change in test (partition) value • Change in the partitions • Change in node levels • Change in outcome(leaves)

  11. Comparison Sub Problem • Issues • Structure of trees unpredictable • Comparing two trees with no standard structure

  12. Solution • XML Trees • Convert the tree structure in XML files • XML inherently tree structure • Take advantage of existing XML related technologies • Standard specs

  13. Solution – Proposed File format

  14. Approach • Devise Algorithms to solve filtering and comparison problems • Analyzing results of comparison in logical terms • Measuring efficiency of the algorithms through time and space complexities

  15. Progress

  16. Suggestions Preferred !! Over questions !! Thank You !!

More Related