1 / 34

Visual Data Mining with MineSet™

Visual Data Mining with MineSet™. MineSet Web site URL: www.sgi.com/software/mineset. Data Warehouses or Business Data. MineSet ™. Business Insights. What is MineSet™?.

lisaz
Download Presentation

Visual Data Mining with MineSet™

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visual Data Mining with MineSet™ MineSet Web site URL: www.sgi.com/software/mineset

  2. Data Warehouses or Business Data MineSet™ Business Insights What is MineSet™? • Visual data mining technology that helps your business quickly turn large amounts of data into actionable business insights

  3. Why is MineSet™ so important? • Business Intelligence Solutions • Visualization is essential (IDC,1998) • 80% of users find visualization to be desirable • 51% find it very or extremely desirable • Data mining algorithms (IDC, 1998) • Important to over 80% of data warehousing users • Explosive data growth (Meta Group, 1997) • Data warehouses double in size every 12 to 18 months • Scalability, CPU performance and I/O bandwidth (The Data Warehouse Institute, 1998) • Most important factors in selecting a data mining or data warehouse hardware platform

  4. Action/Feedback Layer Analytical Layer IT Infrastructure Layer Enterprise Data Warehouse ECTL Processes Operational System(s) Mart 1 Mart 2 Mart N Descriptive Predictive Visual How does MineSet™ fit into Business Intelligence Solutions? Visualizations for Decision Makers Visualization & Analytic Model Development for Business Analysts

  5. MineSet Client GUI Transformations Denormalized Data Subset MineSet Server Visual Analytical Data Mining Data Mining Business Insights & Understanding API to OLAP and Other Mining Algorithms Data Mining Discovery Process Selection Data Warehouse or Business Data

  6. MineSet Clients • Windows • SGI IRIX • Launch via • COM • ActiveX Tool Manager (Controls, Visualizations) • MineSet Servers • NT, Linux • 32-bit • Single Threaded • SGI IRIX • 64-bit • Parallel • RDBMS Connections • Oracle • Sybase • Informix • ODBC • Flat Files Analytical Data Mining Engine Data Transformations API to OLAP & Other Mining Algorithms Data Warehouse & Business Data Data Mining Discovery Process with MineSet™

  7. MineSet™ 3.1 Key Features • Powerful Visual Data Mining • Visualizations launched from MineSet™ Clients or any Windows application or WEB browser • Insightful Analytic Data Mining • Classification, Regression, Association and Clustering data mining model development • Software Development Toolkit (SDK) • Plug-in APIs for Analytics, Visual, Transformations and Functions • Application Toolkit Extensions • APIs to facilitate the writing of Web enabled applications

  8. Statistics Visualizer Mean, Min/Max, Std. Dev. analysis Histogram Visualizer Distinct count & range analysis Visual Data Mining with MineSet™

  9. Scatter Visualizer Multi-dimensional analysis Splatter Visualizer Multi-dimensional analysis for large data sets Visual Data Mining with MineSet™

  10. Map Visualizer Spatial trend analysis Tree Visualizer Hierarchical trend analysis Visual Data Mining with MineSet™

  11. Unsupervised Mining (descriptive & unlabeled columns) Clustering Association • Correlations • one-to-one • multi-way • Segmentations • k-means • iterative k-means Analytic Data Mining with MineSet™ Learning Supervised Mining (predictive & labeled columns) Classification Regression • Continuous columns • Regression trees • Discrete columns • Decision trees * • Evidence * • Option trees • Decision tables • Column Importance (column selection) • * boosting option

  12. Decision Tree Visualizer for Decision Tree data mining analysis Regression Tree Visualizer for Regression Tree data mining analysis Visualizing Analytic Data Mining Models with MineSet™

  13. Evidence Visualizer for Naïve-Bayes data mining results, interactive scoring & analysis Decision Table Visualizer for Decision Table data mining analysis Visualizing Analytic Data Mining Models with MineSet™

  14. Cluster Visualizer for Cluster data mining results analysis Association Visualizer for LHS/RHS association i.e., “market basket” analysis Visualizing Analytic Data Mining Models with MineSet™

  15. Data Sources • MineSet Server & DB connections Data Destinations • Visualization Tools • Mining Tools • Data export Data Transformations • Remove, add, change or bin columns • Filter, Aggregate or sample columns • Apply Model or Plug-in Data Source MineSet™ Tool Manager -- Integrating it all together

  16. Visualization Deployment Options with MineSet™ • Visualizations can be launched via MineSet™ Clients from • Any Windows application through the COM protocol as ActiveX components • Any MineSet™ client on Windows or SGI IRIX • WEB browsers through the COM protocol as ActiveX components <CLICK HERE TO LAUNCH> • WEB browsers by recording the visualization in a web-media format or saving a snap shot

  17. Data Warehousing Wave Internally Focused Reactive Problem Solving Build data repositories to understand the past Re-engineer infrastructures to support business operations and process workflow Data Mining Wave Customer Focused Proactive Solutions & New Market Development Predict customer behavior, market trends, and competitive environments Leverage IT infrastructure with Business Intelligence Solutions 2000 1985 1995 2005 1990 Data Mining Wave -- Business Intelligence Solutions

  18. Obtain Data Identify Business Indicators Measure Results Investigate & Drill Down Implement Action Plan Develop Visual/Analytical Models & Business Scenarios Approve Action Plan Create Action Plan Close Loop Business Model Data Mining Pilot/Project MineSet SGI PSO or Data Mining Partner

  19. MineSet™ Business Intelligence Solution Examples • State of Texas Medicaid Fraud & Abuse Detection System • Situation Analysis: • About 25% of the Texas state budget goes to medical welfare programs • Estimated 10% of the $7.3B in Medicaid transactions are fraudulent • The previous system, Surveillance Utilization Revision Subsystem (SURS), detects only 14% of fraudulent providers • The Requirements • More accurate fraud detection and higher prosecution rates

  20. State of Texas Welfare Fraud Management • The Business Intelligence Solution • ITC Fraud Spotlight data mining and fraud case management application • MineSet™ software for visualization • EDS System integration and management • SGI Origin 2000 high-performance IRIX server running Oracle 8i, ITC & MineSet™ Server • The Results • Detection rate of suspected doubled to 38% • Solution paid for itself in less than 6 months • The fraud detection application being leveraged into other state programs

  21. Clustering and Data Visualization with MineSet™ • Data on 14,000 providers analyzed by unsupervised neural networks • Neural networks clustered providers based on 100+ columns • Visualization tool displays clustering, showing known fraudulent providers • Subset of 100 providers with similar patterns investigated: Hit rate > 70% Source: ITC 9-22-98

  22. State of TexasWelfare Fraud Management • “These savings could help provide preventive care for several hundred thousand additional children” • - Robin Herskowitz, • Senior Policy Analyst with the Texas Office of the Comptroller of Public Accounts

  23. MineSet™ Business Intelligence Solution Examples • Bioinformatics scientists are using MineSet visual and analytical data mining to gain insights and understanding into genetic data. • Business Intelligence Solutions Examples • Analysis of Gene Expression Chip Data at Roche • Visualizing Gel Electrophoresis Data at the University of Michigan • Visualizing Sequence Comparisons at the EBI • Predicting Splice Junction Points

  24. Analysis of Gene Expression Chip Data at Roche • Researchers at Roche use SGI MineSet to analyze and understand gene expression chip data. • Visualization: Gene expression chip data displayed using MineSet Scatter Visualizer. Image Courtesy Roche Group.

  25. Visualizing Gel Electrophoresis Data at the University of MI • Professor Philip Andrews & Peter Ulintz, Biological Chemistry Dept., are using the MineSet Splat Visualizer to view the large amount of electrophoresis data. • Visualization: Electrophoresis data displayed in 2-dimensions using MineSet Scatter Visualizer. Image courtesy the University of Michigan.

  26. Visualizing Sequence Comparisons at the EBI • The European Bioinformatics Institute uses MineSet to visualize the partial results of a segment-wise "all-against-all" FASTA sequence comparison between two completed genomes. • Visualization: Genome comparisons using MineSet Map Visualizer. Image courtesy the EBI.

  27. Predicting Splice Junction Points from the GenBank Primate DB • Using splice junction information, the MineSet Decision Tree classifier is used to predict and identify splice junction points in other unknown primate sequences • Visualization: Evidence Visualization of DNA Splice junction data.

  28. Predicting Splice Junction Points from the GenBank Primate DB • Using splice junction information, the MineSet Evidence Classifier is used to predict and identify splice junction points in other unknown primate sequences • Visualization: Decision Tree Visualization of DNA Splice Junction Data.

  29. Sample list of Business Intelligence Solutions using MineSet™ SGI Proprietary and Confidential

  30. MineSet™ 3.1 Summary • Powerful Visual Data Mining • Visualizations launched from MineSet™ Clients or any Windows application • Insightful Analytic Data Mining • Classification, Regression, Association and Clustering data mining model development • Scalable Client/Server Architecture • Windows and SGI IRIX MineSet™ clients • NT/Linux MineSet™ Servers (Single Threaded 32 bit) • IRIX MineSet™ Servers (Parallel 64-bit) • Software Development Environment • APIs and plug-in interface and integration into other OLAP tools

  31. MineSet™ 4.X Client Enhancements • Future Client Enhancements • Visualization Environment • 2D Visualizations • Data Mining Control Environment • Data Mining Project and Model management • Tool Manager Wizards for common Data Mining Tasks • Control API • Client API to provides access to all MineSet™ capabilities • Web enabling of server function • Enhanced configuration and launching of Data Visualization tools in WEB browser environments • Application Authoring Support

  32. MineSet™ 4.X Server Enhancements • Future Server Enhancements • Data Mining Analytics • New Analytics • Time Sequenced Data (e.g., Association sequences) • Source Code export of MineSet™ Data Mining Models • Performance • Parallel Clustering • Parallel RDBMS, ODBC, Flat File, etc. to MineSet™ operations • Connectivity • Enhanced Direct RDBMS connectivity on NT, Linux and SGI IRIX MineSet™ Servers

  33. MineSet™ Client/Server Platforms from sgi SGI Series 2000 SGI IRIX Servers SGI Series 1000 NT & LINUX Servers SGI

More Related