1 / 0

Previously in class about information visualization...

Previously in class about information visualization. Defined the goal of information visualization and discussed the visualization tasks for BI. Identified methods of enhancing understanding and amplifying cognition:

keegan
Download Presentation

Previously in class about information visualization...

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Previously in class about information visualization... Defined the goal of information visualization and discussed the visualization tasks for BI. Identified methods of enhancing understanding and amplifying cognition: Reduce search time and enhance recognition of patterns (using pre-attentive processing); provide focus/emphasis through afforandances. Reviewed heuristics from Tufte and Nielsen. Saw an example of a multivariate visualization for the task of communication.
  2. Previously in the readings... Understand quantitative relationships (optional review) Nominal vs. ordinal vs. interval vs. hierarchical relationships Ranking vs. ratio vs. correlation Measures of average and distribution Concepts of tables and graphs Tables are used to see individual values; graphs are used to reveal relationships among multiple values Tables and graphs should be sorted to highlight key message. Relative use of pie charts, bar charts, line charts, sparkline, small multiples, box plot... Showing relationships vs. deviation vs. correlation vs. ranking vs. time-series vs. part-to-whole vs. distribution Importance of sorting tables and graphs.
  3. What’s up for today? Finish evaluating a few sample individual visualizations. Explain how visualizations fit within the overall BI architecture. Discuss the differences between OLAP and data mining. Present dashboards as the most common OLAP visualization tool. Begin discussion of data mining.
  4. The purpose is to compare one product’s sales to other products. Good or bad visualization?
  5. The purpose is to display sales revenue in the state of Kansas associated with 12 products across the four quarters of a year. How would you improve this visualization?
  6. BI – what is it again??? “Business Intelligence” is making purposeful use of data in decision making. The goals of BI are: To support human decision making by providing as much understandable, complete, relevant, well-organized information as necessary and helpful. To automate some decisions to relieve humans of routine decision making tasks. To discover new issues/relationships/correlations that may not be able to be readily conceived by humans.
  7. BI Architecture
  8. Overall Components of BI Architecture Data Sources available for input. ETL tools to bring input data into an integrated data source. Integrated Data Source (usually a data warehouse). Structured and unstructured data. Internal and external data. Metadata repository. Data definitions and meanings. Business rules and process decisions. Analytical tools. OLAP: Online Analytical Processing Statistical analysis. Data Mining. Data Visualization. Graphical, tables, pictures.
  9. Online analytical processing tools The vast majority of output from BI is OLAP-related. Provide information to support both ad-hoc and consistent queries for managerial decision making. Provide multi-dimensional data analysis techniques. Work primarily with data aggregation. Data mart/derived data model. Provide advanced statistical analysis. Support access to very large databases through additional data structures such as SQL Server Analysis Services (cubes). Contain enhanced query optimization algorithms to facilitate query processing speed (SQL Server Analysis Services).
  10. OLAP Results Generates relatively standardized reports to ad-hoc queries. Answers questions such as: Which products sold the most quantity - by type of product and geographic region? Which stores are currently most profitable? Which are least profitable? Used frequently to support short and long term managerial decision making. OLAP Visualization Presented in standard displays that are accessed frequently Dashboard format used to provide quick and comprehensive overview of business status. Presented in Excel or other spreadsheet format. Display the output using either a standard report generator (Crystal Reports, Access, etc.) Display the output graphically.
  11. Data mining tools Data mining is the set of activities used to find new, hidden or unexpected patterns in data. Data mining tools: use large sets of data; uncover patterns based on statistical and artificial intelligence algorithms; form computer models based on the findings; and use the models to predict business behavior. Common synonyms for data mining include knowledge discovery, information harvesting, & pattern analysis. Proactive tools, used for discovery and prediction.
  12. Data Mining Results Generates information about patterns in data. Data mining provides answers to previously ambiguous questions; but a question area must be defined. May produce information such as: Which products should be promoted to a pre-defined type/category of customer? Which patients have the greatest likelihood of being hospitalized within the next year? Which securities are the most profitable to buy/sell in a particular environment? Data Mining Visualization Focus is on discovery and analysis, rather than reporting, monitoring or communicating a message. Uses primarily graphical output to display the patterns. Included as part of the data mining tool. Can also incorporate the results in standardized reporting tools and/or dashboards, but information is already “discovered” by that time.
  13. Is it OLAP or Data Mining?? How many people between the ages of 15-30 are diagnosed with type 2 diabetes? What is the quantity breakdown by county in the U.S. for people diagnosed with type 2 diabetes? What is the relationship between weight, exercise, age smoking, and the prevalence of type 2 diabetes? What demographic factors are related to type 2 diabetes?
  14. Is it OLAP or data mining?? (TEC) How many different customers did we serve? How many applicants did we place? Which customer was our most profitable? Which customers have the greatest likelihood of increasing their number of temporary employees next year? Which geographic region was our most profitable last quarter? Which geographic region has the fastest growth rate measured by number of employees placed over the last 3 years?
  15. Dashboards Most common visualization method for OLAP. Visual display – not printed. Must have metrics. What is a metric, again?? Key Information Most important information to monitor one or more objectives Usually related directly to key performance indicators Consolidated Fits on one screen (no scrolling!) Designed to be monitored at a glance
  16. Dashboard examples galore! http://www.infosol.com/business%20intelligence/library-dashboards.aspx http://www.dundas.com/dashboard/online-examples/ http://www.tableausoftware.com/ http://www.exceluser.com/dash/samples.htm http://dashboardsbyexample.com/ http://www.dashboardzone.com/
  17. Dashboard videos abound! (mostly from vendors of dashboard products...) http://www.it-performs.com/services/dashboard-centre/dashboard-videos http://www.youtube.com/watch?v=3Stuh7-RyuE http://www.youtube.com/watch?v=EJ9CNhgh8EY http://www.dminebi.com/dmine-dashboard-videos/ http://www.youtube.com/watch?v=V9GMCS-WjyI&feature=related http://www.youtube.com/watch?v=0AS9TIK1QFk&feature=related
  18. Dashboards are not new... Derived from the work on executive information systems (late 1980’s through 1990’s). Further roots in the work on the “balanced scorecard” concept to broaden perspective from financials alone. Uses the dashboard metaphor to develop fast recognition and appeal.
  19. Always need to know the goal
  20. Typical dashboard data
  21. Common mistakes Overall design Exceeding boundaries of a single screen. Limiting design to the dashboard metaphor. Choosing ineffective or inappropriate visualization methods. Poor flow/arrangement of presentation of data. Content Choosing a deficient, inappropriate or ineffective measure. Supplying inadequate context for the data. Displaying excessive detail or precision. Detailed design (look and feel) Misusing or overusing color; meaningless variety of color and shape. Poor highlighting of important data. Cluttering the display with useless decoration.
  22. Well-designed dashboard Delivers information that is: Exceptionally well-organized. Condensed. Provides summaries and exceptions. Specific to the requirements of the audience. Presented on the media of choice for the audience (computer, phone, tablet, etc.) Flexible. Able to be pursued in more detail beyond the dashboard.
  23. Key Goals (Tufte, 1980’s, Few, 2010’s) Understand and make best use of screen real estate Maximize the data-ink /total-ink ratio (or data pixels/total pixels ratio...) Eliminate all unnecessary non-data pixels De-emphasize all non-data pixels and make them slip into the background of the overall design Highlight the most important data pixels
  24. Emphasized Neither emphasized or de-emphasized Emphasized Neither emphasized or de-emphasized De-emphasized
  25. Maximize data pixels/total pixels ratio
  26. Junk pixels Grid lines in graphs that don’t need precision Backgrounds that don’t provide delineation of sections on the dashboard 3-D that doesn’t provide additional variables or layers of analysis Drawings that are not part of the data – including detailed logos Colors that don’t highlight or emphasize data Meters and gaugesthat don’t incorporate preattention
  27. Good design Arrange the overall design to reflect how the intended audience “thinks” about the decisions to be made. Group related data. Arrange the data in a meaningful order (low to high; high to low) Use bright colors sparingly and judiciously. Avoid use of a colored background. White space is an effective delimiter. Use fonts with good legibility and readability.
  28. So, what about data mining visualization? Also graphical, but designed for an analyst to discover patterns, not to communicate information for managerial decision making. Must understand a bit more about data mining while discussing visualization.
  29. Opening Vignette:Data Mining Goes to Hollywood! Dependent Variable Independent Variables A Typical Classification Problem
  30. The DM Process Map in IBM SPSS Modeler
More Related