1 / 51

A comprehensive model for Data Quality Value of Data, and User Interface Design

A comprehensive model for Data Quality Value of Data, and User Interface Design. Andrew U. Frank Geoinformation TU Vienna frank@geoinfo.tuwien.ac.at. What are the most important problem hindering wide use of GIS today?. Gueting said: Support for temporal data Spaccapietra said: Semantics.

rosa
Download Presentation

A comprehensive model for Data Quality Value of Data, and User Interface Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A comprehensive model for Data QualityValue of Data, and User Interface Design • Andrew U. Frank • Geoinformation • TU Vienna • frank@geoinfo.tuwien.ac.at Andrew Frank

  2. What are the most important problem hindering wide use of GIS today? • Gueting said: Support for temporal data • Spaccapietra said: Semantics Andrew Frank

  3. What are the most important practical problems for the GI industry? • Consider that the market for GI in Europe is only 1/10 of the comparable industry in the USA (approx. same population). • Impediments for business: • User Interface • Value of Data • Data Quality Andrew Frank

  4. Comprehensive model of GI use • Different applications of GIS are operating with very different concepts of what the GIS produces: • Produce maps (for decision makers) • Analyze situations • Explore data • Each time, a different user interface must be learned, which is a high cost and a large impediment. Andrew Frank

  5. Economic value of information • (Geographic) information can only be used to improve decision. • This is the only situation in which data can produce economic value. • Read: • Varian & Shapiro: Network economy Andrew Frank

  6. Model of rational decision making: • A rational man (a.k.a. homo economicus) decides between action such that his well-being is optimized. Andrew Frank

  7. Multiple critiques: • Not just economic (monetary) optimizations, but general well-being. • Bounded rationality: neither the information nor the inference resources are available to make the optimal decision • … Andrew Frank

  8. Model of rational decision making is (only) a model • Descriptive model: it is often used when we rationalize our behavior after the fact. • We explain our actions in terms of optimizing our utility. • Prescriptive model: for administrative decisions the model is used to justify a decision and to communicate the arguments to others. Andrew Frank

  9. Core model of rational decision making • Produce all candidate actions • Exclude action by non-compensatory criteria • Evaluate utility of remaining candidate actions using compensatory criteria and weights. • Select best action (i.e. action with highest utility). Andrew Frank

  10. Actions change state of the world: Andrew Frank

  11. Hotel for a weekend: candidates Andrew Frank

  12. Andrew Frank

  13. Andrew Frank

  14. Andrew Frank

  15. Andrew Frank

  16. Andrew Frank

  17. Andrew Frank

  18. Andrew Frank

  19. My Criteria • Distance to beach • Classification of hotel • Restaurant • Garden • Trail access • Noise • Price Andrew Frank

  20. Collection of data for these criteria Andrew Frank

  21. Normalize data • Data is collected on different measurement scales (cf. Steven’s paper in Science 1946). • Make it comparable by normalizing it, for example on a scale 0..10 (or 0..1), but allow positive and negative utility. Andrew Frank

  22. Non-compensatory criteria • Non-compensatory criteria (a.k.a. K.O. criteria) • must be fulfilled for a candidate to make it acceptable. Andrew Frank

  23. Compensatory criteria • These criteria list the contribution of properties of the candidate actions. • Weights indicate what the contribution to utility per unit of the property is Andrew Frank

  24. Unifying criteria Andrew Frank

  25. Interaction with the spreadsheet: • The weights are not well determined – this is one of the major critique of the method. • Too many non-compensatory criteria: no elements left. • Reduce non-compensatory criteria. • Many similar solutions – reduce weight for the common criterion. Andrew Frank

  26. User interaction style • User interface must be “direct manipulation” – not requiring a rational analysis, • but give a ‘feeling’ for connections between criteria and optimal selection. Andrew Frank

  27. User Interface Consideration • Shneiderman has pointed out that the only interface style which works consistently are interfaces based on direct manipulation. They exploit human abilities which are not based on verbal (rational) understanding, but use the connection between actions and reactions. • Direct manipulation: • The user has some controls and the result reacts immediately to changes. Andrew Frank

  28. Emotional aspects • Experience shows that users play with weights till the solution feels right. • This means, that it is emotionally acceptable. • Modern neurophysiology has observed that actual decision making in human brains is not rational, but emotionally controlled. • Insert a property ‘likable’ and assess each candidate. Then the weight given this property indicates the emotional influence. Andrew Frank

  29. What are the controls in the rational decision model? • Non-compensatory criteria: • Threshold for fulfillment. • Compensatory criteria: • Weight • What data is considered – either a threshold or a weight is set. Andrew Frank

  30. A first sketch of an interface: • Very simple interface. • Interface is completely in the language of the user. Andrew Frank

  31. General user interface because model is general • The rational decision model is general; EVERY decision is modeled. • Users have to learn only one conceptual model, not many different ones. Andrew Frank

  32. Decision model links directly to user task • Intermediate elements are excluded, which simplifies the conceptualization (less is better!) • Compare with Standard approach: GIS produces map which is used as input to the decision process. • Many details of map form must be fixed, which are not relevant for the decision process. • User interface must have controls for these. Andrew Frank

  33. Value of decision • In the model of rational decision making, the value of data can be estimated: • The value of the data is the improvement of the decision compared to no information. • For decision on actions where the action have a cost, the difference between highest and lowest cost can be used as an estimate for the value of the decision. Andrew Frank

  34. Value of data • Properties which have more weight contribute more to the decision. The value of the decision can be distributed to the data according to the weights. Andrew Frank

  35. Price of data • The value of the data is not the price at which it can be sold: • Deduce cost of obtaining and using it • Price must be set for many users, value is specific for a decision. • Opportunities for specialized user interfaces, connections to data collections and thus BUSINESS. Andrew Frank

  36. Data Quality • Quality of the data is typically measured from the perspective of the data producer. Metadata standards codify this approach. • Observations indicate that users are not using metadata. How should a user decide on the usability of data from metadata? Andrew Frank

  37. Data quality from a user perspective: • Data is good, if it leads to the best decision. It is bad, if it makes me take the wrong decision. • Data quality is the risk of me making the wrong decision. Andrew Frank

  38. Can we translate a producers assessment of data quality to the risk of the user making the wrong decision? • Example: Precision • The producer of data states that the distance to the beach is 100 m +- 50 (one standard deviation, corresponds to 68% of all values are between 50 and 150 m). Andrew Frank

  39. Translation of completeness to risk • Incomplete data will make us miss the best solution. The risk is comparable to the amount of missing data. Andrew Frank

  40. Example: • 50% of data are missing (realistic in the selection of hotels based on web browsing). • Reduce value of data by risk proportionally. Andrew Frank

  41. Temporal currency • Temporal currency is a standard data quality element. • Temporal currency is not ‘separable’ from other criteria. Andrew Frank

  42. Effects of temporal currency • Time passed since collection reduced • Precision • Completeness (omissions, commissions). Andrew Frank

  43. Data does not change, but quality is diluated with time: Andrew Frank

  44. Estimate movement per period and reduce precision proportionally: • Estimate appearance/disappearance of objects: • Reduce completeness proportionally. Andrew Frank

  45. Decision model translates data quality to risk • The decision model translates • data quality to risk and • risk to a reduction in the value of the data. Andrew Frank

  46. Conclusion • The model or rational decision making gives a single conceptual framework in which three important practical problems of today's use of Geographic Information can be discussed: Andrew Frank

  47. User Interface • Decisions can be modeled as a selection of the action which optimizes the utility, given some conditions. • The user must select: what are the elements which influence the decision (selection of data layers, themes..) • What are candidate actions. • What are the minimal requirement for a property • What are his preferences, translated to weights for each property. • This is the same for many (all?) decision situations. Andrew Frank

  48. Value of data • The value of the data is in the improvement of the decision. The contribution of each data element is comparable to the weight of this property. Andrew Frank

  49. Data quality from a user perspective • Better data reduces the risk of taking a wrong decision. • Precision and completeness can be translated directly to the risk of taking a wrong decision and reduces the value of the data. • Temporal currency is first converted to reduced precision and completeness (this should be done by data provider) Andrew Frank

  50. Closed loop semantics • My answer to the problem of semantics: • Link observation semantics in the database to action semantics in the decision. Andrew Frank

More Related