340 likes | 453 Views
Activating Metadata – Case Study 1. Land Cover and the REVIGIS project. Richard Wadsworth 1 , Lex Comber 2 & Peter Fisher 3 1. CEH Monks Wood, UK, 2. Leicester University, 3. University of London. Revision of Uncertain Geographic Information EU Project (under IST-FET 2000 to 2004)
E N D
Activating Metadata – Case Study 1 Land Cover and the REVIGIS project. Richard Wadsworth1, Lex Comber2 & Peter Fisher3 1. CEH Monks Wood, UK, 2. Leicester University, 3. University of London
Revision of Uncertain Geographic Information EU Project (under IST-FET 2000 to 2004) Develop methods to process information that was uncertain in “what” and uncertain in “where” Case studies included: Cadastre – strong “what”, strong “where” Sand Dunes – strong “what”, weak “where” Land Cover – weak “what”, weak “where” What was REVIGIS?
Historically maps supported the description of the phenomenon. Now the documentation supports the map. The user has less information on the origins and meanings of the data. The user has less motivation for understanding the origins and meanings of the data. “A lecture is the process by which information goes from the notes of the lecturer to the notes of the student without passing through the consciousness of either”. Underlying Problem
Technology, Science and Public Policy are always changing. Repeated natural resource inventories often use different methods and record different categories from earlier studies It is difficult to distinguish changes in the phenomenon from changes in the method, classes, technology, objectives etc. How did we get there?
Two land cover maps (LCMGB, LCM2000) Produced by the same organisation Funded from same sources (mostly) Produced from similar data (Landsat & ground survey) But … issued with the caveat that they should not be used to estimate change. Why the caveat? Problem Domain
Reasons for the Caveat • Objectives: scientific v. policy support • Conceptualisations: Target classes v. Broad Habitats • Representation: pixel v. parcel • Technology: GIScience and GISystems developments • Metadata: aspatial v. object level meta-data
Network of actors and their links: money information skills control Commissioning context: LCM1990
Network of actors and their links: money information skills control Commissioning context: LCM2000
We all have “prototypes” in our heads We match a class label with that prototype There may be a mis-match Many examples in land cover (and every other type of geographic information), eg: What is unimproved grassland? What is a “forest”? What is a “bog”? Classes, labels & meanings
Uncertainty in “what” • Improved Grassland • Improved by reseeding and / or fertiliser • Includes Fertile pastures with Juncus effusus • May be confused with semi-natural swards where abandoned or little-managed • Acid Grassland • Generally not reseeded or fertiliser-treated; • Pastures with Juncus effusus are included • Management may obscure distinctions from Improved grassland.
http://home.comcast.net/~gyde/DEFpaper.htm 150 pages! of definitions of forest 16 Zimbabwe 14 12 10 Sudan Tree Height (m) 8 Turkey Tanzania Mozambique Morocco Ethiopia United Nations -FRA 2000 New Zealand 6 Denmark PNG Luxembourg Netherlands SADC Namibia Malaysia Cambodia Belgium UNESCO Jamaica Australia Somalia Japan 4 Israel United States Gambia Switzerland South Africa Mexico 2 Kyrgyzstan Kenya Portugal Estonia 0 0 10 20 30 40 50 60 70 80 90 Canopy Cover (%) What is a forest?
In LCMGB (1990) Bog was defined as … permanent waterlogging, … … permanent or temporary standing water … Myrica gale and Eriophorum spp. … … water-logging, perhaps with surface water … In LCM (2000) Bog was defined as … areas with peat >0.5 m deep; … Consequences in one 100 x 100km square: 1990 12 pixels of bog (<1 ha); 2000 120728 pixels of bog (~75 km2) What is a Bog?
Land Cover Maps of GB LCMGB 1990 LCM 2000
Land Cover Maps of GB LCMGB 1990 LCM 2000 But the fragments are obviously of the same area …
Aggregated to a reduced number of common “super classes” (thematic simplification). An extreme case (Wulder et al 2004) – land cover maps of Canada are aggregated to just 2 classes, “forest” and “not forest”! (If necessary data sets are aggregated spatially) The “traditional” approach to inconsistency
Reduce what can be said about the phenomenon. Effectiveness of the process is rarely tested. Sensitivity is rarely discussed. Can increase variability. (aggregation is subjective) Problem with “traditional” approach
Extend “many-to-one” aggregation to a “many-to-many” look-up-table. Extend “belongs/doesn’t belong” (1/0) to “expected, uncertain, unexpected” (1/0/-1). Identifies inconsistencies between representations (and one form of inconsistency is change). Semantic-Statistical Approach
A hypothetical segment For class A: expected score = 18, uncertain score = 7 (4 class B pixels + 3 class C pixels) unexpected score = 1 (the single pixel of class D).
Hypothetical segment second classifications For class A: expected score = 19 (class X), uncertain score = 5 (class Z) unexpected score = 2 (class Y).
Combining Scores Scores are treated as if they were probabilities then using Dempster-Shafer: Belief = (Bel1.Bel2 + Unc1.Bel2 + Unc2.Bel1) / β where β = (1 – Bel1.Dis2 – Bel2.Dis1) Bel1 & Bel2 = the beliefs (expected), Unc1 & Unc2 = uncertainties (uncertain), Dis1 & Dis2 = disbeliefs (unexpected). For class A. Bel1 = 18/26 = 0.692, Unc1 = 7/26 = 0.269, Dis1 = 1/26 = 0.038 Bel2 = 19/26 = 0.731, Unc2 = 2/26 = 0.077, Dis2 = 5/26 = 0.192 Therefore: β = 1 – 0.692*0.192 – 0.731*0.038 =0.839 Belief = (0.692*0.731 + 0.693*0.077 + 0.731*0.269) / 0.839 = 0.901 The belief has increased therefore we consider that the segment is consistent for A
“D_c,57:Dm_a,13:Dm_c,10:Dm_b,7:D_b,6” PPL: Top 5 MLC per pixel classes. Spectral heterogeneity information – produced in parallel with Classification process, not as part of it Object level metadata!
With the LCMGB and LCM2000 maps we need two LUT to calculate the expected, uncertain and unexpected scores twice: once from “perpixlist” (object level metadata) secondly by intersecting the LCMGB pixels. Expected, Unexpected and Uncertain
Semantic relations expressed in LUT Relating Broad Habitats to the spectral variants in perpixlist
Semantic relations expressed in LUT Expert opinion of relationship between Broad Habitats (LCM2000) and Target Classes (LCMGB)
Doesn’t destroy or “throw away” any data Few (2%) of consistent segments appear to have changed. Most (80%) inconsistencies are due to “error” not change. Good things about the Statistical-Semantic approach
Semantic relations (LUT) between LCMGB and LCM2000 based on Expert opinion. Might not have an Expert Experts need to make lots of decisions. The allocation to; +1/0/-1 is not very subtle. Decisions are “opaque” (why do “A” and “X” have an expected relationship?) {Knowledge based corrections – what to trust?} Problems with the Statistical-Semantic approach
Unanswered questions • How to record the “commissioning context”? (especially the compromise between what is desirable and what is feasible) • How to describe the data conceptualisations (ie more than the class labels)? • Who is responsible for specifying the relationships between conceptualisations? • How is that information to be communicated to the user? • Different Experts have different “views” how to capture these?
Acknowledgements and Thanks We wish to thank all our collaborators on REVIGIS as well as our colleagues in our host Institutes. REVIGIS European Commission IST-1999-14189, Coordinator Robert Jeansoulin, Universite de Provence, France.