600 likes | 729 Views
Safeguarding GIS Data through Metadata. Christopher Cialek & Nancy Rader Minnesota Geospatial Information Office September 2004, updated May 2014. What’s Metadata?. If you had two cans without labels, which would you eat?
E N D
Safeguarding GIS Data through Metadata Christopher Cialek & Nancy Rader Minnesota Geospatial Information Office September 2004, updated May 2014
What’s Metadata? If you had two cans without labels, which would you eat? Without a label, how would you know which was tuna and which was cat food? Tuna? CatFood?
By the end of the Workshop . . . You will: • Understand what metadata is; appreciate its value • Become familiar with metadata standards • Create your own metadata records • Use a search engine to find data • Know where to go for help
What’s Metadata? “the information that makes data sets understandable, usable and sharable.” International Standards Organization
What’s Metadata? FDA Food Label We often use metadata without knowing it -- even a food label is an example of metadata! • Structured format • Specific content • Necessary information
What’s Metadata? Search metadata to find resources in the library • Library community has developed metadata systems to describe books • Dublin Core • Allows you to search by title, author, subject…
What are Metadata Used For? • MANAGING DATABASES • COMPARING DATA SETS • FACILITATING DATA SHARING • PROVIDING TECHNICAL SPECS • FINDING DATA
Standards • Content Standard for Digital Geospatial Metadata (CSDGM) • Established by the FGDC in 1994 • Foundation standard for the NSDI • “Mandatory” for federal agencies • www.fgdc.gov/metadata/geospatial-metadata-standards#csdgm
Standards • Minnesota Geographic Metadata Guidelines (MGMG) • Derived from the federal standard in 1998 • Simplified, but retains all required fields • Became state standard in 1999 • www.mngeo.state.mn.us/committee/standards/mgmg/metadata.htm
Standards • ISO 19115 Geographic Information: Metadata • International geospatial metadata standard • More information: www.fgdc.gov/metadata/geospatial-metadata-standards#fgdcendorsedisostandards
CSDGM MGMG ISO LINEAGE LINEAGE LINEAGE Source Information Statement Source Citation Citation Information Process Step Description Source Scale Rationale Type of Source Media Date & Time Source Time Period of Content Processor Source Time Period Info Description Source Currentness Scale Spatial Reference Sys Source Citation Abbrev Source Citation mandatory Source Contribution Source Extent mandatory, if applicable Process Step Source Step optional Process Description Source Used Citation Process Date & Time Source Produced ProcessContact Contact Information Metadata Structure Examples
The MGMG • SUBSET OF A FEDERAL STANDARD • MADE UP OF SEVEN SECTIONS • DRIVES WEB SEARCH TOOLS • USED BY OVER 100 ORGS IN MN • MN STANDARD; FGDC RECOGNIZED
1 Identification Originator Title Identifier (optional) Abstract Purpose Content Date Currentness Progress Maintenance and Update Frequency Spatial Extent Description Bounding Coordinates Keywords Constraints Contact Information Browse Graphic Information Associated Data Sets A Walk Through the Guidelines • Originator: name of organization or individual that developed the data • Title:name by which the data set is known
1 Identification Originator Title Identifier (optional) Abstract Purpose Content Date Currentness Progress Maintenance and Update Frequency Spatial Extent Description Bounding Coordinates Keywords Constraints Contact Information Browse Graphic Information Associated Data Sets TITLE • Too cryptic: niclcpy3 • Too general: Landuse • Acronyms: Wisconsin DOPs • Too detailed: Wetland Polygon Coverage Overlay for St. Cloud (USGS Quad) • Just right (includes theme, area, date): Minnesota Telecommunications Service Area Boundaries, 2007
1 Identification Originator Title Identifier (optional) Abstract Purpose Content Date Currentness Progress Maintenance and Update Frequency Spatial Extent Description Bounding Coordinates Keywords Constraints Contact Information Browse Graphic Information Associated Data Sets A Walk Through the Guidelines • Abstract:summary of what’s in the data setThis land cover data set was derived from 30 meter resolution LANDSAT Thematic Mapper (TM) satellite imagery. Classification is divided into 15 classes with source imagery dates ranging from September 1991 to August 1996. Both a raster and a vector version are available. • Purpose:why the data set was developedLand use planning, natural resource monitoring
1 Identification Originator Title Identifier (optional) Abstract Purpose Content Date Currentness Progress Maintenance and Update Frequency Spatial Extent Description Bounding Coordinates Keywords Constraints Contact Information Browse Graphic Information Associated Data Sets A Walk Through the Guidelines • Time Period of Content:single date that best describes when the data are current08/2004 • Currentness:text describingwhat the Time Period date is referring to, e.g., range of dates of aerialphotographyDate of source imagery (LANDSAT-5 TM, bands 3, 4, and 5) ranges from September 20, 1991 to August 30, 1996.
1 Identification Originator Title Identifier (optional) Abstract Purpose Content Date Currentness Progress Maintenance and Update Frequency Spatial Extent Description Bounding Coordinates Keywords Constraints Contact Information Browse Graphic Information Associated Data Sets A Walk Through the Guidelines • Spatial Extent:description of the geographic area covered (Lyon County, Minnesota) • Bounding Coordinates:the extreme north, south, east and west limits of coverage expressed in latitude and longitude values W -95.4 E -89.5 N 49.4 S 45.5
1 Identification Originator Title Identifier (optional) Abstract Purpose Content Date Currentness Progress Maintenance and Update Frequency Spatial Extent Description Bounding Coordinates Keywords Constraints Contact Information Browse Graphic Information Associated Data Sets A Walk Through the Guidelines • Keywords: words or phrases that summarize the theme and location of the data set, together with the name of any formal list of keywords (thesaurus) Too general:GIS, layer, survey Just right:Feedlot, animal agriculture,hog • Constraints:any restrictions to the access or use of the data set Access:Due to increased security measures taken after 9/11/01, this data set is no longer available online. Use:. . .right to use these data for any internal purpose
1 Identification Originator Title Identifier (optional) Abstract Purpose Content Date Currentness Progress Maintenance and Update Frequency Spatial Extent Description Bounding Coordinates Keywords Constraints Contact Information Browse Graphic Information Associated Data Sets A Walk Through the Guidelines • Contact Information:the person who can answer questions about the content or development of the data set
1 Identification Originator Title Identifier (optional) Abstract Purpose Content Date Currentness Progress Maintenance and Update Frequency Spatial Extent Description Bounding Coordinates Keywords Constraints Contact Information Browse Graphic Information Associated Data Sets A Walk Through the Guidelines • Browse Graphic:a sample illustration of the data set
1 Identification Originator Title Identifier (optional) Abstract Purpose Content Date Currentness Progress Maintenance and Update Frequency Spatial Extent Description Bounding Coordinates Keywords Constraints Contact Information Browse Graphic Information Associated Data Sets A Walk Through the Guidelines • Associated Data Sets: information about other, related data sets that may be of interest If you’re interested in this data set, here are others that may also interest you. NOT a list of source materials (those are described in Lineage). For information on other air photos available for Minnesota, see www.mngeo.state.mn.us/chouse/airphoto/
2 Data Quality Attribute Accuracy Logical Consistency Completeness Positional Accuracy Lineage Source Scale A Walk Through the Guidelines • Attribute Accuracy: qualitative or quantitativeexplanation of how accurately features in the data set have been described, including procedures used to assess accuracy (examples: field-checking, checkplots, frequency counts to find invalid codes)
2 Data Quality Attribute Accuracy Logical Consistency Completeness Positional Accuracy Lineage Source Scale A Walk Through the Guidelines • Completeness:information about selection criteria, omissions, generalization, etc. EXAMPLE: Geographic exclusion “Data was not available for Smith Township.”
2 Data Quality Attribute Accuracy Logical Consistency Completeness Positional Accuracy Lineage Source Scale A Walk Through the Guidelines • Completeness:information about selection criteria, omissions, generalization, etc. EXAMPLE: Categorical Exclusion “Municipalities with population under 1000 not included.”
2 Data Quality Attribute Accuracy Logical Consistency Completeness Positional Accuracy Lineage Source Scale A Walk Through the Guidelines • Positional Accuracy:an explanation of what’s known about the horizontal and vertical accuracy of the data set (can be qualitative or quantitative) • Qualitative example: Data was collected in the field and plotted on a variety of base maps. Archaeological properties visited in the past 30 years are located on USGS maps. Almost all site locations are accurate to the quarter section. Most site locations are accurate to within a quarter-quarter section. Site boundaries are poorly defined, as are site centroids. Minnesota State Historic Preservation Office Archaeological Inventory
2 Data Quality Attribute Accuracy Logical Consistency Completeness Positional Accuracy Lineage Source Scale A Walk Through the Guidelines • Positional Accuracy:an explanation of what’s known about the horizontal and vertical accuracy of the data set • Quantitative example: Using the National Standard for Spatial Data Accuracy, this data set tested 1 foot horizontal accuracy at 95% confidence level. City of Minneapolis (from Positional Accuracy Handbook)
2 Data Quality Attribute Accuracy Logical Consistency Completeness Positional Accuracy Lineage Source Scale A Walk Through the Guidelines • Lineage: information about the sources of data used to construct the data set and processing steps applied
2 Data Quality Attribute Accuracy Logical Consistency Completeness Positional Accuracy Lineage Source Scale LINEAGE RECIPE • Source Information Data set reference Scale Media Time period of content Source contribution Source Metadatareference • Processing Step Process description software used organization doing the processing Process date • Miscellaneous Notes
3 Spatial Data Organization Native Data Set Environment Geographic Reference (Tabular) Spatial Object Types Tiling Scheme A Walk Through the Guidelines • This section no longer used.
4 Spatial Reference Horizontal Coordinate Scheme Ellipsoid Horizontal Datum & Units Resolution Altitude Datum & Units Depth Datum & Units If Raster If Geographic If UTM If State Plane If County Coordinate If User Specified Projection If Other A Walk Through the Guidelines
5 Entity and Attribute Entity and Attribute Overview Entity and Attribute Detailed Citation A Walk Through the Guidelines • Entity & Attribute Overview: description of the information content of the data set: the features it represents (entities) and details about them (attributes). An entity might be road and the attributes that describe it might include interstate, 6 lanes, concrete. • Entity & Attribute Detailed Citation: reference to other sources of detailed information on the content of the data set; pointer to a data dictionary
5 Entity and Attribute Entity and Attribute Overview Entity and Attribute Detailed Citation A Walk Through the Guidelines Useless: 21 22 23 Examples: Land Use Codes Slightly Better: AGRICULTURAL LAND 21 - Cultivated Land 22 - Pasture Land 23 - Transitional Agricultural Land
A Walk Through the Guidelines Much Better: AGRICULTURAL LAND 21 - Cultivated Land Cultivated land includes those areas under intensive cropping or rotation, including periods when a parcel may be fallow. It represents land planted to forage or cover crop. The units exhibit linear or other patterns associated with current or relatively recent tillage. 22 - Pasture Land Land in active pasture use. This class was discontinued and combined into 23. 23 - Transitional Agricultural Land This category includes areas that show evidence of past tillage but do not now appear to be continuously cropped or in a crop rotation. Parcels in this unit include fields that are idle or abandoned and may or may not have been planted to a cover crop. In addition to displaying some evidence of past tillage, they usually are relatively uniform in vegetation.
6 Distribution Publisher Publication Date Distributor Information Distribution Liability Transfer Format Transfer Size Ordering Instructions Online Linkage A Walk Through the Guidelines • Publisher: organization or individual that distributes the data set • Distributor Information:person who can answer questions about the distribution of the data set • Distribution Liability:statement of any liability assumed by the distributor Limitations Warranty Liability Redistribution Conditions Data Delivery Terms
6 Distribution Publisher Publication Date Distributor Information Distribution Liability Transfer Format Transfer Size Ordering Instructions Online Linkage A Walk Through the Guidelines • Ordering Instructions:instructions for obtaining the data set. If applicable, instructions for acquiring data through Online Linkage element below • Online Linkage:(optional)when the data set is available online, this is the link to the Internet site where it can be downloaded
6 Distribution Publisher Publication Date Distributor Information Distribution Liability Transfer Format Transfer Size Ordering Instructions Online Linkage A Walk Through the Guidelines • Two reports available to aid in determining GIS data distribution policy Mapping the Risks: Assessing the Homeland Security Implications of Publicly Available Geospatial Information 2004 RAND National Defense Research Institute http://www.rand.org/pubs/monographs/MG142.html RAND researchers found no publicly accessible federal geospatial information deemed critical to meeting attackers’ information needs. The researchers found only four publicly available federal databases that had information that is both useful to potential attackers and could not be obtained from other widely available sources. The four federal databases are no longer being made public by federal agencies
6 Distribution Publisher Publication Date Distributor Information Distribution Liability Transfer Format Transfer Size Ordering Instructions Online Linkage A Walk Through the Guidelines • Two new reports available to aid in determining GIS data distribution policy Guidelines for Providing Appropriate Access to Geospatial Data in Response to Security Concerns http://www.fgdc.gov/policyandplanning/Access%20Guidelines.pdf A Federal Geographic Data Committee Homeland Security Working Group study investigating restrictions to geospatial data access that are reasonable, sensible and cost effective
Purpose of a Metadata Entry Tool • Organizes metadata content • Provides help • Formats results • printed reports • webpages • Clearinghouse searches • Can it write the whole record for you?(no)
Tool • Minnesota Metadata Editor (MME) • Customized from the EPA’s metadata editor • Standalone • Requires only Microsoft .NET 3.5 • Microsoft Access is required to edit the database • More information and download: www.mngeo.state.mn.us/chouse/mme/
How to Make this Easier… • Value metadata • Create metadata during your project • Prioritize legacy data • Use existing resources • Writing tips • Use your judgment • Share the task
Value metadata • Establish its value for yourselfand for your organization • Short-term investment long-term payoff • Metadata is no longer optional;it is part of being a GIS and IT professional Cat Food
Create metadata during your project • When you create new data • When you change existing data If you write metadata as you go along,at the end, it is done! Tuna
Prioritize legacy data • What is most critical? • What are you asked about the most? • What may be lost soonest? • Information that is quickly forgotten • Information that only one person or organization knows
Use existing resources • Guidelines and tools • Starter templates • Existing documents • Other peoples’ metadata