1 / 33

Data Archiving & Preservation: Best Practices for GIS

Data Archiving & Preservation: Best Practices for GIS. Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013. Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Emory Center for Digital Scholarship. Overview. Best practices for managing geospatial data:

lavonn
Download Presentation

Data Archiving & Preservation: Best Practices for GIS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Archiving & Preservation: Best Practices for GIS Presentation to ARGIS - Atlanta Region GIS User Group October 30, 2013 Jennifer Doty| jennifer.doty@emory.edu Data Management Specialist Emory Center for Digital Scholarship

  2. Overview Best practices for managing geospatial data: • File formats • Naming conventions • Folder structure • Storage and backup • Documentation Trends in geospatial data archiving: • Federal funding agencies’ requirements • State initiatives for preservation

  3. Best Practices: File Formats UK Data Archive File Formats guide, http://www.data-archive.ac.uk/create-manage/format/formats-table

  4. Best Practices: File Formats GeoMAPP Geospatial Data File Formats Reference Guide: • provides quick reference of common geospatial raster and vector dataset types • serves as tool to identify geospatial format types based on file extensions • also includes information on standards and specifications for documenting geospatial data http://www.geomapp.net/docs/GeoMAPP_Geospatial_data_file_formats_FINAL_20110701.xls

  5. Best Practices: Naming Conventions • Create meaningful but brief naming conventions for your project • Use file names to classify broad types of files  • Avoid using spaces and special characters • Begin names with letters, not numbers e.g. Census2010_blockgroups_GA, not 2010Census… • Avoid very long file names  

  6. Best Practices: Naming Conventions Example: keyword_steward_extent_date.ext • Keyword (essential)—be as descriptive of the contents of the data as possible by using a word or short phrase • Steward (essential)—either the creator of the dataset or the last one to make a significant modification to a dataset • Extent (optional)—may be included to indicate resolution of the data (e.g. county, state, or international) • Date (optional)—may be used to indicate the date of creation or the age range of the content. Recommended format is YYYYMMDD Indiana Geographic Information Council, http://www.igic.org/standards/namingstandard.pdf

  7. Best Practices: Naming Conventions Versioning: • useful to indicate file revisions or edits, especially in collaborations • can be through discrete or continuous numbering, depending on minor or major revisions • think of software versioning—ArcGIS 10 was significant change from 9.x., but ArcGIS 10.1 was (relatively) minor change to 10

  8. Best Practices: Folder Structure • Separate directories for scratch workspace and final data • Hierarchy—is deep or shallow best for your project?

  9. Tape library, CERN, Geneva by Cory Doctorow / CC BY-SA 2.0

  10. Best Practices: Storage & Backup Storage Considerations: • Accessibility • Read/Write speed • Size limits—overall vs. file size Options: • Local—PC drive, flash drive, external hard drive • Server—department/organization server space • Cloud—Dropbox, Google Drive, etc.

  11. Best Practices: Storage & Backup Backup Considerations: • Accessibility (local, server, cloud) • Redundancy (rule of thumb—here, near, far) Options: • Incremental/Snapshot • Automated

  12. Metadata is a love note… by sarah0s / CC BY-NC-ND 2.0

  13. Best Practices: Documentation “When thoughtfully populated, geospatial metadata can be a critical resource for understanding and managing geospatial data for current and future GIS practitioners and those trying to preserve the data.” -Utilizing Geospatial Metadata to Support Data Preservation Practices, January 2011, GeoMAPP (http://www.geomapp.net/publications_categories.htm)

  14. Best Practices: Documentation Metadata—represents the who, what, when, where, why and how Standards: • CSDGM (FGDC) • ISO 19115-2003 / 19139

  15. FGDC’s Content Standard for Digital Geospatial Metadata (CSDGM) http://www.fgdc.gov/csdgmgraphical/index.html

  16. CSDGM Fields for Preservation

  17. Checklist: CSDGM Fields for Preservation Identification Information - basic info about data set, including: • party responsible—usually creator • publication date—date the data set is completed and ready for use • title—”where” “what” “when” • maintenance/update frequency—annually, as needed, based on census, etc. • bounding coordinates • keywords (theme and place) • access and use constraints—any restrictions, disclaimers, or guidance on data set attribution • contact details GeoMAPP, Utilizing Geospatial Metadata to Support Data Preservation Practices http://www.geomapp.net/docs/GeoMetadata_Items_for_Preservation_2011_0110.pdf

  18. Checklist: CSDGM Fields for Preservation Data Quality Information – provides historical lineage and source descriptions for the data used in the creation of the data set, including: • originator • publisher, publication date & place • “currentness” of source data • process description GeoMAPP, Utilizing Geospatial Metadata to Support Data Preservation Practices http://www.geomapp.net/docs/GeoMetadata_Items_for_Preservation_2011_0110.pdf

  19. Checklist: CSDGM Fields for Preservation Spatial Reference Information - description of the reference frame for, and the means to encode, coordinates in the data set, including: • map projection name • coordinate system name • unit of measure • geodetic model—datum, ellipsoid GeoMAPP, Utilizing Geospatial Metadata to Support Data Preservation Practices http://www.geomapp.net/docs/GeoMetadata_Items_for_Preservation_2011_0110.pdf

  20. Checklist: CSDGM Fields for Preservation Entity and Attribute Information - details about content of the data set—the entities, their attributes, and domains from which attribute values may be assigned, including: • entity label • attribute label and description GeoMAPP, Utilizing Geospatial Metadata to Support Data Preservation Practices http://www.geomapp.net/docs/GeoMetadata_Items_for_Preservation_2011_0110.pdf

  21. Checklist: CSDGM Fields for Preservation Metadata Reference Information - information on the party responsible for creating the metadata and the currentness of the metadata: • metadata standard name • metadata standard version GeoMAPP, Utilizing Geospatial Metadata to Support Data Preservation Practices http://www.geomapp.net/docs/GeoMetadata_Items_for_Preservation_2011_0110.pdf

  22. Data ManagementInitiatives Federalagency mandates for sponsored research: • NSF & NIH requirements for DM plans • GIS Inventory (Ramona) & Federal Grants data sharing plans—gisinventory.net Other related initiatives: • USGS DM working group • DM training for early career researchers

  23. FGDC Geospatial Data Lifecycle Model http://www.fgdc.gov/policyandplanning/a-16/stages-of-geospatial-data-lifecycle-a16.pdf

  24. State & National Initiatives in Geospatial Data Archiving GeoMAPP - Geospatial Multistate Archive and Preservation Partnership (www.geomapp.net): • federally funded partnership between the Library of Congress and state geospatial and archives staff from North Carolina, Kentucky, Montana, and Utah National Digital Stewardship Alliance (NDSA), Geospatial Content Team (www.digitalpreservation.gov/ndsa): • report identifying appraisal and selection activities as they effect decisions defining geospatial content of enduring value for the nation

  25. Open GeoPortal@ Emory NASA Goddard Photo and Video / CC BY

  26. Green Question Mark by mikecogh on Flickr / CC BY

  27. Contact Information: Jennifer Doty | jennifer.doty@emory.edu Data Management Specialist Michael Page | michael.page@emory.edu Geographer & Geospatial Data Librarian Emory Center for Digital Scholarship digitalscholarship.emory.edu

More Related