150 likes | 203 Views
Sustaining Engineering Informatics. Towards Methods and Metrics for Digital Curation Joshua Lubell, Sudarsan Rachuri, Eswaran Subrahmanian, Mahesh Mani National Institute of Standards and Technology {lubell, sudarsan, eswaran, mahesh}@nist.gov. NIST Workshop April 24-25, 2007.
E N D
Sustaining Engineering Informatics Towards Methods and Metrics for Digital Curation Joshua Lubell, Sudarsan Rachuri, Eswaran Subrahmanian, Mahesh ManiNational Institute of Standards and Technology{lubell, sudarsan, eswaran, mahesh}@nist.gov
NIST Workshop April 24-25, 2007 • “ Long Term Sustainment of Digital Information for Science and Engineering: Putting the Pieces Together” • Over 30 participants • Implementers of OAIS • Government (NARA, Library of Congress, Navy, Govt. Printing Office) • Universities • Third in a series of workshops • NIST (March 2006) • Bath, UK (February 2007)
Issue Recommendation Don’t know enough about who the end users are and how they will use the data Collect end-user use cases. Create guidelines/requirements for creators to follow, while recognizing that they won’t be universally followed. Potential enforcement through Contractual mechanisms, peer review. Support creators through semi-automated tools. Lack rationale, rich documentation required for archival packaging of data for future use Different levels of detail in metadata needed depending on format and context Follow guidelines and best practices for creation of structured as well as informal metadata. Classify different kinds of data objects and their metadata requirements. Breakout Issues: Archival Information and Technology
Issue Recommendation Hard to find archiving software tools and to figure out how they all fit together Better consolidated registries, clearing house for available technologies for archival, promote educational awareness Lack of domain-specific guidance (metrics, potential use cases, service level agreements) Understand the needs of community, industry specific requirement analysis No comprehensive methods to capture design history explicitly and, even if captured, they are distributed Research context sensitive digital formats. Identify methods for meaningful and efficient extraction of design rationale information. Diverse digital formats, proliferation, transformation between digital formats. Include the cost (current and estimated) of transformation of digital format into the total cost of ownership of the digital object. Breakout Issues: Standards and Specific Domains
Overall Workshop Conclusions • Facilities for archiving should be available at the source of information creation • Archival systems must deliver the right information for the task at hand to the end user • Archival system design is a socio-technical problem
Archival Challenges Unique toEngineering Design From Kopena, Shaffer, Regli: “CAD Archives Based on OAIS,” Proceedings of ASME Computers and Information in Engineering Conference, DETC2006-99675, Philadelphia, September 2006 • Capturing all aspects of a design project • Preserving data generated by software tools • Predicting how data will be used over long term • Package schemas tailored to CAD preservation needed • Standards for representation information helpful, but not a silver bullet
Library of Congress Digital Format Sustainability Factors See digitalpreservation.gov/formats • Disclosure – availability of documentation, validation s/w • Adoption - popularity • Transparency – analysis possible without specialized tools? • Self-documentation – metadata included in digital object? • External dependencies – specialized software needed to use objects? • Impact of patents • Technical protection mechanisms – access restrictions
Sustainability Factors Applied to ISO 10303 (STEP) • Disclosure • International standard • Third party documentation • Validation software • Adoption • CAD vendors • Other domains • Transparency • ASCII and graphical formats • Self-documentation • Rich model-based representation • External dependencies • None as long as software does import/export • Impact of patents • None • Technical protection mechanisms • None
So What's Next? • Sustainability factors not domain-specific • Potential future access scenarios not taken into account • We need more metrics for STEP and other engineering digital objects • Quality and functionality factors a start Question: How would you measure the quality of an engineering archiving and/or records management strategy?
Access Scenarios: The Three Rs • Reference • Preserve information in its original state • Example (product data engineering): 3D visualization • Reuse • Allow for future modification, re-engineering • Example: ISO 10303-203:1994 (STEP AP203) • Rationale • Encode construction history, design intent, tolerancing info, lifecycle management info, etc. • Example: STEP AP203 ed.2 ++ • Ontologies and/or other representations needed
Extended Functional Model EI = Engineering Informatics DOP = Digital Object Prototype METS = Metadata Encoding Transmission Standard
Future Goals • Formalize 3Rs • Define EI sustainability metrics • Create EI implementation framework • More generic than LOTAR (a specification for long term archival of 3D-CAD and associated product data), but more domain-specific than vanilla OAIS • Develop EI archival testbed