450 likes | 682 Views
Data Design Implementation and support for Build 2b. November 30, 2011 Steve Hughes. 1. Topics. Overview Key Requirements and Drivers Build 2b Deliverables Build 2b Deployment Issues Next Steps. PDS4 Architecture. Label Schema. Data Element. Planetary Science Data Dictionary.
E N D
Data Design Implementation and support for Build 2b November 30, 2011 Steve Hughes 1
Topics • Overview • Key Requirements and Drivers • Build 2b Deliverables • Build 2b Deployment • Issues • Next Steps
Label Schema Data Element Planetary Science Data Dictionary has Class Data Architecture Concepts Information Model Product Tagged Data Object (Information Object) Used to Create Validates Expressed As Extracted/Specialized Describes Data Object
Topics • Overview • Key Requirements and Drivers • Build 2b Deliverables • Build 2b Deployment • Issues • Next Steps
Topics • Overview • Key Requirements and Drivers • Build 2a Deliverables • Build 2b Deployment • Issues • Next Steps
Build 2a Scope • Begin supporting PDS4 label design for LADEE and MAVEN; Begin planning/testing migration • Support the Policy on Acceptable PDS4 Data Formats • Support transition of the central catalog to the registry infrastructure • Deploy early PDS4 software tools and services
PDS4 Documents in Context references references derive generates Big Picture instruct generates Introduction to PDS4 Documentation Standards Reference Concepts Document PDS4 Information Model Specification Data Dictionary Data Provider’s Handbook PDS4 Product Labels XML Schemas Registry Configuration File Data Dictionary Tutorial Jumpstart Glossary Registry generates Requirements User Friendly Product Tracking and Cataloging creates / validates generates Definitions Blueprints Deliverables Object Descriptions Requirements Engineering Specification configures Legend Complete Cookbook Some TBD Informative
PDS4 Data Formats Extensions/ Restrictions Base
PDS4 Observational Product [1] Data_Standards Identification_Area [1] [0..1] Subject_Area [1] Cross_Reference_Area [0..*] Bibliographic_Reference [1..*] Observing_System [0..*] Reference_Entry [1] Observation_Area Mission_Area [0..*] [0..*] Node_Area [1..*] File_Area [0.*] Digital_Object
Information Modeling Tool Domain Knowledge Data Standards Development Process • Domain expertise was captured in the PDS4 Information Model as an ontology. • The model represents a consensus of the domain experts. • The model is the single source for the PDS4 Data Standards, for example the generated XML Schemas. PDS4 Information Model Filter and Translator XML Schema (Generic) XML Schema (Generic) XML Schema (Generic) XML Schema (Generic)
Topics • Overview • Key Requirements and Drivers • Build 2b Deliverables • Build 2b Deployment • Issues • Next Steps
Build 2b Deployment • Resolve build 2a liens (to be discussed) and generate a build 2b deployment • Generate a release of the information model, companion documents and supporting tutorial material • Generate new schemas • Generate registry configuration information • Post key documents to PDS website
Topics • Overview • Key Requirements and Drivers • Build 2b Deliverables • Build 2b Deployment • Issues • Next Steps
Chart of Review Comments Total: 1173
Topics • Overview • Key Requirements and Drivers • Build 2b Deliverables • Build 2b Deployment • Issues • Next Steps
Build 2b Actions – Jan ‘12 • Finalize and freeze the information model for Build 2b incorporating high priority changes identified in Build 2a. • Use existing capabilities to support local data dictionary validation and the creation of schema and human-readable definition lists. • Baseline the current documentation • Add any additional information/ changes to an online resource (e.g., wiki) • Finalize and freeze the XML Schema for Build 2b incorporating the extension schemas currently under testing by the DDWG
Conclusion • The PDS4 Information Model represents the DDWG consensus. • A large number of decisions resulting from much discussion were captured in the model. • All had a say, not everyone always got their way. • On the scheduled date the model will be frozen and the PDS4 Data Standards will be generated and deployed. • The schemas, the dictionary, and all other generated artifacts will be consistent with the model. • The current consensus, as reflected in the model will be operational.
Acknowledgements* Peter Allan David Heather Michel Gangloff Santa Martinez Thomas Roatsch Alain Sarkissian Ed Bell Richard Chen Dan Crichton Amy Culver Patty Garcia Ed Grayzeck Ed Guinness Mitch Gordon Sean Hardman Lyle Huber Steve Hughes Chris Isbell Steve Joy Ronald Joyner Debra Kazden Todd King Joe Mafi Mike Martin Thomas Morgan Lynn Neakrase Paul Ramirez Anne Raugh Mark Rose Elizabeth Rye Boris Semenov Dick Simpson Susie Slavney * Anyone who sat through a DDWG 2-hour telecon or provided useful input.
Too Many {objects, classes, schemas, …} • Abstract (vacuous) classes are used for organizational purposes. • These are not included in the schemas and many are being deleted. • Subclasses of the four fundamental structures are used to partition the set of allowed structures, for example the Array_2D_Image subclass of Array_Base. • Question to be answered, does the PDS want to provide software specific to Array_2D_Image? • All Array_Base software works for any Array_2D_Image
Too Many {objects, classes, schemas, …} • Subclasses of a product component are used to provide specificity, for example, the subclass Bundle_Member_Entry. • There are three methods, change the name, change the namespace (new file), or use optional attributes. • Some specific subclasses are used for special purposes, for example Table_Field_Checksum in an Inventory. • Consider using Schematron Assert statements to validate.
Too Many {objects, classes, schemas, …} • Some classes result from the process of normalization, for example array_axis and array_element. • Emperor Joseph II: …And there are simply too many notes, that's all. Just cut a few and it will be perfect. Mozart: Which few did you have in mind, Majesty?
By the numbers • Fundamental Data Structures – 4 • Lines of Schema Code • Flat 18K • Master 4k-6k • Classes dropped (Master) – nn • SimpleTypes dropped (Master) – 200 • Actionable items closed – 1.5K • Actionable items open - < 50 • Issues from reviews – 1k+
Post Build 2b – Summer ‘12 • Develop discipline level classes for the next phase of data set migration • Refine the document suite and its organization. • Support development of tools scheduled for the next build. • Support development of data dictionary and local data dictionary services.