370 likes | 510 Views
Data Model & DDWG Update. Management Council Face-to-Face Flagstaff, Arizona August 22-23, 2011. Topics. Design Process Builds Calendar Build 1b Review Issues. Data Standards Design Process. "Build". What exactly has to happen?. "Build". Freeze the Information Model. "Build".
E N D
Data Model & DDWG Update Management Council Face-to-Face Flagstaff, Arizona August 22-23, 2011
Topics • Design Process • Builds • Calendar • Build 1b Review Issues
"Build" • What exactly has to happen?
"Build" • Freeze the Information Model
"Build" • Freeze the Information Model • Finalize the System • Generate Schema • Freeze the Document Set
"Build" • Freeze the Information Model • Finalize the System • Generate Schema • Freeze the Document Set • Introduction • Concepts Document • Glossary • Jump Start • Data Provider's Handbook • Standards Reference • Dictionary Tutorial • Data Dictionary • Example Set
"Build" • Freeze the Information Model • Finalize the System • Generate Schema • Freeze the Document Set • Introduction • Concepts Document • Glossary • Jump Start • Data Provider's Handbook • Standards Reference • Dictionary Tutorial • Data Dictionary • Example Set • Reasonably Stable
"Build" • Freeze the Information Model • Finalize the System • Generate Schema • Freeze the Document Set • Introduction • Concepts Document • Glossary • Jump Start • Data Provider's Handbook • Standards Reference • Dictionary Tutorial • Data Dictionary • Example Set • Reasonably Stable Generated
"Build" • Freeze the Information Model • Finalize the System • Generate Schema • Freeze the Document Set • Introduction • Concepts Document • Glossary • Jump Start • Data Provider's Handbook • Standards Reference • Dictionary Tutorial • Data Dictionary • Example Set • Reasonably Stable Generated Human Intervention
"Build" • What this translates to is "lead time". • Right now we're looking at two to three weeks lead time from "freeze the model" to "flip the switch" on the build. • Let's look at a calendar.
Internal Review Issues • 1b Review produced > 200 separate issues/comments • Issues fell into two broad categories: • Documentation issues - clarity, consistency, completeness, integration. • Concerns about the model contents & implementation. • The Status of the review issues fall into two categories: • Open • Closed
Internal Review Issues Open • Still working for Build 2. • Will address after Build 2. • Have not decided whether or not to implement. Closed • We have implemented. • Model related issue arose from misunderstanding some aspect of PDS4. • We disagree: • Incompatible with PDS4 requirements. • Incompatible with the model approach we're using. • Not possible to implement within our time & budget constraints.
Internal Review Some Closed Issues Implemented • Document set integration. • Need analogs for PDS3 spreadsheet & container. Misunderstanding • New Structures don't support qubes. • Volatile metadata in a static archive (redelivery issue). Disagree • Labels that describe multiple data objects don't really work. • Do away with character tables. • Other space science archives: Consider using VOTABLE, CDM & OPeNDAP approach, class="variable" & named "dimension".
Internal Review Some Open Issues • Documentation issues – still working many of them. • Need robust, global metadata. • New Structures don't support some EDRs, Telemetry, DSN data. • Use a standard bundle entry (bundle index.html) • Consider a nomenclature review. • There is a proposed alternate XML implementation • Starts with XML Schema 1.0 or 1.1? • Perceived complexity. • Too many subclasses.
Open Issue: Too many Subclasses (1) • Going back to the original reviews, the issue is for the number of variations expanded from the four base structural types. The underlying concerns are overhead and confusion. • There have been a lot of changes since build 1b. Now as we look at this issue we have to ask three questions. • What do we count? • Are there too many? • If the numbers are reasonable, do we have the right ones?
Open Issue: Too many Subclasses (2) • What do we count? • Count what the data providers and end users see.
Open Issue: Too many Subclasses (3) • What do we count? • Count what the data providers and end users see. • Schema – specifically the Product_* schema.
Open Issue: Too many Subclasses (4) • What do we count? • Count what the data providers and end users see. • We have 40 Product schema. Wait for it …
Open Issue: Too many Subclasses (5) • 40 Product schema – by function. • Aggregations – 2 (Probably will be 3)
Open Issue: Too many Subclasses (6) • 40 Product schema – by function. • Aggregations – 2 • Observational Data – 10 (probably will add 1 or 2)
Open Issue: Too many Subclasses (7) • 40 Product schema – by function. • Aggregations – 2 • Observational Data – 10 • Observational Support – 10 (e.g., browse, document)
Open Issue: Too many Subclasses (8) • 40 Product schema – by function. • Aggregations – 2 • Observational Data – 10 • Observational Support – 10 • Context – 5
Open Issue: Too many Subclasses (9) • 40 Product schema – by function. • Aggregations – 2 • Observational Data – 10 • Observational Support – 10 • Context – 5 • Operations – 13 (includes 5 PDS3 Context)
Open Issue: Too many Subclasses (10) • 40 Product schema – by function. • Aggregations – 2 • Observational Data – 10 • Observational Support – 10 • Context – 5 • Operations – 13 • Providers see 27, end users see 22.
Open Issue: Too many Subclasses (11) • Are there too many? • Comparing to PDS3 tends to be an apples and oranges situation, but the number of • PDS4 observational data products is roughly equivalent to the corresponding subset of PDS3 Data Objects. • PDS4 context products is roughly equivalent to the corresponding subset of PDS3 Catalog Objects. • PDS4 observational data support products is substantially greater than the corresponding subset of PDS3 Data Objects.
Open Issue: Too many Subclasses (12) • Do we have the correct set? • We're close, but will probably add and subtract a few. • May be significantly affected by the potential change in the XML Schema implementation.
Acknowledgements* Peter Allan David Heather Michel Gangloff Santa Martinez Thomas Roatsch Alain Sarkissian Ed Bell Richard Chen Dan Crichton Amy Culver Patty Garcia Ed Grayzeck Ed Guinness Mitch Gordon Sean Hardman Lyle Huber Steve Hughes Chris Isbell Steve Joy Ronald Joyner Debra Kazden Todd King Joe Mafi Mike Martin Thomas Morgan Lynn Neakrase Paul Ramirez Anne Raugh Mark Rose Elizabeth Rye Boris Semenov Dick Simpson Susie Slavney * Anyone who sat through a DDWG 2-hour telecon or provided useful input.
PDS4 Documentsand their Relationships references references derive generates Big Picture instruct generates Introduction to PDS4 Documentation PDS4 Information Model Specification Registry Configuration File Data Dictionary Data Dictionary Tutorial Data Provider’s Handbook PDS4 Product Labels XML Schemas Standards Reference Concepts Document Jumpstart Glossary Registry generates Requirements User Friendly Product Tracking and Cataloging creates / validates generates Requirements Engineering Specification Definitions Blueprints Object Descriptions Deliverables configures Legend Complete Cookbook Some TBD Informative
Information Modeling Tool Requirements & Domain Knowledge PDS4 Information Model and Generated Documents PDS4 Information Model PDS4 Data Dictionary (ISO/IEC 11179) Filter and Translator Information Model Specification Registry Configuration Parameters PDS4 Data Dictionary (Doc and DB) XML Schema (Specific) Query Models XML Schema (Generic) XMI/UML XML Document (Label)