150 likes | 292 Views
Data Collection File Standards. Gerrit de Bolster, 11 May 2005. Topics. E-Priorities Interchange formats Developments at Statistics Netherlands Envelops Management of Standards. Priorities at CBS. Secondary data collection Primary data collection Exports using standard files
E N D
Data CollectionFile Standards Gerrit de Bolster, 11 May 2005
Topics • E-Priorities • Interchange formats • Developments at Statistics Netherlands • Envelops • Management of Standards
Priorities at CBS • Secondary data collection • Primary data collection • Exports using standard files • Interactive questionnaires • Webforms • HTML/upload • Off-line tools
Interchange formats (1) • Transportable • Pass through firewalls • Not open to viruses • Not inviting providers to modify the message • Recognizable • Process meta • Content meta • Platform independent
Interchange formats (2) • Open • Free to use • Vendor independant • Stable • Managed • Available • Easy to process
Examples used at CBS • ASCII • Fixed • CSV • EDIFACT • XML • XBRL Other file formats in use: • Excel • Blaise bdb • Access • …
ASCII • International Trade • International Services • Consumers Price-Index • Scanner data • Traffic & Transport • Most EDR messages • etc. • Limited content meta • No process meta • No meta tags • Easy to process (difficult to detect)
EDIFACT • International Trade • CUSDEC / INSTAT • CUSDEC / COSTAR • Traffic & Transport • IFTDGN / PROTE5 • Content meta • Process meta • Limited meta tags • Difficult to process (CBS convertor build in Blaise)
XML • International Trade • Instat62 (Baan) • Traffic & Transport • Roadtransport • Producers Price-Index • EDR message • Content meta • Process meta • Meta tags • Easy to process with modern software (CBS convertor build in Blaise to support legacy systems)
XBRL • EMU reports • Waterboards • Municipalities • Provinces • International Services (SAP, Adobe) • Short term indicators (turnover) • National Taxonomy Project • In co-operation with Tax, Justice, Ch. of Com. • Financial Economic data • XML++ • Difficult to process because of its richness
Envelop • Contains process meta • Content independant • Transport medium independant • Provides consistency control • Should be (an open) standard Now: • CBS XML envelop (EDR and CBSsend) • EDIFACT Future: • SOAP • ebMS (SOAP+)
Envelop: manage data flows Tool3 Tool4 Tool1 Tool2 SMTP FTP Post Office MesDesk 2004: 275.000 messages Distribution/conversion surveys
Management of Standards • Technical standards • UN/CEFACT • OASIS • XBRL.org • ISO • etc. • Content (Taxonomy, MIG) • Suggestions ? • I do !
Content Standards • Are there any for statistical purposes? • If so, what is the scope (local, national, european)? • Are they necessary? • And on what scope? • Can they be created? • How and by whom should they be managed? • What role should the NSI’s play in this? • And what about Eurostat?