1 / 31

heikki.rouhuvirta@stat.fi

Developing Statistical Information Systems and XML Information Technologies - Possibilities and Practicable Solutions. Heikki Rouhuvirta, Statistical Methodology R&D. heikki.rouhuvirta@stat.fi. Geneva, 8-10 May 2007. Approaches to Statistics Production.

halec
Download Presentation

heikki.rouhuvirta@stat.fi

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Developing Statistical Information Systems and XML Information Technologies- Possibilities and Practicable Solutions Heikki Rouhuvirta, Statistical Methodology R&D heikki.rouhuvirta@stat.fi Geneva, 8-10 May 2007

  2. Approaches to Statistics Production • Sources to statistics – Data Processing • Sources to statistics – Statistical Methodology • Statistics as Information Heikki Rouhuvirta

  3. registers Inquiries other statistical data Compilation / combining of data logical verifications Datum tilasto- aineisto Dirty data processing into statistical concepts Imputation etc. quality control and approval of data for the purpose of statistics compilation protection of unit-level data reporting analyses further processing reporting release release IT in Statistics Production Heikki Rouhuvirta

  4. Methodological processing of statistical data In statistics production Heikki Rouhuvirta

  5. Statistical Information Heikki Rouhuvirta

  6. Challenge: • create solutions that unite the foregoing point of views • the solutions offer the services that statistic production needs • the solutions are easy recognizable by a user and • offer an adequate informative basis for each individual task • by solutions the entity of tasks is manageable for the statistician Key for Solution: • exploitation of XML Technology Heikki Rouhuvirta

  7. Basic of XML XML Spesification for Statistical Information Common Structure of Statistical Information (CoSSI) Heikki Rouhuvirta

  8. … the result from a statistics standpoint … Heikki Rouhuvirta

  9. Statistics Production and Statistical Information 0. Defining • Collecting • Editing • Producing public statistics • Using Stages of Processing Model of Data Organisation condensed format table and description basic format datamatrix and description descriptions in different documents condensing interpreting matrix model including statmeta table model including statmeta statistical metadata model matrix module table module statmeta module Heikki Rouhuvirta

  10. … case studies of XML in statistics production … Heikki Rouhuvirta

  11. XML Database and Statistical Information Heikki Rouhuvirta

  12. Retrieval of Statistical Metadata for a Variable - Simple User Interface Heikki Rouhuvirta

  13. Turn over the Documents in XML Database Heikki Rouhuvirta

  14. Saving Documents to XML Database Heikki Rouhuvirta

  15. Event log of XML Database /db /system admin dba /config admin dba users.xml admin dba rwurwu--- /Tilastot admin dba /logs admin dba contents.xml admin dba rwurwur-- /db/logs/contents.xml ... <event timestamp="2007-03-02T10:57:47.941+02:00"> <type>STORE</type> <path>/db/Tilastot/Arbortext-koulutus/Julkaisut/Julkaisu4.xml</path> </event> <event timestamp="2007-03-02T10:57:48.235+02:00"> <type>STORE</type> <path>/db/Tilastot/Arbortext-koulutus/Julkaisut/Julkaisu4_001.gif</path> </event> <event timestamp="2007-03-02T10:57:48.898+02:00"> <type>STORE</type> <path>/db/Tilastot/Arbortext-koulutus/Julkaisut/Julkaisu4_002.gif</path> </event> <event timestamp="2007-03-02T10:57:49.89+02:00"> <type>STORE</type> <path>/db/Tilastot/Arbortext-koulutus/Julkaisut/Julkaisu4_002.png</path> </event> <event timestamp="2007-03-02T10:58:35.741+02:00"> <type>STORE</type> <path>/db/Tilastot/Arbortext-koulutus/Julkaisut/Julkaisu4_eq_00.gif</path> </event> <event timestamp="2007-03-02T11:26:28.432+02:00"> <type>UPDATE</type> <path>/db/Tilastot/Arbortext-koulutus/Julkaisut/Julkaisu1.xml</path> </event> </events> Heikki Rouhuvirta

  16. Tabulation Application Architecture in SAS Heikki Rouhuvirta

  17. Tabulation Wizard User Interface in SAS EG Heikki Rouhuvirta

  18. SAS Data Editing Process Heikki Rouhuvirta

  19. Logical schema of an XML file Statistical data Heikki Rouhuvirta

  20. Archiving and Backuping to XML Heikki Rouhuvirta

  21. Example of Xquery/SQL Heikki Rouhuvirta

  22. Content of XML file Heikki Rouhuvirta

  23. Production and Dissemination of Tables in Publishing Process Heikki Rouhuvirta

  24. XML Publication Editor - User Interface Heikki Rouhuvirta

  25. Retrieval of Statsitical Information Heikki Rouhuvirta

  26. … and statistical information in tables Heikki Rouhuvirta

  27. Table 1. Statistical Metadata in a informative statistical table (I) Statistical metadata: title, subtitle, footnote, metadata reference (quality declaration) Document metadata elements: subject, keywords, content description, date, identifier Variable 2 Variable 2 Variable 3 Variable 3 Variable 1 Variable 1 Statistical metadata elements: -name, specification, concept definition, concept definition description, operational definition, operational definition description, calculation name, calculation formula, calculation description, measurement unit, measurement description Statistical figure 6 Statistical figure 6 Class value 1 Class value 1 Statistical figure 1 Statistical figure 1 Statistical figure 2 Statistical figure 5 Statistical metadata elements: -note Class value 2 Class value 2 Statistical figure 3 Statistical figure 7 Statistical figure 4 Register metadata elements: name, concept definition, formation intsruction, law, interpretation of law, lawcases, etc. Statistical figure 8 Statistical metadata elements: -code, name, description Document metadata elements: -classification id, type, author, date Heikki Rouhuvirta

  28. Table 1. Statistical Metadata in a informative statistical table (II) Variable 2 Variable 2 Variable 3 Variable 3 Variable 1 Variable 1 Quality declaration Quality Indicators: Coefficient of Variation Value=0.92 Statistical figure 6 Statistical figure 6 Class value 1 Class value 1 Statistical figure 1 Statistical figure 1 Statistical figure 2 Statistical figure 5 Quality Indicators: Coefficient of Variation Value=0.87 Class value 2 Class value 2 Statistical figure 3 Statistical figure 7 Statistical figure 4 Statistical figure 8 Heikki Rouhuvirta

  29. Table 1. Statistical Metadata in a informative statistical table (III) Variable 2 Variable 2 Variable 3 Variable 3 Variable 1 Variable 1 Quality declaration Quality Indicators: Coefficient of Variation Value=0.92 Statistical figure 6 Statistical figure 6 Class value 1 Class value 1 Statistical figure 1 Statistical figure 1 Statistical figure 2 Statistical figure 5 Quality Indicators: Coefficient of Variation Value=0.87 Class value 2 Class value 2 Statistical figure 3 Statistical figure 7 Statistical figure 4 Statistical figure 8 Heikki Rouhuvirta

  30. Conclusions XML Based Service Environment in Statistics Production • The statistics production solution briefly described above gives indications of the kinds of services that could be produced from a statistical information system in future, both for statisticians and the users of statistical data. The foundation (for statistics production) is an XML-based information architecture and standard applications exploiting it. • Basing the implementation of the information architecture on XML allows utilisation of standard and standard-like specifications, but the special characteristics of statistical information should be taken into consideration in their application and implementation. If, for instance, the possibilities of a semantic structural specification are not exploited in the structural analysis and the final structure of statistical data, from the point of information management the solutions become complicated, on the one hand, and ineffective in practice, on the other. From the perspective of application development, it seems especially important that the information architecture itself does not contain application-specific data specifications, because we are unlikely to see a situation where we would have just one monolithic application for both statistics production and information service provision. • A semantically relevant structure helps the statistician and the user of statistics to control the correctness of contents. Heikki Rouhuvirta

  31. Thank you for your attention! Heikki Rouhuvirta

More Related