140 likes | 265 Views
A DISTRIBUTED ARCHITECTURE FOR STATISTICAL DATA PROCESSING AND DISSEMINATION. G. Pongas and A Wro ń ski Eurostat Meeting on the Management of Statistical Information Systems, MSIS 2009, Oslo. Outline. Our idea of a distributed system System components Collaboration levels Conclusion.
E N D
A DISTRIBUTED ARCHITECTURE FOR STATISTICAL DATA PROCESSING AND DISSEMINATION G. Pongas and A WrońskiEurostatMeeting on the Management of Statistical Information Systems, MSIS 2009, Oslo A distributed architecture for ..
Outline • Our idea of a distributed system • System components • Collaboration levels • Conclusion A distributed architecture …
Purpose of the paper • Present the architecture of a distributed statistical system able to • Promote collaboration among SA • Decrease overall burden on SAs • Increase data quality • Improve timeliness • Increase efficiency in data exchange • Key message All the functions and data components must be described through metadata A distributed architecture …
Assumptions • Such a system should • Evolve from / integrate existing systems • Permit different collaboration levels • Not enforce a unique system A distributed architecture …
Ultimate Distributed System Requirements • Fast Data Access • Distributed Process Automation • Common Logical Data Model Acceptance • Access Security • Autonomy of Statistical Agencies’ Systems A distributed architecture …
System Components • Distributed Agents (DA) • Central Agents (CSA) • Local Systems of the Statistical Agencies • Web Clients • Firewalls A distributed architecture …
Component functionality • Distributed Agent (DA): • Provide agreed processing on top of the common logical model • Provide connection between the common logical model and the Statistical Agency data model • Ensure secure access to the data of the Statistical Agency • Central Agent (CSA) • Connects to Distributed Agents • Provides for data unification by using: • Common model • Unique data identification • Global metadata and the metadata of the DAs. A distributed architecture …
Component functionality, continued • Local System • Provision of Data • Production Functions as usual • Firewall • Intermediary between DAs and Local Systems. Provides security and filtering • Web Client • Dissemination clients • Data editing and imputation clients • Complex (analysis oriented) clients • Content maintenance client A distributed architecture …
SA1 SA3 SA2 Architecture options presented from the simplest to the ultimate Data Dispatch Collaboration • Data transmission occurs according to predefined formats and content • Use of common transmission software A distributed architecture …
SA1 SA2 SA3 DA1 DA2 DA3 Client2 Client1 Pull Data Collaboration • SA provides access to selected files stored in the DAs database (download facility) • SA provides query software on top of the DAs database • The above corresponds to the classical dissemination schema A distributed architecture …
SA1 SA2 SA3 DA3 DA1 DA2 SA4 subscribed to DAs Push Data Collaboration • SA provides through the DA : • Data and metadata database (warehouse) • Query facilities • Subscription facilities for automating data distribution A distributed architecture …
SA1 SA2 SA3 DA1 DA2 DA3 CSA1 CSA2 Web client1 Web client2 Advanced (Ultimate) Collaboration • Requires an intermediary level between DAs and Clients (The CAs) • Unifies selected subsets of data coming from different DAs • Can offer processing services using data coming from different DAs • Avoids unnecessary data replication A distributed architecture …
Summary of functionality Collaboration Type A distributed architecture …
Conclusions • Data federation can be gradual (à la carte) • Data federation can coexist with the current legacy systems • Data federation is based on common logical model and common / shared metadata • Not only data and metadata can be shared but also functionalities • Key message All the functions and data components must be described through metadata A distributed architecture …