1 / 41

Welcome to the 1 st GLOWA-Volta Database Workshop

Welcome to the 1 st GLOWA-Volta Database Workshop. Agenda. Aims of the workshop Deficits relating the datastocks and data management of the GVP Datamanagement Livecycle of data Conclusions for the GVP Need for integration of the data users to database developement

Download Presentation

Welcome to the 1 st GLOWA-Volta Database Workshop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Welcome to the 1st GLOWA-Volta Database Workshop

  2. Agenda • Aims of the workshop • Deficits relating the datastocks and data management of the GVP • Datamanagement • Livecycle of data • Conclusions for the GVP • Need for integration of the data users to database developement • Role of disciplines to data management • Steps forward to an optimized data management

  3. Aims of the workshop • Initiation of a dialogue with the GVP-members about their requirements to an efficient data management • this dialogue is a process in which the following items should be discussed • data • use and access • database structure • metadatabase • webpresence • database team and division of work

  4. Aims of the workshop • These points should be discussed within the working groups as far as possible. In this workshop we are focussing the items • data (data flow) • data use and access • - set up of a „database team“ and division of work Technical implementation, structure and type of the databases, including ways of access should be developed in a team by members of the departments as well as computer scientists and project leaders!

  5. Deficits relating the datastocks and data management of the GVP currently

  6. The current situation • Data server • data stock is not completed • data searching by criterias • is not possible • arrangement of data is unclear • relation to the project is • unclear • there are no rules for data • uploading (location, topic etc.)

  7. The current situation • Data mediums • what is it‘s content? • to which project/thesis does it belong to? ?

  8. The current situation • Metadatabase • data stock representation • is not completed

  9. The current situation • Metadatabase • if you are looking for data, • you have to ask your • colleague in and outside of ZEF! • maybe the contact person is not available

  10. The current situation • Metadatabase • blind links

  11. The current situation • Datasets • inconsistency

  12. The current situation • Datasets • lack of data description • which method background? • are the values correct? ?

  13. Data management • For the avoidance of such problems there is the necessity of datamanagement • Definition (by the „Data Management Association“): • „Data Resource Management is the development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs of an enterprise“ • Normally the processes of data management should be implemented within a project, when it starts!

  14. Disposal • Update or • Erasure Procurement Administration • Own investigation • Own processing • Supply from other • institution/Project • User Access Rights • Security Use and Processing • Data processing • Content Management • Quality Assurance • Data preparation • (for others) Distribution • Access • Deliver Structuring (Data modeling) - Categorisation - Sortation - Description and Storing Lifecycle of Data and Aspects of his Management

  15. Lifecycle of Data: Procurement of Data • can happen from • own investigations • other institutions • other (sub-)projects within the main project • serves • for providing the operating processes with input data • needs • certain data sources and formats • quality • application interfaces (import)

  16. Lifecycle of Data:Structuringand Storing • means • sorting of data related to a classification schema • by themes • by projects/subprojects • by formats • by applications • by spatial research area • .....

  17. Lifecycle of Data:Structuringand Storing • or/ and • by a conceptual data model • it obtains the data entities and their relationships within a scope of a system • the entities have properties (attributes) • it is independend of the storing in a database and other technical requierements • it can be designed in different forms (relational, network, hierarchical) • the target system for data storing can be a relational database as well as a file system

  18. Lifecycle of Data:Structuringand Storing • serves for • easy search, find and use of data • needs • consensus among data producers and users within an organization about • conceptual data model • data needed and not needed • rules about data updating and archival storage • standards for metadata-content • control of compliance to structure criterias

  19. Lifecycle of Data:Structuring and Storing • means • the physical storing of data • needs • storage places for the databases (central/distributed) • physical data model • derived from the conceptual/logical data model • takes into account the facilities and constraints of a given • database management system • database management system with • interfaces for applications • query and search services • backup and security functions

  20. Lifecycle of data:Administration • means • on technical base • install and maintenance of database system (database + database management system) • user access constraints (rights) • back up and archiving tasks • security • performance • on content base • Integrity - verifying or helping to verify • control of data deliver • control of data input • metadata

  21. Lifecycle of data:Administration • needs • cooperation between data producers/users and administrators for • maintenance and upgrading the database(-schema) • definition of the authorization concept for database access • (read only, read/write only, database schema modification etc.)

  22. Lifecycle of data:Use and Processing • means • use of data for analysis • processing of data inside and outside of models • production of new or modified (output-)data • control of data accuracy • preparation of data for other processes/projects

  23. Lifecycle of data:Distribution • means • delivery of data • inside an organization/project • by storing in a database (access by transfer counterpart) • transfer by a portable media • by publishing the metadata • outside an institution/project • by direct access to a database • Web-Services • publishing the metadata • data extract service from a database • data downloads • Map Services (geodata)

  24. Lifecycle of data:Distribution • serves • inside an organization/project • for providing work processes with adjusted data • outside an organisation/project • for providing work processes with adjusted data • for providing data for public information about the projects • needs • knowledge about the requierements of demand concerning • further use of data • formats • clients • ...

  25. Lifecycle of data:Disposal • means • updating the data • selection and deleting or archiving of data • being out of date • being in disuse • serves • against data overflow into the databases • for maintenance the quality of data • needs • cooperation between the data producers/users and the • database administrator

  26. Conclusions for the GVP • Conditions • GVP is divided in a range of projects and subprojects • e.g. in Phase II „Land Use“ with subprojects L1, L2 etc. • e.g. in Phase III „Analysis of Long-Term Environemental • Change“ with the subprojects E1, E2 etc. • with their own processings, models, input and output data • (- formats) data flows and -storages • with specific integrations and dependencies among each • other and within „use case“ frameworks • Projects and their models are provided also with data from • different scientific disciplines like Hydrology, Pedology, Social • Economy, Ecology etc.

  27. Conclusions for the GVP • Conditions • in Phase III main objective ist the „Integration of Phase I • and II research results, knowledge, data and tools“* • in Phase III the DSS will be realized as the GVP‘s primary output • The several subprojects are connected by data flow (transfer) • The data flow should be adjusted to the GVP and DSS requierements. This means there must be a transparent management, which is centralized and standardized * GVP Phase III Proposal, S. 8

  28. Need for integration of the data users during development and setup of a GVP-data management • Each researcher (or on a higher level: project) is a kind of data manager in his own work space. He has • is own (local) database • his own input and output data and data procurement requierements • his own usage and processings • his own distributing of data (to other users/projects) • and therefore his own (short) lifecycle of data • and is integrated in the data flow between the projects and also their life • cycle of data

  29. Data flow Central Database Data flow Project 3 Project 2 Project 4 Project 1 Project 1 Project 3 Project 4

  30. Role of disciplines in developing concepts of a data management project members.... • have to decide, together with other project members and the • database developers, which data should be stored centrallyto share them, and which can be stored locally or at other places • have to decide which structure of data storing is most convenient for an optimized using • have to give information about their data (create metadata) • and • they are responsible for the data management in their own work area - before they will be interdisciplinary coordinated by the database administrator

  31. Role of disciplines in developing concepts of a data management developers of a database .... • have the responsibility to consult the project members about the requirements of data management • have to organize the data flow concerning the (technical) way of data storing and access. The activities must be adjusted to the operating processes/projects and their interfaces • have to develop the data management standards together with the project- members

  32. Steps forward to an optimized data management (within this workshop) My request to you Step 1: analyze the data stock (data dictionary) Step 2: analyze the data flows Step 3: develope the logical data model for data storing

  33. Basic for working groups Data flow modell combined with data dictionary Notation : = Terminator: data producers (data source) or users (data hollow) outside the system (external Partners, public) = Process: transfer of input data into output data e.g. by algorithms = Data storage unit as data pool (not local). Building time differs from using time. „A“  dictionary A = a Data flow: direction for dataset „a“  dictionary a = Data flow: relay in two directions (processes)

  34. Decision Makers External Partner External Partner Public Basic for working groups Context-Diagram GLOWA-Volta

  35. External Partner External Partner Water Supply and Distribution Water Demand and Management Analysis of Long-Term Env. Change Basic for working groups Diagram 1: GLOWA-Volta DSS

  36. Automated Classification of Remotely Sensed Imagery (E1) Cellular automata (E2) Land-use Change Predictions and LU Policy Basic for working groups Diagram 2: Analysis of Long-Term Environmental Change Vendor of remote sensing data GVP LUDAS (E3) S 1

  37. E 1 GVP-LUDAS Basic for working groups Diagram 3: GVP-LUDAS working group: natural scientists working group: social economists a, b, c Elicitation Ghana A g e f d Evaluation of Elicitation Results (House- hold Survey) E 4

  38. To Do • Please try to draw a general overview about data flows and stocks • And relate data management options to the certain data flows or storages In the afternoon I would like to discuss the requirements of a data management system from your point of view. Take it all as a form of brainstorming!! Thank you!!

  39. How to organize (sort) the data into the database ??? Central Database • Project 1: • theme 1 • format 1 • format 2 • theme 2 • .... • Region 1 • Project • subproject • theme • format • Formats • SPSS • project 1 • project 2 • remote sensing • .... • Project 1: • theme 1 • format 1 • format 2 • theme 2 • ....

  40. Basic for working groups Data flow modell combined with data dictionary Notation II: a = Dataflow: relay in two directions (processes) b = a Dataflow: division from dataset „a“ into datasets „b“ and „c“ c b = Dataflow: „a“ is originated from „b“ and „c“ a c a = Dataflow: updating of data to a storage

More Related