150 likes | 179 Views
Universal Access Layer Presented by: G. Manduchi TF Leader : G. Falchetto Deputies: R. Coelho, D. Coster EFDA CSU Contact Person: D. Kalupin. 01/12/2010. The need for data layer abstraction. Within the ITM framework, a unique data interface is defined.
E N D
Universal Access LayerPresented by: G. ManduchiTF Leader : G. Falchetto Deputies: R. Coelho, D. CosterEFDA CSU Contact Person: D. Kalupin 01/12/2010
The need for data layer abstraction • Within the ITM framework, a unique data interface is defined. • Many data access tools are used in the fusion community • Different data formats; • Different views of data strucures. • No data access tool has been selected to be directly used in ITM • Rather, a unique data interface is exposed to users, hiding the actual implementation. • This layer represents therefore the only way users can deal with data.
Database abstraction • A Data Model is presented to simulation program • The model is decoupled from its actual implementation • In Object – Oriented Terminology:Program to Interfaces • Interface Decoupling allows: • Changing the underlying implementation whenever technology provides a better solution; • Using different solutions, possibly mixed in distributed systems.
A Bus-Like View of the UAL Grid Simulation Interface (KEPLER) Batch processor HPC server HPC server UAL Bus HDF5 Files MDSplus data server
The Data Model • Users normally have access to basic data types such as Integer, float, double, strings, n-dimensional arrays. • Physical entities are represented by several pieces of information, contributing to their complete definition. • In the ITM framework physical entities are represented by Consistent Physical Objects (CPOs) • Every CPO is represented by a (possibly complex) hierarchical data structure. • In essence: only CPOs can be read and written in the ITM framework.
A neutral language-independent definition of CPOs • XML has been chosen to provide an abstract definition of the hierarchical structure Consistent Physical Objects. • XML appears to be the best candidate for the generic definition of hierarchical data structures. • XML represents therefore the common language to establish the way data are organized in ITM. • Several tools are available for analysis and graphical display of XML descriptions • Such a graphical view is published in the ITM web page
Consistent Physical Objects and Time • CPOs can describe time-independent information • E.g. geometry description • In other cases CPO will describe phenomena over time • Every CPO instance represents a snapshot of the described physical quantity. • Every time-dependent CPO has the field « time », i.e. the time that snapshot refers to. • Time evolution of a given physical object is represented by an array of CPOs in the ITM database.
From Abstract Representation to Data access • The usage of XML schemas is effective for agreeing on widely accepted data structure. • The UAL provides an Application Programming Interface (API) for reading/writing structured data for a variety of languages (Fortran, C, Java, Matlab, Python). • A one-to-one mapping exists between the XML structure description and the actual language specific implementation. • The language specific API represents the exclusive interface between applications and database implementation.
UAL implementation - Architecture • The UAL interface is split into two levels: • The high level interface provides the user language specific API • The low level interface provides a system-independent API composed of a set of low level data access routines • Mapping between High level and low level layers is carried out by code generated from the XML interface definition • Two implementations of the low level API are currently available for MDSplus and HDF5.
MDSplus and HDF5 • MDSplus is used in the fusion community as a common format for data exchanging • HDF5 is used to provide efficient machine-independen storage and access of hierarchical databases • The abstract data structures defined in the UAL are mapped onto specific data structures for both systems • The hierarchical view of CPOs in the UAL fits well in both systems since both support a hierarchical data view. • A comparison bewteen the two systems is provided in: • ”Commonalities and differencies between MDSplus and HDF5 data systems”Fusion Engineering and Design 85:3-43-4, 583-590
UAL structure • Structured in modular layers. One could change one layer without modifying the others • High level : • Knows about CPO structure • Language specific • Dynamically generated • Low level : • Deals with single elements : GET/PUT scalars, vectors, arrays, … • Can use multiple storage methods • Transport : managed by MDS+ • Storage : MDS+, memory, HDF5 F90 Java C, C++ Matlab Python C MDSip, TDI func MDSplus (file or memory) HDF5 11
UAL data access within the ITM framework • When programs are executed within the Kepler Framework, no data access is performed. • Programs receive the current input and output CPO reference as argument when they are called by the framework. • The kepler framework provides all the required data access, and prepares the language specific CPOs structures to be used by programs. • Two Kepler modules will provide the source and the sink of data used in simulation • During a Kepler Simulation, CPOs can be stored in memory using the same interface • This is achieved using the MDSplus ability of mapping pulse files in memory cache
UAL and HDF5 in higly data intensive programs in HPC • The HDF5 UAL implementation is used for simulations handling a huge amount of data; • Parallel I/O is being integrated in the UAL for HDF5 • HDF5 files produced in HPC systems will be transferred via (Grid)FTP into the gateway orchestrating the simulation • This approach is different from the MDSplus one, where data are exported by a data server and not stored in local files • It is possible to mix UAL access for MDSplus pulse files and HDF5 in a trasparent way, even from the same program
Collecting experiment data for the UAL • Simulation must work on real data • Even if experiments use different data systems, most of them provide also a MDSplus « access point » • Using MDSplus remote data access experiment-speficic data can be collected and, through the UAL interface, ITM experimental databases can be created • MDSplus remote data access is also used to implement remote UAL access • Exceptions are represented by data intensive applications requiring local storage and producing very large databases which are then transferred via FTP.
Conclusions • The UAL represents the Data Bus for ITM-TF applications. • It is not tied to the Kepler framework used for simulation in ITM-TF • It provides an abstract high level view of structured data, offered in 5 different languages • High level data view is decoupled from low level data management. • Currently data storage is provided by MDSplus and HDF5; data transport and memory mapping by MDSplus.