150 likes | 273 Views
Gateway Implementation. 4/30/2008. Overview. Implementation Technologies / Tools Science Metadata Implementation Browse Interface RDF Search Integration Data Downloading Metrics Integration. Database Driven Approach. All metadata and associated elements stored in a single database
E N D
Gateway Implementation 4/30/2008 ESG-CET Meeting, Boulder, CO, April 2008
Overview • Implementation Technologies / Tools • Science Metadata Implementation • Browse Interface • RDF Search Integration • Data Downloading • Metrics Integration ESG-CET Meeting, Boulder, CO, April 2008
Database Driven Approach • All metadata and associated elements stored in a single database • Data integrity for all elements enforced at the database level • Normalization reduces the amount of duplicated data over previous system • Concurrency and transaction control spanning all related elements • Hot backups supported ESG-CET Meeting, Boulder, CO, April 2008
Database Implementation • PostgreSQL 8.3 selected as the database engine • Better performance and scalability over MySQL • Feature rich and good SQL standard compliance • Full transactional support • OpenBSD license, no dual licensing issues ESG-CET Meeting, Boulder, CO, April 2008
Gateway Implementation • Java based • Spring Framework: • Lightweight Inversion of Control Container (IoC) • Acegi (Spring Security) • Web application support • Database access abstractions (transactions, exception handling, etc) • Full application support, integration of many useful libraries ESG-CET Meeting, Boulder, CO, April 2008
Gateway Implementation • Hibernate: Object Relational Mapping • Maps Java objects to the database • Greatly reduces the amount of database code that needs to be written • Built-in caching, optimized join lookups, and other performance enhancements ESG-CET Meeting, Boulder, CO, April 2008
Database Schema • Still under very active development • Currently 92 tables • Database is separated into 4 logical schemas • Metadata • Metrics • Security • Workspace ESG-CET Meeting, Boulder, CO, April 2008
Science Metadata Schema(subset) ESG-CET Meeting, Boulder, CO, April 2008
Browse Interface • Driven completely from the database • Efficient queries and data structures • Straight forward to cache queries and results • Relatively static structures involved ESG-CET Meeting, Boulder, CO, April 2008
Future Features • Annotations • User submitted comments on resources • Can be applied to collections and logical files • Notifications sent to resource owners and admins for review • Tagging • User defined and assigned keywords • Can be assigned at the collection level • Browsable and searchable • Notifications sent to resource owners and admins for review ESG-CET Meeting, Boulder, CO, April 2008
RDF Integration • Database is the authoritative source for the RDF search data • Event mechanism to trigger RDF updates when the underlying database changes • Database contains detailed information beyond what is stored in RDF ESG-CET Meeting, Boulder, CO, April 2008
Data Download • Data can be retrieved directly from data nodes or the gateway when data is local • Files can be directly downloaded through the gateway interface • Bulk data retrieval scripts can be created through the user interface • WGET is currently supported • Additional options such as DML to come • Deep storage retrieval requests generated from the same interface ESG-CET Meeting, Boulder, CO, April 2008
Authorization Tokens • Lightweight tokens are used to allow users to download restricted files using standard tools, such as standard HTTP clients • Limited lifetime • Grants a particular user access to only a specific resource • Currently implemented for direct gateway downloads and appropriately configured TDS servers ESG-CET Meeting, Boulder, CO, April 2008
Authorization Tokens ESG-CET Meeting, Boulder, CO, April 2008
Metrics System • Metrics data integrated with access control and metadata schemas • Associated with user accounts and inventory metadata • Accurate associations of activities without duplication of data • Use of Jasper reports to allow more flexible options for creating new metrics reports in the system • Evaluating the use of star schemas to allow for better report query performance / options ESG-CET Meeting, Boulder, CO, April 2008