180 likes | 323 Views
GSAF Grid Storage Access Framework. Salvatore Scifo INFN of Catania EGEE User Forum Manchester, UK - 10th-11th May 2007. Partnership. Grid Storage Access Framework
E N D
GSAFGrid Storage Access Framework Salvatore Scifo INFN of Catania EGEE User Forum Manchester, UK - 10th-11th May 2007
Partnership • Grid Storage Access Framework • The project is carried out by a cooperation between the INFN and the IR&T engineering s.r.l. (a SME located in Catania, Italy). • The context of this work is the e-Infrastructure Trinacria Grid Virtual Laboratory Project and the ADAT Project (“Archivi Digitali Antico Testo”), aiming at the implementation of a Digital Archive for Cultural Heritage Data (antique manuscripts) with Digital Repository based on EGEE grid services. • Resources • INFN • S. Scifo (s.scifo@ct.infn.it) • A. Calanducci (a.calanducci@ct.infn.it) • On behalf of the Gilda Team • IR&T engineering (http://www.irt-engineering.it) • V. Milazzo (v.milazzo@irt-engineering.it) • A. Magrì (a.magri@irt-engineering.it) Manchester, 10-11 May 2007
Web Integration Requirements • Main objectives of web application • Infrastructure side • Organize and handle big amounts of information • Share documents among several organizations • Security side • Define and apply Access Control Policies • Development side • Build application without specific technical knowledge of the adopted infrastructure (high level API) • Build and maintain dynamic web content (simple tools to manage repositories for provisioning purposes) • User side • Manage Groups and Users (administrative user profiles) • Manage Digital Resources (authoring user profiles) • Access, search and retrieve file/data easily (end user profiles) Manchester, 10-11 May 2007
Grid as Digital Repository • Repository Virtualization provided by • interfaces to manage DATA • interface to manage METADATA • Data Management capabilities • Large and numerous file handling also in distributed environments • Ubiquity: data access independently by their location • Security capabilities • Centralized access control mechanism based on x.509 certificates • Systems capabilities • Availability, Scalability, Fault Tolerance Manchester, 10-11 May 2007
Classic Web Application • Data Presentation Layer consists of all graphical interfaces that make user able to interact with application • Data Business Layer collects all software components that implement the behavior of the given application • Data Access Layer is made up by software components that allow application to manage data (ascii files, xml files, digital object, metadata, SQL data) • Data Access Layer components interact to several types of data sources • File System (for data stored into files) • Relational Database Management System (for data organized into SQL tables) Manchester, 10-11 May 2007
Grid Web Application • Grid environment porting aspects • files are stored inside a Storage Element (SE) • files can be replicated on several SEs for ubiquity, security and sharing needs; relationship among locations of files, replicas and theirs identifiers are kept within a specific File Catalogue Service • for each file is possible to associate descriptive metadata arranged through a specific Metadata Catalogue Service • Technical Approach • replace Data Access Layer with an appropriate interface that permits: • business components to manage data stored within the DMS • presentation objects to search and retrieve data from DMS Manchester, 10-11 May 2007
Designers point of view • Development of applications (web or desktop) is not easy • Grid Data Services are independent each from others • They work in a “stand a lone” mode • Any kind of coherence is ensured • This fragmentariness forces software engineers and web designers to consider a vertical architecture • Application must take care about the atomicity, coherence and synchronization of data manipulation Manchester, 10-11 May 2007
Users point of view • User can use only Command Line Tools • These tools are installed on specific machines called User Interface (UI) and located inside the Grid network boundaries • Users encounter several problem about net access • User has a personal UI • who does ensure its security? • All logical relationships among data and metadata must be kept in his mind Manchester, 10-11 May 2007
GSAF solution • GSAF is an Object Oriented Framework • built on top of the Grid Metadata Service and Grid Data Service and exposes classes and related methods for applications located above • Main objective • hide the complexity and the fragmentation of the several underlying APIs • grouping functional requirements shared among applications • ensure atomicity among different data manipulation Manchester, 10-11 May 2007
GSAF System Architecture Manchester, 10-11 May 2007
GSAF Functional Requirement • Managing Metadata Schemas • Managing ACLs to access Metadata • Managing ACLs to access Data • Uploading file to the SE • Browsing Metadata Catalogue \ File Catalogue • Search file by Metadata • Deleting file Manchester, 10-11 May 2007
GSAF Web Interface • GSAF Web Interface to manage data and their metadata remotely • Initially, the main target of this application was to be a natural tester of the framework basic functionalities • it represents a useful tool to administrate the Grid Storage through internet • Web Interface is the easiest approach • for new users which don’t have specific knowledge of the Grid environment. • no syntax rules are required and users don’t loose the high level view of data neither of metadata schemas. • immediate interaction thanks to comfortable and friendly driven procedures that make training and learning faster. • web application needs only a simple internet connection than it avoids any dependencies from the Grid UI machines. Manchester, 10-11 May 2007
GSAF Web Interface Manchester, 10-11 May 2007
ADAT project Manchester, 10-11 May 2007
Use Cases • ADAT Project • embeds GSAF within the Digital Archive Software • Physics Department of the University of Catania (PI2S2 project) • aims to implement a Grid Oriented Digital Archive for DICOM images. • BM Portal project (Bio-Lab, DIST University of Genoa ) • embeds GSAF framework as a plug-in • GILDA Team • adopts the GSAF web interface for dissemination and training purposes. Manchester, 10-11 May 2007
Outlook • File Replica support • VOMS Integration • ACLs at Disk Pool Manager Level • for coherence between File Catalogue permissions and DPM permissions • Transaction Manager • Serialization levels • Transaction pattern • Execute() • Commit() • Rollback() • All we need to integrate applications…. Manchester, 10-11 May 2007
Conclusions • GSAF means • Useful API to develop Grid Storage based applications • Useful and simple web interface to access Data Management Services remotely • extreme flexible, multi platform and multi user • to be a cross application domain plug-in • comfortable usage of the Web Interface • to be a simple Content Management Tool to manage data remotely • candidate at the EGEE Respect Program • to become a recommended external software for the EGEE middleware Manchester, 10-11 May 2007
References • GSAF wiki pages • https://grid.ct.infn.it/twiki/bin/view/TRIGRID/GSAF • Amga Web Interface wiki pages • https://grid.ct.infn.it/twiki/bin/view/TRIGRID/AMGAWI • AMGA Service and Java API • http://project-arda-dev.web.cern.ch/project-arda-dev/metadata/index.html • GFAL Java API • http://grid-deployment.web.cern.ch/grid-deployment/gis/GFAL/gfal.3.html • https://grid.ct.infn.it/twiki/bin/view/GILDA/APIGFAL • LFC Java API • http://wiki.egee-see.org/index.php/SEE-GRID_File_Management_Java_API • IR&T engineering s.r.l. • http://www.irt-engineering.com • Trigrid VL • http://www.trigrid.it Manchester, 10-11 May 2007