190 likes | 319 Views
EGRID Project: Experience Report. Implementation of a GRID Infrastructure for the Analysis of Economic and Financial data. EGRID Project: Experience Report. Econophysics GRID Italian Ministry of Education (MIUR) funded project.
E N D
EGRID Project: Experience Report Implementation of a GRID Infrastructure for the Analysis of Economic and Financial data
EGRID Project: Experience Report Econophysics GRID • Italian Ministry of Education (MIUR) funded project. • Purpose: pilot project for future Italian National GRID facility for Economics and Finance. • Serves the computing needs of two select research projects: • INFM’s High frequency dynamics in financial markets. • AREA Trieste’s Softcomputing techniques applied to modern finance. (Both applying models from Physics to Economics/Finance)
EGRID Project: Experience Report Summary: • User requirements • The EGRID facility • Deficiencies of EDG middleware • EGRID solutions/workarounds • Next steps of EGRID
User requirements AREA • Big DB of corporate budget analysis: to be exported to GRID + WEB. • Access must be: secure + authenticated + authorised. • No real need for computing power.
User requirements INFM: • Management services for 2TB Stock Exchange data (NYSE, Milan, etc.). • Data privacy and security: legally binding contracts. • Processing facility for raw data.
The EGRID facility Physical Infrastructure: • Non-partner centre INFN Padova supplies all bulk computing power + storage. • Resources: 2.6TB storage + 4 exclusive CPUs + 100 CPUs best effort. • INFN Padova already part of national High Energy Physics GRID – INFN-GRID. • Our Users provide limited local GRID-enabled buffer storage to offset bandwidth problems.
The EGRID facility Padova CE SE 2.6 TB WNs 100 CPUs Firenze site RB (Padova) CE+SE+WN Trieste Palermo . . . . CE+SE+WN CE+SE+WN
The EGRID facility Software Infrastructure: • Peripheral Sites with same middleware of INFN-GRID: GLOBUS 2.2/2.4 based EDG/LCG2. • EGRID software layer on top of EDG/LCG2 to simplify data management: egrid-upload /nyse-2002-01.tar.gz lfn:/fonti/cd/nyse-2002-01.tar.gz edg-replica-manager --vo=egrid copyAndRegisterFile \ file:///home/usr/nyse-2002-01.tar.gz \ -d sfn://egrid-10.egrid.it/flatfiles/SE00/egrid/fonti/cd/nyse-2002-01.tar.gz \ -l lfn:/fonti/cd/nyse-2002-01.tar.gz
The EGRID facility Software Infrastructure: • Raw data processing EGRID SW: Stock Exchange format -> more usable research format. • Ad-Hoc solution for AREA DB access: web-enabling techniques (CGI, JSP, etc.) + GSI security (Apache MOD_GRIDSITE) + GRID Information System integration.
Deficiencies of EDG Middleware Data privacy and security • GSIFTP protocol moves data around the GRID but GridFTP daemon only enforces access restrictions by way of standard UNIX permission triple. • Pool account mechanism on SE does not allow access rights partitioning within same VO. • Neither authentication nor authorization enforced on RLS: replica catalogue easily corrupt!
Deficiencies of EDG Middleware Middleware deployment • EDG based on Red Hat Linux 7.3 • No complete installation instructions. • LCFGng installation tool poorly documented + needs dedicated machine + does not allow useful software combinations (i.e. no CE+SE+WN on same machine). • UI needs dedicated machine: cannot be installed on user’s own workstation.
EGRID solutions/workarounds Data privacy and security: Data resides in SE - that’s where security must be guaranteed; no ACLs available – RedHat 7.3 limit. • Pool account mechanism disabled in SE. • Each GRID user mapped to his/her own corresponding local account. • UNIX groups formed by gathering users based on contract rights to data access. • Files on SE protected by group ownership rights. • A nested directory structure allows: read access to group + write access to subset of group. • Central LDAP server publishes user/group account maps + propagates them to SE.
EGRID solutions/workarounds Middleware Deployment: • Painstaking job of: documentation tracking down + deriving from LCFGng installation explicit procedures for single GRID elements + interpretation of obscure error messages + trial and error. • Knoppix based LiveCD technology for UI and SuperNode: can be run on the fly from the CD, or can be installed on a machine. • Script installs UI on any WorkStation – no need to re-install machine + no need for RedHat 7.3.
EGRID next steps • Present security mechanism is only a temporary solution (scalability issues)! EGRID working with INFN to develop StoRM SRM server: features ACL enforced security to GRID files. • Portal for User Applications to replace CLI. • Porting of Parallel Applications.