170 likes | 379 Views
Polish Infrastructure for Supporting Computational Science in the European Research Space. Status and Current Achievements of PL-Grid Project. Jacek Kitowski and Łukasz Dutka ACK CYFRONET AGH, Cracow , Poland Institute of Computer Science AGH-UST
E N D
PolishInfrastructure for SupportingComputational Science intheEuropeanResearchSpace Status and CurrentAchievementsof PL-Grid Project Jacek Kitowski and Łukasz DutkaACK CYFRONET AGH, Cracow, Poland Institute of Computer Science AGH-UST incollaborationwithPL-GridRepresentatives Michał Turała, Kazimierz Wiatr, Marian Bubak, Tomasz Szepieniec,Marcin Radecki, AlexKusznir, Zofia Mosurska, Mariusz Sterzel Piotr Bała, Wojciech Wiślicki, Norbert Meyer, Krzysztof Kurowski, Józef Janyszek, Bartłomiej Balcerek, Jaroslaw Rybicki, Rafał Tylman CracowGridWorkshop, CGW 2009October 12-14, 2009
Outline • National GridInitiativein Poland in a nutshell • Motivation • Rationales and Foundations • Status – CurrentResults • PL-Grid Project – CurrentResults • Summary • E-Science approach to research • Integrationactivitiesongoing in theworld
Demandfrom Community for e-Science Approachin spite of complexity: Computing, Storage, Infrastructure • E-Science: collaborative research supported by advanced distributed computations • Multi-disciplinary, Multi-Site and Multi-National • Building with and demanding advances in Computing/Computer Sciences • Goal: to enable better research in all disciplines • System-level Science: beyond individual phenomena, components interact and interrelate, experiments in silico • to generate, interpretand analyse rich data resources • From experiments, observations and simulations • Quality management, preservation and reliable evidence • to develop and explore models and simulations • Computation and data at all scales • Trustworthy, economic, timely and relevant results • to enable dynamic distributed collaboration • Facilitating collaboration with information and resource sharing • Security, trust, reliability, accountability, manageability and agility I. Foster, System Level Science and System Level Models, Snowmass, August 1-2, 2007 M. Atkinson, e-Science (...), Grid2006 & 2-nd Int.Conf.e-Social Science 2006, National e-Science Centre UK
GEANT2 RationalesbehindPL-GridConsortium • TheConsortiumconsists of five High Performance ComputingPolishCentresrepresentingCommunitiesdue to: • Participationin International and National Projects and Collaboration • ~35 international projects FP5, FP6, FP7 on Grids (50% common) • ~15 Polishprojects (50% common) • Needs by Polish Scientific Communities • ~75% publicationsin 5 Communities • Computational resources to date • Top500 list • European/Worldwide integrationActivities • EGEE I-III, EGI_DS, EGI, e-IRG, PRACE, DEISA, OMII, EU Unit F3 „ResearchInfrastructure” Experts • National Network Infrastructureready • (thanks to Pionier National Project)
Motivation E-Science approach to research EGI initiative ongoing in collaboration with NGIs Milestones: Creation of Polish Grid (PL-Grid) Consortium: http://plgrid.pl Consortium Agreement signed in January 2007 PL-Grid Project (2009-2012) Application in Operational Programme Innovative Economy, Activity 2.3 (in Sept. 2008) Get funded March 2, 2009 (via European Structural Funds) Consortium made up of five largest Polish supercomputing and networking centres (founders) ACK CYFRONET AGH (Cracow) – Coordinator PL-GridFoundations – Summary Polish Infrastructure for Supporting Computational Science in the European Research Space Response to the needs of Polish scientists and ongoing Grid activities in Poland, other European countries and all over the world
AdvancedServicePlatforms Application Application Application Application DomainGrid DomainGrid DomainGrid DomainGrid Grid infrastructure (Grid services) PL-Grid Clusters High Performance Computers Data repositories National Computer Network PIONIER PL-GridBasePoints • Assumptions • Polish Grid is developing a common base infrastructure– similar to solutions adopted in other countries. • Specialized, domain Grid systems – including services and tools focused on specific types of applications – will be built upon this infrastructure. • These domain Grid systems can be further developed and maintained in the framework of separate projects. • Such an approach should enable efficient use of available financial resources. • Creation of a Grid infrastructure fully compatible and interoperable with European and Worldwide Grids. • Plans for HPC and ScalabilityComputingenabled. • Tighly-coupledactivities
Users Grid Application Programming Interface Grid portals, development tools Virtual organizations and security systems Other Grids systems Grid services UNICORE (DEISA) LCG/gLite (EGEE) Basic Grid services Distributed computational resources Distributed data repositories National computer network Grid resources Elements and Functionality • PL-Grid software will comprise: • user tools (portals, systems for applications management and monitoring, result visualization and other purposes, compatible with the lower-layer software used in PL-Grid); • software libraries; • virtual organization systems: certificates, accounting, security, dynamic ; • data management systems: metadata catalogues, replica management, file transfer; • resource management systems: job management, applications, grid services and infrastructure monitoring, license management, local resource management, monitoring. • Three Grid structures will be maintained: • production, • reseach, • development/testing.
Operational Center: Tasks • Coordination of Operation • Management and accounting • EGI and DEISA collaboration(Scalability and HPC Computing) • Users’ requirementsanalysis for operationalissues • Runninginfrastructure for: • Production • Developers • Research • Futureconsideration: • Computational Cloud • Data Cloud • Internal and ExternalClouds • Virtualizationaspects EGI Group EGI Production Middleware EGI TestingMiddleware
Operational Center: CurrentResults • PL-Grid VO (vo.plgrid.pl) fully operational • Virtual production resources for PL-Grid users in 4 centres (Gdańsk, Kraków, Poznań, Wrocław) • Infrastructure monitoring (failure detection) tools based on Nagios & EGEE SAM in place • Support teams, tools and EGEE/EGI-compliant operations procedures aimed at achieving high availability of resources established • Knowledge sharing tool for operations problems • Problem tracking tool to be released in November • Provisional user registration procedure • User registration portal in preparation (1st prototype to be released end of October) • Accounting integrated with EGEE PCSS WCSS Cyfronet
Operational Center: NextPlans User Registration access to any person registered as researcher in Poland - www.opi.org.pl portal being already developed (to be released end of October'09) access to resources based on grants – giving rights to use X CPUhours and TB of disk space access services – way of accessing resources in particular computing centre PL-Grid Research Infrastructure – “resources on demand” allows to request machines for different purposes install a grid middleware for testing purposes for developers to build a testing environment any other research requiring grid resources procedures are being defined Enable access to UNICORE resources small amount of dedicated UNICORE machines integrate UNICORE and gLite – make use of the same resources
Software and Tools Development: In numbers Research developers: 20+ Main research teams: 8 Supercomputing centers involved : 5 Dedicated physical and virtual machines: 30+ Tested infrastructures: EGEE, DEISA New infrastructures and extensions: QosCosGrid, GridSpace Current applications under integration and testing procedures: 10+ Current user tools under integration and testing procedures: 5+ Active links to new user communities in Poland:www.plgrid.pl/ankieta
Software and Tools Development: Next Plans • Closer collaboration with user and developer communities representing various scientific domains (based on feedback received from our questionnaire) • Adaptation of virtualization techniques improving usability, fault tolerance, checkpointing and cluster management • Fancy web GUIs based on Liferay and Vine Toolkit accessing existing and extended e-Infrastructure services • Performance tests of various large scale cross-cluster parallel applications using co-allocation and advance reservation techniques • More analysis of grid management, monitoring and security solutions • Common software repository and deployment rules • Example services, tools and applications available as virtual boxes with different configurations • Main checkpoints and timeline: We are here VI 09 IX 09 IX 10 IV 11 I 12
Users’ Support and Training • General task: • Running Help-Desk and users’ training and support tasks • Supporting how to effectively exploit developed infrastructure including selection of right applications for users’ problems and seamless running computation on the grid. • Pl-grid is an user oriented infrastructure, support and training team will be close to the users. • At the moment: • Support Team is organized • Internal training for support team already begun • Selection of licensed application required by the users’ community to be installed on the whole PL-Grid infrastructure • First training for the users delivered. • Plans: • Many, many trainings for beginners and advanced grid users
PL-Grid Security Group:CurrentResults • Defined security guidelines for architecture planning • Defined security policy for local sites installations • Incident reporting systems review to choose best solution for PL-Grid • GGUS - Global GridUser Support • RTIR - RequestTracking for IncidentResponse • DIHS - DistributedIncidentHandling System • Architecture and prototype version of on-line user mapping and credentials distribution system.
PL-Grid Security Group: NextPlans • Establish NGI-wide CERT • Introduce security policies into local sites • Develop and deploy local sites security policy conformity monitoring system • check if software (kernel, services) is up to date • check listening ports, firewall rules • monitor suid/sgid lists and integrity • Develop distributed alert correlation system build on top of network and host sensors.
Summary – Activities • Short term • Establish PL-Grid VO using Partners’ local and EGEE resources • Provide resources for covering operational costs • Provide resources for keeping international collaboration ongoing • Long term – continuously • Software and tools implementation • Users’ support and traning • Provide, keep and extend the necessary infrastructure • Be prepared / work on approaching paradigms and integration development • HPCaaS and Scalable Computing (Capability and Capacity Computing) • Clouds Computing (internal-external, computing clouds, data clouds) • SOA paradigm, knowledge usage … • „Future Internet” as defined by EC in Workprogramme
http:plgrid.pl Thankyou