480 likes | 646 Views
I risultati scientifici del progetto SCoPE SCoPE Scientific Results. L. Merola Workshop finale dei Progetti Grid del PON "Ricerca" 2000-2006 - Avviso 1575 Catania, 10-12 febbraio 2009. Summary The “University of Napoli Federico II” The SCoPE Project and Research Areas
E N D
I risultati scientifici del progetto SCoPESCoPE Scientific Results L. Merola Workshop finale dei Progetti Grid del PON "Ricerca" 2000-2006 - Avviso 1575 Catania, 10-12 febbraio 2009
Summary • The “University of Napoli Federico II” • The SCoPE Project and Research Areas • The SCoPE Data Center & Metropolitan Network • Results from Material Sciences • Results from Life Sciences • Results from MicroCosm and MacroCosm Sciences • Results from Middleware applications • Beyond SCoPE
The “University of Napoli Federico II” Founded in 1224 Second largest university in Italy More tha 3.000 professors and researchers More than 10.000 new student per year Involved in the most strategic areas of scientific research, e-Science and technology
The SCoPE project and Research Areas SCoPE : SistemaCooperativodistribuito ad altePrestazioniperElaborazioni Scientifiche Multidisciplinari (Distributed Cooperative High Performance System for Multidisciplinary Applications) Objectives: • Innovative and original software for fundamental scientific research. • High performance Data & Computing Centre for multidisciplinary applications. • Grid infrastructure and middleware INFNGRID LCG/gLite: • Compatibility with EGEE middleware • Interoperability with the other three PON 1575 projects and SPACI in GRISU’ • Integration in the Italian and European Grid Infrastructure.
Research areas: • MicroCosm and MacroCosm Sciences • Materials and Environment Sciences • Life Sciences • Social Sciences • Middleware • 19 Departments and Research Institutes • 128 Professors and Senior researchers from • UniNA + many others from Research • Institutes (INFN, etc.) • 35Young researchers (assegni di ricerca) • 28 Technology specialists (co.co.co.) 5
Struttura centrale (CSI) Area delle Scienze MM.FF.NN. (Campus – GRID) Area delle Scienze Mediche e Biotecnologie Area delle Scienze Umane e Sociali Organizzazioni esterne ma collegate Area delle Scienze Ingegneristiche Macroarea Macroarea Macroarea Scienze M.F.N. Medicina Medicina Dip. Biochimica e Biotecnologie mediche Dip.Matematico- Statistico Dip. di Scienze Fisiche Dip.Informatica e Sistemistica INFN Sezione di Napoli Dip. Ingegneria Elettrica Dip. di Matematica CNR-SPACI Napoli Dipartimento di Sociologia Centro diEccellenza per lo Studio delle Malattie Genetiche Dip.Ingegneria Elettronica e delle Telecomunicazioni Dip. di Chimica INFM Unità di Napoli CEINGE Dip.Ingegneria Chimica CRIAI Dip. di Analisi e Progettazione Strutturale ARPA CINI
Astrophysics group • Search for gravitational waves • Data mining and visualization of astronomical massive data sets • http://virgo.infn.it and http://www.eurovotech.org/ • Particle Physics (subnuclear physics) group • Study of proton-proton interactions at the CERN-LHC Large Hadron Collider and implementation of a Tier2 Data Centre for large scale, data intensive Montecarlo simulations and data analysis • http://lxatlas.na.infn.it
Bioinformatics group Study of genome sequence analysis and image analyses for cell motility and other dynamical phenomena http://bioinfo.ceinge.unina.it/ and http://www.ceinge.unina.it Numerical Mathematics and Scientific computing group Study and design of algorithms for distributed scientific applications and implementation on HPC infrastructures Statistical mechanics group Study of applications of statistical mechanics to complex systems http://smcs.na.infn.it/
Electromagnetism and Telecommunication group Study of models and measurements of electromagnetic field in the Napoli metropolitan area http://www.diet.unina.it/gruppoCE/GruppoEl.html Material Science group Study of molecular dynamics and optical properties of nano-structured materials http://lsdm.campusgrid.unina.it/ and http://www.nanomat.unina.it Soft Matter Engineering group Study of models and simulations of the flux of micro-structured materialshttp://wpage.unina.it/p.maffettone 9
The SCoPE Data Center 33Racks (of which 10 for Tier2 ATLAS) 304 Servers for a total of 2.432 processors 130 TeraByte storage 2 remote sites: Fac. Medicine: 60 TB storage Dep. Chemistry: 8 server multiCPU (4-proc)
INGRESSO Fisica Biologia Data Center SCOPE Control Room SCOPE
SCoPE DataCenter & Tier2 ATLAS Cabina elettrica1 MW
Control room Data Center Chem. Med. Low latency network 2432 core
S.CO.P.E. CAMPUS GRID Monte Sant’Angelo DSF INFN Fibra ottica DMA DiChi GARR C.S.I. Centro S.CO.P.E. Cabina elettrica G.E. 1 MW Dipartimento di ScienzeFisiche Sezione INFN Control room S.CO.P.E. The Metropolitan Network GARR 2.4 Gb/s
Results from Material Sciences (Dip. Ingegneria Chimica – Dip. Matematica e Applicazioni SUSPENSION OF PARTICLES IN LIQUIDSIntensive simulations in 2D & 3D FEM Motivation • Suspensions of particles in liquids are a class of materials relevant in a huge variety of applications, e.g. • Polymer melts with fillers • Biomedical materials • Food • Cosmetics • Detergents • This variegate spectrum is due to the differences in particle concentration, mechanical properties, shape, and size. • Particles suspended in viscoelastic media are known to develop structures when sheared at sufficiently high shear rates.
Aim Characterize the flow behavior of dilute and semidilute suspensions of rigid spheres in viscoelastic media in confined geometries. • Sw libraries: • BLAS/LAPACK • METIS • To solve linear systems emerging from FEM, following solvers have been used/compared: • HSL library (commercial, direct solver, sequential) • MKL Gmres (included in MKL, iterative solver, sequential) • Sparsekit (free, iterative solver, sequential) • Pardiso (included in MKL, direct solver, OpenMP) • Mumps (free, direct solver, MPI) • Petsc (free, iterative solver, MPI)
3D Two Particles flux in confined geometries Newtonian Fluid Non Newtonian Fluid
Results from Material Sciences (Dip. Scienze Fisiche) From single-particle energy levelsTight-Binding sw package to study optical properties (absorbance, reflectivity, refraction index, photoluminescence) of semiconductor nanocrystals vs. shape and dimensions. Intensive computing and RAM requirements (10^3-10^5 rows matrix diagonalization) . Medintz et al. Quantum dot bioconjugates for imaging, labelling and sensing, Nat. Mat. 4, 435 (2005)
Nanocrystal (real) H.Hofmeister, F.Huisken and B.Kohn, Eur. Phys. J. D 9, 137 (1999). Nanocrystal (model) 3 nm Problem Tight Binding = wave functions as linear combination of atomic orbitals. Advantages: 1) Hamiltonian Matrix of small dimension: only 4 orbitals per single Si atom; 2) High level of sparsity (>95%): the diagonalization time scales linearly with dimension ; 3) Symmetries: Hamiltonian Matrix decomposed in independent blocks according to the irriducible representations of the simmetry group.
Results Absorbance of InAs colloidal nanocrystals vs. dimensions Line: this model (Trani et al.Phys. Rev. B76, 085302 2007) Points: experiment (Yu et al.J. Phys. Chem. B109, 7084 2005)
Software • Language: Fortran 90 • Libraries: BLAS, LAPACK, math lib (MKL) for matrix manipulation Package tested on: • linux AMD Athon a 1.6 GHz, • AMD64 a 3.2 GHz, • Alpha True64, • Intel Xeon. • Compilers: Intel, PGI, Gfortran • HPC needed: Parallelization under development • Web interface: http://www.nanomat.unina.it SCoPE portal Workshop SCoPE - Stato del progetto e dei Work Packages Sala Azzurra - Complesso universitario Monte Sant’Angelo 21-2-2008
Results from Material (and Life) Sciences (Dip. Chimica) Computational modelling of molecular and supra-molecular systems • Recent developments: theory: both classical (MM) and quantistic (QM); algorithms (linear scaling methods) and technology. • Intensive simulation processes. • Efficient description for: • Average dimension structures • Periodic systems • Problematic areas: • large non periodic systems: • - Nanoparticles • - biomacromolecules • - “defects” of materials Soft Matter
Application (example) Dynamic ADMP (Atom centered Density Matrix Propagation)/ ONIOM (Our own N-layered Integrated molecular Orbital + Molecular Mechanics) on ionic channel of the gramicidine A. In ADMPthe electronic degrees of freedom have a “unreal” mass and propagate from step to step. • - nella dinamica classica, le posizioni e i momenti dei nuclei vengono propagati da uno step al successivo: il campo di forze consente di calcolare l’energia e le accelerazioni punto per punto. • - nella dinamica ADMP (come nella congenere dinamica Car-Parrinello), anche i gradi di libertà elettronici hanno associata una massa fittizia e vengono propagati da uno step all’altro (Lagrangiana estesa). • - nell’ADMP, i gradi di libertà elettronici sono codificati da una matrice densità espressa in termini di funzioni di base centrate sugli atomi.
Results from Material & Environment Sciences (Dip. Ingegneria Elettronica e delle Telecomunicazioni) Study of the environmental impact of e.m. field P.zza Plebiscito Palazzo Reale Real-time forecast of the e.m. field on the metropolitan area (Napoli) • Numerical solvers for the optimization of planning of wireless mobile phones networks. • Interpolation on the metropolitan area of the e.m. exposure, starting from a limited number of sensors. • Applicationto a project for a call- center for public information 24 Livello del campo Livello del campo
Interests of UniNA towards a Virtual Organization in MATerIalScienceSimulation andEngineering (MATISSE) (ref. Prof. Domenico Ninno)
Results fromLife Sciences(Fac. Medicine & CEINGE - Biotecnologie Avanzate) H. Sapiens M. Musculus CSTs Large DataBase for genomic sequences of bacteria, vertebrates, trees. • Applications CPU & Data intensive: • Identification and characterization of nucleotide sequences DG-CST (DISEASE GENE CONSERVED SEQUENCE TAGS), A DATABASE OF HUMAN–MOUSE CONSERVED ELEMENTS ASSOCIATED TO DISEASE GENES. More than 60.000 sequences identified associated to diseases.
Proteina mRNA biological function DNA RNA Strutturato • Applications CPU & Data intensive: • Gene mining
Execution time on GRID 1 WN Grid Gene mining Large DataBase for genomic sequences of bacteria, vertebrates, trees.
Results Automatic annotation of genomic sequences: Search for functional RNA structures Sequences potentially transcribed has been split in overlapping fragments of 150 bp length. 290,904 sequences
IPROC architecture data + images proc- steps page iPage HPC on Cluster nodes G a t e w a y area iPane iPane iPane image
Results fromMiddleware for applications(Dip. Matematica e Applicazioni) (see talk by Vania Boccia) Problem Solving Environment MedIGrid Grid-aware HPC for medical images: • management • processing • visualization 32
Message passing level Multithreading level CPU CPU CPU CPU cores cores cores cores memory memory memory memory Parallel out-of-order task scheduling without synchronization and idle time Better efficiency on the single CPUs While (local error > local tolerance) refine subdomains on the cores rearrange subdomains among CPUs Endwhile Subdomains reorganization without global communications Better scalability on the blade system Multilevel adaptive algorithms on MP multicore architectures(poster; abstract no. 92)(Dip. Matematica e Applicazioni) 1st level: message passing among CPUs of a blade server 2nd level: multithreading among cores of a single multicore CPU
Results fromMacroCosm Sciences(Dip. Scienze Fisiche & INAF) Virtual Observatory: objective: federation and interoperability of worldwide astronomical data archives according to the standards of the International Virtual Observatory Alliance (IVOA). Large astronomical surveys (from 100 TB to 1000 TB) requirements: patterns, trends etc in high dimensionality parametric spaces. VO-Neural Project Implementation of a web application (WA), of Data Mining and visualization methodologies for complex scientific data in distributed systems. WA is intended to be a service for both astronomical and bioinformatic international communities.
S.Co.P.E. at Caltech DAME – Data Mining & Exploration • International Collaboration : • Università Federico II • INAF - Napoli • Caltech • Pennsylvania State University • Pune IUCAA - India • MIRROR sites: • SCOPE - UNINA • NESSSI - Caltech • Applications: • Astrophysics • Biology e Bioinformatics • Enterprises http://nesssi.cacr.caltech.edu/dmtest/index.html
DAME offers user friendliness task for Data mining tasks. • DM models now available: • MLP: Multi Layer Perceptron • SVM: Support Vector Machines • PPS: Probabilistic Principal Surfaces • DM models under developments: • Bregmann co-clustering • SVM-C: SVM per clustering • Reti Bayesiane • PCA & ICA • Access to the GRID through robot-certificates (e-Token) • Specific applications are offered to the user as web – applications. • Photometric redshifts for galaxies and quasar • Search for quasar candidates • Automatic classification of AGN (Active Galactic Nuclei) by photometric multiband surveys. Talk by M. Brescia Poster by Laurino Poster by Riccio
Ex: Automatic classification of AGN Base di conoscenza spettroscopica (per addestramento SVM) 30380 objects Superficie dei parametri delle SVM Ottenuta su 110 nodi di S.Co.P.E. lg2(gamma) • e = 79.69% • e Seyfert: esey = 74.76% • e LINER : eLIN = 81.09% • c Seyfert: csey = 52.77% • c LINER : cLIN = 91.69% lg2(C)
Results from Gravitational waves research(Dip. Scienze Fisiche & INFN – Istituto Nazionale di Fisica Nucleare) VIRGO interferometer at Cascina (PI) DATA INTENSIVE ALGORITHMS TO SEARCH FOR GRAVITATIONAL WAVES MORE THAN 1 TB /day to be analyzed Signals from: - periodic systems (Pulsar) - coalescent binary systems (Chirp) - impulsive systems (Burst) VERY LOW SNR (SIGNAL to NOISE RATIO) HIGH COMPUTATIONAL CHALLENGE
A Grid-based Evolution of Merlino The online analysis in Virgo has been tackled via Merlino, a SMFT-based (Static Matched Filter Technique) framework, whose architecture is sketched below. Merlino: Data Analysis via Matched Filters Bosi’s Merlino computes the correlation between the data series and a number of signal templates, obtained by simulating the chirp signals emitted by pairs of coalescing stars with solar masses within a given range. Precision and efficiency are strongly influenced by the number of considered working points, i.e. the granularity of the search in the space of the star masses.
Adaptive Filters for Detection of Gravitational Waves in Virgo Data Size and Computational Cost of the Analysis Data from VIRGO are characterized by a very low SNR, and thus need to be accurately filtered to actually detect the presence of the signal. However an on-line analysis requires roughly 300 Gflops to retrieve the 90 per cent of the SNR. Approach to the Analysis via Adaptive Filters The aim is to implement a rough analysis with: small signal losses (w.r.t. the use of matched filters); robustness against false detections; low computational costs (for use in real-time). The idea is to use the adaptive IIR ALE filter to reconstruct at the output the “coherent” component at the input. The “reconstruction” can then be used as (noisy) template for building a correlation detector for the analysis. infinite impulse response adaptive line enhancer (IIR ALE)
A Genetic Parallel Evolution of the Price's Controlled Random Search Algorithm As most of the others, the Price’s algorithm is based on a matched filter approach. However, instead of adopting a fixed grid of templates, it heuristically explores the search space via a controlled random search. Parallel Genetic Price Price algorithm has been modified to improve the performances and better ensure the thoroughness of the exploration of the search space - without having to consider a too high number of working points. To these aims we parallelized the software, and introduced a genetic modification of the search procedure, which introduce some randomness in the generation of new working points, thus making the software more resilient to local minima. Furthermore more than one trial point is now generated at each step. A number of different population members are randomly chosen, and each of them is reflected through the centroid of the others, generating a new trial point. These trial points are then compared to the worst population members, and substitute them in case of better behaviour.
Large Hadron Collider (proton-proton interactions) Centre of mass energy: 14 TeV Accelerator circumference: 27 km Physics objectives: Particle physics in the TeV energy domain Search for Higgs boson Search for Physics Beyond the Standard Model (supersimmetry etc.) Precision measurements for konwn (and unkown) physics. p p Results from Subnuclear Physics @CERN-LHC(Dip. Scienze Fisiche & INFN) (see talk by G. Carlino) • Bunch crossing every 25 ns rate 109 Hz • 25 interactions/b.c. • High granularity detectors: • 108 electronic channels • event size ~1 MB • Input data rate: 1PB/s ! • But only ~100 MB/s to tape. • High selectivity triggersystem (rejection power 107 ) • Max\size of single files 2 GB, 16k files/day 10 TB/day ! Napoli in exp.ATLAS e CMS CMS LHCb ALICE ATLAS
ATLAS Huge international effort(scientific and tecnological) 37 nations 167 institutions 2000 scientists 22 m 46 m
First events at the LHC (10-09-2008) ATLAS CMS Reconstructed events with HPC & Grid computing
Dissemination • Meetings, workshops, events for dissemination of the results to research and industry communities: • Le idee della ricerca al lavoro (26-27/2/08) • Networking Day (15/4/08) • Incontro con le imprese (16/4/08) • Italian e-science2008 (27-29/5/08) • Inaugurazione di SCoPE (1/12/08)
Beyond SCoPE • Aerospace, Automobile • Telecommunications, Informatics, Elettronics • Security • Chemistry, Farmaceutica, Biomedicine • Transportation e logistics • Finance and Economy • Services for Public • High bandwidth network and services • Cloud computing Scientific and industrial applications • High Performance Computing e Grid Computing. • Data Mining • Development of algorithms and software • Cooperation with Science and Industry • Interoperability and Integration in GriSù IGI EGI
SPACI GARR GARR Altri Enti e realtà PI2S2 GRISU’ OGNI INFRASTUTTURA IMPLEMENTA ALMENO UNSITE (CE+SE+WN) E REPLICAI SERVIZI COLLECTIVE 1 VO PER PROGETTO cybersar cresco cometa scope spaci SERVIZI COLLECTIVE E CORE DI OGNI INFRASTRUTTURA SUPPORTANO TUTTE LE VO
EGI European GRID Infrastructure DEISA IGI Italian GRID Infrastructure INFN-GRID ENEA-GRID PORTICI BRINDISI LECCE TRISAIA GRISU’