360 likes | 504 Views
AMS Data Handling e INFN. P.G. Rancoita. AMS Ground Segment Data flow in AMS-02. High Rate (Scientific + Calibration) : 3-4 Mbit/s Slow Rate (House Keeping) : 16 kbit/s NASA ancillary data : 1 kbit/s Total Volume : 30 - 41 GB/day
E N D
AMS Data Handling e INFN P.G. Rancoita Perugia 11/12/2002
AMS Ground SegmentData flow in AMS-02 • High Rate (Scientific + Calibration) :3-4 Mbit/s • Slow Rate (House Keeping) :16 kbit/s • NASA ancillary data : 1 kbit/s • Total Volume :30 - 41 GB/day 11 - 15 TB/year
AMS Ground SegmentData volume in AMS-02 • Archived Data 1. Event Summary Data :44 TB/year 2. Event Tag :0.6 TB/year 3. Total (+Raw and ancillary) :56 - 60 TB/year • Data on direct access 1. Event Summary Data :8.3 TB/year 2. Event Tag :0.6 TB/year • Total data volume (3 years): 180 TB • Namely 180 GB/day
Events and Ev. rate • Exp rate of average accepted ev. about 200 Hz, this means in 3 y’s about (1.5-2)x10^10 ev’s • Typical reconstructed ev. length less than about 6.5-7 kB. • Total storage for ESD about 130 TB
Data/ Year 1998 2001 2002 2003 2004 2005 2006 2007 2008 2009 Total Raw 0.20 ---- --- --- --- 0.5 15 15 15 0.5 46.2 ESD 0.30 --- --- --- --- 1.5 44 44 44 1.5 135.3 Tags 0.05 --- --- --- --- 0.1 0.6 0.6 0.6 0.1 2.0 Total 0.55 --- --- --- --- 2.1 59.6 59.6 59.6 2.1 183.5 MC 0.11 1.7 8.0 8.0 8.0 8.0 44 44 44 44 210.4 Grand Total 0.66 1.7 8.0 8.0 8.0 10.1 104 104 104 46.1 ~400 AMS Ground SegmentData budget in AMS-02
RT data Commanding Monitoring NRT Analysis POIC@MSFC AL POCC POCC AMS Ground Segment:Data budget in AMS-02 HOSC Web Server and xterm XTerm commands Monitoring, H&S data Flight Ancillary data AMS science data (selected) cmds archive TReK WS “voice”loop Science Operations Center Science Operations Center TReK WS Video distribution External Communications PC Farm GSE GSE NRT Data Processing Primary storage Archiving Distribution Science Analysis AMS Data, NASA data, metadata Buffer data Retransmit To SOC GSE Production Farm MC production D S A e T r A v e r AMS Remote center Analysis Facilities Data Server MC production Data mirror archiving Analysis Facilities AMS Station AMS Ground Centers AMS Station AMS Station
AMS Ground SegmentAMS-02 Ground Facilities • POIC @ Marshal MSFC • POCC @ JSFC / MSFC / MIT / CERN • (A)SOC @ CERN • Remote Center - Italian Ground Segment • Laboratories
AMS Ground SegmentPayload Operation and Integration Center (POIC) • POIC @ Marshall SFC (Huntsville -AL) • Receives data from ISS • Buffers data until retransmission to (A)SOC • Forward monitoring and meta-data to POCC • Transmits commands from POCC to AMS • Runs unattended 24h/day, 7days/week • Must buffer ~ 2 weeks of data 600 GByte
AMS Ground SegmentPayload Operation Control Center(POCC) • POCC @ JSFC, MSFC, MIT, CERN • Receives data from POIC @ MSFC • Monitors data and runs quality control program • Process ~ 10% of data in near real time • Originates and transmits commands to AMS through POIC • Requires scientists on shift
AMS Ground Segment(AMS) Science Operation Center[(A)SOC] • Complete Data Repository (Raw + Reco) • Production of Reconstructed data • Re-processing / Re-calibration of data • Meta-data Repository and Command archive • Production and management of MC events • MonteCarlo Repository • Scientific Data Analysis Facility
AMS Science Operation Center Computing Facilities Production Farm Tape Server Tape Server Disk Server PC Linux 2x2GHz+ PC Linux 2x2GHz+ PC Linux 2x2GHz+ PC Linux 2x2GHz+ PC Linux 2x2GHz+ Gigabit Switch (1 Gbit/sec) Archiving and Staging #8 #2 PC Linux Server 2x2GHz, SCSI RAID Cell #1 Gigabit Switch(1 Gbit/sec) MC Data Server AMS data NASA data metadata PC Linux 2x2GHz+ PC Linux 2x2GHz+ Disk Server Disk Server 2xSMP, (Q, SUN) Disk Server Disk Server Gigabit Switch (1 Gbit/sec) Simulated data Data Server Analysis Facilities
AMS Ground SegmentAMS Italian Ground Segment(IGS) • Get data (raw + reco + meta-data) from (A)SOC • Complete Mirror and Meta-data repository: Master Copy of the full Data set • Monte Carlo production (20%) • Support local user’s community for Data Analysis
AMS Ground SegmentItalian Ground Segment Facilities • Italian Ground Segment Data Storage Complete mirror data and meta-data repository (IGSDS) namely the MASTER COPY of the full AMS Data Set • Data Transfer Facility DTF • Data Transfer Management and Survey DTMS • Monte Carlo contribution: (20%)
AMS Ground SegmentData Transfer to IGS • Involved: DTF, IGSDS, DTMS • DTF (CERN): access Data at (A)SOC and transfer to IGSDS • IGSDS (TBD): receive and store Data • DTMS (Milano): watch over the Data transfer • Network required: 32 Mbit/s
Data transfer New release of Data Transfer is running since 20 weeks. Stops are due only to power outages at CERN.
Data transfer • “production rate” = 2.2 Mbit/sec • Sustainable production rate = 8 Mb/sec (80% of available bandwidth) • This thanks to a forking mechanism and bbftp’s efficient bandwidth usage • Milano and CERN Data Transfer DB’s consistency = 100% • Data that has to be retransmitted= 0.2 %
Data transfer: present work • Test bbftp’s variable TCP parameters (done) • Release a new version of “our” bbftp (minor changes on authorization and error reporting) (done) • Test system in a more reliable environment (no power outages…) • Implement automatic recovery. • Setup GUI (Graph. User Interface) to start/stop system • Complete Web monitoring tools.
AMS Italian Ground SegmentData Storage at IGSDS • Place:TBD • Archived Data:180 TB (3 years) • On-line Data:~2 TB (1-2 weeks)
Descrizione dei costi • Costi relativi al Central AMS Ground Segment (POIC+POCC+(A)SOC)
Central Production Facility • La Central Production Facility sara’ dedicata alla ricostruzione dei dati. • La CPF sara’ fisicamente alloggiata presso il CERN e fa parte dell’ (A)SOC • Le necessita’ per la CPF sono suddivise in storage e CPU (e DB servers).
HW e costi del Data Handling di AMS ---Central Production Facility Per quanto riguarda la potenza di calcolo, si avra’ bisogno dell’equivalente di: • 50 dual 1.5 GHz boxes, 1 GB RAM, • Processing storage: 10 TB
Central Production Facility Ai costi e alle conoscenze attuali degli sviluppi dei costi, si prevede per la facility nel periodo 2004-2006 un costo di • CPF 350 KUS $ • DB Servers 50 KUS $ • Event Storage 200 KUS $
POCC, Marshall (POIC), Analysis Ai costi e alle conoscenze attuali degli sviluppi dei costi, si prevede un costo di • Marshall 55 KUS $ • POCC (x2) 150 KUS $ • Analysis 55 KUS $
Spese Addizionali • Spese 2000-2001 per prototipi e initial set-up) 150 KUS $ • Running costs & Upgrades 2007–2008 150 KUS $ Totale (escluso personale)1160 KUS $ Si attende che il 20% +IVA di questa circa venga da parte INFN : 277 k€
Stime del personale per il Data Handling di AMS • E’ in fase di formalizzazione la spesa per personale (oltre ai fisici) da dedicare al data handling per il periodo 2003-2008 • Il personale consiste in system administrators, SW and HW engeneers. Le stime in anni/uomo sono: • POCC circa 8.5 • (A)SOC circa 15.3 • User’s support group circa 15.6 (incluso personale dedicato ad item particolari quali lo storage) • Totale circa 39.4/anni uomo • Se si assume un costo di 50K€/anno uomo si ottiene circa 1970 K€ , il cui 20% (circa 390 K€) dovrebbe essere un contributo INFN
Descrizione dei costi • Costi relativi all’ Italian Ground Segment, relativi a DTF, DTMS, IGSDS
DTF Il sistema di DATA TRANSFER avra’ un suo front-end INFN presso il CERN, con un sistema dedicato a “prendere” i dati e trasferirli in Italia al MASTER COPY repository Il sistema si basa su: • Architettura Client/Server (SSL) • Bbftp • MySql
DTF cont. Per tale sistema sara’ necessario: • 1 Server AMD 1.5 GHz • 1.5 TB su disk raid (scsi) • 32 Mb/s CERN IGS • Costo inclusa la manutenzione e sostituzione dei server circa 50k€ +IVA mel periodo 2004-2008 Richieste di banda: (4 R + 8 NT ) + (2 R + 4 NT ) rt + 2 (SR+CAS) = 20 Mb/s
DTMS High performance server, with fast CPU and high I/O throughput. I/O Buffer • Capacity equivalent to 7 days of data taking to recover from any connectivity failure • 1.5 Tbytes Network • High speed network connections to CPF. Must be consistent with a flux of 3 days worth of data: 32 Mb/s • Each facility (DTF and DTMS) costs about 27+VAT k€ up 2008
DATA STORAGE : Italian MASTER COPY 2 High performance servers, with fast CPU and high I/O throughput. I/O Buffer : Capacity equivalent to about 3 days of data taking to recover from any connectivity failure (0.5 Tbytes) On-line storage RAID system (1 Tbytes) Off-line storage : Tapes or similar (e.g.: LTO) 180 Tbytes. For instance LTO Off-line Robotics staging area: Depending on the robot solution adopted, it varies between a few percent and 10% of the stored data (10 Tbytes) Network: High speed network connections to CPF. Must be consistent with a flux of 3 days worth of data (32 Mb/s) Cost (2002 price based on LTO) : 355 k€ + VAT
Sommario costi per la parte INFN per il contributo al Ground Segment Centrale (CERN) e IGS relativa al Data Transfer e Master Copyper il periodo 2003-2008 • HW to AMS central ground segment 277k€ • Personnel (A)SOC,POCC, etc 394k€. • Total cost 671 k€ (VAT included) • HW (IGSDS) for 200TB storage 428k€ • HW DTF e DTMS (63k€) • Total cost 491k€ • Grand Total (2003-2008) 1162 k€ • No cost for IGSDS facility (infrastructure and personnel) is included