LHC Distributed Data Management

European Laboratory for Particle Physics LHC Distributed Data Management Eva Arderiu Ribera CERN Geneva 23, SwitzerlandIT Division, ASD/RD45 Eva.Arderiu@cern.ch

Introduction • RD45 • LHC distributed scenario • Motivation and Implications of Data Replication • Two choices for LHC Data Distribution • based on one federated database • based on independent federated databases • Basic differences between the two models • Availability/Performance Parameters • Number of replicas • Nature of transaction • Frequency of synchronization • Bandwidth required on the links • Calibration/DB Distribution and Event Tag Data Distribution • Summary

RD45 project • Proposed in late 1994 and approved in 1995 • Searching for solutions to the problems of making LHC physics data persistent (objects) • also calibration data, histograms, etc. - any persistent object • LHC experiments will store huge data amounts • 1 PB of data per experiment and year • 100 PB over the whole lifetime • Distributed, heterogeneous environment • Some 100 institutes distributed world-wide • (Nearly) any available hardware platform • Data at regional-centers • Existing solutions do not scale • Solution suggested by RD45: ODBMS coupled to a Mass Storage System, Objectivity/DB and HPSS

LHC distributed scenario

Motivation and Implications of Data Replication • Motivation • improve performance by having data locally (query may be solved locally) • improve availability by having data at several sites (query may access data at another site) • no bottlenecks for data access • Implications • KEEP CONSISTENCY: keep replicas consistent • PROVIDE CONTINUOUS AVAILABILITY: offer a distributed fault tolerant environment

Two Choices for LHC Data Distribution Objectivity/DB client/server offers enough flexibility to adopt different system configurations depending on distribution requirements and constraints TWO CHOICES: I- One federated database with autonomous partitions II- Independent federated databases

Fault Tolerant System with one FDB Federated Database Partition 2 Lock server Partition 1 AMS servers LAN AMS servers WAN Lock server LAN AMS/HPSS server Lock server AMS servers Partition 3

Fault Tolerant System with independent FDB´s FDB 2 Lock server FDB 1 AMS servers LAN AMS servers WAN Lock server LAN AMS/HPSS server Lock server AMS servers FDB 3

Basic differences between the two architectures

Relevant Availability and Performance Parameters • Number of Replicas • Nature of Transaction • Frequency of Synchronization • Link Bandwidth

Number of replicas H/W testbed: 91 interlinked workstations running AIX 4.1 and composed of several types of IBM RS/6000 nodes, connected via an internal high speed network but we use the Ethernet interface . S/W testbed: a replica generator and update program.

Nature of transaction One FDB with Partitions Read • Use of Objectivity/DB cache • LOAD ONCE VIA NETWORK, READ MANY TIMES FROM LOCAL CACHE MEMORY • Some small protocol overhead to contact remote lock servers Update • R replicas of DB1: the update transaction of DB1 size grows by a factor or R • replica synchronization increases the transaction time => increases locking time and waiting time. Independent FDBs Read • the same as with one FDB, but no contact to remote lock servers Update • the update transaction size does not depend on the number of replicas

Frequency of replica synchronization One FDB with Partitions Immediate Synchronization a)creation of the replica • via network: oonewdbimage [-remoteHost…] • via tape 1- oonewdbimage [-localHost] 2- oochangedb -catalogonly[new location]3- send replica via tape to the new location b)synchronization of replicas when updating them Objects Automatically sync. by replica protocol. No replica inconsistencies ! Any time Synchronization the same as with independent FDB’s. Replica DB

Frequency of replica synchronization Independent FDBs Inmediate Synchronization Any time Synchronization a) creation of the copy 1- oocopydb [localhost] 2- send file by tape or network 3- ooattachdb [new id] [remote host] remote_boot_file b) synchronization of copies databases synchronized by administrator procedure inconsistencies if copies are not sync. immediately ! copy DB

Bandwidth required on the links CERN PARTITION 2 Replication Protocol Test via WAN: DB2 IMAGE AMS - Freq. Update: every 5 min - Data Updated: 1KB Lock Server CALTECH PARTITION LAN Data server: HP 712/60 HP/UX 10.20 DB1 REPLICA AMS AMS DB1 DB2 Lock Server WAN Lock Server Data server: RS/6000 POWER2 AIX 4.1 Data server: Pentium Pro 200 MHz, Windows NT 4.0 CERN PARTITION 1

Bandwidth required on the links generation of updates during one day NON SATURATED HOURS ~ 1 Mbit/sec SATURATED HOURS ~ 10 Kbit/sec

Use Cases Calibration/DB Distribution • ONE FEDERATED DATABASE • calibration packages (I.e. BABAR one) are based on versioning. All users can access to the latest version any time. • INDEPENDENT FDBS • is not possible to have common versioning, calibration data versions done on independent FDB’s should be merged “manually”. Event Tag Data Distribution • ONE FDB • allows links to raw data from Event Tag Objects • INDEPENDENT FDB • no links between objects from different FDB’s

Availability and performance parameters

MERGING THE TWO MODELS a possible solution central FDB satellite FDB’s Autonomous Partition

MERGING THE TWO MODELS a possible solution CENTRAL FDB • An FDB with few partitions to reduce the number of replicas to synchronize and the transaction time. • Updating from anywhere and anytime has immediate synchronization on the rest of the federation. • Partitions belong to centers with required links and administration centers (good QoS). SATELLITE FDBs • Synchronization based on requirements, every hour, every month... • They can be disconnected most of the time. • New databases are attached to them. • Schema must be kept synchronized with the central one.

Summary • Objectivity/DB client/server architecture offers enough flexibility to adopt different system configurations depending on experiment distribution requirements and constraints • Three possible distributed architectures • one FDB with partitions: partitions located in remote institutes must offer a good QoS. Asynchronous Replication protocol should be applied to few partitions if update anytime, anywhere is required. • independent FDBs: they offer a completely independent administration but schema and replica synchronization must be done by the DB administrator. • merging both previous solutions. Offers a scalable distributed replica solution with a mixture of administration centers. • Distributed architecture from Objectivity/DB has been tested, not fully autonomous->to be solved, more tests on use cases need to be done.

LHC Distributed Data Management

LHC Distributed Data Management

Presentation Transcript

Foundations of Distributed Data Management

Policy-Driven Distributed Data Management

Distributed Data Management

Issues In Distributed Data Management

Distributed Data Management

Information Management and Distributed Data

Distributed Data Management for Biomedical Research

Distributed Data Management in CMS

Rule-Based Distributed Data Management

Policy-Based Distributed Data Management Systems

Distributed Data Management at DKRZ

LHC First Data

Distributed Data Management

ATLAS Distributed Data Management

ATLAS Distributed Data Management Operations

Information Management and Distributed Data

Distributed Data Management

Distributed Triggers for Peer Data Management

Data Distribution and Distributed Transaction Management