160 likes | 255 Views
TeraGrid Archival Migration of Data to the XD Era Phil Andrews et al. Users’ view: When they give us their data, they expect it to be available even when the original recipient is not.
E N D
TeraGrid Archival Migration of Data to the XD EraPhil Andrews et al Users’ view: When they give us their data, they expect it to be available even when the original recipient is not. The Moving Finger writes; and having writ,Moves on; nor all your Piety nor WitShall lure it back to cancel half a Line,Nor all your Tears wash out a Word of it. -Omar Khayyam
Significant Archival Data (~20PB)is at TG RP sites unfunded in XD What to do about data at current TeraGrid RP sites that do not yet have funds for the XD era? Do we have a communal obligation to continue data availability past the funding of centers that accepted it? The NSF thinks so! It’s Later Than You Think! – A Tale of Two Cities, Charles Dickens TeraGrid Quarterly, Dec’08
Task Force to consider issue • Members from most (maybe all) sites • Send me Email (andrewspl@utk.edu) if you want to participate • NSF wants to see plan, encouraging idea of general replication approach at remote sites • If replicate data at currently unfunded sites, then we are covered whatever happens • Awkward funding implications TeraGrid Quarterly, Oct’08
More than one approach possible In the past, moved one archive (CTC) physically, another (PSC) across the network. Both moves successful, never replicated entire archive. Network moves require several months. Physical moves very concerning. Data offline or frozen during move. A merry road, a mazy road, and such as we did tread The night we went to Birmingham by way of Beachy Head! – G.K Chesterton TeraGrid Quarterly, Dec’08
How much data are we talking? • Approximately 10 PB total at each of SDSC and NCSA • Other sites also have significant data • 10 Gb/s = 10 PB/100 days • Only TACC, NICS, PSC, continually funded into XD era at the moment • NCSA -> track1 funding TeraGrid Quarterly, Oct’08
Option 1: Physical move Advantages: can wait until last minute, possibly funding neutral, doesn’t stress network, keeps physical resources in TG Disadvantages: dangerous, data unavailable for weeks, site could regain funding later, new host must handle format The Nuclear option: very awkward, mixed data tapes, lays waste to an existing archive. Forced upon us if we wait too long! Out of this nettle, danger, we pluck this flower, safety. – Shakespeare TeraGrid Quarterly, Dec’08
Option 2: Network Transfer • Cannot move 20 PB in any reasonable time • Must rely on only 2-3 PB real data/site • Advantages: Data checked during transfer. No danger of data loss. Site can recover. • Disadvantages: ties up network, people resources. Long process. Doubles archival requirements. For though his body ’s under hatches, His soul has gone aloft. – Charles Dibdin TeraGrid Quarterly, Oct’08
Option 3: Archival Replication • Advantages: More general approach; increases TeraGrid value added. Intellectually stimulating rather than maudlin • Disadvantages: more involved process. Could lead to drastically increased archival requirements. There is a tide in the affairs of menWhich taken at the flood, leads on to fortune– Shakespeare TeraGrid Quarterly, Oct’08
Replication Approaches: • General Middleware: • iRODS can do replication, but must manage the data. Can’t import general or SRB data • SRB is slow • Infrastructure: • HPSS archives can be connected via wide area GPFS. (HPSS 6.2.2, June’08) TeraGrid Quarterly, Oct’08
What to do now? • The clock is ticking; if we are to investigate options, must do it soon • SDSC, TACC, NCSA looking at iRODS • SDSC runs HPSS as one archive, and exports GPFS • Propose trying the GPFS-HPSS Integration (GHI) approach for replication between HPSS archives TeraGrid Quarterly, Oct’08
Will other file systems work? • Can we use other approaches for Lustre? • pNFS does have a proposed mechanism for replication via caching: Panache • Will global file systems and HPSS come in pairs? • Is a more general (but less efficient) middleware approach (iRODS?) preferable? Pay no attention to that man behind the curtain – L. Frank Baum TeraGrid Quarterly, Oct’08
GHI status • Some features already released • Multiple HPSS archives not there yet; due next year • Timing could be tricky, but could start with pre-release software • Due for beta testing at NERSC, NCSA • GPFS, HPSS guys interested (spoke at SC) TeraGrid Quarterly, Oct’08
Discussion: • Is replication worth the effort? • Will sites be prepared for physical move, if necessary? • If no physical move, how do we fund resources? • Do we let users say: “move everything”? • We need an inventory of data! • Are users rendering this discussion moot? TeraGrid Quarterly, Oct’08
Philosophy: • Current funding approach allows continual ebb and flow of RP sites: we can handle Computational impact, but not Archival! • Need Archival organization that allows for a frequent gain and loss of Data RP’s • Hard to wait for XD to solve this problem I must go in and out – Bernard Shaw TeraGrid Quarterly, Oct’08
Need to know what data is where: • We don’t know which site has how much data, and on what media • Different media can have major impacts on how quickly it can be moved or replicated • Need a good story to go to NSF with funding request for a better Archival organization • Need a Data census! In those days a decree went out from Caesar Augustus that the whole world should be counted – Luke, 2:1 TeraGrid Quarterly, Oct’08