250 likes | 423 Views
Near-real-time Backup of Large Seismic Waveform Datasets with the Storage Resource Broker. Kent Lindquist, Jennifer Eakins, Frank. L. Vernon, Arcot Rajasekar. February 2, 2006. The EarthScope Project. Structure and Evolution of the North American Continent NSF, IRIS Three components:
E N D
Near-real-time Backup of Large Seismic Waveform Datasets with the Storage Resource Broker Kent Lindquist, Jennifer Eakins, Frank. L. Vernon, Arcot Rajasekar February 2, 2006
The EarthScope Project • Structure and Evolution of the North American Continent • NSF, IRIS • Three components: • San Andreas Fault Observatory at Depth (SAFOD) • Plate Boundary Observatory (PBO; Geodetics) • USArray (Seismic)
The EarthScope Project Image Credit: EarthScope.org
Permanent Seismic Stations Image Credit: EarthScope.org
Transportable Seismic Stations Image Credit: EarthScope.org
USArray Instrumentation GPS Socorro, NM Photo Credit: IRIS PASSCAL Instrument Center
Transportable Array Seismometer Photo Credit: EarthScope.org
EarthScope/USArrayTransportable Array • Fixed design transportable array, “bigfoot” • 400 broadband seismometers • ~70 km spacing • ~1500 x 1500 km grid • 50 magnetotelluric field systems • ~2 year deployments at each site • rolling deployment over more than 10 years
The EarthScope Transportable Array Image Credit: EarthScope.org
USArray Data Flow • 0.6 Terabytes of data Apr. 2004 – Oct. 2005 • > 265,000 individual files • > 1.3 Million seismic waveform segments • As of October 2005: • 1.2 GB/day ingestion rate • 102 seismic stations • 612 seismic channels
The Array Network Facility • Acquisition of data from • Transportable Array • Flexible Array • Hosted at Institute for Geophysics and Planetary Physics (IGPP), UCSD • Maintenance of station metadata • Quality control of incoming seismic data • Control of the running stations
Waveform Archival and Review Banda Sea, Indonesia Magnitude 7.5 Jan 27, 2006
USArray ANF Processing Sun 1/29/06 4:09 am PST 5.0 ML 18 km deep
USArray SRB Usage • Datascope used as primary real-time buffer database and initial processing database • Proximal goal: • Immediate backup and archiving of incoming data, protection against loss at operations facility • Additional benefits: • Resource virtualization, distributed access • May be distinct from long-term organizational archiving in the community (IRIS etc) • Opportunity for cross-resource interconnections
USArray data-backup context ANF Ops (Sun Solaris) Rtbackup_srb Mercali SRB (Linux) SDSC MCAT, SRB, Storage Resources
Datascope ‘wfsrb’ database table SRB location fields
Rtbackup_srb • S-command transfer of raw waveform files to SRB • Antelope RDBMS tracking of archive files • Periodic, version-timestamped snapshot of parametric tables (origin, arrival, etc), i.e. vital-statistic metadata about earthquakes • Triggered replication of archive data • Perl
Get_archive_srb • S-command script retrieval of archived data • Extraction based on subsetting commands • Recreates seismic “wfdisc” waveform database and external BLOB files of raw waveforms • Perl
Conclusions • 0.6 Terabytes of data Apr. 2004 – Oct. 2005 • > 265,000 individual files • > 1.3 Million seismic waveform segments • As of October 2005: • 1.2 GB/day ingestion rate • 102 seismic stations • 612 seismic channels • Backed up in SRB • SRB has contributed to at least one large operational-robustness event