250 likes | 513 Views
SEIS-UK Data Management Procedures Alex Brisbourne. SEIS-UK Data Management Procedures. Overview of SEIS-UK Data conversion and file structure Verify data conversion and station performance Dataless SEED production with make_dlsv Shipping data to IRIS DMC The final IRIS archive
E N D
SEIS-UK Data Management Procedures • Overview of SEIS-UK • Data conversion and file structure • Verify data conversion and station performance • Dataless SEED production with make_dlsv • Shipping data to IRIS DMC • The final IRIS archive • Active source datasets • Real-time data and satellite modem SOH reports
. . . . . . Overview • SEIS-UK is the seismic node of NERC’s Geophysical Equipment Facility • 3.2 F.T.E. supporting ~12 field projects concurrently • Undertaking: instrument procurement, testing and preparation; shipping; field support; training; data management support and data archiving • Established in 2000, now with: • 29 x CMG-3TD with DCM/SAM • 15 x CMG-3T with NMX Taurus • 20 x CMG-40TD with DCM • 20 x CMG-ESPCD • 127 x CMG-6TD • 15 x ISSI SAQS HF systems • 28 x LE-3Dlite • Solaris and Linux data management servers with ~20TByte RAID storage
Overview • The data management system at SEIS-UK is available for remote use by all users along with initial training if required • Due to low staff levels, SEIS-UK have the ethos of using software available from other sources wherever possible • In order that users can emulate the SEIS-UK system in house, only software which does not require licensing is used • It is the user’s responsibility to ensure data are made available for permanent archive • In practise, the user readies the data for archive but SEIS-UK send the data to IRIS DMC • The NERC loan agreement states that 3 years after the end of the experiment data must become open-access • SEIS-UK undertake the shipping of data to IRIS DMC once users have data in miniseed format and a final dataless seed volume • SEIS-UK have no real-time data capability – all data are archived at the end of the experiment
Data Conversion and File Structure • Obtain unique network code from FDSN • www.fdsn.org/getcode.html • Record in proprietary format • Use instrument manufacturer’s software to convert to Steim-1 mseed • Populate all miniseed headers at conversion • This then allows for independent verification of meta-data upon completion as the dataless volume is produced separately • Use a simple / flat file structure for archive • > Project Directory • > Day Directories • > Component-Hour files • (4, 8, 12 or 24 hour files may actually be optimal) • Maintain GPS/SOH data separately • All station quality control is carried out by users in the field immediately after data download e.g. GPS checks; mass positions; continuity • Use PASSCAL utilities and/or qmerge for miniseed data manipulation • Use GOAT at IRIS to verify data format conversion • Produce GOAT text file which is ftp’d to IRIS (uses seed2sync utility) • View data continuity/gaps/overlaps via web interface
Verify Data conversion and station performanceGOAT - Gap/Overlap Analysis Tool
Dataless SEED production with make_dlsv • Previously used PDCC (previous incarnation – v.2?) • GUI based, time consuming for large numbers of instruments • SEIS-UK now use an adaptation of the make_dlsv package of Winfried Hanka (GFZ) • Unsupported package in C/Fortran • Text file / command line based system to produce dataless seed volumes from 3 user-supplied text files • No GUI interface allows rapid SEED builds and error trapping • Relies on a suite of text files representing instrument calibration data, PAZs, FIRs, repair dates etc. • Use C-Shell scripts as wrapper to produce input files for make_dlsv • Currently running on Solaris only at SEIS-UK • The hard part is setting it up initially • The easy part comes when the user creates their dataless volume. • Adding new instruments is straight-forward • Calibration data etc must be kept up-to-date
comments_temp.cfg fir_temp_header.cfg analogresp_temp.cfg digit_resp_lookup.txt template_header.cfg channel_coeffs.txt sensor-gains.txt * digitresp_temp.cfg formats_temp.cfg units_temp.cfg channel_id.txt dmc.cfg instruments_temp.cfg FIRs/SPS Lookups (**) Instrumentation Database • As well as the software: • 12 text files are maintained which must be updated when there are any changes in instrumentation supported, e.g., new digitisers or sensors. • Changes in calibration data are made as soon as any changes occur (*) • Tables of FIRs and the sequence w.r.t. sample rates required (**)
Example Sensor Response File analogresp_temp.cfg :::::::::::::: # Response Dictionaries # # Analog stages (seismometer & analog filters): Poles & zeros representation (PAZ) # Parameters: 1 - response lookup key, 2 - response name, 3 - no of zeros, # 4 - no of poles, 5 - norm. factor, 6 - norm. freq., 7 - stage gain, # 8 - gain freq., 9 - input units key, 10 - output units key, # 11 - zeros, 12 - poles # # CMG-3T PAZ response Resp_paz> 1 CMG-3T-PAZ 2 5 571507691.8 1.00 3000.0 1.00 2 3 -> -> (0.0,0.0) (0.0,0.0) -> -> (-1005.31,0.0) (-502.6548,0.0) (-1130.973,0.0) -> -> (-0.037008,0.037008) (-0.037008,-0.037008) # CMG-40T PAZ response Resp_paz> 2 CMG-40T-PAZ 2 5 571507691.8 1.0 1600.0 1.0 2 3 -> -> (0.0,0.0) (0.0,0.0) -> -> (-1005.31,0.0) (-502.6548,0.0) (-1130.973,0.0) -> -> (-0.148597,0.148597) (-0.148597,-0.148597) # CMG-6T PAZ response Resp_paz> 3 CMG-6T-PAZ 3 6 491139422.6 1.0 1100.0 1.0 2 3 -> -> (-31.6,0.0) (0.0,0.0) (0.0,0.0) -> -> (-0.148597,0.148597) (-0.148597,-0.148597) -> -> (-2469.3609,0.0) (-47.06357,0.0) -> -> (-336.7655,-136.656) (-336.7655,136.656) # CMG-EDU-V PAZ response Resp_paz> 4 CMG-EDUV-PAZ 2 6 9129959284.0 1.0 1100.0 1.0 2 3 -> -> (0.0,0.0) (0.0,0.0) -> -> (-0.148597,0.148597) (-0.148597,-0.148597) -> -> (-391.9552,850.693) (-391.9552,-850.693) -> -> (-471.239,0.0) (-2199.1149,0.0)
Example Data Format Description File formats_temp.cfg :::::::::::::: # copy_seed Configuration File # # Abbreviation Dictionaries # # Format_abbr: Abbreviation dictionary for formats of data records # Parameter: 1 - format identifier code, 2 - format name, 3 - family type, # 4 - ddl keys # Here: DDL description of Steim-1 data compression scheme # (see SEED Manual Vers.2.3, pp 151) # Format_abbr> 1 Steim-1_Integer_Compression_Format 50 -> -> F1_P4_W4_D_C2_R1_P8_W4_D_C2 -> -> P0_W4_N15_S2,0,1 -> -> T0_X_W4 -> -> T1_Y4_W7_D_C2 -> -> T2_Y2_W2_D_C2 -> -> T3_N0_W4_D_C2 # Format_abbr> 2 Steim-2_Integer_Compression_Format 50 -> -> F1_P4_W4_D_C2_R1_P8_W4_D_C2 -> -> P0_W4_N15_S2,0,1 -> -> T0_X_W4 -> -> T1_Y4_W1_D_C2 -> -> T2_W4_I_D2 -> -> K0_X_D30 -> -> K1_N0_D30_C2 -> -> K2_Y2_D15_C2 -> -> K3_Y3_D10_C2 -> -> T3_W4_I_D2 -> -> K0_Y5_D6_C2 -> -> K1_Y6_D5_C2 -> -> K2_X_D2_Y7_D4_C2 -> -> K3_X_D30
Example Digitiser FIR sequence File DMTAURUS_lookup.txt ::::::::::::::::::::::::::::::::::::: DMTAURUS 10 30000:DMTAURUSXFIR101X20/1500:DMTAURUSXFIR102X15/100:DMTAURUSXFIR103X5/20:DMTAURUSXFIR104X2/10 DMTAURUS 20 30000:DMTAURUSXFIR201X15/2000:DMTAURUSXFIR202X10/200:DMTAURUSXFIR203X5/40:DMTAURUSXFIR204X2/20 DMTAURUS 40 30000:DMTAURUSXFIR401X15/2000:DMTAURUSXFIR402X5/400:DMTAURUSXFIR403X5/80:DMTAURUSXFIR404X2/40 DMTAURUS 50 30000:DMTAURUSXFIR501X20/1500:DMTAURUSXFIR502X15/100:DMTAURUSXFIR503X2/50 DMTAURUS 80 30000:DMTAURUSXFIR801X15/2000:DMTAURUSXFIR802X5/400:DMTAURUSXFIR803X5/80 DMTAURUS 100 30000:DMTAURUSXFIR1001X15/2000:DMTAURUSXFIR1002X10/200:DMTAURUSXFIR1003X2/100 DMTAURUS 120 30000:DMTAURUSXFIR1201X5/6000:DMTAURUSXFIR1202X5/1200:DMTAURUSXFIR1203X5/240:DMTAURUSXFIR1204X2/120 DMTAURUS 200 30000:DMTAURUSXFIR2001X15/2000:DMTAURUSXFIR2002X5/400:DMTAURUSXFIR2003X2/200 DMTAURUS 250 30000:DMTAURUSXFIR2501X15/2000:DMTAURUSXFIR2502X4/500:DMTAURUSXFIR2503X2/250 DMTAURUS 500 30000:DMTAURUSXFIR5001X10/3000:DMTAURUSXFIR5002X3/1000:DMTAURUSXFIR5003X2/500
What the user sees … • Three user-supplied text files • Network file • XX project full_project_name • Station file – one line per sensor/site/time-window: • stat_id inst_type sens_serial dig_type dig_serial lat lon elev start_time end_time sps1 sps2 sps_code site_name • Gains file (where required) • dig_serial Vpp software_gain start_date end_date • Run a script to produce input file to make_dlsv from the 3 input text files • Run make_dlsv to produce the dataless volume • Verify dataless volume • The user is now ready to extract fully-populated event files from their data set
Verifying data before archive • Prior to data being sent to IRIS DMC for permanent archive the facility verify: • Validity of the dataless volume with verseed • All miniseed headers are complete • Check for typos etc in text files / dataless • All miniseed files are correctly described by the dataless: • No huddle test data • No inconsistencies in the dataless regarding site swaps etc • Deployment/decommissioning times correct • Script based utility using verseed to compare miniseed and dataless volume
Shipping data to IRIS DMC • Once all data from a project have been retrieved from the field and verified: • Send dataless volume to IRIS via ftp • Compile and then ftp station-day volumes to IRIS in chronological order, one day at a time • Send cksum data for each file sent • Again, an automated script to send the entire project dataset • Upon completion of the ftp, create seed2sync file listing all data available at SEIS-UK (done separately to ftp to ensure redundancy)
seed2sync DMC/seed2sync -f /data1/IRIS_SYNC/norway06//seed_2_sync.YF.2006.tmp|2007,246 YF|N6008||BHE|2006,117,09:46:53|2006,118,00:00:00||50.00|2559350||||||2007,246|| YF|N6008||BHN|2006,117,09:46:56|2006,118,00:00:00||50.00|2559200||||||2007,246|| YF|N6008||BHZ|2006,117,09:46:53|2006,118,00:00:00||50.00|2559350||||||2007,246|| DMC/seed2sync -f /data1/IRIS_SYNC/norway06//seed_2_sync.YF.2006.tmp|2007,246 YF|N6006||BHE|2006,117,16:11:42|2006,118,00:00:00||50.00|1404900||||||2007,246|| YF|N6006||BHN|2006,117,16:11:47|2006,118,00:00:00||50.00|1404650||||||2007,246|| YF|N6006||BHZ|2006,117,16:11:47|2006,118,00:00:00||50.00|1404650||||||2007,246|| DMC/seed2sync -f /data1/IRIS_SYNC/norway06//seed_2_sync.YF.2006.tmp|2007,246 YF|N6007||BHE|2006,117,15:34:43|2006,118,00:00:00||50.00|1515850||||||2007,246|| YF|N6007||BHN|2006,117,15:34:49|2006,118,00:00:00||50.00|1515550||||||2007,246|| YF|N6007||BHZ|2006,117,15:34:45|2006,118,00:00:00||50.00|1515750||||||2007,246|| DMC/seed2sync -f /data1/IRIS_SYNC/norway06//seed_2_sync.YF.2006.tmp|2007,246 YF|N6004||BHE|2006,118,14:27:40|2006,118,14:31:43||50.00|12150||||||2007,246|| YF|N6004||BHE|2006,118,15:15:46|2006,119,00:00:00||50.00|1572700||||||2007,246|| YF|N6004||BHN|2006,118,14:27:40|2006,118,14:31:43||50.00|12150||||||2007,246|| YF|N6004||BHN|2006,118,15:15:46|2006,119,00:00:00||50.00|1572700||||||2007,246|| YF|N6004||BHZ|2006,118,14:27:40|2006,118,14:31:45||50.00|12250||||||2007,246|| YF|N6004||BHZ|2006,118,15:15:46|2006,119,00:00:00||50.00|1572700||||||2007,246|| DMC/seed2sync -f /data1/IRIS_SYNC/norway06//seed_2_sync.YF.2006.tmp|2007,246 YF|N6008||BHE|2006,118,00:00:00|2006,119,00:00:00||50.00|4320000||||||2007,246|| YF|N6008||BHN|2006,118,00:00:00|2006,119,00:00:00||50.00|4320000||||||2007,246||
The final IRIS archive • IRIS then verify that all data listed in the seed2sync file reside at IRIS DMC • Any missing data are then sent manually to complete the archive • Once all data are at IRIS it is migrated to the permanent repository • Upon completion, the project PI (Principal Investigator) is the owner of the data at IRIS and determines data release dates and accessibility • 3 year release date is part of the NERC loan agreement • IRIS are flexible regarding data release times • The archive at SEIS-UK is non-permanent
Active source datasets • The end result of the experiment is SEGY gathers • Standard station QC utilities are used in the field • Once all raw data have been collected: • Convert to SEGY and reformat filenames and headers so the data look like Reftek Texan output (day directories; single-component files; specific filename structure; headers populated) • Then use the PASSCAL utilities segygather/txn2segy to produce receiver or shot gathers as required • Can be scripted to automate production of all gathers for the experiment • Use the package plotsec for verification of raw gathers • Users then ftp gathers to their own system for analysis • Network code obtained from IRIS DMC and assembled datasets are ftp’d to IRIS by SEIS-UK for permanent archive. • Controlled source data sets are complex. In reality most of the gather production is done by SEIS-UK once the meta-data have been compiled
Real-time data • SEIS-UK do not have, nor do we envisage, using real-time data for mobile stations in the near future • Site hardware is expensive • Base station hardware is expensive • Airtime it expensive • Power consumption is high • Site deployment much more complex requiring highly trained personnel at each installation • For 1-2 year deployments it is difficult to justify the extra cost
Satellite Modem SOH data only • Guralp DCMs allow Iridium satellite modem communication for system status verification • Two automatic 90sec calls per week made from central base-station • Remote DCM modem on for 2 x 8 hour periods per week (~50mA when idle or <1Ah/week) • Base station collects SOH summary report from remote DCM • Call costs £0.50 per minute or ~£0.80 each • Hardware costs of ~£1,000 per unit • Manual calls can also be made to the remote DCM allowing reconfiguration of system and system updates • Transmission is at 2400kbits per second • The modems can operate at temperatures ranging from -30°C to +60 °C
Why the MiChroSat 2400? • The MiChroSat 2400 transmits using the Iridium Satellite Network – this is a Low Earth Orbit network of 66 satellites that gives total global coverage. • Initial set-up is low-cost in comparison to the large dishes required for High Earth Orbit constellations such as VSAT. • The power consumption is low due to the minimal hardware required – this is essential for remote solar-powered sites. • Bandwidth is smaller than that for VSAT (which is generally used for data transmission) but for state-of-health communications it is ideal. • The system is low-cost to run since air-time is on a pre-paid pay-as-you-go basis and by running a modem at SEIS-UK transmissions will be Iridium to Iridium thus avoiding the large fees for calling to other networks.
Satellite Modem Hardware HuBLE-UK Site in Hudson Bay, Canada Summer 2007