1 / 25

SEIS-UK Data Management Procedures Alex Brisbourne

SEIS-UK Data Management Procedures Alex Brisbourne. SEIS-UK Data Management Procedures. Overview of SEIS-UK Data conversion and file structure Verify data conversion and station performance Dataless SEED production with make_dlsv Shipping data to IRIS DMC The final IRIS archive

mura
Download Presentation

SEIS-UK Data Management Procedures Alex Brisbourne

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SEIS-UK Data Management ProceduresAlex Brisbourne

  2. SEIS-UK Data Management Procedures • Overview of SEIS-UK • Data conversion and file structure • Verify data conversion and station performance • Dataless SEED production with make_dlsv • Shipping data to IRIS DMC • The final IRIS archive • Active source datasets • Real-time data and satellite modem SOH reports

  3. . . . . . . Overview • SEIS-UK is the seismic node of NERC’s Geophysical Equipment Facility • 3.2 F.T.E. supporting ~12 field projects concurrently • Undertaking: instrument procurement, testing and preparation; shipping; field support; training; data management support and data archiving • Established in 2000, now with: • 29 x CMG-3TD with DCM/SAM • 15 x CMG-3T with NMX Taurus • 20 x CMG-40TD with DCM • 20 x CMG-ESPCD • 127 x CMG-6TD • 15 x ISSI SAQS HF systems • 28 x LE-3Dlite • Solaris and Linux data management servers with ~20TByte RAID storage

  4. Overview • The data management system at SEIS-UK is available for remote use by all users along with initial training if required • Due to low staff levels, SEIS-UK have the ethos of using software available from other sources wherever possible • In order that users can emulate the SEIS-UK system in house, only software which does not require licensing is used • It is the user’s responsibility to ensure data are made available for permanent archive • In practise, the user readies the data for archive but SEIS-UK send the data to IRIS DMC • The NERC loan agreement states that 3 years after the end of the experiment data must become open-access • SEIS-UK undertake the shipping of data to IRIS DMC once users have data in miniseed format and a final dataless seed volume • SEIS-UK have no real-time data capability – all data are archived at the end of the experiment

  5. Data Conversion and File Structure • Obtain unique network code from FDSN • www.fdsn.org/getcode.html • Record in proprietary format • Use instrument manufacturer’s software to convert to Steim-1 mseed • Populate all miniseed headers at conversion • This then allows for independent verification of meta-data upon completion as the dataless volume is produced separately • Use a simple / flat file structure for archive • > Project Directory • > Day Directories • > Component-Hour files • (4, 8, 12 or 24 hour files may actually be optimal) • Maintain GPS/SOH data separately • All station quality control is carried out by users in the field immediately after data download e.g. GPS checks; mass positions; continuity • Use PASSCAL utilities and/or qmerge for miniseed data manipulation • Use GOAT at IRIS to verify data format conversion • Produce GOAT text file which is ftp’d to IRIS (uses seed2sync utility) • View data continuity/gaps/overlaps via web interface

  6. Verify Data conversion and station performanceGOAT - Gap/Overlap Analysis Tool

  7. Dataless SEED production with make_dlsv • Previously used PDCC (previous incarnation – v.2?) • GUI based, time consuming for large numbers of instruments • SEIS-UK now use an adaptation of the make_dlsv package of Winfried Hanka (GFZ) • Unsupported package in C/Fortran • Text file / command line based system to produce dataless seed volumes from 3 user-supplied text files • No GUI interface allows rapid SEED builds and error trapping • Relies on a suite of text files representing instrument calibration data, PAZs, FIRs, repair dates etc. • Use C-Shell scripts as wrapper to produce input files for make_dlsv • Currently running on Solaris only at SEIS-UK • The hard part is setting it up initially • The easy part comes when the user creates their dataless volume. • Adding new instruments is straight-forward • Calibration data etc must be kept up-to-date

  8. comments_temp.cfg fir_temp_header.cfg analogresp_temp.cfg digit_resp_lookup.txt template_header.cfg channel_coeffs.txt sensor-gains.txt * digitresp_temp.cfg formats_temp.cfg units_temp.cfg channel_id.txt dmc.cfg instruments_temp.cfg FIRs/SPS Lookups (**) Instrumentation Database • As well as the software: • 12 text files are maintained which must be updated when there are any changes in instrumentation supported, e.g., new digitisers or sensors. • Changes in calibration data are made as soon as any changes occur (*) • Tables of FIRs and the sequence w.r.t. sample rates required (**)

  9. Example Sensor Response File analogresp_temp.cfg :::::::::::::: # Response Dictionaries # # Analog stages (seismometer & analog filters): Poles & zeros representation (PAZ) # Parameters: 1 - response lookup key, 2 - response name, 3 - no of zeros, # 4 - no of poles, 5 - norm. factor, 6 - norm. freq., 7 - stage gain, # 8 - gain freq., 9 - input units key, 10 - output units key, # 11 - zeros, 12 - poles # # CMG-3T PAZ response Resp_paz> 1 CMG-3T-PAZ 2 5 571507691.8 1.00 3000.0 1.00 2 3 -> -> (0.0,0.0) (0.0,0.0) -> -> (-1005.31,0.0) (-502.6548,0.0) (-1130.973,0.0) -> -> (-0.037008,0.037008) (-0.037008,-0.037008) # CMG-40T PAZ response Resp_paz> 2 CMG-40T-PAZ 2 5 571507691.8 1.0 1600.0 1.0 2 3 -> -> (0.0,0.0) (0.0,0.0) -> -> (-1005.31,0.0) (-502.6548,0.0) (-1130.973,0.0) -> -> (-0.148597,0.148597) (-0.148597,-0.148597) # CMG-6T PAZ response Resp_paz> 3 CMG-6T-PAZ 3 6 491139422.6 1.0 1100.0 1.0 2 3 -> -> (-31.6,0.0) (0.0,0.0) (0.0,0.0) -> -> (-0.148597,0.148597) (-0.148597,-0.148597) -> -> (-2469.3609,0.0) (-47.06357,0.0) -> -> (-336.7655,-136.656) (-336.7655,136.656) # CMG-EDU-V PAZ response Resp_paz> 4 CMG-EDUV-PAZ 2 6 9129959284.0 1.0 1100.0 1.0 2 3 -> -> (0.0,0.0) (0.0,0.0) -> -> (-0.148597,0.148597) (-0.148597,-0.148597) -> -> (-391.9552,850.693) (-391.9552,-850.693) -> -> (-471.239,0.0) (-2199.1149,0.0)

  10. Example Data Format Description File formats_temp.cfg :::::::::::::: # copy_seed Configuration File # # Abbreviation Dictionaries # # Format_abbr: Abbreviation dictionary for formats of data records # Parameter: 1 - format identifier code, 2 - format name, 3 - family type, # 4 - ddl keys # Here: DDL description of Steim-1 data compression scheme # (see SEED Manual Vers.2.3, pp 151) # Format_abbr> 1 Steim-1_Integer_Compression_Format 50 -> -> F1_P4_W4_D_C2_R1_P8_W4_D_C2 -> -> P0_W4_N15_S2,0,1 -> -> T0_X_W4 -> -> T1_Y4_W7_D_C2 -> -> T2_Y2_W2_D_C2 -> -> T3_N0_W4_D_C2 # Format_abbr> 2 Steim-2_Integer_Compression_Format 50 -> -> F1_P4_W4_D_C2_R1_P8_W4_D_C2 -> -> P0_W4_N15_S2,0,1 -> -> T0_X_W4 -> -> T1_Y4_W1_D_C2 -> -> T2_W4_I_D2 -> -> K0_X_D30 -> -> K1_N0_D30_C2 -> -> K2_Y2_D15_C2 -> -> K3_Y3_D10_C2 -> -> T3_W4_I_D2 -> -> K0_Y5_D6_C2 -> -> K1_Y6_D5_C2 -> -> K2_X_D2_Y7_D4_C2 -> -> K3_X_D30

  11. Example Digitiser FIR sequence File DMTAURUS_lookup.txt ::::::::::::::::::::::::::::::::::::: DMTAURUS 10 30000:DMTAURUSXFIR101X20/1500:DMTAURUSXFIR102X15/100:DMTAURUSXFIR103X5/20:DMTAURUSXFIR104X2/10 DMTAURUS 20 30000:DMTAURUSXFIR201X15/2000:DMTAURUSXFIR202X10/200:DMTAURUSXFIR203X5/40:DMTAURUSXFIR204X2/20 DMTAURUS 40 30000:DMTAURUSXFIR401X15/2000:DMTAURUSXFIR402X5/400:DMTAURUSXFIR403X5/80:DMTAURUSXFIR404X2/40 DMTAURUS 50 30000:DMTAURUSXFIR501X20/1500:DMTAURUSXFIR502X15/100:DMTAURUSXFIR503X2/50 DMTAURUS 80 30000:DMTAURUSXFIR801X15/2000:DMTAURUSXFIR802X5/400:DMTAURUSXFIR803X5/80 DMTAURUS 100 30000:DMTAURUSXFIR1001X15/2000:DMTAURUSXFIR1002X10/200:DMTAURUSXFIR1003X2/100 DMTAURUS 120 30000:DMTAURUSXFIR1201X5/6000:DMTAURUSXFIR1202X5/1200:DMTAURUSXFIR1203X5/240:DMTAURUSXFIR1204X2/120 DMTAURUS 200 30000:DMTAURUSXFIR2001X15/2000:DMTAURUSXFIR2002X5/400:DMTAURUSXFIR2003X2/200 DMTAURUS 250 30000:DMTAURUSXFIR2501X15/2000:DMTAURUSXFIR2502X4/500:DMTAURUSXFIR2503X2/250 DMTAURUS 500 30000:DMTAURUSXFIR5001X10/3000:DMTAURUSXFIR5002X3/1000:DMTAURUSXFIR5003X2/500

  12. What the user sees … • Three user-supplied text files • Network file • XX project full_project_name • Station file – one line per sensor/site/time-window: • stat_id inst_type sens_serial dig_type dig_serial lat lon elev start_time end_time sps1 sps2 sps_code site_name • Gains file (where required) • dig_serial Vpp software_gain start_date end_date • Run a script to produce input file to make_dlsv from the 3 input text files • Run make_dlsv to produce the dataless volume • Verify dataless volume • The user is now ready to extract fully-populated event files from their data set

  13. Verifying data before archive • Prior to data being sent to IRIS DMC for permanent archive the facility verify: • Validity of the dataless volume with verseed • All miniseed headers are complete • Check for typos etc in text files / dataless • All miniseed files are correctly described by the dataless: • No huddle test data • No inconsistencies in the dataless regarding site swaps etc • Deployment/decommissioning times correct • Script based utility using verseed to compare miniseed and dataless volume

  14. Shipping data to IRIS DMC • Once all data from a project have been retrieved from the field and verified: • Send dataless volume to IRIS via ftp • Compile and then ftp station-day volumes to IRIS in chronological order, one day at a time • Send cksum data for each file sent • Again, an automated script to send the entire project dataset • Upon completion of the ftp, create seed2sync file listing all data available at SEIS-UK (done separately to ftp to ensure redundancy)

  15. seed2sync DMC/seed2sync -f /data1/IRIS_SYNC/norway06//seed_2_sync.YF.2006.tmp|2007,246 YF|N6008||BHE|2006,117,09:46:53|2006,118,00:00:00||50.00|2559350||||||2007,246|| YF|N6008||BHN|2006,117,09:46:56|2006,118,00:00:00||50.00|2559200||||||2007,246|| YF|N6008||BHZ|2006,117,09:46:53|2006,118,00:00:00||50.00|2559350||||||2007,246|| DMC/seed2sync -f /data1/IRIS_SYNC/norway06//seed_2_sync.YF.2006.tmp|2007,246 YF|N6006||BHE|2006,117,16:11:42|2006,118,00:00:00||50.00|1404900||||||2007,246|| YF|N6006||BHN|2006,117,16:11:47|2006,118,00:00:00||50.00|1404650||||||2007,246|| YF|N6006||BHZ|2006,117,16:11:47|2006,118,00:00:00||50.00|1404650||||||2007,246|| DMC/seed2sync -f /data1/IRIS_SYNC/norway06//seed_2_sync.YF.2006.tmp|2007,246 YF|N6007||BHE|2006,117,15:34:43|2006,118,00:00:00||50.00|1515850||||||2007,246|| YF|N6007||BHN|2006,117,15:34:49|2006,118,00:00:00||50.00|1515550||||||2007,246|| YF|N6007||BHZ|2006,117,15:34:45|2006,118,00:00:00||50.00|1515750||||||2007,246|| DMC/seed2sync -f /data1/IRIS_SYNC/norway06//seed_2_sync.YF.2006.tmp|2007,246 YF|N6004||BHE|2006,118,14:27:40|2006,118,14:31:43||50.00|12150||||||2007,246|| YF|N6004||BHE|2006,118,15:15:46|2006,119,00:00:00||50.00|1572700||||||2007,246|| YF|N6004||BHN|2006,118,14:27:40|2006,118,14:31:43||50.00|12150||||||2007,246|| YF|N6004||BHN|2006,118,15:15:46|2006,119,00:00:00||50.00|1572700||||||2007,246|| YF|N6004||BHZ|2006,118,14:27:40|2006,118,14:31:45||50.00|12250||||||2007,246|| YF|N6004||BHZ|2006,118,15:15:46|2006,119,00:00:00||50.00|1572700||||||2007,246|| DMC/seed2sync -f /data1/IRIS_SYNC/norway06//seed_2_sync.YF.2006.tmp|2007,246 YF|N6008||BHE|2006,118,00:00:00|2006,119,00:00:00||50.00|4320000||||||2007,246|| YF|N6008||BHN|2006,118,00:00:00|2006,119,00:00:00||50.00|4320000||||||2007,246||

  16. The final IRIS archive • IRIS then verify that all data listed in the seed2sync file reside at IRIS DMC • Any missing data are then sent manually to complete the archive • Once all data are at IRIS it is migrated to the permanent repository • Upon completion, the project PI (Principal Investigator) is the owner of the data at IRIS and determines data release dates and accessibility • 3 year release date is part of the NERC loan agreement • IRIS are flexible regarding data release times • The archive at SEIS-UK is non-permanent

  17. Active source datasets • The end result of the experiment is SEGY gathers • Standard station QC utilities are used in the field • Once all raw data have been collected: • Convert to SEGY and reformat filenames and headers so the data look like Reftek Texan output (day directories; single-component files; specific filename structure; headers populated) • Then use the PASSCAL utilities segygather/txn2segy to produce receiver or shot gathers as required • Can be scripted to automate production of all gathers for the experiment • Use the package plotsec for verification of raw gathers • Users then ftp gathers to their own system for analysis • Network code obtained from IRIS DMC and assembled datasets are ftp’d to IRIS by SEIS-UK for permanent archive. • Controlled source data sets are complex. In reality most of the gather production is done by SEIS-UK once the meta-data have been compiled

  18. Real-time data • SEIS-UK do not have, nor do we envisage, using real-time data for mobile stations in the near future • Site hardware is expensive • Base station hardware is expensive • Airtime it expensive • Power consumption is high • Site deployment much more complex requiring highly trained personnel at each installation • For 1-2 year deployments it is difficult to justify the extra cost

  19. Satellite Modem SOH data only • Guralp DCMs allow Iridium satellite modem communication for system status verification • Two automatic 90sec calls per week made from central base-station • Remote DCM modem on for 2 x 8 hour periods per week (~50mA when idle or <1Ah/week) • Base station collects SOH summary report from remote DCM • Call costs £0.50 per minute or ~£0.80 each • Hardware costs of ~£1,000 per unit • Manual calls can also be made to the remote DCM allowing reconfiguration of system and system updates • Transmission is at 2400kbits per second • The modems can operate at temperatures ranging from -30°C to +60 °C

  20. Why the MiChroSat 2400? • The MiChroSat 2400 transmits using the Iridium Satellite Network – this is a Low Earth Orbit network of 66 satellites that gives total global coverage. • Initial set-up is low-cost in comparison to the large dishes required for High Earth Orbit constellations such as VSAT. • The power consumption is low due to the minimal hardware required – this is essential for remote solar-powered sites. • Bandwidth is smaller than that for VSAT (which is generally used for data transmission) but for state-of-health communications it is ideal. • The system is low-cost to run since air-time is on a pre-paid pay-as-you-go basis and by running a modem at SEIS-UK transmissions will be Iridium to Iridium thus avoiding the large fees for calling to other networks.

  21. Satellite Modem Hardware HuBLE-UK Site in Hudson Bay, Canada Summer 2007

  22. The base station

  23. The base station

  24. Remote System Status Summary

  25. Remote System Status Summary

More Related