130 likes | 295 Views
WDSS-II Training Module II. Manipulating Data Sources. Data files. Data are stored in files Separate files for each elevation scan, table, LatLonGrid slice, contour, etc Each file corresponds to one time-stamp Separate files as the data streams in Usually organized by product name
E N D
WDSS-II Training Module II Manipulating Data Sources
Data files • Data are stored in files • Separate files for each elevation scan, table, LatLonGrid slice, contour, etc • Each file corresponds to one time-stamp • Separate files as the data streams in • Usually organized by product name • Usually netcdf or XML format • Details in data format section
Data Sources • The data files are organized • Based on the sensor(s) it is from • Radar, satellite, model, METAR, etc. • Combined from these 6 radars • Called a data source • Each data source has an “index” file • Lists which data are available • How to get a hold of the data • Usually all the data are in one directory • But may be distributed • The index will however be in one “well-known” spot
Indices: XML versus XMLLB • All index records are in XML format in one of two file types: • XML LB files (code_index.lb): can have up to 32,500 records; handle notification; binary file must be accesses with lb_cat, etc. • algorithms that use an xmllb: as input will never exit! • XML text files: user-editable, can have an unlimited number of records • Algorithms that use an xml: index as input will exit when they reach the end of the file.
Example of an index record <item><time fractional="0.0"> 799875952 </time><params>netcdf {indexlocation} Velocity 00.50 19950507-194552.netcdf.gz</params><selections>19950507-194552 Velocity 00.50 </selections> </item> More details in session on data format …
Creating an XML index file • For files that follow the WDSS-II naming convention, • makeIndex.pl /my/input/dir code_index.xml (very fast) • Directory should have data in the directory structure: DataType/SubType/Time, like: “Reflectivity/00.50/20030511-123400.netcdf.gz” • Otherwise, use “makeNetcdfIndex” (slow but sure) • makeNetcdfIndex can be run in real-time • Monitors a directory and updates index
Cropping the index • w2simulator is a useful tool for playing back data. We can also use it to “crop” out a time section from an index file and put it in another: • w2simulator –i xmllb:/data/mydir/code_index.lb –l new_index.lb –b 20010228-160000 –e 20010228-1630 –f • Accessing code_index.lb will give you all the data. Access new_index.lb will give you a 30-minute subsection.
Backing up real-time data • The easiest way, if you want all of the data: • Just copy the entire directory of interest, including the code_index.lb file (cp –r <source dir> <target dir> ; or scp). Connect to the LB in the new directory. • If the first record of the code_index.lb file is too late (index doesn’t list all data files), then create an XML index file.
Backing up real-time data • archiveProducts.pl – to save a subset of sources or a time period from a real-time data stream. • Must have access to the machine’s data disk • Usage: archiveProducts.pl <start date> <start hour> <end date> <end hour> <source directory> <target directory> <sources (“list in quotes” or “all”) • archiveProducts.pl 20021110 18 20021111 04 /data/realtime/radar /date/mytargetdir “KLSX KILX KEAX multi” • Creates a new code_index.xml and code_index.lb for each data source directory (with makeIndex.pl & w2simulator)
XMLLB to XML to XMLLB • XML to XMLLB: • w2simulator –i xml:/data/mydir/code_index.xml –l code_index.lb –f • XMLLB to XML: • replaceIndex –i xmllb:/data/mydir/code_index.lb –o /data/mydir/code_index.xml
Referencing Data Locations (URLs) • Indextype:machine:/directory/index • xmllb:anubis:/data/realtime/radar/KTLX/code_index.lb • xml:/data/archive/multi/code_index.xml Usually code_index.lb or code_index.xml (Optional) xml: or xmllb: Path to data
End of WDSS-II Training Module II Next: Real-time system setup Archived data playback