230 likes | 329 Views
Data Retrieval and Software Configuration Brian Jackel University of Calgary. Data Retrieval and Software. Imager Data Hours of operation Image size Data volume Compression. Site Software Configuration Imaging programs System status Compression. Central Operations Monitoring
E N D
Data Retrieval • and • Software Configuration • Brian Jackel • University of Calgary
Data Retrieval and Software • Imager Data • Hours of operation • Image size • Data volume • Compression • Site Software • Configuration • Imaging programs • System status • Compression • Central Operations • Monitoring • Network • Project data flow • Data products
Image Frame Cropping 270 x 270 250 x 250 230 x 230
Image Cadence • Currently acquiring one image every 5 seconds (mission goal) • Possibility of increasing integration time or increasing cadence • Imaging every 5 seconds produces 90 megabytes of data per hour (uncompressed), for an average of 700 megabytes per site per day, and a peak of 1.4 gigabytes per day at the highest latitude site during midwinter.
Total Data Volume Solar zenith angle = 102° data/year/site: 220 to 290 gigabytes (uncompressed) total data/year: 5 terabytes (uncompressed)
Image Compression • “raw” file size: 128 kilobytes + 1 kilobyte header • gzip compression (used by PNG) is not very effective on 16-bit images, even with pre-filtering • bzip2 compression of raw image files is more time consuming but often significantly better • use PGM files with bzip2 for first season • revisit topic next spring
Site Software • Linux operating system • currently RedHat 9.0 upgraded to kernel 2.4.25 • minimal set of system services to reduce CPU load • separate users for imager, magnetometer, and system monitoring • strict firewall and other access restrictions
Imager Data Flow: On-Site imagerd recent.pgm pgm_reshape thumbnail.pgm system disk udp_sendfile backup udp_receiverd archive disk archive disk
Imager Daemon • TCL program is always running, waiting for solar zenith angles > 102° (nautical twilight) • images acquired every 5 seconds nearly simultaneously (within approximately 0.1 seconds) at all sites • detailed status information provided in text format: • 2004-04-05 13:50:32 site calgu at latitude 51.7, longitude 245.8 • solar zenith angle 45.4, threshold 0.0, imager should be ON • imager themis01 acquired 1 images with schedule ASAP • PPID PID ELAPSED %CPU %MEM TIME VSZ COMMAND • 22474 12360 00:03 3.0 0.5 00:00:00 4448 tclsh /usr/local/bin/imagerd • 2004-04-05 13:50:30 [debug] exposing 2230 ms, converting 90 ms (waited 2588 ms) • 2004-04-05 13:50:30 [debug] image 20040405_1950_calgu_themis01_full_1000ms • 2004-04-05 13:50:30 [debug] current time 1081194629. • 2004-04-05 13:50:29 [debug] imager on at sun angle 45.425991 • 2004-04-05 13:50:29 [notice] turning on camera and opening shutter • 2004-04-05 13:50:29 [notice] imaging started • 2004-04-05 13:50:29 [debug] executing option run • 2004-04-05 13:50:29 [debug] using uniqueID calgu_themis01 • 2004-04-05 13:50:29 [debug] using schedule: ASAP
Image meta-data Essential information attached to each image on site: • #"Site name" University of Calgary • #"Site uniqueid" calgu • #"Site description" Physics building roof • #"Geodetic latitude" 51.7N • #"Geodetic longitude" 245.8E • #"Geodetic altitude" 1140m • #"Site information revised" 2004-03-17 • #"Imager orientation" yaw=0.0 pitch=0.0 roll=0.0 • #"Imager uniqueid" themis01 • #"Imager type" starlight Xpress MX716 • #"Expose pgm" /usr/local/bin/starlight_expose_pgm • #"Ccd width" 752 pixels • #"Ccd height" 290 pixels • #"Pixel depth" 16 bits • #"Pixel aspect ratio" 0.5238 • #"Optical type" THEMIS all-sky • #"Optical projection" a1=1.5 a3=0.0 b2=0.0 b4=0.0 • #"Optical center" x0=128 y0=128 • #"Imager information revised" 2004-03-17 • #"Image size" 270 270 • #"CCD device" /dev/ccdA • #"Exposure options" XOFFSET=130 YOFFSET=10 WIDTH=540 HEIGHT=270 XBIN=2 YBIN=1 MSEC=1000 • #"Exposure start time" 2004-03-25 12:24:00.182015 UTC • #"Exposure duration" 1119.64 ms Site Requires star frames after installation Instrument Can be determined before installation, should be similar for all THEMIS imagers Image
Imager Geometry and Orientation • Hundreds of stars are visible in a 1 second exposure when skies are clear. • Minimization techniques allow estimation of parameters related to camera optics and imager orientation. • Resulting angular uncertainties are less than ½° over most of the field of view.
Image Thumbnails full resolution 256×256 pixel image reduced to a 20×20 pixel “thumbnail” pgm_reshape reshape_filename image_filename > thumb_filename reshape_file is a standard PGM image with pixel values corresponding to indices in the thumbnail (look-up table)
Real-Time Data: Transmit • UDP packets • connectionless (low overhead) • best effort (unreliable, but losses rarely exceed 1%) • up to 64 kilobytes of arbitrary data Transmit roughly 500 bytes per “thumbnail” image every 5 seconds: udp_sendfile <target_hostname> <target_port> <filename>
Real-Time Data: Receive • nearly 2 packets/second of imager thumbnails, roughly 16 kilobits/second • xinetd used to limit number of connections per second, preventing accidental or malicious flooding • data in directory tree ordered by IP address and date: udp_receiverd <port> <root_directory> • |-- ip_address • | |-- 136.159.51.24 • | | `-- 2004 • | | `-- 03 • | | `-- 01 • | `-- 136.159.51.33 • | `-- 2004 • | `-- 03 • | `-- 01 • |-- host_name • | |-- space.phys.ucalgary.ca -> ../ip_address/136.159.51.24 • | `-- wham.phys.ucalgary.ca -> ../ip_address/136.159.51.33 • `-- site_code • |-- calgu1 -> ../ip_address/136.159.51.24 • `-- calgu2 -> ../ip_address/136.159.51.33
On-Site Monitoring • Information about hardware and software state gathered on site • written to text files (included in daily backup) • entered in RRD database
Real-Time Status: Transmit • Syslog • unix infrastructure for sending and logging arbitrary text strings • messages tagged with severity code (eg. debug, info, error) • logs stored locally as text files, “logrotate” used to ensure log files don’t exceed size limit • messages can also be forwarded across network to other computers: • /etc/syslog.conf • local7.* /var/log/boot.log • *.info @realtime.phys.ucalgary.ca
Real-Time Status: Receive • Syslog-ng • extended version of standard syslog host • flexible filtering scheme allows messages to be filtered by sender, contents etc. • messages in directory tree ordered by IP address and date, could also be entered in database. • browsing via standard web and command line tools (eg. “find”, “grep”)
Other Monitoring Tools • Standard logging will be sufficient for the majority of cases. However, additional information and intervention will occasionally be required: • direct log-in using SSH • file transfer using RSYNC or SCP • status polling using SNMP
Satellite Internet • Telesat HSi satellite internet coverage is available at all THEMIS sites. • Bandwidth is asymmetric, with peak downlink of 1 megabit/second and peak uplink of 100 kilobits/second (temporarily throttled to 10 kbps after short term limits exceeded). • Monthly bandwidth cap of 1.5 gigabytes results in an effective average uplink bandwidth of roughly 4.6 kilobits/second. • Long latency times (>240 ms one way) are obvious during an interactive session, and cause unreliable remote NTP synchronization.
Project Data Flow thumbnail syslog RAID thumbnail syslog thumbnail syslog realtime.phys.ucalgary.ca thumbnail syslog thumbnail syslog themis.phys.ucalgary.ca public data Off-line Archive RAID Integration THEMIS team shipping Disks from sites
Data Products • Merged maps and movies for quick-look science and public outreach. Lower resolution, lossy formats, 1 minute cadence. • CDF files with luminosity (8 bit) from 20×20 grid at each of 20 sites, plus corresponding azimuth, zenith angle, geographic latitude/longitude (110km and 150km), PACE latitude/longitude (110km and 150km). • File size would be roughly 10 megabytes for a 24 hour sequence at 1 minute cadence.