220 likes | 238 Views
Explore a scalable distributed datastore tailored for bioimaging applications, focusing on latency optimization, network effects, and performance comparison. Delve into hardware, software, and design aspects to enhance research efficiency.
E N D
A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith, jtafas}@r2labs.org Renaissance Research Labs Department of Computer Science California State University San Bernardino, CA 92407 Supported by NSF ITR #0331697
Background • CSUSB Institute for Applied Supercomputing • Low Latency Communications • UCSB Center for BioImage Informatics • Retinal images • Texture map searches • Distributed consortium (UCB, CMU)
Retina Images 3 day detachment (3d) Normal (n) 1 day detachment followed by 6 day reattached with increased oxygen (1d+6dO2) 3 month detachment (3m) Laser scanning confocal microscope images of the retina
Environment UCSB Image and metadata server Image and metadata server Local Hammer/Nail Cluster BISQUE Image and metadata server LAN Local features search Raven Cluster Lustre WAN analysis internal external CSUSB
Software • Open source • OME • Postgresql 7 • Bisque • Distributed datastore • Clustering • NFS • Lustre • Benchmark: OSDB
Hardware - Raven • 5 year old dual processor • 1.4 GHz Pentium 3 • 256MB RAM • 60GB SCSI • Compaq Proliant DL-360 servers. • Raven has been latency tuned.
Hardware – Hammer/Nail • Hammer headnode • 5 Nail nodes • quad CPUs • 3.2 Ghz Xeon • 4GB RAM • 140GB SCSI • Dell servers • Bandwidth tuned (default) UCSB CSUSB
Outline • Effect of node configuration • Comparison of network file systems • Effects of a wide area network (WAN)
Design Effects? • A few expert users • Metadata searches • Small results to user • Texture searches • Heavy calculation on cluster • Small results to user • Latency tuning
Outline • Effect of node configuration • Comparison of network file systems • Effects of a wide area network (WAN)
NFS / Luster Performance. • NFS • well known standard • Configuration problems with OME • performance comparison of the Lustre file system • Lustre • Journaling • Stripe across multiple computers • Data redundancy and failover
Relative Performance on LAN • NSF/Lustre • Compared to local DB • 1GB LAN • two significant differences
Significant Differences • NSF caching • bulk deletes and bulk modifies • Lustre stripes across computers • increase the bandwidth
Outline • Effect of node configuration • Comparison of network file systems • Effects of a wide area network (WAN)
Effect on Wide Area Network WAN • Compared three connections • Local • Switched, high speed LAN (1 Gb/s) • WAN between UCSB and CSUSB (~50 Mb/s) • NFS only • UCSB didn’t have Lustre installed • Active research prevented reinstalling OS
Effect on Wide Area Network WAN • Most significant effect • Not bandwidth intensive operations • Latency intensive operation • Next generation WAN will not solve the problem. • Frequently used data must be kept locally • Database cluster • Daily sync of remote databases
Conclusions • Scientific researchers • Latency tune network • Don’t bandwidth tune • Latency of WAN is too large • replicate data and update. • Bisque/OME NFS issues • Lustre • High bandwidth operations • Stripe Lustre across systems
Future directions: • Agent based texture search engine • Loosely coupled cluster • WAN connection • Unreliable connection • Fault tollerant • Parallelize Jobs • Open source components • Scilab • Convert NSF funded algorithms in Matlab • Simple interface • Superior caching scheme for Lustre