140 likes | 153 Views
Learn about the performance improvements in the BaBar project at SLAC, including scalability achievements, major code redesign, and future optimization plans. Discover the challenges, solutions, and statistics of the largest growing multi-TB database.
E N D
Improving Performance of Object Oriented Databases, BaBar Case Studies Jacek Becla Stanford Linear Accelerator Center
The BaBar Project • Headquartered @ SLAC • 3 regional centers (France, Italy, UK) • 300 physicists/engineers • Data sample size • 32 MB/sec (100 events/sec) from the detector • ~300 TB per year, >10 years lifetime • In production since May’99
Performance Requirements • Online Prompt Reconstruction (OPR) • 200 computing nodes • 100 events/sec processing rate on average • likely to be augmented in the future • Physics Analysis • DST creation: 2 users at 109 events per month • DST Analysis: 20 users at 108 events per month • Interactive analysis: 100 users at 100 events/sec
Achieving Scalability &Improving Performance • Dedicated test-bed since Aug ‘99 • initially 100 nodes, 2 data servers, 2 secondary servers, 2 file systems • expanded to 4 data servers, 3 secondary servers, 6 file systems • and up to 230 nodes, scheduled access • Tests focused on OPR • fully controlled environment • Powerful monitoring system developed
Major Improvements (1) • Redesigned code • reduced lock collisions and lock traffic • many optimizations for speed • without loosing on robustness • Increased #data servers & #file systems • random write: 8 MB/sec limitation • Increased #server processes per host • reduces #open file descriptors • measured in thousands
Major Improvements (2) • Pre-sized containers • Increased & randomized transactions • Increased #used databases • database=file=serialization point • segregating clients • load balancing, conditions spread
Miscellaneous • Current limitation • Mostly due to Lock Server CPU saturation • over 10K entries in the lock table • significant improvements coming in the next two releases • Very active cooperation with Objectivity
Optimizing Physics Analysis (1) • Environment much more difficult to understand • uncontrolled • over hundred simultaneous users • Understanding the system • monitoring production FD • profiling jobs
Optimizing Physics Analysis (2) • Knowledge from the testbed applied • added more data servers and file systems • increased number of server processes • optimized data placement • Optimized access to tag & micro data • 35 Hz -> 2 kHz (iterating over tag benchmarks) • expected x25 bandwidth relative to August ‘99
Future Improvements • Optimization process will continue • new file systems / data servers • faster client computing nodes • reducing payload per event • next Objy releases • focused on performance • new features we asked for • e.g. read-only dbs • ?
Some Statistics • Servers • 12 primary (db servers) • 15 secondary (lock servers, catalog & journal servers) • DB personnel • 2 administrators • 5 primary developers • Data • ~33TB accumulated data • ~14K databases • ~28K collections • Disk Space • ~10TB • 513 persistent classes
Conclusions • Initial performance problems overcome • Robustness improved • Objectivity/DB is able to scale • if used wisely • We seem to have the largest DB in the world • and it is growing fast • does anybody else entered into multi-TB region yet?