370 likes | 500 Views
Emmanuel Cecchet – INRIA Julie Marguerite – ObjectWeb Willy Zwaenepoel – EPFL. : Flexible Database Clustering Middleware. JDBC. JDBC. Motivations. Database tier should be scalable highly available without modifying the client application database vendor independent
E N D
Emmanuel Cecchet – INRIA Julie Marguerite – ObjectWeb Willy Zwaenepoel – EPFL : FlexibleDatabase Clustering Middleware
JDBC JDBC Motivations • Database tier should be • scalable • highly available • without modifying the client application • database vendor independent • on commodity hardware JDBC Internet
Scaling the database tier – Alternative 1 (master-slave) • Cons • failover time on master failure • scalability App. server Master Web frontend Internet
Database Well-known database vendor here Well-known hardware + database vendors here Scaling the database tier – Alternative 2 (SMP) • Cons • Cost • Scalability limit App. server Web frontend Internet
Scaling the database tier – Alternative 3 (shared disks) • Cons • still expensive hardware • availability App. server Disks Database Web frontend Internet Another well-known database vendor
Outline • C-JDBC architecture • High availability • Use cases • Conclusion
C-JDBC controller MySQL database C-JDBC JDBC driver MySQL JDBC driver JVM architectural overview Application server JVM JVM
basic concepts • 2 components • C-JDBC driver • C-JDBC controller • 100% Java implementation • Read-one, Write all approach • Tunable replication • full partitioning • full replication • partial replication JVM
connectmyDB connectlogin, password executeSELECT * FROM t Functional overview
executeINSERT INTO t … Functional overview
Outline • C-JDBC architecture • High availability • Use cases • Conclusion
Failures • No 2 phase-commit • parallel transactions • failed nodes are automatically disabled executeINSERT INTO t …
Restoring a backend • Updates stored in the recovery log • Database dumps associated to checkpoints
Synchronization • Replay missing updates from log
Healed Cluster • Re-enable backend when done
Vertical scalability • Addresses JVM scalability issues • Distributing large number of connections on many backends
Controller replication • Prevent the controller from being a single point of failure • Group communication for controller synchronization
jdbc:cjdbc://node1,node2/myDB Total order reliable multicast Controller replication
Current limitations • Replication granularity is table • No distributed joins • Network partition with replicated controllers • JDBC only • support of PHP, Perl, ODBC through wrappers or bridges • partial support of JDBC 3.0
Other features • SSL support • Support for heterogeneous databases • SQL monitoring • JMX based administration console • Request player
Outline • C-JDBC architecture • High availability • Use cases • Conclusion
Budget High Availability • High availability infrastructure “on a budget” • Typical eCommercesetup • http://www.budget-ha.com
OpenUSS: University Support System • eLearning • High availability • Portability • Linux, HP-UX, Windows • InterBase, Firebird, PostgreSQL, HypersonicSQL • http://openuss.sourceforge.net
Flood alert system • Disaster recovery • Independent nodes synchronized with C-JDBC • VPN for security issues • http://floodalert.org
Internet emulated users J2EE benchmarking • Large scaleJ2EE clusters • http://jmob.objectweb.org
Outline • C-JDBC architecture • High availability • Use cases • Conclusion
Conclusion • C-JDBC: Flexible Database Clustering Middleware • scalable • highly available • without modifying the client application • database vendor neutral • on commodity hardware • LPGL software hosted by ObjectWeb
Q&A_________Thanks to all users and contributors ... http://c-jdbc.objectweb.org
TPC-W benchmark(Amazon.com) • Nearly linear speedups with the shopping mix
Result cache • Cache contains a list of SQL->ResultSet • Policy defined by queryPattern->Policy • 3 policies • EagerCaching: variable granularities for invalidations • RelaxedCaching: invalidations based on timeout • NoCaching: never cached
Recovery log • All updates are stored in the recovery log • Database dumps associated to checkpoints
Making new checkpoints • Disable one backend to have a coherent snapshot • Mark the new checkpoint entry in the log • Use Octopus to store the dump
Making new checkpoints • Replay missing updates from log
Making new checkpoints • Re-enable backend when done
Handling a backend failure • A node fails! • Automatically disabled but should be fixed or changed by administrator
Fault tolerant recovery log UPDATE statement