150 likes | 224 Views
Cofax Scalability Document Version 1.0. Scaling Cofax in General. The scalability of Cofax is directly related to the system software, hardware and network environment in which it is installed.
E N D
Scaling Cofax in General • The scalability of Cofax is directly related to the system software, hardware and network environment in which it is installed. • The analysis did not find any scalability bottlenecks in the Cofax design or code itself. It confirmed that Cofax was designed to scale. • This process did result in a significant reliability and performance improvement to our server hosting facility and database setup. • Cofax was developed to allow the rapid deployment of new hardware resources according to demand. • Cofax is not dependent on any one system installation architecture. It scales by reconfiguring the environment to meet current needs.
Scaling Cofax at PNI • Description of Installation v2.0.atPNI • Strengths of Installation v2.0.atPNI • Weaknesses of Installation v2.0.atPNI • Description of Installation v3.0.atPNI • Strengths of Installation v3.0.atPNI • Addressing concerns about Installation v3.0.atPNI
Installation v2.0.atPNI File Server Cofax Server File Storage HTTP Server Cofax Server File Storage HTTP Server File Storage Cofax Server HTTP Server File Storage Cofax Server HTTP Server Cofax Server Cofax Server Database Server
Strengths of Installation v2.0.atPNI • HTTP Servers are distributed. • Cofax Application Servers are distributed. • Load is balanced between multiple servers. • Load can be distributed as necessary. • Once the hardware and OS is in place, new additional servers can be manually configured and running in minutes. • Very suitable for an ISP that can automatically add/remove servers on the fly.
Weaknesses of Installation v2.0.atPNI • Database server is a single point of failure. • On the serving side, and • On the updating side • There is a limit to how much hardware we can add to the single database server. • The incremental performance gain from adding more computing resources (CPUs, memory, disk space) to the single server starts to diminish at a point. • A single machine, no matter how powerful does fail for the common types of problems (locked data in a table, runaway processes, memory leaks, etc.)
Weaknesses of Installation v2.0.atPNI • There is a practical limit of how much optimization we can do on a single machine. • There is also a limit to how much optimization the people would want to do. • These optimizations change with time. • When the server setup changes • When the usage patterns change.
File Server Installation v3.0.atPNI File Storage File Storage Cofax Server File Storage HTTP Server Cofax Server File Storage HTTP Server Cofax Server Serving DatabaseServer HTTP Server Cofax Server Serving DatabaseServer Editing DatabaseServer HTTP Server Cofax Server Editing DatabaseServer Serving DatabaseServer Cofax Server Serving DatabaseServer
Strengths of Installation v3.0.atPNI • The model is already tested. • Only reasonable optimizations are required. • Serving database is replicated across multiple physical servers • There is no single point of failure on the serving side. • Data transformation is isolated from data retrieval
Implementing the Distributed Database • Requires no design changes to the Cofax framework. • Requires no changes to the Java code or software application. • Requires configuration changes only. • Requires the addition of new hardware resources, database servers, tomcat servers, web servers.
Upgrading from Database model to Distributed Database model • Separation of “editing” and “serving” databases. • Front-end database can be replicated across multiple physical servers. • Additional databases can be brought online as needed.
A proven model that is able to serve very high traffic • Additional database servers can be added to handle growing web site traffic. • E.g. 100 million or more dynamic page views a day. • Can house large amounts of content • Disk storage continues to become cheaper. • 10 Years’ worth of content from 100 Daily Newspapers.
Replication Issues Addressed • The database replication model is based on the knowledge: • The number of “reads” from the data store outweigh the “writes”. • E.g. A data store that has 10 million records read from it in an hour is likely to have no more than 10 thousand records written to it. • The number of “new records being added or deleted” outweigh the current records being updated. • E.g. A data store that has 10 thousand new records added to it in an hour is likely to have between 1 hundred to 2 thousand existing records updated in that time.
Latency Issues Addressed • Updates from the Editing databases to the Serving databases are transactional. As tables on the editing database occur those transactions are replicated on the serving machines. • Transactional model means almost no latency between editing and serving machines. • Data is de-normalized and optimized for fast serving on the Editing databases. These fast-access tables are sent to the Serving databases.
Conclusion • Because of its flexible framework Cofax can scale to meet any demand. • Scaling requires only the addition of hardware resources and minor configuration changes • The current installation changes took only a few days to implement and bring online.