290 likes | 437 Views
Prepare to Scale. Bill O’Connor, CTO d.o.: csevb10. 1. 2. 3. Basic Infrastructure. Single-Server Database & Application on the same server Start optimizing what you have Apache Drupal PHP Database. Optimizations you make for the first server will be applicable for future servers
E N D
Prepare to Scale Bill O’Connor, CTO d.o.: csevb10 1
Basic Infrastructure • Single-Server • Database & Application on the same server • Start optimizing what you have • Apache • Drupal • PHP • Database • Optimizations you make for the first server will be applicable for future servers • Strategy: Optimize what you have, then divert traffic through caching and specialization. 4
Drupal • 1-word: • Support for Database Replication • Support for Squid/Varnish • MySQL optimizations • PHP5 optimizations • http://fourkitchens.com/pressflow-makes-drupal-scale/downloads 5
DB • MyISAM • Default storage engine for <= Drupal 6 • Good for selects • Read-only sort of websites • Poor Read-write performance for large websites 6
DB, Cont. • Falcon • Beta-stage project for MySQL • Different performance characteristics than other engines (both + & -) • Not ready for primetime, but worth watching 7
DB, Cont. • InnoDB is your friend in most scenarios. • Row-level vs Table-level locking • Improves read/write functionality • Does slow pure read functionality to some degree • Easier to do it right from the start, then have to revisit the issue later when you have users and traffic • Default Store Engine of Drupal 7+ • Best bet at the moment for allowing your site to scale 8
PHP • Opcode caching • Sort of like having a compiled version of your application • Optimizes your application • Stores the compiled PHP bytecode for execution in stored memory • Result: Smaller PHP memory footprint (read more users with less hardware) and faster execution of code. • Virtually a necessity for any large-scale/high-volume Drupal deployment 9
PHP, Cont. • Opcode caching • eAccelerator • Off & on maintenance • Only works with threadsafe PHP • Has – in my experience – led to some strange crashing, WSOD, etc. • Xcache • Reasonable performance improvement, though tends to performance test slowest of the 3 • Actively maintained. • Stable, but still prone to cache-corruption, WSOD, etc. 10
PHP, Cont. • Opcode caching, cont. • APC • Current opcode cache of choice. • Most actively updated. • Most stable of the 3. • Usually the winner in performance benchmarks. • Maintained by core PHP developers (Rasmus). 11
Static Caching • Static Caching Modules • Creating and storing rendered versions of the html • Rather than building the page on request • Avoids having to load any aspect of your application depending on the implementation • Acts as a layer between the user and actual execution of your program • Alleviates DB issues since the DB is no longer involved • Simplifies any PHP execution 12
Static Caching, Cont. • Static Caching Modules, Cont. • Boost Module • Static file caching • Good for Anonymous traffic only • Great performance for small sites • Ideal for shared hosts • AuthCache Module • Static file caching • Attempts to handle logged-in traffic • Plays nice with and/or can utilize multiple caching engines (more on those later) • Can be a bit of a pain for user-specific content as you have to write particular cases for each user-specific area 13
Static Caching, Cont. • Static Caching Modules, Cont. • Shameless plug: Ajaxify Regions • Aptly-named….or not. • Actually pulls Blocks not Regions via ajax • Early release w/ plenty of work to do, needs more real-world testing, etc. • Automatically handles all user-specific block content based on block-caching settings • BLOCK_NO_CACHE • BLOCK_CACHE_PER_USER • BLOCK_CACHE_PER_ROLE • Concept: ajax load anything that can’t be cached for everyone. 14
Object-level Caching • Object-level caching • Provides a way to store full-generated objects • Can be the amalgam of many queries • Think of all the queries run on a node_load vs retrieving all that information in 1 query. • Stories the information in memory for fast access • Performance characteristics not significantly different than MySQL when MySQL can handle the load • BUT can handle a much higher load • Protects the DB – the area most likely to inhibit performance for Drupal – from becoming overwhelmed 15
Object-level Caching, Cont. • Object-level caching, Cont. • APC • Not a typo. • APC can handle object caching as well as op-code caching. • It’s fast: everything is stored in local memory. • It caches only for one server. • This means that you could have synchronization issues between servers if you have more than one. • If that’s not an issue, it’s a quick and easy solution. • Ideal for single-server implementations or when synchronicity isn’t an issue. 16
Object-level Caching, Cont. • Object-level caching, Cont. • Memcache • Utilized by most high-profile sites. • Facebook, for instance, makes tremendous use of lots and lots of memcache servers. • Drupal.org uses it. • Provides an object cache that can be used by multiple servers. • Slower in the single-server instance than APC, but provides synchronicity. • Multiple silos/buckets can be created for information, so you can distribute information across multiple servers. 17
Advanced Infrastructure (ex) Load Balancer Static-Caching Application Database Solr Memcache Deployment Slave DB 18
Specialization • Specialized Servers/Services • DB Server • SOLR • Memcache • Static-caching • CDN 19
Specialization • MySQL Server • One of the fastest ways to improve performance is to separate your MySQL DB from your application • This allows both your application and your db to make full use of independent hardware • The change is basically transparent at the application layer: just a single change to settings.php 20
Specialization • Search • Problem: Search is incredibly hard on the system • Particularly w/ multiple search terms • Drupal search works, but, despite great efforts is still not as quick or useful as an outside solution • Search is particularly hard on the DB, Drupal’s traditional bottleneck • In other words, search makes a bad problem worse 21
Specialization • Search, Cont. • Solution: Solr • Communication layer between the website and the Lucene search index • Offloads all of the complex processing to a separate box • More power for searches (search faster!) • Doesn’t lock up your website DB • Website can focus on what it does, search can focus on what it does • Additional benefit: faceting (filtering), sorting • Ability to search content based on specific criteria (content type, author, taxonomy terms) and sort based on criteria (title, date, author, content type) • Hosted model (Acquia Search) or can be installed on server in your infrastructure 22
Specialization • Static Caching • Static-caching on the same server as the website provides performance improvement • Downside: there’s still a lot of wasted overhead. apache has everything it needs for a website, not just serving html; php also has to load. • Static-caching elsewhere provides the opportunity to optimize the server for static-caching • Side effect: your web server now has more memory free to handle requests that require php processing. 23
Specialization • Static Caching, Cont. • Squid • Free • Not specifically designed just for http acceleration • Difficult to setup/configure • Performance improvement, but less than competition 24
Specialization • Static Caching, Cont. • Varnish • Free (to download) • Pressflow built to work w/ Varnish • Varnish servers set up for Drupal and usable off Amazon EC2 (developed by Chapter 3) ($.34/hr +$.17/GB) • Designed from the group up for http acceleration • Can take time/expertise to get the performance you want • Can create a significant performance improvement once configured correctly 25
Specialization • Static Caching, Cont. • AI-Cache • Best performance of the bunch • Simple configuration • Provides additional features for caching • header recognition • session caching • Drop-in solution • Not free • Amazon EC2 instance is available ($.68/hr +$.20/GB) 26
Specialization • CDN • Cache content that is static (outside of full pages) • Images • Video • CSS • JS • Popular examples • Akamai • LimeLight • Amazon CloudFront • Separate domains, more bandwidth, geographic servers all equal faster loading • Can be an expensive option 27
Summary • Start small and make the easy optimizations: • Pressflow • InnoDB • APC • Add servers and services as necessary and based on individual traffic: • MySQL • SOLR • Memcache • Static Cache • CDN 28
The End. • Questions? 29