150 likes | 316 Views
Scalable Web Site Antipatterns. Justin Leitgeb Stack Builders Inc. Overview. Based on architectures that have caused significant down-time and pain Like examples in Nygard's book, but more emphasis on essential rather than accidental properties of system.
E N D
Scalable Web Site Antipatterns Justin Leitgeb Stack Builders Inc.
Overview • Based on architectures that have caused significant down-time and pain • Like examples in Nygard's book, but more emphasis on essential rather than accidental properties of system
Anti-pattern 1: Monotonically-increasing data set with rapid growth • Having a system that relies on querying all historical data • Requires joins from mega-tables (hundreds of millions of rows) • Often from automatically aggregated data
Detection • Slow query log • SHOW FULL PROCESSLIST • SHOW ENGINE INNODB STATUS • vmstat
Anti-solutions • Partitioning • Pre-caching (cron jobs) • Switching to MyISAM • NoSQL?
NoSQL • Out-of-the box solutions with NoSQL (e.g., Mongo) help with data modeling • Use CAP instead of ACID • May lead to better ability to distribute algorithms • But: • Haven't had as much effort yet expended on engines as MySQL (INNODB) • Often use the same algorithms (e.g., Btree indexes) • Can require more dev time (e.g., Cassandra and good implementation of distributed algorithms)
Stop the bleeding • Cut off long queries • Turn off site sections • Fail whale
Band-aids • Obvious - adding app servers, memcached, bigger DB server • Adding app servers puts more pressure on DB server • HTTP Caching (varnish) • MySQL tuning (look for things like FILESORT) • Read slaves
Solutions • Hard-limit data volume - look for cases where data decreases in value with time • Add features related to scale • Distributed algorithms and data stores • Data warehousing
Anti-pattern 2: Allowing "risky" writes to block HTTP responses • Symptoms: • Slow requests • Servers hitting MaxClients and 500 error
Possible Causes • Possible causes: database backed analytics tracking • Session management • Any SQL DML (UPDATE, DELETE)
Risk increases with: • The number of requests invoking the write operation • Traffic • Concurrent background operations • The algorithmic complexity of the write • Slow AWS I/O on EBS
Solutions • Asynchronize! • Write to a queue • Write to memcached or other non-ACID store • Later bring to data warehouse for advanced analytics
More info • Nygard, Michael T. Release It!: Design and Deploy Production-ready Software. Raleigh, NC: Pragmatic, 2007. • Fowler, Martin. Patterns of Enterprise Application Architecture. Boston: Addison-Wesley, 2003. • Kimball, Ralph. The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses: John Wiley & Sons 2010. • Schwartz, Baron. High Performance MySQL: O'Reilly, 2008