670 likes | 791 Views
Web Performance Optimization: scaling the witchcrafts. Xiaoliang “David” Wei www.facebook.com/DavidWei. Agenda. A web site is a distributed system. A web page request can involve many parties: Client browser Web server (and backend caches, DBs ) Content distributor.
E N D
Web Performance Optimization: scaling the witchcrafts Xiaoliang “David” Wei • www.facebook.com/DavidWei
A web site is a distributed system • A web page request can involve many parties: • Client browser • Web server (and backend caches, DBs) • Content distributor
Life of a web page request • From a user clicking event to the presentation of the page at the browser: Browsers Web Servers
Life of a web page request • From a user clicking event to the presentation of the page at the browser: • time to send to and receive from web server through Internet Browsers • ½ NetTime Web Servers
Life of a page request • From a user clicking event to the presentation of the page at the browser: • time to send to and receive from web server through Internet • time to generate HTML at web server Browsers • ½ NetTime Web Servers • GenTime
Life of a page request • From a user clicking event to the presentation of the page at the browser: • time to send to and receive from web server through Internet • time to generate HTML at web server Browsers • NetTime Web Servers • GenTime
Life of a page request • From a user clicking event to the presentation of the page at the browser: • time to send to and receive from web server through Internet • time to generate HTML at web server • time to download multimedia contents and to render the page at browser Browsers Content Distribution Network (CDN) Render Time • NetTime Web Servers • GenTime
Web service: a distributed system • Questions: • Scalability • Consistency • Performance Browsers Content Distribution Network (CDN) Render Time • NetTime Web Servers • GenTime
Performance impacts business • Google: +400ms => -0.6% daily search per user • Bing: +500ms => -1% click, +1.2 sec time to click (2x) • Facebook: user session time stays the same, faster => more page views
Status of web performance optimization • Decade-old protocols: TCP, HTTP, HTML • Fast-changing industry: browser war • Optimization is mostly witchcraft • Some rules emerging (e.g. Steve Souder’s books) • Large scale optimization is difficult
Web performance: TCP / HTTP / HTML • TCP overhead: 3-way handshake, slow start => persistent conn • TCP performance: depends on loss rate and distance => multiple conn • HTTP persistent connection: maximum 2 per server • HTTP Pipeline:failed, sequential loading • HTML: very limited description for priority/concurrency • => Big resources, multiple connections are preferred
Web performance: Browsers • most browsers blocked on JavaScript download / execution; some blocked on Stylesheet • Stylesheet application might be slow • JS execution might be slow • Maximum connections per browser • => The less the better
Web performance: “best practice rules” • Minimize HTTP requests • Gzip resources • Preload Components • Reduce DNS Lookup • Split Requests Across Domain
Web performance: trade-offs • Minimize HTTP requests Increased unused bytes • Gzip resources Increased CPU utilization • Preload Components Increased bandwidth cost • Reduce DNS Lookup Split Requests Across Domain • Split Requests Across Domain Reduce DNS Lookup
Web performance optimization • Improve speed • Reduce cost (both latency penalty and $ cost) • Operate on the equilibrium • Easy for a single page, hard for large scale systems
Performance optimization: challenges • Large scale • Viral growth • Agile development
Challenges: Large scale • Thousands of servers in multiple locations • Millions of users per minute • Billions of page views served per day • Hundreds of thousands of developers • Thousands of web resource components
Challenges: Viral growth • Concept of “Platform”: A single web page can have many features from many developers • A feature could be adopted by millions of users in a day • Features come and go
Challenges: Agile development • Fast release, allow mistake, fix quickly • Frequent code base changes • Continuous release
Web performance: the gap • Large scale • Viral growth • Agile development “Reduce # of HTTP Requests” Package JS/CSS; Sprite Image Large Scale Operation
Web performance: fill the gap • Large scale • Viral growth • Agile development “Reduce # of HTTP Requests” Package JS/CSS; Sprite Image Interface, Model, and Infrastructure Large Scale Operation
Adaptive Performance optimization • Example: “Best Practices for Speeding Up You Web Site” – YUI Rule 1: Minimize HTTP Request • Combined files: package Javscript and Stylesheets • Sprite Images • …
Performance optimization: Case Study • Day 1: Some smart engineers start a project! <Print css tag for feature A> <Print css tag for feature B> <Print css tag for feature C> <print HTML of feature A> <print HTML of feature B> <print HTML of feature C> … “Let’s write a new page with features A, B and C!”
Day 2: Some smart engineers read YUI blog and says… <Print css tag for feature A> <Print css tag for feature B> <Print css tag for feature C> <print HTML of feature A> <print HTML of feature B> <print HTML of feature C> … “A & B & C are always used; let’s package them together!”
Day 2: Awesome! <Print css tag for feature A&B&C> <print HTML of feature A> <print HTML of feature B> <print HTML of feature C> …
Day 3: feature C evolves… <Print css tag for feature A & B & C> <print HTML of feature A> <print HTML of feature B> If (users_signup_for_C()) { <print HTML of feature C>} …
Day 3: <Print css tag for feature A & B & C> <print HTML of feature A> <print HTML of feature B> If (users_signup_for_C()) { <print HTML of feature C>} … A&B are always used, while C is not. ..
Day 4: feature C is deprecated <Print css tag for feature A & B & C> <print HTML of feature A> <print HTML of feature B> // no one uses C { <print HTML of feature C>} …
Day 4: we start to send unused bits <Print css tag for feature A & B & C> <print HTML of feature A> <print HTML of feature B> // no one uses C { <print HTML of feature C>} … It is hard to remember we should remove C here.
One months later… <Print css tag for feature A & B & C & D & E & F & G…> if (F is used) <print HTML of feature F> <print HTML of feature G> if (F is not used) { <print HTML of feature E>} … Thousands of dead CSS bytes in the package.
Performance Optimization: the gap • One months later… <Print css tag for feature A & B & C & D & E & F & G…> if (F is used) <print HTML of feature F> <print HTML of feature G> if (F is not used) { <print HTML of feature E>} … “Reduce # of HTTP Requests” Package JS/CSS; Sprite Image Large Scale Operation
Adaptive Static Resource Optimization Fill the gap: • Build interface to separate requirement declaration and delivery of static resources • Model the trade-off of optimization • Infrastructure to analyze requirement logs and globally optimize the delivery performance “Reduce # of HTTP Requests” Package JS/CSS; Sprite Image Interface, Model, and Infrastructure Large Scale Operation
Interfaces • Back to Day 1: require_static(A_css); <render HTML of feature A> require_static(B_css); <render HTML of feature B> require_static(C_css);<render HTML of feature C> <deliver all required CSS> <print all rendered HTML> Separate Declaration from actual Delivery Requirement Declaration lives with HTML Global Optimization on Delivery
Packager: Global JS/CSS Optimization Offline analysis Online process require_static(A_css); <render HTML of feature A> require_static(B_css); <render HTML of feature B> require_static(C_css); <render HTML of feature C> <deliver all required CSS> <print all rendered HTML> Usage Pattern logs Packaging algorithm “Optimal” packages
Packager: Usage Pattern Logs Usage Pattern logs
Packager: Cost/Benefit Model • To package two files A & B: • “Cost”: for page requests that only uses A, we waste the bytes of B, vice versa • “Benefit”: for page requests that uses both A and B: we save one round trip • Bytes / Bandwidth ~ Latency • “Profit” to be maximized: Benefit – Cost
Packager: Cost/Benefit Model • Assume: latency = 40ms, and bandwidth = 1 Mbps • A+B: 40ms * 11.131M • No cost, pure gain. • Definitely package
Packager: Cost/Benefit Model • Assume: latency = 40ms, and bandwidth = 1 Mbps • B+C: 40ms * 11.1M • – 300B / 1Mbps * 0.031M • Benefit larger than cost • OK to package
Packager: Cost/Benefit Model • Assume: latency = 40ms, and bandwidth = 1 Mbps • B+D: 40ms * 1K • – 2K / 1Mbps * 11.13M • Cost larger than benefit • Don’t package
Packager: Optimal packages • Pkg 1: A, B, C • Pkg 2: E, F, G Usage Pattern logs Packaging algorithm “Optimal” packages
Packager: Cost/Benefit Model The weight (count) of each page load The usage matrix (0/1) on the left for m page loads and n resources The package definition (0/1 matrix). P[i,j] = 1 if resource j is in package i. The byte size of each resource
Packager: Cost/Benefit Model The usage matrix (0/1) on the left for m page loads and n resources The weight (count) of each page load Packages for each page load The byte size of each resource Byte size of each package The package definition (0/1 matrix)
Packager: Cost/Benefit Model The usage matrix (0/1) on the left for m page loads and n resources Byte Cost The weight (count) of each page load RT Gain The byte size of each resource The package definition (0/1 matrix)
Packager: Cost/Benefit Model The usage matrix (0/1) on the left for m page loads and n resources The weight (count) of each page load The byte size of each resource The package definition (0/1 matrix)
Packager: Algorithms • Greedy algorithm • Simulated Annealing • Clustering algorithms • … Packaging algorithm
Adaptive Static Resource Optimization Adaptive Packaging / Spriting • Cross-feature optimizations (e.g. search + ads) • Adaptive to change of user behaviors and code developments • Same methodology works for image spriting