270 likes | 433 Views
Understanding Website Complexity: Measurements, Metrics, and Implications . Michael Butkiewicz Harsha Madhyastha UC Riverside. Vyas Sekar Intel Labs . doubleclick.net. cdn.turner.com. ads.cnn.com. Websites today are very complex!
E N D
Understanding Website Complexity: Measurements, Metrics, and Implications Michael Butkiewicz Harsha Madhyastha UC Riverside Vyas Sekar Intel Labs
doubleclick.net cdn.turner.com ads.cnn.com Websites today are very complex! Diverse content from many servers and third party services cnn.com facebook.com
Users see slow loading websites! 30% > 3 Seconds Median = 2 Seconds 67% of users encounter “slow” sites once a week (gomez.com)
Why does load time matter? Implications for: Website owners End users Browser developers Customization Source: gomez.com
Our work • Comprehensive study of website complexity • Analysis of sites across rank and category • Content and Service level metrics • Key metrics that impact performance
Roadmap • Introduction • Measurement Setup • Complexity • Performance Implications • Discussion and Summary
Measurement Setup • Selecting websites • 1,700 websites from Quantcast top 20k • Primary focus on landing (home) page • Annotated with AlexaCategories • Tools • Firefox + Firebug • No Local Caching • Approach • 4 vantage points (3 EC2, 1 UCR) • Every 60 second one page loaded • ~30 measurements per site per vantage point over 9 weeks
Example Site Download Objects B C A D HTML CSS Image Script Time
Example Firebug Log "log":{ "browser":{ "name":"Firefox”, }, "pages":[{ "startedDateTime":"18:12:59.702-04:00", "title":"WiredNews”, "pageTimings":{ "onLoad":4630 }] "entries":[{ "startedDateTime":"18:12:59.702-04:00", "time”:75, "request":{ ... "headers":[{ "name":"Host", "value":"www.wired.com" }, } “response":{ "content":{ "mimeType":"text/html", "size":186013, ... ...
Roadmap • Introduction • Measurement Setup • Complexity • Content-level • Service-level • Performance Implications • Discussion and future work
Number of Objects: Across Categories Median 125 Objects! Median Site = 57 Objects!
Number of Objects: Across Ranks 20% > 100! Not as much difference across rank ranges
Types of Content Median site: 33 Images, 10 JavaScript, 3 CSS, 0 Flash Flash
Normalized types of content Type % Mostly Homogeneous; Flash Skewed
Roadmap • Introduction • Measurement Setup • Complexity • Content-level • Service-level • Performance Implications • Discussion and Summary
Number of Servers Median site requires contacting 8 servers Median News 30 Servers!
Number of Origins 20% Sites > 13 Origins Median site requires contacting 6 origins Median News 20 Origins!
Popular non-origin providers Not just usual suspects! bluekai.com imrworldwide.com invitemedia.com > 5% of sites each! Most common services: Analytics & Advertising Most common objects: Image (small!) and Javascript
Contribution of non-origin services 20% sites > 80% from 3rd Party Median Site Objects / Bytes 30% 35%
Contribution of non-origin services 80% from 3rd Party Median Site Objects / Bytes 30% 35% Time only 15%
Roadmap • Introduction • Measurement Setup • Complexity • Performance Implications • Discussion and Summary
Metric Review • Content-Level Characteristics • Total Objects • Object Type: Number, Size • Absolute & Normalized • Service-Level Characteristics • Number: Server & Origins • Non-Origin Fraction: Servers / Objects / Time • Total combination of 33 metrics
Roadmap • Introduction • Measurement Setup • Complexity • Content-level • Service-level • Performance Implications • Discussion and Summary
Discussion: Many other variables • Client-side plugins • NoScript reduces #objects by half! • Mobile-specific customizations • Mobile version reduces #objects to a quarter • Landing vs. non-landing pages • Non-landing seem less complex
Conclusions • Comprehensive study of Website Complexity • Median site: 57 objects, contacts 8 servers across 6 origins • Categories show more differences than popularity ranks • Non-origin content: • Analytics & advertising popular • large # objects, bytes, servers, not time • Key performance indicators: • Load Time # objects • Variability # servers Data: www.cs.ucr.edu/~harsha/web_complexity