420 likes | 440 Views
#16 Application Measurement. Presentation by Bobin John. 1 st paper:. Measurement, Modeling & Analysis of a Peer-to-Peer File-Sharing Workload (KaZaa paper). KaZaa paper. P2P file sharing is the most dominant This paper deals with KaZaa 200-day trace is taken Model is developed
E N D
#16 Application Measurement Presentation by Bobin John
1st paper: Measurement, Modeling & Analysis of a Peer-to-Peer File-Sharing Workload (KaZaa paper)
KaZaa paper • P2P file sharing is the most dominant • This paper deals with KaZaa • 200-day trace is taken • Model is developed • Locality-awareness can improve KaZaa performance
Trace Methodology KaZaa trace summary statistics KaZaa “usernames” used KaZaaLite … IPs used Easy to distinguish KaZaa-specific HTTP headers Auto-update transactions filtered out KaZaa paper
KaZaa paper • User Characteristics • KaZaa users are patient
KaZaa paper • User Characteristics • Users slow down as they age • 2 reasons: attrition & slowing down over time
KaZaa paper • Client Activity
KaZaa paper • Object Characteristics • Diverse workload
KaZaa paper • Object Characteristics • Object Dynamics • Clients fetch objects at most once • Popularity of objects is often short-lived • Most popular objects tend to be recently born objects • Most requests are for old objects
KaZaa paper • Object Characteristics • NOT Zipf-like • Web access patterns follow the Zipf property
KaZaa paper • Model
KaZaa paper • Model for P2P file-sharing workloads • Model Description
KaZaa paper • Model for P2P • File-Sharing effectiveness diminishes with client age
KaZaa paper • Model for P2P • New Object Arrivals improve performance
KaZaa paper • Model for P2P • New clients cannot stabilize performance
KaZaa paper • Model for P2P • Model validation
KaZaa paper • New idea! • How to reduce bandwidth cost? • Use a proxy cache • Legal & political problems • Locality-aware request routing • Centralized request redirection • redirector • Decentralized request redirection • supernodes
KaZaa paper • Locality awareness • Methodology • Benefits
KaZaa paper • Locality awareness • Accounting for Hits & Misses
KaZaa paper • Locality awareness • Availability
KaZaa paper • Conclusion • KaZaa workload is different • Does not follow Zipf • Can be improved with locality awareness • Drawbacks • A trace from a university ought not to be generalized to all KaZaa/P2P applications • Further implementation details of locality-awareness? • Scope of use for such a locality awareness tool? • I don’t think universities would like this
2nd paper: An analysis of Internet Chat systems
Chat paper • Why is chat a worthwhile target for traffic characterization? • Chat offers computer mediated communication • Used by a large number of people … potential of being habit-forming
Chat paper • Different types of chat systems: • Internet Relay Chat [IRC] • Web-based chat systems • ICQ & AIM • Gale
Chat paper • Problem in analyzing chat traffic • Multitude & diversity of systems & protocols • Chat protocol realized on top of HTTP protocol … difficult to separate chat traffic • Resource limitations due to filtering demands
Chat paper • IRC • Set of connected servers • Client connection requests on port 6667 • Unique nicknames • Discussion channels • Channel operators • Medium to share data • IRC operator
Chat paper • Web-chat • Not tty-based … Web browser interface • A single server to connect to • 3 classes of chat systems: • HTML-Web-Chat • Applet-Web-Chat • Applet-IRC-Chat • Difference between IRC & Web-chat is only “social”
Chat paper • Identifying IRC chat traffic • Packet monitor that captures all TCP traffic involving port 6667 • Can only capture text & control messages • Data/file transfers cannot be captured as they run on other TCP connections • IRC’s packet size distribution is mainly dominated by small packets • IRC session should last more than a few minutes • IRC sends keep-alive messages
Chat paper • Identifying Web-chat traffic • HTML-Web-chat: • Appropriate cache-control-headers • Adding state information • Cache-Control: Must-revalidate & Cache-Control: Private indicates non-chat traffic • Use of scripting languages e.g.,Javascript • Use of applet windows e.g., Java
Identifying Web-chat traffic Applet-Web-chat: User would have accessed a Java file or a script or even a page like “xxxchatyyy” … “chat” could occur even in the path Chat paper
Chat paper • Overall strategy for extracting chat traffic
Chat paper • Overall strategy for extracting chat traffic • Repeat this process • Identify traffic that cannot be chat traffic • Remove it • Steps that filter out more non-chat traffic has to be implemented earlier • Other steps that need more processin gor pre-processing should be implemented later
Chat paper • Overall strategy for extracting chat traffic • Eliminate traces from ports < 1024 except port 80 • Also eliminate trace from well-known application ports (e.g., Gnutella - 6346) • Group packets into flows • Mark & filter them according to the previous table
Chat paper • Experiment • At University of Saarland • Resource partitioning • Traces were generated after filtering • 950GB > 1.2GB > 238MB (WEBCHAT1) • 192MB (IRC1) • 350MB (WEBCHAT2)
Chat paper: • Validation • 2 aspects: • Recall – ability of a system to present all relevant items • Precision – ability of a system to present only relevant items
Chat paper • Validation • Lots of calculations “we can expect to locate about 91.7% of all real chat connections and that we expect that at least 93.1% of all connections we identify are indeed chat connections. “
Chat paper • Results • Session durations
Chat paper • Results • Interarrival times of sessions
Chat paper • Results • Packet sizes
Chat paper • Results • Sent & Received bytes
Chat paper • Conclusion • Chat-traffic was successfully filtered out • Accuracy was above 90% • Drawbacks • Use of this work?