110 likes | 283 Views
Web Analysis. Introduction. Web analysis includes the study of users’ behavior on the web Traffic analysis – Usage analysis Behavior at particular website or across the entire web Web server log files, page tagging, cookies, network packet sniffing, route tracing Web usage mining. Cookies.
E N D
Introduction • Web analysis includes the study of users’ behavior on the web • Traffic analysis – Usage analysis • Behavior at particular website or across the entire web • Web server log files, page tagging, cookies, network packet sniffing, route tracing • Web usage mining
Cookies • Cookies are unique identifiers, usually text files placed on users’ computers by web servers hosting web sites • ‘Stateful’ HTTP • Used to authenticate users, optimize and personalize the presented information, to serve custom advertisements or just to make browsing more convenient for the user
Log files • Web servers record their transactions in log files • Information available includes click streams, page views and visits • Other useful information - number of unique visitors, least and most visited pages, average length per visit, top referring sites, top search words • Log files mined to provide data for analysis
Page tagging • “Web bugs” or “web beacons” • Information retrieved from log files may not be accurate in the presence of web caches • A web bug or web beacon is a small object, usually a transparent graphic image which is embedded in a web page to track the users visiting the web page and identify them • Usually used by third‐party servers
Network packet sniffing • Similar to a wiretapping device • The sniffer is passive device that can capture each packet as data streams flow across the network decode and analyze its content • Network management advantages - analyzing network problems, detecting intrusion attempts, debug network protocols and filter suspected content from network traffic • Vulnerable to misuse - monitoring network usage, reporting statistics and collecting sensitive information
Uses - Personalization • ‘Personalization is the provision to an individual of tailored products, services, information or information relating to products or services’ • Information gathered implicitly or explicitly and aggregated • Memorization and Customization • Applications - customizing access to information sources, filtering news or e‐mails, targeted marketing and advertising, recommendation services for the browsing process
Uses - Advertising • Custom advertising based on information gathered • Example – DoubleClick • A user’s “profile” includes demographic attributes (age, income etc.) and preferences that may be gathered explicitly or implicitly • Marketing Web analysis data can be analyzed to determine how visitors to the web site react to marketing offers and improve marketing strategies
Privacy Implications • Privacy is the control of personal information • Anonymity on the web • Privacy Vs. convenience • Many websites do not take user’s permission before placing a tracking device, this is the default behavior and functionality is affected if this feature is disabled • Example - Toysmart, AOL • Third-party cookies, email web bugs, route tracing and identity theft
Technological support • Secure cookies - encryption of the messages passed through SSL • Email clients to prevent web bugs • Platform for Privacy Preferences Project (P3P) • Anonymizing tools • Anonymous proxy (Anonymizer) • Lucent Personalized Web Assistant (LPWA) • Crowds • Anonymous Routing (Onion routers) – provide layers of encryption