1 / 29

Analyzing Web Logs

User Interface. Group. Research. for. Analyzing Web Logs. Sarah Waterson 18 April 2002 SIMS 213. Talk Outline. What is a web log? Where do they come from? Why are they relevant? How can we analyze them? Study Discussion. A record of a visit to a web page Visitor (IP address)

rafi
Download Presentation

Analyzing Web Logs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UserInterface Group Research for Analyzing Web Logs Sarah Waterson 18 April 2002 SIMS 213

  2. Talk Outline • What is a web log? • Where do they come from? • Why are they relevant? • How can we analyze them? • Study • Discussion

  3. A record of a visit to a web page Visitor (IP address) URL Time of visit Time spent on a page Browser used Referring URL Type of request Reply code Number of bytes in the reply etc… What is a web log? A record of a visit to a web page

  4. A record of a path through web pages Visitor (IP address) URL Time of visit Time spent on a page Browser used Referring URL Type of request Reply code Number of bytes in the reply Next URL etc… What is a clickstream? A record of a path through web pages

  5. What is a Web Log? Apache web log: 205.188.209.10 - - [29/Mar/2002:03:58:06 -0800] "GET /~sophal/whole5.gif HTTP/1.0" 200 9609 "http://www.csua.berkeley.edu/~sophal/whole.html" "Mozilla/4.0 (compatible; MSIE 5.0; AOL 6.0; Windows 98; DigExt)" 216.35.116.26 - - [29/Mar/2002:03:59:40 -0800] "GET /~alexlam/resume.html HTTP/1.0" 200 2674 "-" "Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; http://www.inktomi.com/slurp.html)“ 202.155.20.142 - - [29/Mar/2002:03:00:14 -0800] "GET /~tahir/indextop.html HTTP/1.1" 200 3510 "http://www.csua.berkeley.edu/~tahir/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)“ 202.155.20.142 - - [29/Mar/2002:03:00:14 -0800] "GET /~tahir/animate.js HTTP/1.1" 200 14261 "http://www.csua.berkeley.edu/~tahir/indextop.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)“

  6. Proxy Log Where do they come from? Servers • Done on most web servers • Standard formats Clients • Browsers, loggers on client machine • Must send data back Proxies • Similar to servers • Hang out in between client and server

  7. Why are web logs relevant? • Lots of data • Quantitative analysis is much more fun! • User behavior, patterns • Real users, tasks • Or at least more realistic users and tasks • Leaving the usability lab • Testing effect • Fast, easy, cheap • Automatic or almost-automatic

  8. Ed Chi asks… Usage: • How has information been accessed? • How frequently? • What’s popular? What’s not? • How do people enter the site? Exit? • Where do people spend time? • How long do they spend there? • How do people travel within the site? • Who are the people visiting?

  9. Ed Chi asks… Structural: • What information has been added, deleted, modified, moved? • Usage + Structural • What happens when the site changes? (Google) • Does navigation change? • Does popularity change? • What about missing data?

  10. How do you analyze web logs? • Data Mining: task or intent unknown • “Automated extraction of hidden predictive information from (large) databases” – Kurt Thearling • Server log analysis • Remote Usability Testing: task or intent known • Similar to traditional lab usability testing • Clickstream analysis What are people doing? How well does the site support what people are doing?

  11. How? Data Mining Statistics and numbers galore! • Gazillions of tools for server log analysisComputers>Software>Internet>Site Management> Log Analysis • Usually charts, graphs, numbers galore • Analog & NetTracker typical statistics • In 3D too (eBizinsights)

  12. How? Data Mining cont’d Other interesting work: • Web Ecologies (Chi) • Development over time • Information scent (Chi) • Behavior patterns • Understand how to organize info “Information scent is made of cues that people use to decide whether a path is interesting.“ • Useful for web designers?

  13. Web Ecologies (Chi 1998)

  14. How? Remote Usability Testing • Analyze clickstream in the context of the task and user intentions • Can be gathered on client, server, and via proxy • Varied granularities of interaction • Mouse movements  page access • Varied levels of user awareness • Interactive  invisible • Varied levels of access • Site only  entire web

  15. WebVip and VisVip (NIST) Server side logging Javascript instrumentation Individual paths within context of site Animation/replay sessions Questions: What part of site used for a task? Not used? How long to finish task? Per page? What sorts of behavior for task? How? Remote Usability Testing

  16. ClickViz (Blue Martini) Server side logging Custom instrumentation Aggregate paths based on file system Include demographics, purchase history Filtering Questions: How does visitor of type X compare to type Y? Success vs. “failure” How? Remote Usability Testing

  17. NetRaker Clickstream Vividence ClickStreams How? Remote Usability Testing • Not restricted to servers • Testing suites • Interesting aggregation methods

  18. How? Remote Usability Testing WebQuilt (GUIR) Logging Design Goals: • Extensible, Scalable • Allow for unobtrusive, “naturalistic” user interaction • Multi-platform, multi-device compatibility • Fast and easy to deploy on any website Solution: • Proxy-based logger rewrites links • Nearly invisible to user • Independent of client browser • Infer actions (e.g. back button clicks) • Stand alone or use with other tools

  19. How? Remote Usability Testing WebQuilt (GUIR) Visual Analysis Tool: • Put data within context of the design • Show deviations from expected paths • Interactive graph

  20. Study: Purpose • Exploratory comparison of lab and remote usability testing with mobile devices • What types of usability issues can we: • find with either method? • find with one that we can’t find with the other? • Design implications • testing tools • testing strategies

  21. Study: The Mobile Web • Limited and/or new interaction methods • Small screens • Graffiti, keypads, thumb-pads • Beyond the desktop • Driving, traveling, walking • Noisy, public Gathering good usability data is vital to making these interfaces, and subsequently these devices, successful.

  22. Study: Design • 10 users asked to find: • Anti-lock brake information on the latest Nissan Sentra • The closest Nissan dealer • http://pda.edmunds.com • Handspring Visor Edge withOmniSky wireless modem • 5 users in the lab • 5 users in the wild • Web-based questionnaires

  23. Study: Identifying Usability Issues Lab Data • Tester observations • Participant comments • Questionnaire Remote Data • Clickstream analysis • Questionnaire Severity Levels • 0 indicates a comment • 15 (minorcritical) Four Categories • Device • Browser • Site Design • Test Design

  24. Study: Caveats • Analysis and observation for both tests done by same person • Issues identified from remote tests first • Avoids biasing remote analysis tools • Looking for potential problem areas

  25. Study: Results Totals: • 18 unique issues • 7 found remotely • 1/3 device or browser related Site Design • 5 of the 9 issues • 3 of the 4 with severity level > 3 Test Design • 2 of the 6 issues • 2 of the 4 with severity level > 3

  26. Study: Process Observations Remote usability testing can capture some usability issues that lab testing already discovers Lab testing gets me: • Qualitative observations • Thinking aloud comments • Non-content usability issues

  27. Study: Process Observations What can remote testing get us that labs can’t? • Lab effect • Quitting a task is easier when not in lab • Network problems more realistic • With more users • Patterns emerge • Can reduce uncertainty • Faster

  28. Study: Conclusions Remote usability testing is a promising technique for capturing realistic usage data for mobile web site design Main concerns • Gathering user feedback on mobile devices is even more difficult because of limited input • Understanding users can be ambiguous • Potentially alleviated by ability to test larger number of users

  29. Design Evaluate Prototype Discussion • Comments • Questions • Where does web log analysis fit into a design cycle? • Understanding what methods to use when and where • Experiences? • These or other tools?

More Related