290 likes | 428 Views
User Interface. Group. Research. for. Analyzing Web Logs. Sarah Waterson 18 April 2002 SIMS 213. Talk Outline. What is a web log? Where do they come from? Why are they relevant? How can we analyze them? Study Discussion. A record of a visit to a web page Visitor (IP address)
E N D
UserInterface Group Research for Analyzing Web Logs Sarah Waterson 18 April 2002 SIMS 213
Talk Outline • What is a web log? • Where do they come from? • Why are they relevant? • How can we analyze them? • Study • Discussion
A record of a visit to a web page Visitor (IP address) URL Time of visit Time spent on a page Browser used Referring URL Type of request Reply code Number of bytes in the reply etc… What is a web log? A record of a visit to a web page
A record of a path through web pages Visitor (IP address) URL Time of visit Time spent on a page Browser used Referring URL Type of request Reply code Number of bytes in the reply Next URL etc… What is a clickstream? A record of a path through web pages
What is a Web Log? Apache web log: 205.188.209.10 - - [29/Mar/2002:03:58:06 -0800] "GET /~sophal/whole5.gif HTTP/1.0" 200 9609 "http://www.csua.berkeley.edu/~sophal/whole.html" "Mozilla/4.0 (compatible; MSIE 5.0; AOL 6.0; Windows 98; DigExt)" 216.35.116.26 - - [29/Mar/2002:03:59:40 -0800] "GET /~alexlam/resume.html HTTP/1.0" 200 2674 "-" "Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; http://www.inktomi.com/slurp.html)“ 202.155.20.142 - - [29/Mar/2002:03:00:14 -0800] "GET /~tahir/indextop.html HTTP/1.1" 200 3510 "http://www.csua.berkeley.edu/~tahir/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)“ 202.155.20.142 - - [29/Mar/2002:03:00:14 -0800] "GET /~tahir/animate.js HTTP/1.1" 200 14261 "http://www.csua.berkeley.edu/~tahir/indextop.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)“
Proxy Log Where do they come from? Servers • Done on most web servers • Standard formats Clients • Browsers, loggers on client machine • Must send data back Proxies • Similar to servers • Hang out in between client and server
Why are web logs relevant? • Lots of data • Quantitative analysis is much more fun! • User behavior, patterns • Real users, tasks • Or at least more realistic users and tasks • Leaving the usability lab • Testing effect • Fast, easy, cheap • Automatic or almost-automatic
Ed Chi asks… Usage: • How has information been accessed? • How frequently? • What’s popular? What’s not? • How do people enter the site? Exit? • Where do people spend time? • How long do they spend there? • How do people travel within the site? • Who are the people visiting?
Ed Chi asks… Structural: • What information has been added, deleted, modified, moved? • Usage + Structural • What happens when the site changes? (Google) • Does navigation change? • Does popularity change? • What about missing data?
How do you analyze web logs? • Data Mining: task or intent unknown • “Automated extraction of hidden predictive information from (large) databases” – Kurt Thearling • Server log analysis • Remote Usability Testing: task or intent known • Similar to traditional lab usability testing • Clickstream analysis What are people doing? How well does the site support what people are doing?
How? Data Mining Statistics and numbers galore! • Gazillions of tools for server log analysisComputers>Software>Internet>Site Management> Log Analysis • Usually charts, graphs, numbers galore • Analog & NetTracker typical statistics • In 3D too (eBizinsights)
How? Data Mining cont’d Other interesting work: • Web Ecologies (Chi) • Development over time • Information scent (Chi) • Behavior patterns • Understand how to organize info “Information scent is made of cues that people use to decide whether a path is interesting.“ • Useful for web designers?
How? Remote Usability Testing • Analyze clickstream in the context of the task and user intentions • Can be gathered on client, server, and via proxy • Varied granularities of interaction • Mouse movements page access • Varied levels of user awareness • Interactive invisible • Varied levels of access • Site only entire web
WebVip and VisVip (NIST) Server side logging Javascript instrumentation Individual paths within context of site Animation/replay sessions Questions: What part of site used for a task? Not used? How long to finish task? Per page? What sorts of behavior for task? How? Remote Usability Testing
ClickViz (Blue Martini) Server side logging Custom instrumentation Aggregate paths based on file system Include demographics, purchase history Filtering Questions: How does visitor of type X compare to type Y? Success vs. “failure” How? Remote Usability Testing
NetRaker Clickstream Vividence ClickStreams How? Remote Usability Testing • Not restricted to servers • Testing suites • Interesting aggregation methods
How? Remote Usability Testing WebQuilt (GUIR) Logging Design Goals: • Extensible, Scalable • Allow for unobtrusive, “naturalistic” user interaction • Multi-platform, multi-device compatibility • Fast and easy to deploy on any website Solution: • Proxy-based logger rewrites links • Nearly invisible to user • Independent of client browser • Infer actions (e.g. back button clicks) • Stand alone or use with other tools
How? Remote Usability Testing WebQuilt (GUIR) Visual Analysis Tool: • Put data within context of the design • Show deviations from expected paths • Interactive graph
Study: Purpose • Exploratory comparison of lab and remote usability testing with mobile devices • What types of usability issues can we: • find with either method? • find with one that we can’t find with the other? • Design implications • testing tools • testing strategies
Study: The Mobile Web • Limited and/or new interaction methods • Small screens • Graffiti, keypads, thumb-pads • Beyond the desktop • Driving, traveling, walking • Noisy, public Gathering good usability data is vital to making these interfaces, and subsequently these devices, successful.
Study: Design • 10 users asked to find: • Anti-lock brake information on the latest Nissan Sentra • The closest Nissan dealer • http://pda.edmunds.com • Handspring Visor Edge withOmniSky wireless modem • 5 users in the lab • 5 users in the wild • Web-based questionnaires
Study: Identifying Usability Issues Lab Data • Tester observations • Participant comments • Questionnaire Remote Data • Clickstream analysis • Questionnaire Severity Levels • 0 indicates a comment • 15 (minorcritical) Four Categories • Device • Browser • Site Design • Test Design
Study: Caveats • Analysis and observation for both tests done by same person • Issues identified from remote tests first • Avoids biasing remote analysis tools • Looking for potential problem areas
Study: Results Totals: • 18 unique issues • 7 found remotely • 1/3 device or browser related Site Design • 5 of the 9 issues • 3 of the 4 with severity level > 3 Test Design • 2 of the 6 issues • 2 of the 4 with severity level > 3
Study: Process Observations Remote usability testing can capture some usability issues that lab testing already discovers Lab testing gets me: • Qualitative observations • Thinking aloud comments • Non-content usability issues
Study: Process Observations What can remote testing get us that labs can’t? • Lab effect • Quitting a task is easier when not in lab • Network problems more realistic • With more users • Patterns emerge • Can reduce uncertainty • Faster
Study: Conclusions Remote usability testing is a promising technique for capturing realistic usage data for mobile web site design Main concerns • Gathering user feedback on mobile devices is even more difficult because of limited input • Understanding users can be ambiguous • Potentially alleviated by ability to test larger number of users
Design Evaluate Prototype Discussion • Comments • Questions • Where does web log analysis fit into a design cycle? • Understanding what methods to use when and where • Experiences? • These or other tools?