260 likes | 401 Views
Automatic Data Collection: Server Logs. As with all methods, have to ask:. What are the goals for your system? What constitutes success, or good quality service? How can you conceptualize and operationalize quality? What information can you get using this method?
E N D
As with all methods, have to ask: • What are the goals for your system? • What constitutes success, or good quality service? • How can you conceptualize and operationalize quality? • What information can you get using this method? • How will this info help you evaluate performance?
Sources of data about visits and visitors • Provided by users • Registration, and whatever demographics and preferences are asked about • Captured by system • Server log files • Cookies
Benefits of monitoring data • Can yield lots of data for relatively low investment • Unobtrusive; “outcroppings” • Numbers communicate well • Numbers are useful for comparisons • “hits are up 20% over this time last year”
One example of simple stats • Compare January (with DWR photos) to March (DWR photos removed) • http://elib.cs.berkeley.edu/webstats/Mar2002.html
Common measures • “According to Forrester Research, many companies still use hits as the primary measurement of website success, followed by page views and session length.”
Hit “The retrieval of any item, like a page or a graphic, from a Web server. For example, when a visitor calls up a Web page with four graphics, that's five hits, one for the page and four for the graphics. For this reason, hits often aren't a good indication of Web traffic. See page view.” http://www.webopedia.com/TERM/h/hit.html
Measuring success • “Companies sometimes make the mistake of buying elaborate software packages that analyze data a million ways, and then neglect to look at the most basic, day-to-day measurements of how a site is doing in its primary function…. • For an e-commerce site, those basic measurements are conversion rate—that is, the ratio of buyers to visitors—and average order size. • For sites that make money via advertising banners… the number of ad banners viewed; • other sites can measure traffic from return visitors versus traffic from new visitors. • Remember one of the most basic elements of delivering a good customer experience: making sure that pages load quickly, even when the site is barraged with traffic.” http://www.cio.com/archive/051500_parade.html
Server logs contents • Time • IP Address • Server • Action • Object • Result code and size • Browser version and platform • Referring URL
Server log contents • Time | IP Address | Server | Action | Object | Result code and size | Browser / version and platform | Referring URL • 01:50:17 216.126.148.89 - ICICWEB1 GET /images/pdq.gif - 200 793 290 Mozilla/4.0+(compatible;+MSIE+4.01;+Windows+98) http://128.231.164.190/pdq.html • 01:50:18 216.126.148.89 - ICICWEB1 GET /images/banner1.gif - 200 4067 294 Mozilla/4.0+(compatible;+MSIE+4.01;+Windows+98) http://128.231.164.190/pdq.html • 01:50:18 216.126.148.89 - ICICWEB1 GET /images/news.gif - 200 1054 291 Mozilla/4.0+(compatible;+MSIE+4.01;+Windows+98) http://128.231.164.190/pdq.html
What you can get from server logs • http://www.reportmagic.org/sample/index.html • http://www.mach5.com/products/analyzer/index.php
Some issues in using log data • Differentiating users from machines or proxies • Cookies and registration • Relating IP addresses, user locations, user characteristics • identifying sessions • Cookies; assumptions about nature of sessions • Measuring hits • cached pages? • Interpreting results relative to your goals
One source recommends: • Who is visiting your site • unique visitor identification so you know whether a visitor is returning to your site. • The path visitors take through your pages -- “visitor trails” • knowing each page a visitor viewed and the order, you can identify trends in how visitors navigation through your pages. • what element (link, icon) a visitor clicked on each page to go to the next page. • How much time visitors spend on each page • They say: “A pattern of lengthy viewing time on a page might lead you to deduce the page is very interesting or very confusing.” • But…How do you know what (else) the user is doing?
Recommendations, cont. • Where visitors are leaving your site • The last page a visitor viewed before leaving your site might be a logical place to end the visit, or it might be a place where the visitor bailed out. • The success of users’ experiences at your site • Purchases transacted, downloads completed, and information viewed are concrete indicators of tasks accomplished. From Tec-Ed, Inc., "Assessing Web Site Usability from Server Log Files" on Tec-Ed., Inc. Web site http://www.teced.com/c_and_p.html#WU
Another example promises statistics about: • Web server activity • number of visitors, the number of unique IPs, bandwidth used, number of hits they received, broken down by Time Increment, Day of the Week, and Hour of the Day • Type of data visitors access on your site • Web pages viewed, files downloaded, directories accessed, images accessed during a time period. Broken down by Page Views, Browsing Sequences, Downloaded Files, Accessed Directories, Accessed Images. • Referrer information • Referring Domains and Referring URLs. (Referrers are sites with links to your site. )
Promises, cont. • Search engine performance • the search engines which referred visitors to the site, the phrases and keywords visitors searched for broken down by Top Search Engines, Keywords, and Each Search Engine. • Visitors' geographic region • Displays a Most Active Countries graph and a table showing which Countries your visitors come from. • Browsers and platforms visitors used • Errors visitors encountered at the site
Promises, cont. • Advanced visitor filters • Visitors who accessed specific pages or files. • Visitors who came from specific referring URLs. • Day of Week (Example: see what happened on a specific day); Hour of Day. • Visitors whose first visit is a specific page. • Visitors' countries or regions. • Visitors who make purchases on your web site: see information on visitors who actually buy something from your web site. Source: http://www.123loganalyzer.com/features.htm
cookies • Simulate continuous connection, session • Identify user • Store info about user, preferences, past activity http://www.netscape.com/newsref/std/cookie_spec.html
Cookies • “the server nytimes.com wishes to set a cookie that will be sent to any server in the domain nytimes.com • The name and value of the cookie are nytime-s … • The cookie will persist until Tues April 8 14:25:04 2003”
Set-Cookie: NAME=VALUE; expires=DATE;path=PATH; domain=DOMAIN_NAME; secure • NAME=VALUE : a sequence of characters. The only required attribute. • expires=DATE : valid life time of that cookie. Once reached, cookie no longer stored or given out. • domain=DOMAIN_NAME : When searching the cookie list for valid cookies, domain attributes of the cookie are compared with domain name of host from which URL will be fetched. Default is the host name of the server which generated the cookie response. • path=PATH; the subset of URLs in a domain for which the cookie is valid. If not specified, is assumed to be the same as the document described by the header which contains the cookie. • Secure: Cookie will only be transmitted if the communications channel with the host is a secure one.
Other methods • Analyses of queries on site search engines • Emails: • Customer queries and requests for more information • Customer complaints • Suggestion boxes
Analyses • Frequencies • Cross tabulations • Page visited by IP address • Correlations • Beware of assumptions about causality • Graphics • Exponential distributions