460 likes | 616 Views
Saturation of Web usage - on Cognetive Processing Limits and Loyalty of Web Users (work in progress). Mario Christ, Steffan Baron Humboldt University Berlin, Germany. Agenda. Objectives of the study Description of data source and method Results Implications Future Work. WWW Growth.
E N D
Saturation of Web usage- on Cognetive Processing Limits and Loyalty of Web Users(work in progress) Mario Christ, Steffan Baron Humboldt University Berlin, Germany
Agenda • Objectives of the study • Description of data source and method • Results • Implications • Future Work
WWW Growth # Domains Source: Internet Domain Survey, January 2000
Objectives of the Study • Study individual level of usage of the Internet with two alternative measures • Identify distinct groups of trajectories • Identify demographic characteristics that distinguish groups • Identify loyal groups
Data Source:The HomeNet Project • Understanding people’s use of the Internet at home • Provided families with hardware and internet connections • Number of users: n=139 (24) • Period of observation: 2 years (95-97, 6months)
The HomeNet Project • Five sources of data • Questionaires (demographics) • Archive of HomeNet newsgroup messages • Log of help requests • Home interviews • Computer-generated use records
The Data User ID Site 197022;01.08.1996 15:15:00;GET;http;home.netscape.com;/escapes/search/search5.html 197022;01.08.1996 15:16:00;GET;http;www.netzone.com;/~dburns/zizza2.htm 197022;01.08.1996 15:20:01;GET;http;www.webcrawler.com;/ 197022;01.08.1996 15:21:01;GET;http;www.mindspring.com;/%7Engan/9playboy.html 197022;01.08.1996 15:21:03;GET;http;www.compass-ent.com;/playboy/1991cal.html 197022;01.08.1996 15:22:05;GET;http;www.playboy.com;/ 197022;01.08.1996 15:23:03;GET;http;www1.playboy.com;/entertainment/playboy/ 197022;01.08.1996 15:24:02;GET;http;www.playboy.com;/magazine-toc.html 197022;01.08.1996 15:24:03;GET;http;www.playboy.com;/sep96/toc.html 197022;01.08.1996 15:25:01;GET;http;www.playboy.com;/playmates/playmate-sep96.html 197022;01.08.1996 15:26:04;GET;http;www.playboy.com;/playmates/datasheet-sep96.html 197022;01.08.1996 15:57:03;GET;http;www.playboy.com;/playmates/imx/sep96/nude.html 161901;01.08.1996 15:58:02;GET;http;home.netscape.com;/home/whats-cool.html Timestamp Path
Measure: Number of Sessions • Measure: number of Web sessions over time • Why not distinct domains? • Users may converge to domains • Why not page views? • Users may learn how to use the Web efficiently
Statistical Method • Averages (e.g. Number of sessions over entire period of observation) may be concealing heterogeneity in developmental patterns • Assumes that population is composed of a finite mixture of distinct groups defined by their trajectories • Group based approach • Identifies heterogeneity in developmental patterns • Identifies distinctive trajectories of use
Trajectories • Defines the developmental behavior over time • Examples: • Increasers • Decreasers • no changers average t
Statistical method: Advantages • Maximum Likelihood procedure based on mixture modeling • Itentifies distinctive trajectories of development • Provides formal basis for determining number of groups • Provides explicit metric for evaluating the precision of an individuals assignment to a group • Provides group percentages • Uses the data themselves
Model Estimation l: expected number of occurrences (sessions) j: group i: individual t: time month: month (at time t) b0,b1, b2, b3:parameters of the model
Selection of the best Model • determination of the optimal number of groups to compose the mixture • determination of the appropriate order of the polynomial used to model each group’s trajectory • use the Bayesian information criterion (BIC) as a basis for selecting the optimal model
Further Information http://www.stat.cmu.edu/~bjones/http://ibiza.heinz.cmu.edu/trajectories/
Method • Measure number of sessions/month • What’s a session? • do not use log on time! • User log on to do mail! • Users can walk away from the computer • Analyze usage of active users only • Allow for intertemporal comparison • Increase reliability of our measures of loyalty • Identify, what distinguishes low-rate, moderate, and heavy users
Web usage of individuals t0-t6 33.4% 12.4% 54.1%
We see saturation not only if measured in numbers of distinct domains • Exploration period confirmed • No convergence
On individual page view capacity per session • Do low-rate users consume bigger chunks of the Web at a time? • reasonable to expect, that people who go online more often, view less pages per session • Is there a page view / session constant? • Individual cognitive processing limits?
When do users go online? • Do low-rate user go online less often because of limited time available?
What determines time available and the related number of sessions?
Poisson Model Poisson regression Number of obs = 22 LR chi2(5) = 85.90 Prob > chi2 = 0.0000 Log likelihood = -93.127504 Pseudo R2 = 0.3156 ------------------------------------------------------------------------------ sessionmean | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- white | -.0180425 .2604593 -0.07 0.945 -.5285333 .4924483 male | .7243205 .165975 4.36 0.000 .3990156 1.049625 signage | .0120343 .0053179 2.26 0.024 .0016115 .0224572 mailout | .0609818 .0393237 1.55 0.121 -.0160912 .1380547 mailin | .1670439 .0701686 2.38 0.017 .0295159 .3045719 _cons | 1.526198 .3135868 4.87 0.000 .9115796 2.140817 ------------------------------------------------------------------------------
On Loyalty • Two measures: • Page views / domain • Sequences (former main focus of this research, measure influenced by the user’s willingness to follow links) • Hypothesis: number of sequences decreases because user learn over time which paths to follow in the Web
Mint Query • select t • from node as a b, • template a * b as t • where a.accesses > 4 • and b.accesses > 4 • and a.support > 0 • and b.support > 0
Implications and Conclusions • Saturation in sessions • No group follows an upward path • Web users have a limited overall capacity (12.4%:8, 33.4%: 16, 54.1%: 40) • Constant page views / session • Web users have also a limited capacity per session … the Web will not be growing forever. • Demographic factors • Heavy users are younger, male, heavy mail users… might be targeted in Internet marketing • More important: Who’s loyal? • Number of sequences declining… increasing loyalty? • Page view / domain only increasing for group of moderate users (moms, minorities, female)
Future Work • Find alternative measures of loyalty • Count number of sequence with a confidence>x. • add session identifiers to the web logs and • measure the number of distinct domains per session and (distraction & exploration issue) • page views per domain in given sessions • Do research on the ‘purpose’ of starting a Web session (loyalty issue)
Future work on loyalty • Session1, 3 page views • Yahoo • Lycos • Google • Session10, 3 page views • CNN • CNN • CNN
Thanks. Questions? Comments? christ@wiwi.hu-berlin.de (Bettina, Steffan: Please suggest conferences for this research, Bled’02 might not be appropriate)
Acknowledgements HomeNet is funded by grants from Apple Computer, AT&T, Bell Atlantic, Bellcore, Intel, Carnegie Mellon University’s Information Networking Institute, Interval, the Markle Foundation, the NPD Group, the U.S. Postal Service, and US West. Farallon Computing and Netscape Communications contributed software The work of Daniel S. Nagin was supported by the National Science Foundation under Grant No. SBR-9513040 to the National Consortium on Violence and also by separate National Science Foundation grants SBR-9511412 and SES-9911370. Mario Christ was supported by the German Research Society, Berlin-Brandenburg, Graduate School in Distributed Information Systems (DFG grant no. GRK~316). This research was also supported by the TransCoop program of the Alexander von Humboldt Foundation, Bonn, Germany. The work of Ramayya Krishnan was funded in part by NSF grant CISE/IIS/KDI 9873005.