230 likes | 248 Views
F2: Data Beyond Numbers: Using Data Creatively for Research Room: Lachine Session Chair: Mary Luebbe, University of British Columbia. The truth is out there. Karsten Boye Rasmussen associate professor, kbr@sam.sdu.dk Department of Marketing and Management University of Southern Denmark
E N D
F2: Data Beyond Numbers: Using Data Creatively for Research • Room: Lachine • Session Chair: Mary Luebbe, University of British Columbia IASISST 2007Montreal1 kbr
The truth is out there • Karsten Boye Rasmussen • associate professor, kbr@sam.sdu.dk • Department of Marketing and Management • University of Southern Denmark • Campusvej 55, DK-5230 Odense M, Denmark • +45 6550 2115 fax: +45 6593 1766 • organization and information technology • business intelligence • strategic organization design • 'it, communication and organization' www.itko.dk • Editor of the IASSIST Quarterly IASISST 2007Montreal2 kbr
The truth is out there The data is out there • Point of departure • Use of data => documentation • Use of Internet for research • Relying on the Stimulus-Response contract • The potential of utilizing non-reactive data: • e-mails • blogs • web-logs (on hits, visits, users, etc.) • web-sites and links • paradata • Mixing methods • Having complete, reliable, and accessible resources IASISST 2007Montreal3 kbr
Data, use, metadata and documentality • "if information is data plus context, knowledge is information plus experience" (Levy & Powell, 2005) • data is description - of the world or objects in the world • description of data - is metadata • DDI 'The Data Documentation Initiative' • The quality measures of validity, reliability, accuracy, precision, bias, representativity, etc. • only available through the documentation of the data • the metadata • High documentality means the dataset is 'pattern' & 'model' IASISST 2007Montreal4 kbr
Errors in survey data • Acquirering primary data • survey is the "ability to estimate with considerable precision the percentage of a population that has a particular attribute by obtaining data from only a small fraction of the total population" (Dillman, 2007) IASISST 2007Montreal5 kbr
Internet & research • a shift in the medium for data collection • self administered internet surveys • web surveys • e-mail surveys • e-mail with links • the link points to a web-questionnaire • a mixed-mode within the Internet media • e-mail with attached questionnaire • the questionnaire in software formats (Word) • e-mail text without attachments or links - answering mail • 3-5 questions • PLUS • non-reactive data web e-mail IASISST 2007Montreal6 kbr
Web survey - some problems • respondents have uneven accessibility to the Internet • unevenness in regard to the technical abilities: • bandwidth, computing power, and software (web-browsers) • however general web-site competences do exist • and telephone ownership is now too widespread • - an other medium is needed IASISST 2007Montreal7 kbr
Web survey - the many pros • some reliable e-mail registers do exist • random selection - but not randomly generated ;-) • CAxI (Computer assisted telephone interviewing) • more complicated structures possible in the answering • software will enforce consistent rule following • experiments using different sequencing of questions • the use of paradata in web (later) IASISST 2007Montreal8 kbr
Web survey - the respondent • Internet coverage, sampling, and the right respondent • sampling is not secured by a large number of respondents • the problem of self-selection • a systematic bias • have to secure the right - or at least only one -respondent on the inquiry • the new problem of a 150 per cent answer rate • log-in procedure with a PIN-code is recommended IASISST 2007Montreal9 kbr
Web survey - success and hazard • quicker turnaround than through the postal or face-to-face questionnaire • raising the data quality by securing timely data • the Internet surveys have a much lower 'marginal cost' • with the Internet and supportive software for web surveys • many more surveys are taking place • maybe too many • respondents tend to be more reluctant to participate in surveys • low response rates • as shown in surveys IASISST 2007Montreal10 kbr
Secondary data – a richness of data • The data is already out there - ready to use • data is being made available and retrievable • raising the data quality through a higher documentation level • ... a long list ... of pros for secondary data • for some areas the complete data is available • as the data in the operational system of the company • who bought what, when, and where? • the electronic traces left by the human behavior IASISST 2007Montreal11 kbr
Online behavior / traces / Non-reactive • Investigating the sources • e.g. e-mails • e-mail fields • sender, date, subject, response - a network • content analysis • e-mail • Sender as node • Receiver as node • Response and initiator on a web-list • Subject as id of a thread IASISST 2007Montreal12 kbr
Link Analysis – Graph Theory • nodes & edges • node: has names and properties: phones, doctors, web-pages, e-mailers • edge: pair of nodes connected by a relationship • often communication • fully connected graph • path: • an ordered sequence of nodes connected by edges IASISST 2007Montreal13 kbr
Online behavior / traces / Non-reactive • Investigating the sources • e-mails (just mentioned earlier) • sender, date, subject, response - a network • blogs • the web-sites and the web • all these have ethical as well as legal implications (Allen) • Research into the virtual • Logs of behavior • web-log • paradata • ISP-log (internet service provider) IASISST 2007Montreal14 kbr
Web-log analysis • hits, pages, visits, users of a web-site • cookies and explicit user log-in • 'click-stream analysis' CLF (Common Log File) • pages where the session stops? • patterns of web-movements that explain the stops • going in circles on a web site? • behavior from non-buyers and buyers IASISST 2007Montreal15 kbr
Paradata in surveys • web-log of the process of answering a web survey • timing of the respondent's progression in shifting the web page • paradata is data about the process of data collection (Couper) • collection at the client-side (Heerwegh) • JavaScript can trace - with timing - different types of answering mechanisms: • drop-down lists • radio-buttons • click-items • give value etc. • and client-side can also track how the respondent has changed the answers IASISST 2007Montreal16 kbr
Med fokus på kunden • Customizing • speciallavet til kunden, kundens ønske • mass-customization • Personalizing • opbygget efter kundens behov • som automatisk aflæst • gennem kundens transaktioner • og andre kunders transaktioner • Discriminizing? IASISST 2007Montreal17 kbr
Analyzing virtual communities • Amazon • first among communities of costumers • making other customer comments and evaluations available • using their behavior – linking the books bought and customers • many more sites of communities • e-mail lists • blogs • dating sites • potential in personal links as in Linkedin.com • and in the constructed virtual reality of 'Second Life' … • or the links contained in the web itself IASISST 2007Montreal18 kbr
Network • Live IASISST 2007Montreal19 kbr
Mixed modes and mixed methods • modes of surveys with questionnaires • postal, with interviewer, face-to-face or telephone, or web-mode • mixed-mode has the ability to reduce non-response • 'sequential mixed-mode ... do not pose any problems' (de Leeuw) • but different modes often produce different results (Dillman) • the 'unimode design' • later a mode-specific design taking full advantage of the mode used • 'mixed methods' more the combination of qualitative and quantitative methods - and S-R and non-reactive data IASISST 2007Montreal20 kbr
Conclusion • more data is out there • as traces of actual behavior • get it! IASISST 2007Montreal21 kbr
? • Got it? • Thanks • Karsten Boye Rasmussen • SDU IASISST 2007Montreal22 kbr
Abstract • The Internet has presented a welcome media for the traditional research as found in Internet surveys. The price of conducting surveys has gone down and the following higher frequent occurrence of surveys has also brought the focus to some of the central drawbacks of conducting surveys. We are continuously battling the challengers for the validity of the results obtained through the survey design because of the bias present in low response rates. This presentation will exemplify how the electronic traces of human behavior supplies a new area of valid non-reactive data that adds complete, reliable, and easily accessible resources for analysis. Both e-mails and blogs can be the basis for content analysis as well as for structural or network analysis. The electronic traces of behavior exemplified by click-streams of web behavior can be used stand alone or enhance web-surveys through paradata. Lastly, the Internet itself presents an area of research. IASISST 2007Montreal23 kbr