480 likes | 823 Views
Gary Marchionini, UNC-CH LIDA 2005. Outline. Digital Libraries as phenomena. Multimedia and ... ibiblio, CMU, UMD, NIST, Prelinger and Internet Archives, NASA, ACM ...
E N D
Slide 1:Practice and Theory in Digital Libraries: The Case of Open Video Gary Marchionini, PhD
University of North Carolina at Chapel Hill
www.ils.unc.edu/~march
march@ils.unc.edu
May 30, 2005
Slide 2:Outline Digital Libraries as phenomena
Multimedia and video challenge our text biases
Open Video concepts and system Moebius
User studies
Conclusion
Slide 3:Pragmatics Useful theory and practice are a Moebius strip
DL practice in informed by multiple theories related to:
Information structure
Human behavior
System design
Social-political-economic constraints and organizational behavior
History and epistemology
“We want principles, not only developed—the work of the closet—but applied, which is the work of life.” Horace Mann, Thoughts, 1867
Slide 4:Theories of What and Why Digital extensions of physical libraries
Augmentations of intellect
Collaborative spaces: sharium
Cultural institutions
World Brain
Economic models
Complex information systems
Slide 5:Theories of How Reuse and open source information
Levels of abstraction
Information retrieval
Information interaction
Iterative design and evaluation
Resource management
Slide 6:Digital Library Design Space1999: What Has Changed in 2005? Community includes policies, collaboration and cooperation
In 2005 there is more balance because DLs have been addressed by working libraries rather than only researchers.
Technology has not changed much….content management systems becoming commercial; mobility the current tech direction
Lots of attention to content---digitizing everything; metadata developments
Much more attention to community, contributor run DLs, bottom up pressures, use of DLs by K-12 and global users; interest in cross cultural issues
Ser vices---we don’t know how to improve themCommunity includes policies, collaboration and cooperation
In 2005 there is more balance because DLs have been addressed by working libraries rather than only researchers.
Technology has not changed much….content management systems becoming commercial; mobility the current tech direction
Lots of attention to content---digitizing everything; metadata developments
Much more attention to community, contributor run DLs, bottom up pressures, use of DLs by K-12 and global users; interest in cross cultural issues
Ser vices---we don’t know how to improve them
Slide 7:Provocation: Text no longer rules: The Net generation depends much less of reading (they are entering universities as students and soon, as professors; Oblinger & Oblinger, 2005 Educause book). In the US:
Children age 6 or younger: average of 2 hrs/day using screen media, 1.6 hrs/day playing outside, 39 min. reading
13-17 yr olds: average 3.1 hrs/day watching TV and 3.5 hrs/day with digital media. They multitask
>2M million US children (ages 6–17) have their own Web site. Girls are more likely to have a Web site than boys (12.2 percent versus 8.6 percent).
Ability to use nontext expression—audio, video, graphics—appears stronger in each successive cohort.
Multimedia and Multitasking the trend of 21st century
Information specialists MUST get over our text bias
Slide 8:Open Video DL Case Open
Public good
Reusable
Files not streams
Chunking
Agile views user interface
Alternative representations (views)
Agile control mechanisms
Slide 9:Open Video Vision/Contributions An open repository of video files that can be re-used in a variety of ways by the education and research communities
Encourages contributions
A testbed for interactive interfaces
An easy to use DL based upon the agile views interface design framework
Multiple, cascading, easy to control views (pre, over, re, shared, peripheral)
Views based upon empirically validated surrogates
An environment for building theory of human information interaction
A set of methods and metrics that reveal how people understand digital video through surrogates
Slide 10:Background & Status Begun 1995 with colleagues at UMD & BCPS
Funding: NSF, NASA, NSF/LoC
Collaborators/Contributors: I2-DSI, ibiblio, CMU, UMD, NIST, Prelinger and Internet Archives, NASA, ACM
~2600 video segments
~2000 different titles
~15000 unique visitors per month
MPEG-1, MPEG-2, MPEG-4, QT
OAI provider
Ongoing user studies
New Preservation initiative
Slide 11:Agile Views Interface Research Provide a variety of access representations (e.g., indexes) and control mechanisms
Usual search and browse capabilities
Leverage both visual and linguistic cues
Create and test surrogates for overview preview, shared and history views Experts will learn/tolerate bad interfaces if the content is goodExperts will learn/tolerate bad interfaces if the content is good
Slide 12:User Study Framework
Slide 13:The Surrogates Storyboard with text keywords (20-36 per board@ 500 ms)
Storyboard with audio keywords
Slide show with text keywords (250ms repeated once)
Slide show with audio keywords
Fast forward (~ 4X)
Fast forwards 32X, 64X, 128X, 256X
Poster frames
Real time clips
Text titles
Slide 14:Surrogate Examples
Slide 15:Metrics
Slide 16:User Studies Study 1: Qualitative Comparison of Surrogates (ECDL 02)
Study 2: Fast Forwards (JCDL 03)
Study 3: Narrativity (CHI 02; ASIST 03 paper)
Study 4: Shared views and History Views (Geisler dissertation)
Study 4: Poster frames and text (eye tracking, CIVR 03)
Study 5: TREC evaluations (03 and 04)
Study 6: cognitive load and ISEE (Mu diss.)
Study 7: relevance judgments for video (Yang diss.)
Study 8: Surrogate integration study (in analysis)
Others: several specific master’s papers (Hughes, Gruss
Slide 17:Study 1: Compare Surrogates What are the strengths and weaknesses of different surrogates from the users’ perspective?
Are any of the surrogates better than the others in supporting user performance?
Slide 18:The Surrogates Storyboard with text keywords (20-36 per board@ 500 ms)
Storyboard with audio keywords
Slide show with text keywords (250ms repeated once)
Slide show with audio keywords
Fast forward (~ 4X)
Slide 19:Method 7 video segments (2-10 min), 5 surrogates created for each
10 subjects with high video and computer experience
Three phases (all multi-camera videotaped)
View full video then use 3 surrogates, repeat
Participant observation and debriefing
Do NOT view full video, use 3 surrogates, repeat
Participant observation and debriefing
Complete 3 assigned tasks with surrogates of choice
Think aloud and debriefing
http://www.open-video.org/experiments/chi-2002/methods/study1.mov
Slide 20:Tasks Gist determination—free text
Gist determination—multiple choice
Object recognition—textual
Object recognition—graphical
Action recognition (2-3 second clips)
Visual gist (predict which frames belong)
http://www.open-video.org/experiments/chi-2002/surrogates/index.html
Slide 21:Preferences In debriefing after each phase, subjects asked about preferences.
Some preferences changed over the phases
2 subjects preferred ff
4 subjects said ff if audio keywords added
1 storyboard with audio keywords
2 slide show with audio keywords
? drop ss with text keywords, develop ff
Slide 22:Performance No SRD on gist (both free text and multiple choice)
SRD on action recognition favoring ff
‘Near’ SRD on text object recognition favoring SB/w audio keywords
8:1 to 29:1 compaction rates suitable for tasks
Psychometric and face validity support for the tasks (means and variances; relevant to real tasks)
SRD in gist and visual gist for one video
?Homogeneity of frames diminishes surrogate value
?Keywords help when visual variability decreases
Slide 23:Qualitative Results Subjects suggested different surrogates for different tasks (e.g., ff for judging kid safe, sb for identifying images, ff for video styles)
Three senses of gist
Topic (T)
Narrativity (N)
T+N+visual style
Individual preferences and experiences influence surrogate effectiveness
Slide 24:Study 2: Fast Forward How fast can we make fast forwards?
4 ff conditions (32X, 64X, 128X, 256X)
Four video segments for each condition
45 subjects (1/2 UG, 1/2 grad, 2/3 female)
6 tasks (full text gist, multiple choice gist, word object recognition, graphical object recognition, action recognition, visual gist)
Counterbalance speed and videos
Web-driven experimental condition, 3-camera video tapes, single subject at a time in usability laboratory
Slide 25:Example Image Recognition Stimulus
Slide 26:Results SRD on 4 of 6 tasks as speed increases, however, reasonable performance at even the highest rate
Video content/genre interacts with performance
Preference does not parallel performance (people can perform well under extreme conditions but do not like/enjoy)
No user characteristic differences (age, sex)
?Give users control but select appropriate defaults
Caveat: controlled, independent focus on FF, likely a lower bound on performance
Slide 27:Speed Effects on Performance
Slide 28:Narrativity Study CHI walk up kiosk, 20 people used
20 one-minute clips ( half b&w, no audio) selected on 2 criteria: contain characters, have cause/effect relations between scenes (5 in each category)
SRD on chars, cause, and interaction
Slide 29:Shared Views and History Views Studies Evaluate AV Design Framework by instantiating and evaluating a design
Shared (based on recommendations) and History Views (based on logs)
Phase 1: compare OV to Views interface (28 participants). OV>accuracy; NSRD on time, but learning effect; AV>navigation/efficiency; AV>satisfaction
Phase 2: qualitative analysis of shared and history views
Slide 30:Poster Frame Study Research Questions:
Given both textual and visual metadata; which surrogate will be utilized, which surrogate will be preferred?
Does the placement of the surrogates affect how they are used?
Does the assigned task affect how surrogates are used?
Does personal preference play a role in how surrogates are used?
Slide 31:Study Methods / Procedures 12 undergraduate students (paid volunteers)
Pre-Study questionnaire
Demographics
Visual vs. Verbal learning style (VVQ)
10 search problems
Counter-balanced
Design 1 and 2
1 : text on left / visuals on right
2 : visuals on left / text on right
Eyetracking
Post-study questionnaire
Follow up questions
Slide 32:Results All participants over all tasks:
Mean time looking at text = 29.7 sec.
Mean time looking at pics = 6.8 sec.
75% of fixations over text
18% of fixations over pics
First fixations over text = 65
First fixations over pics = 54
Text requires and gets more user attention
Slide 33:Results cont’d Design 1 vs. Design 2
When text was placed on the left, mean time per fixation was slightly higher
VVQ
Balanced group spent more time looking at text
Tasks
Varied by task:
Time spent looking at text
Time spent per fixation over text
Frequency of fixations over text
Slide 34:Screen Shots
Slide 35:Screen Shots
Slide 36:Screen Shots
Slide 37:Tasks Please find a video that discusses the destruction earthquakes can do to buildings. These search results are from a search on the word “Earthquake”.
Please find a video that discusses nurses and their contributions to the United States Army. These search results are from a search on the word “Work”.
Please choose a video from the following list that you think would be entertaining for you and your friends to watch.
Slide 38:Discussion In this restricted situation (i.e. pre-formulated results page) participants used text as the main anchor point
? Because text is a better surrogate?
? Because text contains more information?
? Because text is more familiar to people
? Because tasks directed users to text?
Slide 39:Discussion cont’d Layout seemed to have little effect on how surrogates were used.
Difference of .03 of a second
Participants didn’t report a significant preference for layout
Some liked design 1 and some liked design 2
VVQ
Hypothesis that visual learners would use visual surrogates and verbal learners would use verbal surrogates was not supported Layout: Interesting for designers, placement won’t influence usage, so placement should be based on other choices…needs to be better worded.
Text was used regardless of where it was placed
VVQ: small sample / validity of vvq
The most likely explanation for this result is that the balanced learners had a stronger preference for text than the visual group, and so spent more time with it Layout: Interesting for designers, placement won’t influence usage, so placement should be based on other choices…needs to be better worded.
Text was used regardless of where it was placed
VVQ: small sample / validity of vvq
The most likely explanation for this result is that the balanced learners had a stronger preference for text than the visual group, and so spent more time with it
Slide 40:Discussion cont’d Tasks
Some tasks took more time to complete
Regardless of:
Counterbalancing order
Participant
Layout design
Slide 41:Text or Pictures? Text was reported as:
Being the search anchor
Containing significant topical information
Taking longer to read than pictures
Visuals were reported as:
Being globally liked
Being used to quickly narrow down choices
Taking less time to decode than text
All participants said the results page would be weaker without them
Often lacking in reference points
Slide 42:Conclusion Visual metadata was used to make (confirm???) relevance judgments
Combination of visual & verbal stronger than one or the other
Generalize with caution:
Small number of study participants
Specific set of search results pages
Ten specific search tasks.
Slide 43:The Integration Study Compare old OV to redesign? Compare to Internet archive?
How do multiple surrogates and agile control mechanisms affect understanding of video?
Accuracy? Time? Satisfaction? Cognitive load? Navigational overhead?
Data analysis underway
Slide 44:Relevance Study (Yang) 3 task groups (illustration [10 profs], collection building [8 video librarians], video production [8 producers/editors])
In-depth interviews
Text, audiovisual, implicit categories of 39 different criteria
Topicality most often mentioned, but far less than text studies
Production groups less varied, more audiovisual criteria
Slide 45:Theory-Practice Lessons from OV User-centered design and user testing pays off, i.e. research informs practice
Production system operation raises new kinds of research questions
Sustainability models
Curatorial models
Preservation challenges
Upgrade paths for universal access
Slide 46:DL Research Directions Incorporating people into DLs (patrons, librarians)
Leveraging contributions and implications for curatorship
Preservation strategies; how much context?
Hybrid physical-digital library operations
Slide 47:Observations A moebius strip is infinite: the interplay between theory and practice goes on
Need for collaboration between working libraries and researchers
Slide 48:Selected Open Video Readings Yang, M. & Marchionini, G. (2005). “Deciphering visual gist and its implications for video retrieval and interface design.” Conference on Human Factors in Computing Systems (CHI). Portland, OR. Apr. 2-7, 2005.
Yang, M. & Marchionini, G. (2004). “Exploring Users' Video Relevance Criteria -- A Pilot Study.” Proceedings of the Annual Meeting of the American Society of Information Science and Technology, pp. 229-238. Nov. 12-17, 2004. Providence, RI.
Yang, M., Wildemuth, B., & Marchionini, G. (2004). “The relative effectiveness of concept-based versus content-based video retrieval.” Proceedings of the ACM Multimedia conference, pp. 368-371.
Mu, X., & Marchionini, G. (2003). “ Enriched video semantic metadata: authorization, integration, and presentation.” Proceedings of the Annual Meeting of the American Society for Information Science and Technology, 40, 316-322.
Wilkens, T., Hughes, A., Wildemuth, B. M., & Marchionini, G. (2003). “ The role of narrative in understanding digital video: an exploratory analysis.” Proceedings of the Annual Meeting of the American Society for Information Science, 40, 323-329.
Hughes, A., Wilkens, T., Wildemuth, B., Marchionini, G. (2003). “Text or Pictures? An Eyetracking Study of How People View Digital Video Surrogates.” Proceedings of CIVR 2003, pp. 271-280.
Wildemuth, B. M., Marchionini, G., Yang, M., Geisler, G., Wilkens, T., Hughes, A., and Gruss, R. (2003). “How Fast Is Too Fast? Evaluating Fast Forward Surrogates for Digital Video.” Proceedings of the 3rd ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2003), pp. 221-230. (Vannevar Bush Award Winner for Best Paper at JCDL 2003)
Mu, X., Marchionini, G., & Pattee, A. (2003). “ The Interactive Shared Educational Environment: User interface, system architecture and field study.” Proceedings of the Annual Meeting of the American Society for Information Science and Technology, 40, 291-300.
Mu, X., Marchionini, G. (2003) “Statistical Visual Features Indexes in Video Retrieval.” Proceedings of SIGIR 2003, pp. 395-396.
Marchionini, Gary (2003). “Video and Learning Redux: New Capabilities for Practical Use.” Educational Technology.
Marchionini, Gary and Geisler, Gary. (2002). “The Open Video Digital Library.” D-Lib Magazine, Vol. 8, Number 12, December.
Barbara M. Wildemuth, Gary Marchionini, Todd Wilkens, Meng Yang, Gary Geisler, Beth Fowler, Anthony Hughes, and Xiangming Mu (2002). “Alternative Surrogates for Video Objects in a Digital Library: Users? Perspectives on Their Relative Usability.” Proceedings of the 6th European Conference on Digital Libraries, September 16 - 18, 2002, Rome, Italy.
Geisler, G., Marchionini, G., Wildemuth, B. M., Hughes, A., Yang, M., Wilkens, T., and Spinks, R. (2002). “Video Browsing Interfaces for the Open Video Project.” Proceedings of CHI 2002, Extended Abstracts.
Nelson, Michael L., Marchionini, Gary, Geisler, Gary, and Yang, Meng (2001). "A Bucket Architecture for the Open Video Project [short paper]." JCDL ’01, ACM - IEEE Joint Conference on Digital Libraries (June 24-28, 2001, Roanoke, Virginia).
Geisler, Gary, and Gary Marchionini (2000). "The Open Video Project: A Research-Oriented Digital Video Repository [short paper]." In Digital Libraries '00: The Fifth ACM Conference on Digital Libraries (June 2-7 2000, San Antonio, TX). New York: Association for Computing Machinery, 258-259.
Slaughter, L., Marchionini, G. and Geisler, G. (2000). "Open Video: A Framework for a Test Collection." Journal of Network and Computer Applications, Vol. 23(3). San Diego: Academic Press.