270 likes | 555 Views
An Experiment To Characterize Videos On The Web. Soam Acharya Brian Smith Cornell University MMCN 1998. www. www. www. www. Overview. Designed and implemented an experiment to search and analyze videos on the web 22500 HTML documents 57000 movies 100 Gbytes of data. Why?.
E N D
An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998
www www www www Overview • Designed and implemented an experiment to search and analyze videos on the web • 22500 HTML documents • 57000 movies • 100 Gbytes of data
Why? • Codec Designers • Network Engineers • Other Multimedia Researchers • MM file systems • Webmasters
Questions We Asked Not all that many. We found 57,000. • How many movies are out there? • What are their basic properties? • What compression formats are popular? • How well do the formats compare? • Are standard modem rates enough? 90% last 45 seconds or less. 1.1 Mbytes is their median size QuickTimeis about 53%, followed by MPEG (30%) and AVI MPEG compresses best. QuickTime and AVI are similar. 28.8 - 128 Kilobits/sec (Kbps) are useless for real-time download and display of movies.
Roadmap • Data Collection Methodology • Analysis • Results • Conclusion • Future Work • Open Questions
Data Collection Methodology • Hunting Phase • get links to movies • Gathering Phase • download movies and gather raw statistics • Sifting Phase • eliminate outliers
Early April 1997 -Hunting Phase • Milked AltaVista for documents dated • January 1995 - March 1997 • looked for MPEG, QuickTime, AVI • no streaming video format
www.eg.com 2. movie.html LDG 1. http://www.eg.com/movie.html 4. summary statistics 3. my.mov Http://www.eg.com/movie.html http://www.cnn.com/pepe.html ….. www.vid.com Gathering Phase mid April 1997 - May 1997 LP0 LP1 LP2 LDG: movie link distributor/gatherer LP: link processor
Sifting Phase • Processed 100 Gbytes of data and 57,000 titles • used mpegstat and modified xanim • 4 <frames/sec< 40 {5000 titles} • duration> 0.5 seconds {1000 titles} • 0.6 <aspect ratio< 1.667 {1000 titles} • bitrate < 10 Mbps {1000 titles} • bitrate = (movie size)/(movie duration) • duplicate URL detection {1500 titles}
Analysis • 47500 titles remained • 53% QuickTime, 30% MPEG, 17% AVI • Can be divided into two categories • Distributions: • by date • fps • size • duration • aspect ratio • bitrate • Comparing movie formats against each other
Roadmap • Data Collection Methodology • Analysis • Results • Conclusion • Future Work • Open Questions
Movie Size (In bytes) • 70% of movies are 2Mbytes or less • Median movie size is about 1.1 MBytes
Aspect Ratio • 74% of all files had an aspect ratio of 1.333 • 320 x 240 • 160 x 120 • 89% had aspect ratios of 1.2 - 1.5
So Far ... • Distributions: • by date • fps • size • duration • aspect ratio • bitrate • Comparing movie formats
AVIQuickTime Audio Codec PCM PCM MS-ADPCM TWOS AVI/QuickTime Comparison • 25% of AVI, 33% of QuickTime: video only Video CodecsAVIQuickTime Radius Cinepak43% 60% Intel Indeo R3.225% 2% Microsoft Video I26% 0% Apple Video-RPZA 0% 22%
How Compare Compression? • Bits/pixel= (video size in bits)__ (width * height * # of frames) Mean Median (bits/pixel) AVI 2.51 2.14 QT 2.16 1.82 MPEG 0.72 0.51
MPEG Bits/pixel Distribution Frame Type Mean bits/pixel Median bits/pixel I 1.25 1.10 P 0.76 0.54 B 0.31 0.19 • Size of I:P:B frames ~ 1: 2 : 5 • 90% of MPEG files were video only
MPEG Frame Patterns Frame Pattern % Distribution Mean bits/pixel I 27.1 1.17 IBBPBB 15.7 0.7 IBBPBBPBBPBBPBB 10.4 0.31 IBBPBBPBBPBB 8.1 0.5 IBBBPBBBPBBB 4.4 0.66 IPBBIBB 4.2 0.39 IIP 3.5 0.7 80% of MPEG: some recurring pattern
Recap • Number of movies coming online - exponential, then flat • MPEG higher fps, QuickTime/AVI lower • Median size of movies: 1.1 Mbytes • 90% of movies last 45 seconds or less • 1.333 is the most common aspect ratio • 28.8 - 128 Kbps modem rates useless for real-time downloads • Radius Cinepak is widely used by QuickTime and AVI • MPEG compresses better than QuickTime and AVI • 80% of MPEGs have some sort of recurring pattern
Conclusion • Existing compression technologies not enough for transmission over standard modems • explains rise of streaming video technologies • users cope by making file sizes, duration smaller • but not by throttling the bitrate • perceptual threshold?
Future Work • How do videos age? • Another study to confirm findings • Brewster Kahle, • www.archive.org • Develop tools to automate the process
Open Questions • What are video access patterns on the Web? • How to analyze streaming video files?