1 / 39

Information Extraction from Multimedia Content on the Social Web

Information Extraction from Multimedia Content on the Social Web. Stefan Siersdorfer L3S Research Centre, Hannover, Germany. Meta Data and Visual Data on the Social Web. Meta Data: Tags Title Descriptions Timestamps Geo-Tags Comments Numerical Ratings Users and Social Links

Download Presentation

Information Extraction from Multimedia Content on the Social Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Information Extraction from Multimedia Content on the Social Web Stefan Siersdorfer L3S Research Centre, Hannover, Germany

  2. Meta Data and Visual Data on the Social Web • Meta Data: • Tags • Title Descriptions • Timestamps • Geo-Tags • Comments • Numerical Ratings • Users and Social Links • Visual Data: • Photos • Videos How to exploit combined information from visual data and meta data?

  3. Example 1: Photos in Flickr

  4. Example 2: Videos in Youtube

  5. tag1 User 2 User 1 tag2 tag3 Video 1 User 3 Video 2 Group 2 Video 3 Social Web Environments as Graph Structure • Entities (Nodes): • Rescources (Videos, Photos) • Users • Tags • Groups • Relationships (Edges): • User-User: Contacts, Friendship • User-Resources: Ownership, Favorite Assignment, Rating • User-Groups: Membership • Resource-Resource: visual similarity, meta data similarity

  6. User Feedback on the Social Web • Numeric Ratings, Favorite Assignments • Comments • Clicks/Views • Contacts, Friendships • Community Tagging • Blog Entries • Upload of Content How can exploit the community feedback?

  7. Outline • Part 1: Photos on the Social Web • 1.1) Photo Attractiveness • 1.2) Generating Photo Maps • 1.3) Sentiment in Photos • Part 2: Videos on the Social Web • Video Tagging

  8. Part I: Photos on the Social Web

  9. 1.1) Photo Attractiveness * * Stefan Siersdorfer, Jose San PedroRanking and Classifying Attractiveness of Photos in Folksonomies18th International World Wide Web Conference, WWW 2009, Madrid, Spain

  10. Attractiveness of Images Which factors influence the human perception of attractiveness? Landscape Portrait Flower 10

  11. Attractiveness Visual Features Human visual perception mainly influenced by Color distribution Coarseness These are complex concepts Convey multiple orthogonal aspects Necessity to consider different low level features 11

  12. Attractiveness Visual Features Color Features Brightness Contrast Luminance, RGB Colorfulness Naturalness Saturation Mean, Variance Intensity of the colors Saturation is 0 for grey scale images 12

  13. Visual Features Coarseness Resolution + Acutance Sharpness Critical importance for final appearance of photos [Savakis 2000] 13

  14. Textual Features We consider user generated meta data Correlation of topics with image appealing (ground truth: favorite assignments) Tags seem appropriate to capture this information

  15. #views #comments #favorites ... Attractiveness of Photos • Community-based models for classifying/ranking images according to their appeal. [WWW´09] Flickr Photo Stream Inputs Community Feedback (photo’s interestingness) Classification & Regression Attractiveness Models Generator Content (visual features) Metadata (textual features) cat, fence, house

  16. Classification & Regression Models 16

  17. Experiments 17

  18. 1.2) Generating Photo Maps * *Work and illustrations from David Crandall, Lars Backstrom, Dan Huttenlocher, Jon Kleinberg,Mapping the World's Photos, 18th International World Wide Web Conference, WWW 2009, Madrid, Spain

  19. Outline: Photos maps • Use geo-location, tags, and visual features of photos to • Identify popular locations and landmarks • Find out location of photos • Estimate representative images

  20. eiffel Spatial Clustering louvre paris tatemodern london trafalgarsquare Each data point corresponds to (longitude,latidue) of an image Mean shift clustering is applied to get hierarchical structure Most distinctive popular tags are used as labels(# photos tag in cluster/ # photos with tag in overall set)

  21. Estimating Location of Photos without tags • Train SVMs on Clusters • Positive Examples: Photos in Clusters • Negative Examples: Photos outside the Cluster • Feature Representation • Tags • Visual features (SIFT) • Best Performance for Combination of Tags and SIFT features

  22. Finding Representative Images • Construct Weighted Graph: • Weight based on visual similarity of images (using SIFT features) • Use Graph Clustering (e.g. spectral clustering) to identify tightly connected components • Choose image from this connected component

  23. Example 1: Europe

  24. Example 2:New York

  25. 1.2) Sentiment in Photos * * Stefan Siersdorfer, Jonathon Hare, Enrico Minack, Fan DengAnalyzing and Predicting Sentiment of Images on the Social Web18th ACM Multimedia Conference (MM 2010), Florence, Italy

  26. Sentiment Analysis of Images Data: more than 500,000 Flickr Photos • Image Features • Global Color Histogram: a color is present in the image • Local Color Histogram: a color is present at a particular location • SIFT Visual Terms: b/w patterns rotated and scaled • Image Sentiment • SentiWordNet: provides sentiment values for terms • e.g. (pos, neg, obj) = (0.875, 0.0 , 0.125) for term „good“ • used for obtaining sentiment categories  training set + ground truth for experiments

  27. Which are the most discriminative visual terms? • Use Mutual Information Measure to determine these features: • Probabilities (estimated through counting in image corpus): • P(t): Probability that visual term t occurs in image • P(c): Probability that image has sentiment category c („pos“ or „neg“) • P(t,c): Prob. that image is in category c and has visual term t • Intuition: „Terms that have high co-occurence with a category are more characteristic for that category.“

  28. Most Discriminative Features Most discriminative visual features: Extracted using the Mutual Information measure [ACM MM’11]

  29. Part 2: Videos on the Social Web * * Stefan Siersdorfer, Jose San Pedro, Mark SandersonContent Redundancy in YouTube and its Application to Video TaggingACM Transactions on Information Systems (TOIS), 2011 Stefan Siersdorfer, Jose San Pedro, Mark SandersonAutomatic Video Tagging using Content Redundancy32nd ACM SIGIR Conference, Boston, USA, 2009

  30. Near-duplicate Video Content • Youtube: most important video sharing environment • [SIGCOM’07]: 85 M videos, 65 k videos/day, 100 M downloads per day, Traffic to/from Youtube = 10% / 20% of the Web total • Redundancy: 25% of the videos are near duplicates Can we use reduandancy to obtain richer video annotations?  Automatic tagging

  31. Automatic Tagging • What is it good for? • Additional information  Better user experience • Richer feature vectors for ... • Automatic data organization (classification and clustering) • Video Search • Knowledge Extraction ( creating ontologies)

  32. Overlap Graph Video 1 Video 2 Video 3 Video 4 Video 5 Video 1 Video 2 Video 5 Video 3 Video 4

  33. A B C A E B E F Neighbor-based Tagging (1): Idea Video 1 Video 2 Video 3 • Video 4 contains original tags A, B; tags F,E are obtained from neighbors • Criteria for automatic tagging: • Prefer tags used by many neighbors • Prefer tags from neighbors with a strong link A B F E Video 4 automatically generated

  34. Neighbor-based Tagging (2): Formal Weights correspond to overlap Indicator function Sum over all neighbors

  35. Neighbor-based Tagging (3) • Apply additional smoothing for redundant regions Overlap Region Number of neighbors with tag t Smoothing factor Subsets of neighbors

  36. TagRank • Takes also transitive relationships into account • PageRank-like weight propagation

  37. Applications of Extended Tag Respresentation • Use relevancies rel( t, vi) for constructing enriched feature vectorsfor videos: combine original tags with new tags weighted by relevance values • automatic annotation: use thresholding to select most relevant tags for a given videos • Manual assessment of tags show their relavance • Data organization: • Clustering and Classification experiments (Ground truth: Youtube categories of videos) • Improved performance through enriched feature representation

  38. Summary • Social Web contains visual information (photos, videos) and meta data (tags, time stamps, social links, spatial information, ..) • A large variety of users provide explicit and implict feedback in social web environments (ratings, views, favorite assignments, comments, content of uploaded material) • Visual Information & annotations can be combined to obtain enhanced feature representations • Visual information can help to establish links between resources such as videos (application: information propagation) • Feature representations in combination with community feedback can be used for machine learning (appliciation: classification, mapping).

  39. References • Stefan Siersdorfer, Jose San Pedro, Mark SandersonContent Redundancy in YouTube and its Application to Video TaggingACM Transactions on Information Systems (TOIS), 2011 • Stefan Siersdorfer, Jonathon Hare, Enrico Minack, Fan DengAnalyzing and Predicting Sentiment of Images on the Social Web • 18th ACM Multimedia Conference (MM 2010), Florence, Italy • Stefan Siersdorfer, Jose San Pedro, Mark SandersonAutomatic Video Tagging using Content Redundancy32nd ACM SIGIR Conference, Boston, USA, 2009 • Stefan Siersdorfer, Jose San PedroRanking and Classifying Attractiveness of Photos in Folksonomies18th International World Wide Web Conference, WWW 2009, Madrid, Spain • David Crandall, Lars Backstrom, Dan Huttenlocher, Jon Kleinberg • Mapping the World's Photos • 18th International World Wide Web Conference, WWW 2009, Madrid, Spain

More Related