1 / 50

Detecting Cartoons a Case Study in Automatic Video-Genre Classification

Detecting Cartoons a Case Study in Automatic Video-Genre Classification. Tzvetanka Ianeva Arjen de Vries Hein Röhrig. Outline. Goal: remove cartoons from search results in TREC-2002 video track Our Approach: extract Image Descriptors & SVM Machine Learning Related work

teague
Download Presentation

Detecting Cartoons a Case Study in Automatic Video-Genre Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Detecting Cartoons a Case Study in Automatic Video-Genre Classification Tzvetanka Ianeva Arjen de Vries Hein Röhrig

  2. Outline • Goal: remove cartoons from search results in TREC-2002 video track • Our Approach: extract Image Descriptors & SVM Machine Learning • Related work • Novel Descriptors from Granulometry • SVM Learning • Experimental Results

  3. TREC-2002 video track • TREC- workshops for large scale evaluation of information retrieval technology • CWI participation: Probabilistic Multimedia Retrieval Model • does not distinguish sufficiently “Cartoons”

  4. Example of undesirable ‘cartoon’ Query Best Matches returned

  5. Related work • M.Roach et al. Motion based classificationof cartoons (2001) • B.T.Truong et al. Automatic genre identification for content-based video categorization (2000) • J.R.Smith et al. Searching for images and videos on the world wide web • N.C.Rowe et al. Automatic caption localization for photographs on www pages • V.Athitsos et al. [ASF] Distinguishing photographs and graphics on the www

  6. Cartoons • What is a Cartoon? • Cartoons do not contain any photographic material • Photos photographic camera • Appears easy to find cartoons • Few, simple, strong colors, patches of uniform colors, strong black edges, text

  7. Quiz: Cartoon or Photo?

  8. Examples not so Typical

  9. Photos like cartoons

  10. “Cartoons” like photos

  11. Artificial photos

  12. Small cues

  13. Overlapping Frames

  14. Mixed

  15. Shadow & Sparkle

  16. Image Descriptors Input Image Image descriptors 0.6231 0.9266 … … • greater correlation • normalized • Example: avg. sat., thresh. brightness 1 2 148 (240x352x3) 0.2880 0.4125 … … 1 2 148

  17. Overview of our all image descriptors Image Descriptors Dimension average saturation1 threshold brightness 1 color histogram 45 edge-direction histogram 40 compression ratio 1 multi-scale pat. spectrum 60

  18. Brightness and Saturation • HSV color model • Cartoons brighter => use % pixels with Value > 0.4 • Cartoons have strong colors => use average Saturation

  19. Saturation in cartoon and photo images RGB S-(HSV) RGB S-(HSV) 0.6231 0.2880

  20. Brightness in cartoon and photo images . RGB V-(HSV) RGB V-HSV 0.9266 0.4125

  21. Histograms • Image I : XxY -> Rc • Filter F : I -> I’ • Bins Bk partition of Rc • hk = #{ (x,y) : I’(x,y) є Bk } • E.g. brightness metric: I grayscale, c=1, B1 = [ 0, 0.4 ], B2=[0.4,1], return h2

  22. Color Histogram • More general than brightness & saturation • Again HSV color space • Partition HSV into 3x3x5 = 45 bins • Cartoons have less colors => col. hist. desc.

  23. Color histogram for in the 45-bin HSV

  24. Color histogram for in the 45-bin HSV

  25. Edge detection • Cartoons have strong black edges => • Approx. total derivative of intensity  I(x,y) = ( I(x,y), I(x,y) )   x y • Approx. || and  • histogram of (, ||) • 5 intervals for||  0 … sqrt(20) • 8 intervals for  0 … 2  

  26. Edge angles & edge magnitudes

  27. Edge histogram

  28. Compressibility 0.23365 0.13548 • Cartoons: more simple composition • Detect complexity by measuring compression ratio • Theory: “Kolmogorov complexity” • Our application: use lossless PNG compression • Lossy JPEG not useful

  29. Granulometries • Idea: measure size distribution of objects • How? openings by structuring element of growing scale • Normalized size distribution • Derivative = pattern spectrum

  30. Openings Opening = erosion then dilation with same SE

  31. Structuring Elements • Non-flat parabola better(?) than flat disk • Parabola: efficient computation, symmetry

  32. Small-scale pattern spectrum descriptors SE disk ri = i, i = 1,…20

  33. SVM Learning • Simplest case:  linear separator • SVM finds hyperplane with largest margin • Closest points = Support Vectors

  34. SVM Learning: nonseparable • Noisy data: no separating hyperplane at all! • Solution: penalty C for points inside the margin • C SVM machines

  35. SVM = quadratic programming SVM task: Equivalent dual problem:

  36. SVM with kernels SVM task: Equivalent dual problem:

  37. SVM kernels RBF kernels Polynomial kernels

  38. SVM with kernels: decision function SVM task: Equivalent dual problem: Decision function:

  39. Experimental Data • Key frames from TREC 2002 Video Track • 13,026 photographic images • 1,620 cartoons • Manually classified • Experiments 1-3: train on (random) 3908 photos and 486 cartoons

  40. Experiment 1: individual performance Et= Ep+Ec |p| |c| |p|+|c| |p|+|c| σ2 = 0.1 0.05 < σ2 < 0.5 σ2 = 0.07 0.05 < σ2 < 0.5 0.05 < σ2 < 0.5 σ2 = 0.07

  41. Experiment 2: “convergence” of SVM learning (Pattern spectrum)

  42. Experiment 3: combined performance σ2 = 0.06

  43. Experiment 4: web-image classifier on our data Test set: random 1,000 photos and 1,000 cartoons

  44. Experiment 5: Performance on web images Comparison with 14,039 photographic and 9,512 graphical images harvested from WWW train on (random) 4239 photographics and 2826 graphics + dimension and file type features

  45. Conclusions • Hard task: good classifier • Use dynamics/spatio-temporal relations ? • Semantic Gap? • Combine classifiers? • Granulometry not enough

More Related