110 likes | 204 Views
CoMMA Update. By Juveria & Muhammad. Past Work-1. Started with the basic concept of mining images and text and combining them together to get naïve rules. Dataset was of 780 images all from different domains (bad for a start). Features Used: R,G,B,Y,Edge Orientations,Intensities
E N D
CoMMA Update By Juveria & Muhammad
Past Work-1 • Started with the basic concept of mining images and text and combining them together to get naïve rules. • Dataset was of 780 images all from different domains (bad for a start). • Features Used: R,G,B,Y,Edge Orientations,Intensities • Basically used a C++ implementation of Apriori Association Rule Mining to get rules from text and from images and then combine them simply by matching image id, features and annotations. • Result: Not that good. Got lots of keywords for each image and not refined at all!
Past Work-2 • Used Decentralised Apriori • Dataset: 680 images only belonging to flowers, landscapes and nature domains (Better to restrict) • Features Used: R,G,B,Y,Edge Orientations,Intensities • Get rules from the two domains separately and then use these to get combined rules by Using Decentralised Apriori made in C# (Is a fun language with lambdas). • Results: Definitely better than the previous but some noise here n there and efficiency issues.
Present Work • The Big Picture: Application • Dataset • Features • Programming • Algorithm
The Big Picture : CoMMA Allow a user to add images with/without annotations (initially with) Hard Disk Auto Annotation System Images Images FRONT END Annotations generated Feature Extractor Image Path, Annotations, Features, Rules Images Database Image retrieval using a query image Image Mining These images annotated & Similar Annotated Images Image retrieval using keyword search Java Keywords Text Mining Oracle 9i Annotated Images with related keywords Matlab PHP & HTML
Dataset • Presently we have 670 images. Target is of 3000. Images are available on the web. With your help in annotating these images, we should be ready with a bolder dataset. • http://reddwarf.cs.rit.edu/~dmrg/CoMMA/annotate.php • For the multirelational aspect we’re trying to get more than one annotation for each image. This will give us a 1:n multirelational data.
Image Features- Common Approaches in CBIR • Subsections of images Each Image is divided into 5 sections left-upper corner, right-upper corner, left-down corner, right-down corner • HSV color space: Hue Saturation Value color space excludes errors of the form Y=0 means R=1,B=1,G=1. We are taking 10 bin histograms for H,S,V. (on exploring matlab for such hist reveals that one gets 3X10X100 blocks of arrays..there has to be a sensible way of storing on the relevant portions- have to find out about that)
Multi-relational Association Rule Mining • Modifying existing Association Rule Mining Algorithms for multi relations --Partial proving already done. More thinking needed as to how to use the present data structures for the purpose. • Designing an algorithm starting from scratch– Lot of thinking to come up with something totally new: not a modified version on existing algorithm. A Formal proof will be needed to show that such an algorithm works.
ARMS • Apriori: We’ve already looked at decentralized Apriori and WARMR. Basic drawback of Apriori is that it uses a horizontal partition method, which is inefficient as compared to vertical method • FP Tree: We have to explore this algorithm. • Eclat: Uses horizontal partition method. Not good either. • COFI: We have to explore this
Conclusion • That’s all for now. • We’re exploring further and posting our discoveries on our webpage • Thank you!
References • Dristributed Multimedia Databases: Techniques & Applications by Timothy K. Shih. • Mutimedia Databases and Mining by B. Thirusingham • Multi-relations Data Mining