220 likes | 234 Views
This study investigates the probabilistic behavior of the distribution of relevant images among the returned results for image search engines using a two-order Markov chain. The experiment results demonstrate the effectiveness of this approach for predicting the number of relevant images.
E N D
Estimate the Number of Relevant Images Using Two-Order Markov Chain Presented by: WANG Xiaoling Supervisor: Clement LEUNG
Outline • Introduction • Objective • Methodology • Experiment Results • Conclusion and Future Work
Introduction • Large collections of images have been made available on web. • Retrieval effectiveness becomes one of the most important parameters to measure the performance of image retrieval systems.
Measures: • Precision • Recall • Significant Challenge: the total number of relevant images is not directly observable
Basic Models • Regression Model • Markov Chain • Two-Order Markov Chain
Objective • To investigate the probabilistic behavior of the distribution of relevant images among the returned results for the image search engines using two-order markov chain
Methodology • Test Image Search Engine: • Query Design • 70% provided by authors • One word query • Two word query • Three word query • 30% suggestive term • Suggestive term with largest returned results • Suggestive term with least returned results
Methodology • Database Setup: • Stochastic process {X1, X2,…,XJ } where XJ denotes the aggregate relevance of all the images in page J • Equation: where YJi=1 if the i th image on page J is relevant, and YJi=0 if the i th image on page J is not relevant.
Forecast Using Two-Order Markov Chain Markov Chain: Stochastic process {XJ, J≥1} with state space S={0,1,2,…20} , Two-Order Markov Chain: State space change to S2, Forecast the state probability distribution of next page π(J) based on the original state probability distribution π(1) and transition probability matrix P . An Example • Model Test Mean Absolute Error
Experiment Results • Forecast Results Using Two-Order Markov Chain
Measure for Forecast Accuracy • Mean Absolute Deviation (MAD):
Comparative Results • Best Model: Two-Order Markov Chain • Worst Model: Regression Model
Conclusion • Two-Order Markov Chain could well represent the distribution of relevant images among the results pages for the major web image search engine. • Two-Order Markov Chain is the best model among three models we have worked.
Future Work • Our future work will try to apply Hidden Markov Chain to this topic
Thank you! Q & A
Two-Order Markov Chain An example (cont’) • Suppose the stochastic process {Xt, t>=0} with a state space S={A, B, C} • As to two-order Markov chain, the state space: S2={AA, AB, AC, BA, BB, BC, CA, CB, CC} • The state probabilities distribution of period zero: (0)= (AA, AB, AC, BA, BB, BC, CA, CB, CC)
An example (cont’) • The transition probability matrix: PAA,BA=0
An example • Therefore, the probability distribution of states for page J will be compute as: π(J)=π(J-1)*P [Return]