110 likes | 212 Views
Optimization of Search Results. Raghuveer Nanduri ESRI Summer Intern 2014. Goal. - To optimize the relevancy of search results - To facilitate users to visualize data in the form of clusters. Idea:.
E N D
Optimization of Search Results Raghuveer Nanduri ESRI Summer Intern 2014
Goal - To optimize the relevancy of search results - To facilitate users to visualize data in the form of clusters Idea: TF-IDF: Frequency based numerical statistic used to determine the importance of a word in a document. Combining TF-IDF with Clustering: Considers only important documents corresponding to the query -similarity of documents within the same cluster is maximized-similarity of documents across clusters is minimized
Clustering • “the process of grouping homogenous objects.” • Why do objects appear in the same cluster? • Spatial or Temporal correlation between objects leads to the formation of clusters K means clustering
Data Repository TF-IDF based top documents Clustering Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data Meta data
Modified (gptogc) Existing (gptogc)
Rivers,water Land Clustering Cluster 1 Cluster 2 Cluster 4 Air quality Biological Assessments Cluster 5 Water Depths Cluster 3
Modified (Alberta data) Existing (Alberta data)
Regional advisory council recommendation Clustering Cluster 1 Information articles about the parks Cluster 3 Different parks present in the area of alberta Cluster 2
Further Improvements • Association and scalar clustering to perform query elaboration. • Improving relevancy through user feed back.