90 likes | 327 Views
Classification of POJ Problems. Mass Data Processing/Cloud Computing Summer 2008. Motivation. Mass Problems User Need. Solution. Users’ Codes. Steps. Grep function names Establish inverted index Group synonyms Discard the meaningless Classification or Clustering.
E N D
Classification of POJ Problems Mass Data Processing/Cloud Computing Summer 2008
Motivation • Mass Problems • User Need
Solution Users’ Codes
Steps • Grep function names • Establish inverted index • Group synonyms • Discard the meaningless • Classification or Clustering
Input #include <stdio.h>.....void dijkstra(){.....}int main(){...} Output1000@user1 dijkstra^main2000@user2 dijkstra^main^init^qsort..... Grep function names
Input1000@user1 dijkstra^main2000@user2 dijkstra^main^init^qsort..... Outputdijkstra 1000@user1^2000@user2...main 1000@user1^2000@user2...init 2000@user2...qsort 2000@user2....... Establish inverted index
Group synonyms Inputdijkstra 1000@user1^2000@user1^3000@user1dijk 1000@user2^2000@user2^3000@user2......Outputsynonym1 dijkstra^dijk........
Discard the meaningless Inputdijkstra 1000@user1^2000@user1^3000@user1dijk 1000@user2^2000@user2^3000@user2main 10000@user1^2000@user1^3000@user1^ 1000@user2^2000@user2^3000@user2......Outputsynonym1 dijkstra^dijk......meaningless1 main.....
Classification or Clustering Inputdijkstra 1000@user1^2000@user2...qsort 2000@user2.......Outputgroup1 1000^2000^3000...group2 1001^2002^3003........