110 likes | 281 Views
Popular Ranking Algorithms. Prepared by -Ranjan Dash. Contents. Efficient ways of Ranking Algorithms for ranking Sort Algorithm Scan Algorithm FA Algorithm TA Algorithm. Efficient ways of Ranking.
E N D
Popular Ranking Algorithms Prepared by -Ranjan Dash
Contents • Efficient ways of Ranking • Algorithms for ranking • Sort Algorithm • Scan Algorithm • FA Algorithm • TA Algorithm
Efficient ways of Ranking • Besides choosing a proper ranking function, efficient way to execute also decides the performance. • So given a ranking function the execution of this following a particular ranking algorithm plays a key role in the efficiency.
Algorithms for ranking • Prominent Algorithms to get top K results are • Sort Algorithm • Scan Algorithm • FA Algorithm • TA Algorithm
Sort Algorithm • Most simple way to decide the top K results of a ranking function like Score (ObjectId) = Linear combinations of attributes is to sort the result and take the top K. • This will take nlogn time. • Very slow for very large relations where n is quite large.
Scan Algorithm • Keep K tuples in a buffer. • Scan this buffer for every tuple in the relation. • Replace the lowest one in the buffer if the input tuple is more than that. • Takes O(n.K) time. • Still low for a large n.
FA Algorithm • Fagin’s Algorithm known as FA Algorithm. Developed by Ron Fagin. • Takes the help of data structures prepared offline. • Though there is a cost associated with these data structures, yet the amortized cost is very low. • Sorted access to the attributes. Supports GetNext() operation and is sequential. One sorted table per attribute. • Random access through the ObjectId. Supports Get(ObjId) operation. • The pre processing requires the preparation of above two types of data structures which will be used again and again during the processing.
FA Algorithm • Step1 • Example of determining top 1 restaurant based on the given ranking function Score(RestId) = 2.Cusine + Location Sorted for Cusine Sorted for Location Original relation
FA Algorithm • Step1 • Do the GetNext from both sorted tables in round robin. • Stop when K objects have been seen in common from all lists – 1 in our example RestId 4 is winner in our case Sorted for Location Sorted for Cusine
FA Algorithm • Step2 • Random access to calculate the score for all visited tuples in step 1. • Take the top K after evaluation • This algorithm is applicable if the problem shows monotonic property. • The worst case will be same as scan algorithm. • The worst case memory requirement is unbounded.
TA Algorithm • Known as Threshold Algorithm • Similar to FA but sorted access and random access are interleaved. • Step 1 • Do sorted access (and corresponding random accesses) until you have seen the top K answers. • Step 2 • Determine threshold value (Hypothetical tuple) based on objects currently seen under sorted access. • K objects with overall score ≥ threshold value ? Stop. • Else go to next entry position in sorted list and repeat step 1 • Faster than FA. • Requires less memory.