150 likes | 257 Views
Supporting personalized ranking over categorical attributes. Presenter : Lin, Shu -Han Authors : Gae -won You, Seung -won Hwang, Hwanjo Yu. Information Sciences 178(2008). Outline. Motivation Objective Methodology Experiments Conclusion Comments. Motivation.
E N D
Supporting personalized ranking over categorical attributes Presenter : Lin, Shu-Han Authors : Gae-won You, Seung-won Hwang, Hwanjo Yu Information Sciences 178(2008)
Outline • Motivation • Objective • Methodology • Experiments • Conclusion • Comments
Motivation • Categorical attributes’ problem of information retrieval's personal ranking • Categorical attributes do not have an inherent ordering. • How to rank the relevant data by categorical attribute. • For example, how can we… • Find old female with the preference of soda drink. 3
Objectives • Enableauniformrankedretrievaloveracombinationofcategoricalattributesandnumericalattributes. • Supportrankingofbinary representation of categorical attribute • Binary encoding • Sparsity Single-valued attribute Multi-valued attribute with bounded cardinality (item set, bc=2) 4
Overview (3) (2) (1) 5
Rank formulation 6 F= 0.5*age + 3*female+…
Rank processing (TA) A Simple example query: Find old female with the preference of soda drink. Transform into F= age + female • Candidate identification • Sorted Access age and female • Find top-k sa(age) and sa(female), e.g., k=1, sa(age)={o1}; sa(female)={o2} • Candidate reduction • O1=30+0 • O2=25+1 • O1 with the highest F score • Termination • O1 !> F(30,1)=31 // upper bound score • Another round of sorted access to consider more candidates, e.g., sa(age)={O4}; sa(female)={O3} 7
Bitmap – binary encoding F=v1+v2+v3+v4, k=2 • K={}, C={1111}(Initailization) • OID=excute(C) • OID={o4},|OID|>0,K={[o4,4]} • C={0111/1011/1101/1110} (Expansion) • K.count<k,Back to 2) • … 8
Bitmap– sparsity Single-valued attribute F=w1v1+w2v2+…+w6v6 rankedweightw1≧w2≧w3;w4≧w5≧w6forsimple,allw=1,k=2 • K={}, C={100.100.100} (Initailization) • OID=excute(C) • OID={o4},|OID|>0,K=OID={[o4,2]} • C={010.100.100/100.010.100/100.100.010} (Expansion) • K.count<k,Back to 2) • … 9
Bitmap– sparsity Multi-valued attributewithboundedcardinality 10
Experiments • UCI’ssparsityofindicatingvariable • 22%ofdatasetconsistonlythecategoricalattributes. • 56%ofcombinationofnumerical&categoricalattributes. 11
Conclusions • Thispaperstudies • Howtosupportrankformulation • Processingoverdatawithcategoricalattributes • Insteadofadoptingexistingnumericalalgorithms,developabitmap-basedapproachto • Binaryencoding • Sparsity • Single-valued • Multi-valuedwithboundedcardinality
Comments • Advantage • … • Drawback • … • Application • …