510 likes | 1.04k Views
Contents. Contents. What is a proper recommendation ? Defining Similarities for OLAP Sessions The SROS R ecommender System Assessing the quality of the recommender system Conclusion & Perspectives. Contents. Contents. What is a proper recommendation ?
E N D
Contents Contents Whatisa properrecommendation? Defining Similarities for OLAP Sessions The SROS Recommender System Assessing the quality of the recommender system Conclusion & Perspectives
Contents Contents Whatis a properrecommendation ? Defining Similarities for OLAP Sessions SROS System Assessing the quality of the recommender system Conclusion & Perspectives
Whatis a properrecommendation ? > RecommenderSystems (E-commerce) RecommenderSystems (E-commerce) Here are the moviesthatI rated: Per qualchedollaro in piu (1) Recommendation ? ReservoirDogs (0.7) Requiem for a Dream (0.7) DjangoUnchained (0.7) • Threeapproaches : • Content-based • Collaborative Filtering • Hybrid Best approaches Qualitymeasuresalreadyproposed Aleksander
Whatis a properrecommendation ? > RecommenderSystems (Databases) Recommendations in a DatabaseContext LOG • Query: • Declarativelanguage • Over a schema Analysis Sessions Aleksander Multi-user context • How to recommendqueries or sessions? • In a multi-user context • Leveragingthe schema ColloborativeFilteringapproach Usingquery expression to be efficient
Whatis a properrecommendation ? > QueryRecommendation in Data Warehouses> ModelingMultidimensional data ModellingMultidimensional Data Query: fragment-based City AllRaces Year Group-by set AllOccs Sex Sex=Female Selection set City=NewYork Measure set AvgIncome CostGas One cube withhierarchies, and a set of measures: Session: Income PropInsr PerWt CostGas CostWtr CostElec Query 1 Query 3 Query 2
Whatis a properrecommendation ? > QueryRecommendation in Databases > SnipSuggestapproach SnipSuggest([Khoussainova et al, 2010])approach LOG Association rules FROM Movies WHERE director=‘Quentin Tarantino’ WHERE genre=‘Western’ Conf:1/5 FROM Movies WHERE director=‘Quentin Tarantino’ WHERE genre=‘Thriller’ Conf:4/5 SELECT title WHERE director=‘Quentin Tarantino’ WHERE year>1995 Conf:4/5 TOP-K SELECT title, genre FROM Movies WHERE director = ‘Quentin Tarantino’ AND Aleksander
Whatis a properrecommendation ? > QueryRecommendation in Data Warehouses > [Sapia, 2000] & [Aufaure et al., 2013] approaches • PROMISE [Sapia, 2000] & [Aufaure et al., 2013] approaches LOG Clusters of queries Query c1 Current Query Aleksander
Whatis a properrecommendation ? > QueryRecommendation in Data Warehouses > [Giacometti et al., 2009] & [Negre, 2009] approaches • [Giacometti et al., 2009] & [Negre, 2009]approaches Generic Framework : Current Session Log Edit Distance Session 1 Session 2 Hausdorff Distance Query 1 Query 2 Distance in Hierarchy Position in a cube Position in a cube
Whatis a properrecommendation ? > QueryRecommendation in Databases & Data Warehouses > Approaches QueryRecommendation in Databases & DataWarehouses SROS Yes Query Expression Sequence of queries Similarity-based
Whatis a properrecommendation ? > QueryRecommendation in Databases & Data Warehouses QueryRecommendation in Databases & DataWarehouses • A DesirableRecommendation: • How to propose an informative recommendationto the user? • Proposal: • Recommend a sequence of queries • How to find log sessions close to the current session? • Proposal: • Define a two-levelsimilaritymeasurebetween sessions • How to be consistent with the context of the current session? • Proposal: • Adapt the recommendationwiththat of the current session • How to providea relevant recommendation to the user? • Proposal: • Definequalitycriteriaassessing the recommendation
Contents Contents Whatis a properrecommendation ? Defining Similarities for OLAP Sessions SROS System Assessing the quality of the recommender system Conclusion & Perspectives
DefiningSimilarities for OLAP Sessions > Requirementsfor Similarity-based Recommendation of OLAP Sessions Requirements for SimilarityMeasuresbetween OLAP Sessions • Intuitively: • The order of queriesisrelevant • Recentqueries are more relevant thanolderqueries • How to define a similaritymeasure for sequences? • Classically: Dice, Edit Distance, soft-TFIDF, SubsequenceAlignment • How to include the comparisonbetweenthe sequenceelements? • A two-levelapproachbased on similaritiesbetweenelements • Classically: Cosinesimilarity, Hamming Distance, Hausdorff Distance • No proposal for a measureusingqueryexpression!
DefiningSimilarities for OLAP Sessions > SimilarityMeasurebetweenQueries SimilarityMeasurebetweenQueries [Aligon et al., KAIS, 2013] City AllRaces Year City AllRaces AllYear AllOccs Sex AllOccs Sex Sex=Female Sex=Female City=New-York Region=West AvgIncome CostGas AvgIncome MaxCostElec Query 2 Query 1
DefiningSimilarities for OLAP Sessions > SimilarityMeasurebetweenSessions SimilarityMeasurebetween Sessions [Aligon et al., KAIS, 2013] • Differentclassicalapproachesextended to OLAP context: • Dice Coefficient
DefiningSimilarities for OLAP Sessions > SimilarityMeasurebetweenSessions SimilarityMeasurebetween Sessions [Aligon et al., KAIS, 2013] • Differentclassicalapproachesextended to OLAP context: • Dice Coefficient
DefiningSimilarities for OLAP Sessions > SimilarityMeasurebetweenSessions SimilarityMeasurebetween Sessions [Aligon et al., KAIS, 2013] • Differentclassicalapproachesextended to OLAP context: • soft-TF-IDF TF-IDF Scores Log
DefiningSimilarities for OLAP Sessions > SimilarityMeasurebetweenSessions SimilarityMeasurebetween Sessions [Aligon et al., KAIS, 2013] • Differentclassicalapproachesextended to OLAP context: • Levenstein Distance • Based on a matrix of costcomputingbetweentwo sessions the minimal number of: • Insertion (I) • Deletion (D) • Substitution (S) (I) (S) (D) matching
DefiningSimilarities for OLAP Sessions > SimilarityMeasurebetweenSessions SimilarityMeasurebetween Sessions [Aligon et al., KAIS, 2013] • Differentclassicalapproachesextended to OLAP context: • SubSequenceAlignment • To find the optimal local alignment whose result is a trade-off between the cost of gap and mismatching GAP No Alignment -> Minimal similarity Perfectalignement -> Maximal similarity Good alignement -> Good similarity Very Bad alignement -> Verylowsimilarity
DefiningSimilarities for OLAP Sessions > SimilarityMeasurebetweenSessions Extension of SubsequenceAlignment in the OLAP context Time-discountingfunction Gap Penalty : variable score ensuring few gaps and long alignments
DefiningSimilarities for OLAP Sessions > SimilarityMeasurebetweenSessions > Subjective & Objective Tests AssessedwithSubjective Tests • 41 students & researchers from ebiss’2011 summer school • Giventheirsimilaritydegrees(Low, Fair, Good, High) between a currentquery and differentqueries • Giventheirsimilaritydegrees (Low, Fair, Good, High) betweena current session and different sessions [ebiss2011 SummerSchool] Opinions: QuerySimilarities Session Similarities questionnaire
DefiningSimilarities for OLAP Sessions > SimilarityMeasurebetweenSessions > Subjective & Objective Tests AssessedwithSubjective Tests
DefiningSimilarities for OLAP Sessions > SimilarityMeasurebetweenSessions > Subjective & Objective Tests Assessedwith Objective Tests
DefiningSimilarities for OLAP Sessions > SimilarityMeasurebetweenSessions > Subjective & Objective Tests Assessedwith Objective Tests
Contents Contents Whatis a properrecommendation ? Defining Similarities for OLAP Sessions SROS System Assessing the quality of the recommender system Conclusion & Perspectives
SROS System SROS System • Composed of three phases: • Selection : Select a set of possible futures • Ranking : Rank the futures fromfrequentsimilarities • Tailoring:Adapt the best future to the current session
SROS System > Selection SROS System Current Session Current Session Aleksander Futures Future Former Close Sessions Log Session 1 Selection Log
SROS System > Ranking SROS System Current Session 2 Ranking Aleksander Futures 3 2 Former Close Sessions 1 Selection 4 Log
SROS System > Ranking SROS System Current Session 2 Ranking Aleksander Futures Futures 3 2 1 Former Close Sessions 1 Selection 4 Log
SROS System > Ranking SROS System Current Session 2 Ranking Aleksander Futures Futures 3 0 2 2 1 1 2 0 2 0 1 Former Close Sessions 1 Selection 4 Log
AdaptedRecommendation SROS System > Tailoring SROS System 3 Tailoring Association rules Current Session • Rules of Type 1 Year=2005 Year=2002 2 Ranking Aleksander Futures 3 AllCities Region State Year=2002 Year=2002 Year=2002 AvgIncome AvgIncome AvgCostWtr 2 • Current Session City City State Region State Former Close Sessions 1 1 Year=2002 Year=2005 Year=2002 Year=2002 Year=2005 Year=2005 Year=2005 Year=2005 Selection AvgCostGas AvgIncome AvgIncome AvgCostGas AvgCostGas • Log Session • Recommendation Log
AdaptedRecommendation SROS System > Tailoring SROS System 3 Tailoring Association rules Current Session • Rules of Type 2 AvgIncome Year=2002 Aleksander 2 Ranking 3 AllCities Region State Year=2002 Year=2002 Year=2002 AvgIncome AvgIncome AvgCostWtr 2 • Current Session City City State Former Close Sessions 1 1 Year=2002 Year=2002 Year=2002 Selection AvgCostGas AvgCostGas AvgCostGas AvgIncome AvgIncome AvgIncome Log • Recommendation
Contents Contents Whatis a properrecommendation ? Defining Similarities for OLAP Sessions SROS System Assessing the quality of the recommender system Conclusion & Perspectives
Assessing the quality of the recommender system > QualityMeasures QualityMeasures • Different Quality criteria : • Novelty Recommendation LOG
Assessing the quality of the recommender system > QualityMeasures QualityMeasures AllSexes • Different Quality criteria : • Novelty • Adaptation AvgCostElec AllCities Race AllOccs Year Year=2002 • Current Session Sex AvgIncome AllCities Race AllOccs Year Year=2005 • Recommendation
Assessing the quality of the recommender system > QualityMeasures QualityMeasures • Different Quality criteria : • Novelty • Adaptation • Accuracy & Coverage ExpectedRecommendations Recommendations
Assessing the quality of the recommender system > QualityMeasures QualityMeasures • An effective recommendation system finds a trade-off between: • A novelty whose recommendation provides new informative fragments from the log • An adaptation preserving the consistency with the current session
Assessing the quality of the recommender system > Synthetic & Real Logs BehaviorGeneration • Explorativetemplate • Goal-orientedtemplate IQ: Initial Query FQ: Final Query SQ: SurprisingQuery RQ: RandomQuery IQ IQ Shortest OLAP Path RQ One randomoperation RQ • Synthetic log : • including Explorative and Goal-Oriented templates, randomly chosen • composed of 200 sessions (a total of 2950 queries) SQ RQ RQ RQ RQ FQ RQ
Assessing the quality of the recommender system > Synthetic & Real Logs Real Logs • Questionnaires performed by Master’s students : • 40 students from University of Bologna and University of Tours • Devising OLAP sessions answering to questions of different complexities • A filtering is essential : (145 sessions after filtering)
Assessing the quality of the recommender system > Effectiveness Test > Principle Effectiveness Test Principle • Conducted with the following logs : • The synthetic log • The logs devised by the student • N-fold cross validation Log Current session Expected Recommendation
Assessing the quality of the recommender system > Effectiveness Test > AccuracyResult AccuracyResult Synthetic Log
Assessing the quality of the recommender system > Effectiveness Test > AccuracyResult AccuracyResult Student Log
Assessing the quality of the recommender system > Effectiveness Test > NoveltyResult NoveltyResult Novelty Measure
Assessing the quality of the recommender system > Effectiveness Test > Adaptation Result Adaptation Result Adaptation Measure
Contents Contents Whatis a properrecommendation ? Defining Similarities for OLAP Sessions SROS System Assessing the quality of the recommender system Conclusion & Perspectives
Conclusion & Perspectives Conclusion • Requirements for Recommendation and Similarity measures • Definitions of query and session similarities for OLAP • Assessed with subjective and objective tests • Query similarity based on the structure • Session similarity extending Subsequence Alignment • Proposal of a similarity-based recommender system of OLAP sessions based on three phases: • Selection • Ranking • Tailoring • The recommender system is assessed in terms of effectiveness: • Quality Measure Proposals • The recommendations are: • well adapted to the context of the current session • preserve the logic of the log session to provide new information to the user • very accurate, for very different contexts of log density
Conclusion & Perspectives A tool for Session Design Assistance • How to remedyto the cold startproblem ? • Solution : exploring the former session to initiate the first queries of a current session • Problem : how to navigatebetween the sessions ? • Proposal: organizing sessions in a hierarchical structure and definingbrowsingoperators • Problem : how to represent groups of sessions ? • Proposal : usingsummarization techniques to reduce the number of queries and sessions but also to design a representative session ([Aligon & Marcel, EDA’2012], [Aligon et al., PersDB’2012])
Conclusion & Perspectives A benchmark of OLAP sessions • Development of a platerform to assess the quality of an analytical session over a cube: • Allowing to measure the effectiveness of user-centricapproaches • Finding a more precisedefinition of OLAP session
Conclusion & Perspectives Adaptation of the Recommender System in othercontexts • Adaptation in Data-Miningcontextwhere sessions canbeconsidered as sequences of complextasks • Supposing to adapt the session similarity • In the Web, sequencescanbeanalysissequences over social networks : • Takingintoaccount the relationshipsbetweenusers • Considering a similaritymeasurebetweenusersto define user profiles