Hidden Concept Detection in Graph-Based Ranking Algorithm for Personalized Recommendation

Hidden Concept Detection in Graph-Based Ranking Algorithm for Personalized Recommendation Nan Li Computer Science Department Carnegie Mellon University

Introduction • Previous work: • Represents past user behavior through a relational graph. • Fail to represent individual differences among items of a same type. • Our work: • Detect hidden concepts embedded in the original graph • Build a two-level type hierarchy for explicit representation of item characteristics.

Relational Retrieval • Entity-Relation Graph G=(E, T, R): • Entity set E={e} Entity types set T={T} Entity relations R={R} • Each entity e in E has its type e.T . • Each relation R has two entity types R.T1 and R.T2. If two entities has relation R, then R(e1, e2) = 1, o/w 0. • Relational Retrieval Task: Query q = (Eq, Tq) • Given Eq = {e’}, predict the relevance of each entity e of the target type Tq.

Path Ranking Algorithm • Relational Path: • P = (R1, R2, …, Rn) R1.T1=T0 and Ri.T2=Ri+1.T1. • Relational Path Probability Distribution: • The probability corresponds to the probability of a path random walker reaching that entity from a query entity.

PRA Model • (G, l, θ) • The feature matrix A has its each column to be the distribution hp(e). • The scoring function:

Training PRA Model • Training data: D = {(q(m),y(m))}, ye(m)=1 if e is relevant to the query q(m) • Parameter: The weight of path θ • Objective function:

Hidden Concept Detector (HCD) Find hidden subtype of relations • Two-Layer PRA paper gene author paper gene author title journal year title journal year

Bottom-Up HCD • Bottom-Up merging algorithm: • For each relation type Ri • Step1: Divide every starting node of relation Ri as a subrelationRij. • Step2: HAC: Each time merge two subrelationsRim and Rin to maximize the gain of objective functions until no positive gain: paper author paper author paper author author paper

Approximate the Gain of Objective Function • Calculate the maximum gain of two relations: gm and gn • Use taylor series to approximate:

Experimental Results • Data Sets: • Saccharomyces Genome Database, a publication data set about the yeast organism Saccharomycescerevisiae • Three measurements: • Mean Reciprocal Rank (MRR): inverse of the rank of the first correct answer • Mean Average Precision (MAP): the area under the Precision-Recall curve • p@K: precision at K, where K is the actually number of relevant entities.

Normalized Cut • Training data: • Number of clusters ↑ Recommendation quality↑ • Test data: • NCut outperforms random

HCD • Training data: • HCD outperforms PRA in all three measurements • Test data: • Two systems perform equally well

Future Work • Bottom-Up vs Top Down • Improve Efficiency • Type Recovery in Non-Labeled Graph

Building an intelligent agent that simulates human-level learning using machine learning techniques A Computational Model of Accelerated Future Learning through Feature Recognition Nan Li Computer Science Department Carnegie Mellon University

Accelerated Future Learning • Accelerated Future Learning • Learning more effectively because of prior learning • Has been observed a lot • How? • Expert vs Novice • Expert  Deep functional feature (e.g. -3x  -3) • Novice  Shallow perceptual feature (e.g. -3x  3)

A Computational Model • Model Accelerated Future Learning • Use Machine Learning Techniques • Acquire Deep Feature • Integrated into a Machine-Learning Agent

An Example in Algebra

Feature Recognition asPCFG Induction • Under lying structure in the problem  Grammar • Feature  Intermediate symbol in a grammar rule • Feature learning task  Grammar induction • Error  Incorrect parsing

Problem Statement • Input is a set of feature recognition records consisting of • An original problem (e.g. -3x) • The feature to be recognized (e.g. -3 in -3x) • Output • A PCFG • An intermediate symbol in a grammar rule

Accelerated Future Learning through Feature Recognition • Extended a PCFG Learning Algorithm (Li et al., 2009) • Feature Learning • Stronger Prior Knowledge: • Transfer Learning Using Prior Knowledge • Better Learning Strategy: • Effective Learning Using Bracketing Constraint

A Two-Step Algorithm • Greedy Structure Hypothesizer: • Hypothesizes the schema structure • Viterbi Training Phase: • Refines schema probabilities • Removes redundant schemas Generalizes Inside-Outside Algorithm (Lary & Young, 1990)

Greedy Structure Hypothesizer • Structure learning • Bottom-up • Prefer recursive to non-recursive

EM Phase • Step One: • Plan parse tree computation • Most probable parse tree • Step Two: • Selection probabilities update • s: aip, ajak

Feature Learning • Build Most Probable Parse Trees • For all observation sequences • Select an Intermediate Symbol that • Matches the most training records as the target feature

Transfer Learning Using Prior Knowledge • GSH Phase: • Build parse trees based on previously acquired grammar • Then call the original GSH • Viterbi Training: • Add rule frequency in previous task to the current task 0.5 0.33 0.5 0.66

Effective Learning Using Bracketing Constraint • Force to generate a feature symbol • Learn a subgrammar for feature • Learn a grammar for whole trace • Combine two grammars

Experiment Design in Algebra

Experiment Result in Algebra • Both stronger prior knowledge and a better learning strategy can yield accelerated future learning • Strong prior knowledge produces faster learning outcomes • L00 generated human-like errors Fig.3. Curriculum two Fig.2. Curriculum one Fig.4. Curriculum three

Learning Speed inSynthetic Domains • Both stronger prior knowledge and a better learning strategy yield faster learning • Strong prior knowledge produces faster learning outcomes with small amount of training data, but not with large amount of data • Learning with subtask transfer shows larger difference, 1) training process; 2) low level symbols

Score with Increasing Domain Sizes • The base learner, L00, shows the fastest drop • Average time spent per training record • Less than 1 millisecond except for L10 (266 milliseconds) • L10: Need to maintain previous knowledge, does not separate trace into small traces • Conciseness: Transfer learning doubled the size of the schema.

Integrating Accelerated Future Learning in SimStudent • A machine-learning agent that • Acquires production rules from • Examples and problem solving experience • Integrate the acquired grammar into production rules • Requires weak operators (non-domain specific knowledge) • Less number of operators x+5

Concluding Remarks • Presented a computational model of human learning that yields accelerated future learning. • Showed • Both stronger prior knowledge and a better learning strategy improve learning efficiency. • Stronger prior knowledge produced faster learning outcomes than a better learning strategy. • Some model generated human-like errors, while others did no make any mistake.

Thank you! 

Hidden Concept Detection in Graph-Based Ranking Algorithm for Personalized Recommendation

Hidden Concept Detection in Graph-Based Ranking Algorithm for Personalized Recommendation

Presentation Transcript

Top-N Recommendation Algorithm Based on Item-Graph

Community Detection and Graph-based Clustering

Community Detection and Graph-based Clustering

Dual Graph-Based Hot Spot Detection

Improved Spoken Term Detection with Graph-Based Re-Ranking in Feature Space

An Agent-Based Algorithm for Generalized Graph Colorings

Ranking and Recommendation Based on Usage Data

Top-N Recommendation Algorithm Based on Item-Graph

Graph-Based Concept Learning

An Algorithm for Anomaly-based Botnet Detection

A Personalized Recommendation System Based on PRML for E-Commerce

A Tensor-Based Algorithm for High-Order Graph Matching

A Graph Based Algorithm for Data Path Optimization in Custom Processors

Graph Algorithm

Ranking and Fraud Review Detection for Mobile Apps using KNN Algorithm

A Tensor-Based Algorithm for High-Order Graph Matching

Graph Algorithm

Graph-Based Concept Learning

Graph Algorithm