500 likes | 899 Views
Heat Diffusion Model and its Applications. Haixuan Yang Term Presentation Dec 2, 2005. Outline. Introduction Heat Diffusion Model Heat Diffusion Classifiers Heat Diffusion Ranking Predictive Random Graph Ranking Experiments Conclusions and Future Work.
E N D
Heat Diffusion Model and its Applications Haixuan Yang Term Presentation Dec 2, 2005
Outline Introduction Heat Diffusion Model Heat Diffusion Classifiers Heat Diffusion Ranking Predictive Random Graph Ranking Experiments Conclusions and Future Work
Introduction - heat diffusion Heat diffusion is a physical phenomena. In a medium, heat always flow from position with high temperature to position with low temperature. Heat kernel is used to describe the amount of heat that one point receives from another point. The way that heat diffuse varies when the underlying geometry varies.
Introduction - related work • Kondor & Lafferty (NIPS2002) • Construct a diffusion kernel on a graph • Handle discrete attributes • Apply to a large margin classifier • Achieve goof performance in accuracy on 5 data sets from UCI • Lafferty & Kondor (JMLR2005) • Construct a diffusion kernel on a special manifold • Handle continuous attributes • Restrict to text classification • Apply to SVM • Achieve good performance in accuracy on WEbKB and Reuters • Belkin & Niyogi (Neural Computation 2003) • Reduce dimension by heat kernel and local distance • Tenenbaum et al (Science 2000) • Reduce dimension by local distance
Introduction – the ideas adopted • Similarity between heat diffusion and density. • Heat diffuses in the same way as Gaussian density in the ideal case when the manifold is the Euclidean space. • The way heat diffuses on a manifold can be understood as a generalization of the Gaussian density from Euclidean space to manifold. • Local information is relatively accurate in a nonlinear manifold. • Learn local information by k nearest neighbors. Direct distance may not be accurate The curve may better measure the distance
Introduction – different ideas • Unknown manifold in most cases. • Unknown solution for the known manifold. • The explicit form of the approximation to the heat kernel in (Lafferty & Lebanon JMLR2005) is a rare case. • Establish the heat diffusion equation directly on a graph that is either the K nearest neighbor graph or the link graph. • The K nearest neighbor graph or the link graph is considered as an approximation to the unknown manifold. • Always have an explicit form in any case. • Form a classifier by the solution directlyin the application of classification. • Apply the heat kernel for ranking onthe Web pages.
Heat Diffusion Model - Notations • G=(V,E), a given directed graph, where • V={1,2,…,n}, • E={(i,j): if there is an edge from i to j}, • fi(t): the heat at node i at time t. • RH(i,j,t,Δt): amount of heat that at time t, i receives from its antecedent j during a period of Δt. • DH(i,t,Δt): amount of heat that at time t, i diffuses to its subsequent nodes.
Heat Diffusion Model - assumptions • RH(i,j,t, Δt) is proportional to the time period Δt. • RH(i,j,t, Δt) is proportional to the heat at node j. • RH(i,j,t, Δt) is zero if there is no link from j to i. • DH(i,j,t, Δt) is proportional to the time period Δt. • DH(i,j,t, Δt) is proportional to the heat at node i. • RH(i,j,t, Δt) is proportional to its outdegree .
Heat Diffusion Model - solution • The heat difference fi(t+Δt) and fi(t) can be expressed as: • It can be expressed as a matrix form: where we let for simplicity. • Let Δt tends to zero, the above equation becomes: • Especially, we have
Heat Diffusion Model – weighted graph • For weighted graphs, the heat difference fi(t+Δt) and fi(t) can be expressed as • The solution is expressed as
Heat Diffusion Classifiers - Illustration NHDC: Non-propagating Heat Diffusion Classifier PHDC: Propagating Heat Diffusion Classifier The first heat diffusion The second heat diffusion
Heat Diffusion Classifiers - Illustration Heat received from A class: 0.018 Heat received from B class: 0.016 Heat received from A class: 0.002 Heat received from B class: 0.08
Heat Diffusion Classifiers - algorithm - Step 1 • [Construct neighborhood graph] • Define graph Gover all data points both in the training data set and in the test data set. • Add edge from j to i if j is one of the K nearest neighbors of i. • Set edge weight w(i,j)=d(i, j) if j is one of the K nearest neighbors of i, where d(i, j) be the Euclidean distance between point i and point j.
Heat Diffusion Classifiers - algorithm - Step 2 • [Compute the Heat Kernel] • Computing H for NHDC using • Computing for PHDC using the equation
Heat Diffusion Classifiers - algorithm - Step 3 • [Compute the Heat Distribution] For each class c, • Set f(0) • nodes labeled by class c, has an initial unit heat at time 0, all other nodes have no heat at time 0. • Compute the heat distribution • In PHDC, use equation to compute the heat distribution. • In NHDC, use equation
Heat Diffusion Classifiers - algorithm - Step 4 • [Classify the nodes] • By last step, we get the heat distribution for each class k, then, for each node in the test data set, classify it to the class from which it receives most heat.
Heat Diffusion Classifiers - Connections with other models • The Parzen window approach (when the window function takes the normal form) is a special case of the NHDC. • It is a non-parametric method for probability density estimation: For each class k The class-conditional density for class k Using Bayes rule Assign x to a class whose value is maximal.
Heat Diffusion Classifiers - Connections with other models • The Parzen window approach (when the window function takes the normal form) is a special case of the NHDC. • In our model, let K=n-1, then the graph constructed in Step 1 will be a complete graph. The matrix H will be Using the heat equation f(t)=Hf(0) Heat that xp receives from the data points in class k
Heat Diffusion Classifiers - Connections with other models • KNN is a special case of the NHDC. • KNN • For each test data, assign it to the class that has the maximal number in its K nearest neighbors.
Heat Diffusion Classifiers - Connections with other models • KNN is a special case of the NHDC. • In our model, let β tend to infinity, then the matrix H becomes Using the heat equation f(t)=Hf(0) The number of the cases in class q in its K nearest neighbor. Heat that xp receives from the data points in class k
Heat Diffusion Classifiers - Connections with other models • PHDC can approximate NHDC. • If γis small, then Since the identity matrix has no effect on the heat distribution, PHDC and NHDC has similar classification accuracy when γ is small.
Heat Diffusion Classifiers - Connections with other models PHDC When γ is small NHDC When β is infinity When k=n-1 KNN PWA
Heat Diffusion Ranking - motivation • The Web pages are considered to be drawn from an unknown manifold. • The link structure forms a directed graph, which is considered as an approximation to the unknown manifold. • The heat kernel established on the Web graph is considered as the representation of relationship between Web pages. • When there are more paths from page j to page i, i will receive more heat from j; • When the path length from j to i is shorter, i will receive more heat form j.
Heat Diffusion Ranking - algorithm Let V be the set of the Web pages. If there is a link from j to i, we say there is edge (j,i). The graph is a static graph. • Compute the Matrix H • Compute or • The i-row j-column element means the amount of heat that i can receive from j from time 0 to 1, and is used to measure the similarity from j to i. If the graph is a random graph, which is generated by the first stage of the Predictive Random graph Ranking, then • Compute the Matrix R • Compute or The algorithm is called DiffusionRank
Heat Diffusion Ranking - advantages • Its solution has two forms, both of which are closed form. • Its solution is not symmetric, which better models the nature of relativity of similarity. • It can be naturally employed to detect group-group relation. • It can be used to anti-manipulation.
Predictive Random Graph Ranking - motivation • To improve the accuracy of DiffusionRank, we need to model the Web graph accurately—random graph. • The web is dynamic • The observer is partial • Links are different • The random graph model can also improve other ranking algorithms, and hence is called predictive random graph ranking framework .
Predictive Random Graph Ranking - framework • Random Graph Generation Stage • Engages the temporal, spatial and local link information to construct a random graph. • Random Graph Ranking Stage • Takes the random graph output and then calculates the ranking result based on a candidate ranking algorithm.
Predictive Random Graph Ranking – first stage • The web is dynamic • Predict the early Web structure as a random graph– Temporal Web Prediction Model • The observer is partial • Different Web graph Gi = (Vi ,Ei ) are obtained by N different observers (or crawlers). • A random graph RG=(V,P) is constructed by n(i,j) is the number of the graphs where the link (i,j) appears. • Links are different • As an example, a random graph RG=(V,P) can be constructed by where j is the k(i, j)-th out-link from i
Predictive Random Graph Ranking – Temporal Web Prediction Model • From the viewpoint of a crawler, the web is dynamic, and there are many dangling nodes (pages that either have no out-link or have no known out-link) • Classify dangling nodes • Dangling nodes of class 1 (DNC1) – those that have been found but have not been visited. • Dangling nodes of class 2 (DNC2) – those that have been tried but not visited successfully. • Dangling nodes of class 3 (DNC3) – those that have been visited successfully but from which no out-link is found.
Predictive Random Graph Ranking – Temporal Web Prediction Model • Suppose that all the nodes V can be partitioned into three subsets: . • denotes the set of all non-dangling nodes (that have been crawled successfully and have at least one out-link); • denotes the set of all dangling nodes of class 3; • denotes the set of all dangling nodes of class 1; • For each node v in V, the real in-degree of v is not known.
Predictive Random Graph Ranking – Temporal Web Prediction Model • We predict the real in-degree of v by the number of found links from C to v. • Assumption: the number of found links from C to v is proportional to the real number of links from V to v. • The difference between real in-degree and the predicted in-degree is distributed uniformly to the nodes in .
Predictive Random Graph Ranking – Temporal Web Prediction Model Models the missing information from unvisited nodes to nodes in V: from D2 to V. Model the known link information as Page (1998): from C to V. Model the user’s behavior as Kamvar (2003) when facing dangling nodes of class 3: from D1 to V. n : the number of nodes in V; m: the number of nodes in C; m1: the number of nodes in D1.
Predictive Random Graph Ranking – second stage On a random graph RG=(V,P) • DiffusionRank
Predictive Random Graph Ranking – second stage On a random graph RG=(V,P) • PageRank • Common Neighbor • Jaccard’s Coeffient • SimRank
Experiments – Heat Diffusion Classifiers • 2 artificial Data sets and 6 datasets from UCI Spiral-100 Spiral-1000 • Compare with Parzen window (The window function takes the normal form), KNN. • The result is the average of the ten-fold cross validation.
Experiments - Heat Diffusion Classifiers • Experimental Setup • Experimental Environments • Hardware: Nix Dual Intel Xeon 2.2GHz • OS: Linux Kernel 2.4.18-27smp (RedHat 7.3) • Developing tool: C • Data Description In Credit-g, the 13 discrete variables are ignored since we only consider the continuous variables.
Experiments - Heat Diffusion Classifiers • Parameters Setting
Experiments - Heat Diffusion Classifiers • Results
Experiments – Predictive Random Graph Ranking • Data • Synthetic Web Graph • Follow a power law • Real Web Graph • Within cuhk.edu.hk
Experiments – Predictive Random Graph Ranking • Methodology • For each algorithm A, we have two versions denoted by A and PreA. • A – the original version • PreA -- the version with the Temporal Web Prediction Model • For each data series and for each algorithm A, we obtain 22 ranking results: A1 , A2 , …, A11 PreA1 , PreA2 , …, PreA11 • Compare the early results with the final result A11 . • Value Difference • Order Difference
Experiments – Predictive Random Graph Ranking • Set Up • For PageRank and PrePageRank, • α=0.85, • g is the uniform distribution • For DiffusionRank and PreDiffusionRank • Use the discrete diffuse kernel • σ=1, N=20
Conclusions • Both NHDC and PHDC outperform KNN and Parzen Window Approach in accuracy on these 8 datasets. • PHDC outperforms NHDC in accuracy on these 8 datasets. • DiffusionRank is another candidate of ranking algorithm. • Temporal Web Prediction Model in effective in PageRank and DiffusionRank. • The Predictive Random Graph Ranking framework extends the scope of some original ranking techniques.
Future Work • Approximate the manifold more accurately. • Apply the non-symmetric heat kernel to SVM. • Further investigate on partial observers and weighted links.