160 likes | 256 Views
Web mining and Social Networking. Introduction Theoretical Backgrounds. Introduction. Background With the explosive growth of information over the internet, WWW has become a powerful platform to mine useful knowledge Problems in Web related researches Finding relevant information
E N D
Web mining and Social Networking Introduction Theoretical Backgrounds
Introduction • Background • With the explosive growth of information over the internet,WWW has become a powerful platform to mine useful knowledge • Problems in Web related researches • Finding relevant information • Search engine – low precision and low recall. • Finding needed information • Query-based search – Doesn’t handle homograph. • Learning useful knowledge • Utilize the Web as knowledge base • Recommendation/personalization of information • Learning user navigational pattern • Web communities and social networking • Relationship among Web objects
Introduction • Data Mining and Web Mining • Data Mining • Discovering hidden or unseen knowledge in the forms of pattern in huge data • Web Mining • The means of utilizing data mining method to induce and extract useful information from Web data information • Web content mining • Web structure mining • Web usage mining • Semantic Web mining
Introduction • Characteristics of Web Data • The data on the web is huge • The data is distributed • The data is unstructured • The data is dynamic • Web community and Social Networking • An aggregation of web pages, users, and data
Theoretical Backgrounds • Web Data Model • Web data can be expressed such as matrix, directed graph and click sequence and so on.
Theoretical Backgrounds • Similarity Functions • Correlation-based Similarity • Cosine-based Similarity
Theoretical Backgrounds • Eigenvector, Principal Eigenvector
Theoretical Backgrounds • Singular Value Decomposition(SVD)
Theoretical Backgrounds • Latent Semantic Analysis(LSA)
Theoretical Backgrounds • Tensor Expression and Decomposition
Theoretical Backgrounds • Performance measure • Precision • Recall • F-measure
Theoretical Backgrounds • Mean Average Precision(MAP) • Discount cumulative gain(DCG) • In the cases of using a graded relevance scale
Theoretical Backgrounds • Web Recommendation Evaluation Metrics • Mean Absolute Error (MAE) • Hit Ratio • Weighted Average Visit Percentage
Theoretical Backgrounds • Basic Metrics of Social Network • Size – # of vertexes in the network • Centrality – Betweenness, Closeness, Degree • Density – existing edges / total possible edges in the network. • Degree( of network) - # of edges in the network. • Betweeness and Closeness • Clique – sub-set of a network
Theoretical Backgrounds • Social Network over the web • Each web page = social entity, hyperlink = relationship • Centrality – closeness, degree, betweenness • Prestige – A prestige actor is one who receives a lot of inlinks