Cliques in unknown graphs Roni Stern, Meir Kalech and Ariel Felner

140 200 * 150 120 # 100 100 ** 50 80 0 CONTROL INS+GLUC INS+GLUC+HCY INS+GLUC+HCY+RG 60 40 20 0 CONTROL HCY RSG HCY+RSG Morbi tristique erat at ligula Testosterone Placebo Neuroticism Extraversion Openness Agreeable Conscientious 60 50 T-score 40 30 Dutifulness Conscientious Vulnerability 20 1 3* 5 1 3 5 1 3 5 1 3 5 1 3 5 C* N O 2 4 6 E A 2 4 6 2 4 6 2 4 6* 2 4 6 NEO Factor *p < .05 Cliques in unknown graphs Roni Stern, Meir Kalech and Ariel Felner Dept. of Information Systems Engineering, Ben Gurion University, Israel Results Conclusion What are unknown graphs? What are unknown graph? Insert your text here. Remember, you can adjust the font size to fit your text. Insert your text here. You can place your organizations logos on either side of the title of the poster. In hac habitasse platea dictumst. Nullam tellus. Fusce eget risus nec est pellentesque tempor. Morbi scelerisque nulla. In non neque. Etiam ac nulla. Nulla vitae sem non lorem ullamcorper interdum. Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Pellentesque blandit. Cras metus. Proin ut lacus sit amet sapien adipiscing malesuada. Maecenas mattis. In hac habitasse platea dictumst. Maecenas nonummy posuere eros. Proin nec urna id pede imperdiet vulputate. Proin lacinia. Nullam vitae nulla eget velit vestibulum porttitor. Put your information here. Remember to size your font accordingly. • Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Nam in diam consectetuer odio blandit elementum. Morbi id ligula ac ligula adipiscing iaculis. Nulla risus lorem, molestie ac, scelerisque quis, gravida eleifend, wisi. Proin sapien ante, faucibus sit amet, mollis eu, molestie a, erat. Donec magna nibh, ultrices facilisis, dignissim at, mollis ut, diam. Nulla facilisi. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Morbi tristique erat at ligula. Curabitur in metus. Donec consequat. Quisque porta. Integer diam. Mauris pellentesque aliquet elit. Curabitur hendrerit metus quis augue. Mauris dolor hendrerit quam, sit amet mollis • Phasellus nec dui at enim faucibus feugiat. Nulla quis lacus nec lorem laoreet volutpat. Curabitur placerat, dui sit amet aliquet volutpat, lectus felis varius lacus, at lacinia turpis neque imperdiet libero. • Donec nunc leo, porta eu, hendrerit vel, posuere nonummy, metus. Vestibulum congue bibendum ipsum. • Sed sed mi. Fusce dapibus diam vitae lorem consequat auctor. Classic graph algorithms assume that the entire structure of the graph is given as input in data structures (e.g. adjacency list or adjacency matrix). We refer to such problems as problems on known graphs. While many real-world domains can be modeled as graphs, the agents operating in them seldom have complete knowledge of that graph. The agents are then required to explore the unknown graph, when performing tasks. This may required costly operations such as activating sensors or sending packets in the net. Clique in unknown graph A set of nodes where each pair of nodes is connected by an edge is called a clique. A k-clique is a clique with k nodes. Finding cliques in a graph is a basic problem in Computer Science, and has many applications. Our goal is to find a k-clique in an unknown graph, while minimizing the number of explored nodes. Measures blandit. Cras metus. Proin ut lacus sit amet sapien adipiscing malesuada. Maecenas mattis. In hac habitasse platea dictumst. Maecenas nonummy posuere eros. Proin nec urna id pede imperdiet vulputate. Proin lacinia. Nullam vitae nulla eget velit vestibulum porttitor. Praesent vel orci. Ut quis metus vel elit placerat bibendum. Nam porttitor orci vel augue. Insert your text here. You can place your organizations logos on either side of the title of the poster. In hac habitasse platea dictumst. Nullam tellus. Fusce eget risus nec est pellentesque tempor. Morbi scelerisque nulla. In non neque. Etiam ac nulla. Nulla vitae sem non lorem ullamcorper interdum. Insert your text here. Remember, you can change template colors to suit your own taste or institution colors. The graphic can be replaced with several smaller graphics. Objectives • Phasellus nec dui at enim faucibus feugiat. Nulla quis lacus nec lorem laoreet volutpat. Curabitur placerat, dui sit amet aliquet volutpat, lectus felis varius lacus, at lacinia turpis neque imperdiet libero. • Donec nunc leo, porta eu, hendrerit vel, posuere nonummy, metus. Vestibulum congue bibendum ipsum. • Sed sed mi. Fusce dapibus diam vitae lorem consequat auctor. • Praesent nibh pede, eleifend ac, aliquam vitae, venenatis eu, risus. In sit amet diam. Integer suscipit interdum eros. • In hac habitasse platea dictumst. Nullam tellus. Discussion References ( ( • Donec nunc leo, porta eu, hendrerit vel, posuere nonummy, metus. Vestibulum congue bibendum ipsum. • Sed sed mi. Fusce dapibus diam vitae lorem consequat auctor. Figure 1. Maecenas mattis Figure 2. Nullam vitae Phasellus vel orci blandit augue rutrum pretium. Pellentesque laoreet magna non odio. Cras porta eros non turpis. Phasellus eu turpis volutpat diam dictum interdum. Duis fringilla nunc. Etiam a tellus nec lorem consequat eleifend. Quisque eu est. Vestibulum sollicitudin est eget ligula. Integer. Funding Source: First Person Second Contributor

? ? 3-clique Unknown ? ? Explored ? ? href Known Known href HTTP HTML Unknown ? ? ? Explored ? Is there a 4-clique? HTTP ? ? ? Cliques in the Web ? ? Example: Cliques in the Web ? ? 3-clique ? ? href HTTP ? HTML href ? ? HTTP Is there a 4-clique? ? ? ? ? ? ? ? ? href HTTP HTML ? href HTTP Cliques in the Web ? ? href HTTP HTML ? href HTTP Example: Cliques in the Web

L A B F H G S J Cliques in unknown graphs Roni Stern, Meir Kalech and Ariel Felner Dept. of Information Systems Engineering, Ben Gurion University, Israel D E I C K ? 10% 30% ? ? 80% What are unknown graphs? Each web page can be viewed as a node in the web graph, where edges represent hypertext links (href) Classic graph algorithms assume that the entire structure of the graph is given as input in data structures (e.g. adjacency list or adjacency matrix). We refer to such problems as problems on known graphs. While many real-world domains can be modeled as graphs, the agents operating in them seldom have complete knowledge of that graph. The agents are then required to explore the unknown graph, when performing tasks. This may required costly operations such as activating sensors or sending packets in the net. Clique in unknown graphs A set of nodes where each pair of nodes is connected by an edge is called a clique. A k-clique is a clique with k nodes. Finding cliques in a graph is a basic problem in Computer Science, and has many applications. Our goal is to find a k-clique in an unknown graph, while minimizing the number of explored nodes. Since it is rapidly changing, and holds over 20 billion web pages, the web is unknown (except perhaps to Google). Where to crawl next? Which node to explore? 1 2 Gray circles are the explored nods Highest known degree Largest potential clique A common heuristic for finding cliques is to search first the node with the highest degree. Since the real degree of a vertex that has not been explored yet is not known, we consider its known degree - the number of adjacent expanded vertices. The vertex with the highest known degree is chosen to be explored. A set of vertices is a potential k-clique if they may be a part of a k-clique in the unknown graph. A lower bound on the exploration steps of finding a k-clique is k minus the size of the largest potential k-clique. This bound is tight. We call Clique* the heuristic which chooses first to explore the nodes connected to the largest potential k-clique. Choice of known degree Choice of Clique* 3 What if the agent knows the probability of an edge to exists? Using Domain-specific knowledge: Probabilities In many domains the graph is unknown, but there is knowledge on the probability of the edges. For example, an accuracy model of a sensor. Standard MDP solvers are not feasible due to the enormous number of states. We answer this challenge with: RClique*, a Monte-Carlo based sampling heuristic that combines Clique* and probabilistic knowledge. A limited number of future exploration steps are simulated using the domain model, and Clique* is used to estimate the future cost. The average cost over all samples is the heuristic used. Experimental results Future work Random: Explore randomly until a k-clique is found • Generalize to subgraph isomorphism • Multiagent search • Advanced exploration cost models: • - Heterogeneous cost per node • - Physical exploration cost • (exploration as af unction of current location) • Large-scale online web experiment • Incorporate into web data mining application • Theoretical analysis of RClique* (how to sample) Clique* is better than known degree for sparse graphs All algorithms are superior to RANDOM RClique* is superior (50% noise,250 samples, max depth 3) Lower bound:Offline shortest path to a k-clique

L A B F H G S J Cliques in unknown graphs Roni Stern, Meir Kalech and Ariel Felner Dept. of Information Systems Engineering, Ben Gurion University, Israel D E I C K ? 10% ? 80% ? 30% 1 Web pages can be viewed as a nodes in the web graph where edges represent hypertext links (href) What are unknown graphs? • Known graphs: • The input graph is given as input • Explicit representation or computational operators • Unknown graphs: • Must explore the graph to learn of nodes/edges • Exploration has a cost (fuel, network I/O, etc.) 2 Explored Clique in unknown graphs Known ? Unknown ? ? Since it is rapidly changing, and holds over 20 billion web pages, the web is unknown (except perhaps to Google). ? Goal: Find a k-clique, minimize the number of explored nodes ? Which node to explore next? 3-clique ? ? ? ? 3 Gray circles mark the explored nodes 4 Highest known degree Largest potential clique ? ? Until a node is explored, itsreal degree is not known. The known degree is the number of adjacent expanded vertices Known degree heuristic: Explore the node with highest known degree. Potential k-clique: Set of nodes that may be a part of a k-clique Tight lower bound: The largest potential k-clique induces a lower bound on future exploration. Clique* heuristic: Explore nodes connected to largest potential k-clique ? ? Is there a 4-clique? ? Chosen by Known degree Chosen by Clique* 5 Can probabilistic knowledge help? Standard MDP solvers are not applicable – exponential state space RClique* heuristic: Monte-Carlo based sampling heuristic , using Clique* to estimate future cost of every sample probability of an edge between openlist nodes 6 Sensor accuracy model ? Graph generator model Experimental results ? href HTTP Random: Explore randomly until a k-clique is found HTML href ? Clique* is better than known degree for sparse graphs 7 All algorithms are superior to RANDOM HTTP Future work RClique* is superior Example: Cliques in the Web • Generalize to subgraph isomorphism • Multiagent search • Advanced exploration cost models: • - Heterogeneous cost per node • - Physical exploration cost (50% noise,250 samples, max depth 3) Lower bound:Offline shortest path to a k-clique

L A B F H G S J Cliques in unknown graphs Roni Stern, Meir Kalech and Ariel Felner Dept. of Information Systems Engineering, Ben Gurion University, Israel D E I C K ? 10% ? 80% ? 30% 1 Theweb is unknown (except perhaps to Google) What are unknown graphs? 2 • Known graphs: • The input graph is given as input • Explicit representation or computational operators • Unknown graphs: • Must explore the graph to learn of nodes/edges • Exploration has a cost (fuel, network I/O, etc.) Explored Clique in unknown graphs Known ? Unknown ? ? Goal: Find web pages referencing each other Motivation: Clique of papers in Google Scholar Nodes & Edges: Web pages and hyperlinks (href) Exploration: HTTP request, HTML parsing Challenge: Where do we crawl next? ? Goal: Find a k-clique, minimize the number of explored nodes ? Example: Cliques in the Web Which node to explore next? ? ? 3-clique ? ? ? ? 3 href Gray circles mark the explored nodes 4 HTTP Highest known degree Largest potential clique HTML href ? ? ? Degree of a node is known only after it is explored Known degree of a node is the number of adjacent expanded vertices. Known degree heuristic: Explore the node with highest known degree. Potential k-clique: Set of nodes that may be a part of a k-clique Tight lower bound: The largest potential k-clique induces a lower bound on future exploration. Clique* heuristic: Explore nodes connected to largest potential k-clique ? ? HTTP Is there a 4-clique? ? Chosen by Known degree Chosen by Clique* 5 Can probabilistic knowledge help? Standard MDP solvers are not applicable – exponential state space RClique* heuristic: Monte-Carlo based sampling heuristic , using Clique* to estimate future cost of every sample probability of an edge between openlist nodes 6 Sensor accuracy model Graph generator model Experimental results Random: Explore randomly until a k-clique is found Clique* is better than known degree for sparse graphs 7 All algorithms are superior to RANDOM Future work RClique* is superior • Generalize to subgraph isomorphism • Multiagent search • Advanced exploration cost models: • - Heterogeneous cost per node • - Physical exploration cost (50% noise,250 samples, max depth 3) Lower bound: Offline shortest path to a k-clique

L A B F H G S J Cliques in unknown graphs Roni Stern, Meir Kalech and Ariel Felner Dept. of Information Systems Engineering, Ben Gurion University, Israel D E I C K ? 10% ? 80% ? 30% 1 Theweb is unknown (except perhaps to Google) What are unknown graphs? 2 • Known graphs: • The input graph is given as input • Explicit representation or computational operators • Unknown graphs: • Must explore the graph to learn of nodes/edges • Exploration has a cost (fuel, network I/O, etc.) Explored Clique in unknown graphs Known ? Unknown ? ? Goal: Find web pages referencing each other Motivation: Clique of papers in Google Scholar Nodes & Edges: Web pages and hyperlinks (href) Exploration: HTTP request, HTML parsing Challenge: Where do we crawl next? ? Goal: Find a k-clique, minimize the number of explored nodes ? Example: Cliques in the Web Which node to explore next? ? ? 3-clique 3 ? ? ? ? href Explored HTTP Highest known degree Example: 4-clique search Unexplored HTML href ? ? ? Degree of a node is known only after it is explored Known degree is the number of adjacent expanded vertices Heuristic: Explore the node with highest known degree ? ? HTTP Is there a 4-clique? 4 ? Clique*: Largest potential clique Potential k-clique: Set of nodes that may be a part of a k-clique Tight lower bound: Expanding the largest potential k-clique Heuristic: Size of the largest connected potential k-clique Chosen by Known degree Chosen by Clique* 5 probability of an edge between openlist nodes RClique*: Using probabilistic knowledge MDP solvers are not applicable – exponential state space Heuristic: Monte-Carlo based sampling heuristic + Clique* to estimate future cost of every sample Sensor accuracy model 6 Experimental results Graph generator model Random: Explore randomly until a k-clique is found Clique* is better than known degree for sparse graphs 7 All algorithms are superior to RANDOM Future work RClique* is superior • Generalize to subgraph isomorphism • Multiagent search • Advanced exploration cost models: • - Heterogeneous cost per node • - Physical exploration cost (50% noise,250 samples, max depth 3) Lower bound: Offline shortest path to a k-clique

L A B F H G S J D E I C K Gray circles mark the explored nodes Chosen by Known degree Chosen by Clique*

Cliques in unknown graphs Roni Stern, Meir Kalech and Ariel Felner

Cliques in unknown graphs Roni Stern, Meir Kalech and Ariel Felner

Presentation Transcript

Stern

ARIEL

Densities of cliques and independent sets in graphs

Computational Challenges with Cliques, Quasi-Cliques and Clique Partitions in Graphs

Vitali Sepetnitsky Advisors: Prof. Ariel Felner Dr. Roni Stern

Cliques and Sub-groups

Packing cliques in graphs with independence number 2

Searching for k-cliques in unknown graphs

Cliques

Cliques and Independent Sets

Ariel

Cliques

Roni, Roni Zephaniah 3 David Loden

Zahy Bnaya and Ariel Felner, ISE Department, Ben-Gurion University.

Roni Even

RONI Kolli7

Ariel and Prospero

RONI MARGULIES

Asymmetric Ramsey Properties of Random Graphs involving Cliques

Vitali Sepetnitsky Advisors: Prof. Ariel Felner Dr. Roni Stern

Potential Search: A New Anytime Heuristic Search Roni Stern, Rami Puzis and Ariel Felner

Cliques