Distributed Control Algorithms for Artificial Intelligence

Distributed Control Algorithms for Artificial Intelligence by Avi Nissimov, DAI seminar @ HUJI, 2003

Control methods • Goal: deliberation on task that should be executed, and on time when it should be executed. • Control in centralized algorithms • Loops, branches • Control in distributed algorithms • Control messages • Control for distributed AI • Search coordination

Centralized versus Distributed computation models • “Default” centralized computation model: • Turing machine. • Open issues in distributed models: • Synchronization • Predefined structure of network • Network graph structure knowledge on processors • Processor identification • Processor roles

Notes about proposed computational model • Asynchronous • (and therefore non-deterministic) • Unstructured (connected) network graph • No global knowledge – neighbors only • Each processor has unique id • No server-client roles, but there is a computation initiator

Complexity measures • Communication • Number of exchanged messages • Time • In terms of slowest message (no weights on network graph edges); ignore local processing • Storage • Common number of bits/words required

Control issues • Graph exploration • Communication over the graph • Termination detection • Detection of state when no node is running and no message is sent

Graph exploration: Tasks • Routing of message from node to node • Broadcasting • Connectivity determination • Communication capacity usage

Echo algorithm • Goal: spanning tree building • Intuition: got a message – let it go on • On reception of message on first time, send it to all of the neighbors, ignoring the rest • Termination detection – after the nodes respond, send [echo] message to father

Echo alg.: implementation receive[echo] from w; father:=w; received:=1; for all (v in Neighbors-{w}) send[echo] to v; while (received < Neighbors.size) do receive[echo]; received++; send [echo] to father

Echo algorithm - properties • Very useful in practice, since no faster exploration can happen • Reasonable assumption – “fast” edges tend to stay fast • Theoretical model allows worst execution, since every spanning tree can be a result of the algorithm

DFS spanning tree algorithm:Centralized version DFS(u, father) if (visited[u]) then return; visited[u]=true; father[u]=father; for all (neigh in neighbors[u]) DFS(neigh, u);

DFS spanning tree algorithm:Distributed version On reception of [dfs] from v if (visited[u]) then send [return] to v; status[v]:=returned; return; visited:=true; status[v]:=father; sendToNext();

DFS spanning tree algorithm:Distributed version (Cont.) On reception of [return] from v status[v]:=returned; sendToNext(); sendToNext: if there is w s.t.status[w]=unused then send [dfs] to w; else send [return] to father

Discussion, complexity analysis • Sequential in nature • There is 2 messages on each node therefore • Communication complexity is 2m • All the messages are sent in sequence • Time complexity is 2m as well • Explicitly un-utilizing parallel execution

Awerbuch linear time algorithm for DFS tree • Main idea: why to send to node that is visited? • Each node sends [visited] message in parallel to all the neighbors • Neighbors update their knowledge on status of the node before they are visited in O(1) for each node (in parallel)

Awerbuch algorithm: complexity analysis • Let (u,v) be edge, suppose u is visited before v. Then u sends [visit] message on (u,v); and v sends back [ok] message to u. • If (u,v) is also a tree edge, [dfs], [return] messages are sent too. • Comm. complexity: 2m+2(n-1) • Time complexity: 2n+2(n-1)=4n-2

Relaxation algorithm - idea • DFS-tree property: if (u,v) is edge in original graph, then v is in path (root,..,u) or u is in path of (root,..,v). • Union of lexically minimal simple paths (lmsp) satisfies this property. • Therefore, all we need is to find lmsp for each node in graph

Relaxation algorithm – Implementation On arrival of [path, <path>] if (currentPath>(<path>,u)) then currentPath:=(<path>, u); send all neighbors [path, currentPath] // (in parallel, of course)

Relaxation algorithm – analysis and conclusions • Advantages – low complexity: • In k steps, all the nodes with length k of lmsp are set up, therefore time complexity is n • Disadvantages: • Unlimited message length • Termination detection required (see further)

Other variations and notes • Minimal spanning tree • Requires weighting the nodes, much like Kruskal’s MST algorithm • BFS • Very hard, since there is no synchronization; much like iterative deepening DFS • Linear message solution • Like centralized; sends all the information to next node; unlimited message length.

Connectivity Certificates • Idea: let G be network graph. Throw from G some edges, while preserving k paths when available in G; and all the paths if G itself contains less than k paths (for each {u,v}) • Applications: • Network capacity utilization • Transport reliability insurance

Connectivity certificate: Goals • The main idea of certificates is to use as less edges as possible, there always is the trivial certificate – whole graph. • Finding minimal certificate is NP-hard problem • Sparse certificate is one that contains no more than k*n edges

Sparse connectivity certificate: Solution • Let E(i) be a spanning forest in graph G\Union(E(j)) for 1<=j<=i-1; then Union(E(i)) is a sparse connectivity certificate • Algorithm idea – calculate all the forests simultaneously – if an edge closes a cycle in tree of i-th forest, then add the edge to forest (i+1)-th (rank of the edge is i+1)

Distributed certificate algorithm Search(father) if (not visited) then for all neighbor v s.t. rank[v] ==0 send[give_rank] to v; receive[ranked, <i>] from v; rank[v]:=i; visited:=true;

Distributed certificate algorithm (cont.) Search(v) (cont.) for all w s.t. needs_search[w] and rank[w]>=rank[father] in decreasing order needs_search[w]:=false; send [search ] to w; receive [return]; send [return] to father

Distributed certificate algorithm (cont.) On receipt of [give_rank] from v rank[v]:=min(i) s.t. i>rank[w] for all w; send [ranked, <rank[v]>] to v; On receipt of [search] from father Search(father);

Complexity analysis and discussion • There is no reference to k in algorithm; it calculates sparse certificates for all k’s • There is at most 4 messages on each edge – therefore time and communication complexity is at most 4m=O(m) • Ranking the nodes in parallel, we can achieve 2n+2m complexity

Termination detection: definition • Problem: detect a state when all the nodes are awaiting for messages in passive state • Similar to garbage collection problem – determine the nodes that no longer can accept the messages (until “reallocated” – reactivated) • Two approaches: tracing vs. probe

Processor states of execution: global picture • Send • pre-condition: {state=active}; • action: send[message]; • Receive • pre-condition: {message queue is not empty}; • action: state:=active; • Finish activity • pre-condition: {state=active}; • action: state:=passive;

Tracing • Similar to “reference counting” garbage collection algorithm • On sending a message, increases children counter • On receiving message [finished_work], decreases children counter • When finishes work, and when children counter equals zero, sends a [finished_work] message to the father

Analysis and discussion • Main disadvantage: doubles (!!) the communication complexity • Advantages: simplicity, immediate termination detection (because the message is initiated by terminator). • Variations may send [finished_work] message on chosen messages; so called “weak reference”

Probe algorithms • Main idea: Once per some time, “collect garbage” – calculate number of sent minus number of received messages per processor • If sum of these numbers is 0 – then there is no message running on the network. • In parallel, find out if there is an active processor.

Probe algorithms – details • We will introduce new role – controller; and we will assume it is in fact connected to each node. • Once in some period (delta), controller sends [request] message to all the nodes. • Each processor sends back [deficit= <sent_number-received_number>].

Think it works? Not yet… Suppose U sends a message to V and becomes passive; then U receives [request] message and replies (immediately) [deficit=1]. Next processor W receives [request] message; it replies [def=0] since it got no message yet Meanwhile V activates W by sending it a message, receives reply from W and stops; receives [request] and replies [def=-1] But W is still active….

How to work it out? • As we saw, a message can pass “behind the back” of the controller, since the model is asynchronous • Yet, if we add some additional boolean variable on each of processors, such as “was active since last request”, we can deal with this problem • But that means, we will detect termination only in 2*delta time after the termination actually occurs

Variations, discussion, analysis • If there is more one edge between the controller and a node, usage of “echo” when initiator=controller, sum calculated inline • Not immediate detection, initiated by controller • Small delta causes to communication bottleneck, while large delta causes long period before detection

CSP and Arc Consistency • Formal definition: find x(i) from D(i) so that if Cij(v,w) and x(i)=v then x(j)=w • Problem is NP-complete in general • Arc-consistency problem is removing all values that are redundant: if for all w from D(j) Cij(v,w)=false then remove v from D(i) • Of course, Arc-consistency is just the primary step of CSP solution

Sequential AC4 algorithm For all Cij,v in Di,w in Dj if Cij(v,w) then count[i,v,j]++; Supp[j,w].insert(,<i,v>); For all Cij,v in Di checkRedundant(i,v,j) ; While not Q.empty <j,w> =Q.deque(); forall <i,v> in Supp[j,w] count[i,v,j]--; checkRedundant(i,v,j);

Sequential AC4 algorithm: redundancy check checkRedundant(i,v,j) if (count[i,v,j]=0) then Q.enque(<i,v>); Di.remove(v);

Distributed Arc consistency • Assume that each variable x(i) is assigned to separate processor, and that all the mutually dependent variables assigned to neighbors. • The main idea of algorithm: Supp[j,w] and count[i,v,j] lay on processor of x(j); while D(i) is on processor i; if v is to be removed from D(i), then j processor sends message

Distributed AC4: Initialization Initialization: For all Cij,v in Di For all w in Dj_initial if Cij(v,w) count[v,j]++; if count[v,j]=0 Redundant(v) Redundant(v): if v in Di Di.remove(v); SendQueue.enque(v);

Distributed AC4: messaging On not SendQueue.empty v=SendQueue.deque for all Cji send [remove v] to j On reception of [remove w] from j for all v in Di such that Cij(v,w) count[v,j]--; if count[v,j]=0 Redundant(v)

Distributed AC4: complexity • Assume A=max{|Di|}, m=|{Cij}|. • Sequential execution: both loops pass over all the Cij,v in Di and w in Dj => O(mA^2) • Distributed execution: • Communication complexity: on each edge can be at most A messages => O(mA); • Time complexity: each node sends in parallel each of A messages => O(nA). • Local computation: O(mA^2) because of initialization

Dist. AC4 – Final details • Termination detection is not obvious, and requires explicit implementation • Usually probe algorithm is preferred because of big quantity of messages • AC4 ends in three possible states • Contradiction • Solution • Arc Consistent sub-set

Task assignment for AC4 • Our assumption was that each variable is assigned to different processor. • Special case is multiprocessor computer, when all the resources are on hand • In fact, that is NP-hard problem to minimize communication cost when assignment has to be done by computer => heuristic approximation algorithms.

From AC4 to CSP • There are many heuristics, taught mainly in introduction AI course (such as most restricted variable and most restricting value) that tells which variables should be removed after arc-consistency is reached • On contradiction termination – usage in back-tracing

Loop cut-set example • Definition: Pit in loop L – a vertex in directed graph, such that both edges of L are incoming. • Goal: break loops in directed graph. • Formulation: Let G=<V,E> be graph; find C subset of V such that any loop in G contains at least one non-pit vertex. • Applications: Belief networks algorithms

Sequential solution • It can be shown that finding minimal cut-set is NP-hard problem, therefore approximations are used instead • Best known approximation – by Suermondt and Cooper – is shown on next slide • Main idea: on each step drop all leaves and then find a vertex so that is common to the maximal number of cycles that has 1 incoming edge

Suermondt and Cooper algorithm C:=empty; While not V.empty do remove all v such that deg(v)<=1; K:={v in V: indeg(v)<=1} v:=argmax{deg(v): v in K}; C.insert(v); V.remove(v);

Edge case • There is still one subtlety (that isn’t described in Tel’s article) – what to do if K is empty while V is not (for example, if G is Euler path on octahedron)

Distributed Control Algorithms for Artificial Intelligence

Distributed Control Algorithms for Artificial Intelligence

Presentation Transcript

Artificial Intelligence for Games

Algorithms of Artificial Intelligence

Distributed Control Algorithms

CSC 486: Artificial Intelligence Informed Search Algorithms

Artificial Intelligence

Artificial Intelligence

Artificial Intelligence

CSC 480: Artificial Intelligence -- Search Algorithms --

Artificial Intelligence for Cybersecurity

SIF8072 Distributed Artificial Intelligence and Intelligent Agents

Artificial Intelligence Technologies for Web Intelligence

Evolutionary Algorithms and Artificial Intelligence

SIF8072 Distributed Artificial Intelligence and Intelligent Agents

Languages for Artificial Intelligence

Multiagent systems and Distributed Artificial Intelligence

SIF8072 Distributed Artificial Intelligence and Intelligent Agents

SIF8072 Distributed Artificial Intelligence and Intelligent Agents

Exact and Distributed Algorithms for Collaborative Camera Control

Artificial Intelligence

Artificial Intelligence for Games

SIF8072 Distributed Artificial Intelligence and Intelligent Agents

SIF8072 Distributed Artificial Intelligence and Intelligent Agents