170 likes | 314 Views
Verify the termination of a P2P System using SPIN. Di Zou. What is SPIN & Promela. SPIN = S imple P romela In terpreter SPIN is a simulator for Promela programs SPIN is a verifier for the properties of Promela programs Promela = Pro cess Me ta La nguage
E N D
What is SPIN & Promela • SPIN = Simple PromelaInterpreter • SPIN is a simulator for Promela programs • SPIN is a verifier for the properties of Promela programs • Promela = Process Meta Language • A specification language, not a programming language • Used for describe a system’s behavior
What can SPIN do • SPIN is only for software verification • SPIN works on the fly, do not need any pre-computation for the check • SPIN can check the system which will change the number of the processes • SPIN support 3 different kinds of communication: • Rendezvous: no buffer • Buffered message • Shared memory • SPIN is able to do the simulation randomly, interactively and guided
How SPIN works • Build the model by Promela • Generate the C code • Execute the on-the-fly verifier
The model I check • P2P systems are widely used nowadays • P2P systems are hard to determine the termination • P2P means there is no global knowledge
Structures • Worker • The worker will keep asking the JobHolder if there is any thing can do • The worker will do the job from the holder and may find new jobs and give it to a holder • JobHolder • The holder will have some jobs • The holder will terminate iff all the holders, as well as the workers, do not have anything.
121 145 113 171 83 75 255 71 • Every job is assigned to a JobHolder • JobHolder ID is the closet lower number than JOB ID • When a worker find a new job, it reports the job to a JobHolder who should take care of it • The JobHolder gives job to a worker if there is an job left 0 9 49 35 17 29 • State-proximity strategy • Every state is hashed to a State ID • Every peer identifier (IP address) is hashed • to a peer ID in the same space of State IDs
Current termination Strategy • Assumption • The network delay is constant and the same between all the machines (worker, JobHolder ) • All the machines can talk to anyone directly • By default the message will go through the RING • Only some special message can send directly
General Idea • A JobHolder will send a message called “termCheck” to notice that it wants to terminate • A JobHolder will ask others to terminate iff the termCheck go back to himself • If a nativeTerm message received, it will not terminator. Even the termCheck may come back
Definition of Termination • Good termination • Terminate when all the JobHolders have nothing to do • Termination needs few messages exchange • Bad termination • Terminate when there are still jobs • Terminate part of the machines. Do not reach the global termination • Terminate leads a deadlock
Situation I • the receiving node is the terminator (the terminator is himself), and the source of the check term is also himself, this means that the check term he previously sent went around the ring without any interruption (negative term), and it is received back by the terminator. In this case, the termination is reached and this node will send a terminating message to all other nodes to force them to enter termination status.
Situation II • The source of the check term is himself, but the variable terminator held by him is null (this means he received a negative term): he put himself as a terminator again (since his message wasn't stopped, and sends a check term again to the next node in the ring)
Situation III • if he received a check term message that another node sent but when receiving it, he finds out that he is not a terminator anymore (he was a terminator but he received a negative check term while his check term message was going around the ring, or he has not been a terminator at all): In this case, he will update the terminator with the source of the new check term message and forward the check term message to the next node.
Situation IV • If the node received a different check term message from a different node and he is an initiator (source of message = himself): In this case, use the priority approach and keep the node having higher id as an initiator (compare his own ID with the ID of the source of the received check term). if the source of the received check term is lower, a negative check term is sent to the source. Else (his node ID is lower, he update his terminator variable by taking the value of the new initiating node (source of received check term), and forwards the check term to the next node in the ring. priority for the node having higher ID
Potential Problem • The termination above assumed that the negative message will arrive earlier than the check term. But if the negative message is sent from your neighbor, there is no guarantee that the negative message will arrive first. • Reason: • The network delay for the check term and negative message is the same : 1 • In the real environment the network delay is not a constant or the same amongall the nodes.
Result • In the SPIN you can have the statement called “assert “, which are guaranteed to execute. I have one assert in the termination part. And if it goes to that part, the number of jobs left should be 0. • The checker reported a assertion violation. That means the assertion do not reached. In other words, that means the termination is wrong.