480 likes | 632 Views
Security and Trust in P2P systems. What is trust. When thinking about security in a system, various entities need to “trust” others to varying degrees So… what is trust Trust is a bet about the future contingent actions of others. Trust and Security. Direct validation
E N D
What is trust • When thinking about security in a system, various entities need to “trust” others to varying degrees • So… what is trust • Trust is a bet about the future contingent actions of others
Trust and Security • Direct validation • I need to know whether I can “trust” another entity within this system • Authentication • Indirect validation • Should I trust “Alice” because my friend, Bob, trusts her? • Recommendation • Reputation
Trust and Security • The “perfect” P2P system • A system with perfectly flat hierarchy, and with each entity allowing other entities to use local resources • How can we provide security without a centralized entity?
Malicious node • A malicious node might give erroneous responses to a request • Application level • Returning false data • Network level • Returning false routes • May work together, acting in concert, to attack the remainder of the nodes
Issues • Identification • Routing table risk • Victim Data • Victim Peer • Content verification • Punishment
Identification • Identity • Undesirable to know the identity of other entities • Privacy (http://www1.businessweekly.com.tw/web/webarticle_45792_p1.html) • Anonymity • However, • If you wish to trust entity A, you need to be able to identify it
Identification • Public key infrastructures (PKI) • Should be run with somebody! • For a PKI to work in this sort of situation, you need to have a trusted third party • Recommendation systems • Chains of trust • Transitive trust
Identification • When trust must be transitive, it creates brittleness • In most P2P system, transitive trust is a key component • How to measure “reputation” • Roles • Time related
Secure Routing in p2p systems • Security routing primitive ensures that when a non-faulty nodes sends a message to a key k, the message reaches all non-faulty members in the set of replica roots Rk with very high probability • Security routing guarantees that a replicas are initially placed on legitimate replica roots, and that a lookup message reaches a replica if one exists
Three problems • Securely assigning nodeIds to nodes • Ensure attackers cannot choose the value of nodeIDs • Securely maintaining the routing tables • Ensure that the fraction of faulty nodes that appear in the routing tables of correct nodes does not exceed the fraction of faulty nodes in the entire overlay • Securely forwarding messages • Ensure that at least one copy of a message sent to a key reaches each correct replica root for the key with high probability
Secure nodeId assignment • A node might choose its identifier maliciously • Allocate itself a collection of nodeIds closer to that document’s key than any existing nodes in the system (Victim Item) • Censor a specific document • Choose nodeIds to maximize its chances of appearing in a victim node’s routing tables (Victim Peer)
Secure nodeId assignment • Centralized authority • The server is only consulted when new nodes join and is otherwise uninvolved in the actions of the p2p system • Sybil attacks • Coalition nodes might try to get a large number of nodeIds • Even if those nodeIds are random, a large enough collection of them would still give the attackers disproportionate control over the network • Moderate the rate at which nodeIds are given out • Charging money? • By solving little problem?
Robust routing primitives • If an attacker controls a fraction f of the nodes in the p2p network, we would expect that each entry in every routing table would have a probability of f of pointing to a malicious node. • If a desired route consumes h hops • The probability being free of malicious nodes is (1-f)h • How about Chord with 2m nodes?`
Robust routing primitives • Attempt multiple, redundant routes from the source to the destination • Costly • How to determine “Not found”
Content verification • Adversary may spoof the results • Verification can be done if we have verification codes • Solve by Google’s PageRank technology • Pages that are linked from “popular” pages are themselves more popular • How to add such a notion of popularity into a p2p system
Punishment • Remove malicious nodes when they are detected • How to detect malicious nodes? • Can we have a global view, who can punish the misbehave nodes?
“Sybil” (1973) by Flora Rheta Schreiber Attacker creates multiple identities to control a large portion of the network Sybil Attack
Identity Validation • John R. Douceur, The Sybil Attack, in Proceedings of 1st International Workshop on Peer-to-Peer Systems (IPTPS), 2002 • How does an entity know that two identities come from different entities? • Four Lemmas “prove” that Sybil attacks are always possible without centralized authority • Direct validation (lemmas 1 & 2) • Indirect validation (lemmas 3 & 4)
Lemma 1 • Because entities are heterogeneous in terms of capabilities, a malicious entity can create several “minimal” identities • Lower-bound on number of identities
Lemma 2 • Each correct entity must simultaneously validate all the identities it is presented, otherwise, a faulty entity can counterfeit an unbounded number of identities • Simultaneous identity verification not practical
Lemma 3 • If a certain number of identities must vouch for a new identity for it to be accepted, then a set of compromised identities can create any number of new fake identities • A sufficient large set of faulty entities can counterfeit an unbounded number of identities
Lemma 4 • All entities in the system must perform their identity validations concurrently; otherwise, a faulty entity can counterfeit a constant number of multiple identities. • Again, simultaneous validation is difficult in real-world networks.
Overview Conclusion • Networks require centralized authority to validate network identities • Without one, Sybil attacks are always a possibility
Mission • If it is hard to avoid, can we limit it? • Idea • Moderate the rate at which nodeIds are given out • Charging money? • By solving little problem?
Admission control system (ACS) • Property • Security • Provide resiliency against • Efficiency • Should be simple and does not require a lot of overhead on participating nodes • Fairness • Nodes should do an equal amount of work to join the network • Response to attack • Make the attack more difficult while not affecting other legitimate nodes • Scalability
Decentralized, multi-puzzle scheme • It is important that the upper layer nodes are both static and trustworthy • A must gain admission from a sequence of nodes, starting with leaf node B and ending with root X • At each stage, A is required to solve a puzzle presented by B
Join protocol • Get token • A wishes to join the network, it must first discover a leaf node B • To gain admission from B by solving B’s puzzle • After solving the puzzle, it is given a token and is used to prove to B’s parent admission by B • At each stage, A is given a token to be used as proof of previous puzzle solution. • When reach the root, a final token format is issued by X • A’s signature
Connect to the network • A must prove to its prospective neighbors that it has been admitted by the root node X • Signature verification is cost • The neighboring nodes each require A to solve one more puzzle challenges protect neighbors from a DoS attack
Node Upgrade • A must prove its stability before inclusion in the ACS • Initially, A joins the ACS as a leaf node, and evaluated by its parent node • To maintain a balanced tree • A node only upgrades nodes when its number of children has reached the degree of the tree • When it is sufficiently deep to support the join load and achieve the proper security guarantees, no node will be added in the ACS
Node departure • Not a member of ACS • A member of ACS • Leave gracefully • The oldest child is chosen to replace the departing node • Due to a failure • Children must rejoin the network by • Contact its grandparent • Or, find another node in the ACS
Security • The ACS is designed to limit Sybil attacks, not to prevent them! • Attacker is a member of ACS • Easily detected by the parent of the attacker by observing the rate of the token requests • Attacker is not a member of ACS • Control a significant fraction of nodes • Attack is limited by ensuring only a small number of tokens are released during a period of time
How about patient attackers? • If an attacker is patient enough, it can achieve the required number of IDs to launch a massive attack • Cut-off window • Define a token expiration time, W • How to determine the value of W • Limit the number of good users that must execute the rejoin process to a small percentage
Startup • The basic protocol provides minimum protection of the network during the startup process when it has small number of nodes • An attacker can obtain a large percentage of nodes in a shorter time • For example, if the network has 36 nodes, an attacker needs to obtain 4 nodes to be in control of 10% of all the nodes. • If we assume that it takes 5 minutes to get an ID, the 10% target can be achieved in less than 20 minutes.
Startup (method1) • Make the puzzles at the starting phase very difficult, and then decrease the difficulty linearly as nodes join. • For example, if the initial puzzle takes an average of two hours to be solved, then after one node joins the puzzle difficulty is reduced to 1 hour and 50 minutes. • Network initialization time will be high!
Startup(method 2) • Define a start up window that impacts the joining process for a finite time. • Puzzle difficulty in this scheme decay over time • As opposed to the above scheme which reduces the puzzle difficult as the number of nodes grow. • For example • nodes joining the network at its inception are given puzzles that take two hours to solve. • nodes that join five minutes after inception are given puzzles that take 1 hour and 55 minutes to solve. • This continues until we reach the puzzle difficulty targeted for the normal join process.
Startup(method 2) • The number of node IDs an attacker may obtain during this start up window depends on • the arrival rate of the nodes • how much more powerful the attacker is compared to the average user • much shorter network initialization time compared above scheme
Analysis • Models • Legitimate nodes arrive according to a Poisson distribution with an arrival rate of lg • Life time is exponentially distributed with mean of mg • Assume an attacker is equal in computational power to the average user • l: Joining difficulty (measured in maximum time)
Analysis • Puzzles and fairness • The distribution of the time to solve the puzzle is uniform • Single puzzle of average time l / 2 • n puzzles of difficulty l/n • Example • 5 mins to solve with a maximum standard deviation of 30 seconds • 9 puzzles and each takes max 33.3 seconds.
Analysis • Steady state • The number of nodes in the network, N • N= lg * mg • To control fraction f of nodes, an attacker will be required to obtain (f/(1-f))*N IDs • Assume there are n attackers • Arrival rate of attacker nodes will be la = n / l • The time to launch a successful attack
Analysis • Example • If λg = 1 node/sec, and µg = 2.3 hours, the steady state number of nodes is 8280 • For the attacker to control 10% of the total nodes in the network it is required to obtain 920 IDs • If the joining process takes on average 5 minutes, a successful attack would take 76 hours which is more than 3 days.
Analysis • Cut-off windows (legitimate nodes) • P : the percentage of legitimate nodes that will be required to reacquire fresh tokens
Analysis • Example • If µg = 2.3 hours and W = 4 hours, • The percentage of Legitimate nodes that will be cut off the network and asked to rejoin is 17.5%.
Analysis • Cut-off window (attackers) • The combined number of nodes of n attackers can accumulate is n*W / l • Example • If the average join time is 5 minutes and W = 4 hours • The maximum number of nodes an attacker can accumulate is 48 nodes
Conclusions and Discussions • What we learn • Topologies • Centralized p2p system • Search cost is bounded • Single point of failure • Decentralized p2p system • Unstructured p2p system • Flexible • Unbounded search • Structured p2p system • Scalibility, bounded search • Only support keyword query • Super peer architecture
Conclusions and Discussions • Search • Constraint of hash • Dimension reduction and Document retrieval • Absolute angle • Rolling index • Locality preserving hashing • idistance • Application • BT • For efficiency downloading • Tit for tat • Skype • Super peer architecture • Security • ACS
Conclusions and Discussions • A better topologies? • Robustness • Scalibility • Flexible • Bounded search • Fairness • Etc.
Conclusions and Discussions • Support general query? • The constraint of hash • Similarity search • Range query • Content-based retrieval • Trust without a third party? • nodeId assignment • Routing table management • Content management • How to decide the score?