510 likes | 771 Views
Freenet. A Distributed Anonymous Information System and Retrieval System I. Clarke, O. Sandberg, B. Wiley, W. Hong. ECE 6102 Presented By: Kaushik Chowdhury and Justin Fiore. Presentation Outline. Introduction of Freenet Protocol Overview Protocol Details Security Features
E N D
Freenet A Distributed Anonymous Information System and Retrieval System I. Clarke, O. Sandberg, B. Wiley, W. Hong ECE 6102 Presented By: Kaushik Chowdhury and Justin Fiore
Presentation Outline • Introduction of Freenet • Protocol Overview • Protocol Details • Security Features • Performance Evaluation • Conclusions
P2P Architecture & Features • Decentralized model • e.g. Freenet, Gnutella, Chord • no global index – local knowledge only • contact mediated by chain of intermediaries • Centralized model • e.g. Napster • global index held by central authority • direct contact between requestors and providers
Freenet Timeline • Final Year project Ian Clarke , EdinburghUniversity, Scotland, June, 1999 • Sourceforge Project, most active • V.0.7 (Alpha release April 2006) • Incorporates a new approach to anonymous peer-to-peer adopting a "scalable darknet" architecture. Source: http://freenetproject.org/news.html
Introduction • Design goals • Producer and consumer anonymity • Deniability for storers of information • Resistance to hostile third parties • Efficient dynamic storage and routing • Decentralization of network functions
Security Issues • How to provide anonymity? • Consumers may use browser proxy services • However, producers may keep session logs • Contacting a particular server reveals the information needed • Producers may ensure anonymity by using encrypted URL services • No protection against the operator of the service
Architecture • Peer to peer network of nodes that query one another • Each node has it’s local data store and dynamic routing table • Enables users to share unused disk space and increases the storage capacity of the network
Key Management • A way to locate a document anywhere • Keys are used to form a URI • Two similar keys don’t mean the subjects of the file are similar! • Keyword-signed Key(KSK) • Based on a short descriptive string, usually a set of keywords that can describe the document • Potential problem – global namespace File Key Public # Fn() Descriptive string Eg. gatech/distributed_systems Private
Key Management … contd • Signed-subspace Key(SSK) • Add sender information to avoid namespace conflict • Private key to sign/ public key to verify Descriptive string Eg. gatech/distributed_systems Public Private XOR # Fn() # Fn() Hash 1 Hash 2 # Fn() File Key
Key Management … contd • Content-hash Key(CHK) • Derived by directly hashing the contents of the corresponding file • Used in conjunction with the signed-subspace keys # Fn()
Basic Model • Nodes know only their immediate upstream and downstream neighbors • Queries are given a unique identifier and hops-to-live count • Queries are forwarded to a node based on previous information
If a previous message is seen, forwarded to another node • Process continues until file is obtained or hops-to-live counter is exceeded • Success or Failure is passed back up the chain
c a Start 1 2 12 3 b d 11 10 7 6 4 9 5 f e 8
Retrieving Data • User hashes a short descriptive string to obtain file key • She then sends the “Request” message to her own node • If present, returns with message saying it was the source • If not, looks up nearest key in routing table and forwards to the next node
If request is ultimately successful, node passes it back up the upstream requestor • It also makes a local cache of the very same file • Future requests will be serviced faster • Similar keys will also be forwarded to the same node • For security, any node along the path can claim to be the author of the file
If a node cannot forward to it’s preferred downstream node, it sends to it’s second-nearest key • If that doesn’t match, then third nearest key and so on • If none of them match, it sends a failure message to it’s upstream node which follows the same procedure
Storing (Inserting) Data • Similar to requesting data • User picks a text string(title) and hashes it to a file key and sends it to her node • If there is a collision, user is informed • If no collision, node sends to the closest key in routing table
This goes on until hops-to-live is reached • If a collision occurs anywhere, the node sends back the file along with a notice and is treated as a request • If not, the file is sent and copied at each node
Managing Data • Node storage uses a LRU cache • When a new file arrives, by insert or request, the least recently used file is removed • Thus, if a file is needed, it will remain on some node • Or it will fade away
Node Joins • Need to assign it a key that is not solely influenced by a given malicious node. New node Randomly chosen node Address, Hash 1 Seed = random() Seed = random() Hash 1 XOR # Fn() # Fn() Hash 1 Hash 2
Protocol Details • 3 Basic Operations • Handshake • Request Data • Insert Data
Transport Methods • Transport • Flexibility via use of TCP, UDP, or other technologies, such as packet radio • Node addresses consist of a transport method and transport-specific identifier • tcp/192.168.1.5:19114
Protocol Handshake • Transaction begins with Request.Handshake • Includes return address of sending node • The sending node may or may not be the original node • Remote node replies with Reply.Handshake • Specifies protocol version that it understands • Handshakes are remembered for a few hours
Request Data • Terminates when: • Key is found (Send.Data message) • Key is not found (Reply.NotFound message) • Reply.Restart message sent when a remote node has waited for network timeouts while contacting other nodes. This message informs predecessor nodes to extend their timers. • Reply.Continue is sent when a dead end is reached, and routing must backtrack. • When key is found, that node sends the data back as well as the supplierAddress, where the supplierAddress is possibly faked.
Insert Data • Terminates when: • Hops-to-live is 0 and key is not found • Remote node sends Reply.Insert • Hops-to-live is 0 and key is not found, but current node has a routing table entry for key. • Remote node sends Reply.NotFound • Key is found • Remote node sends Send.Data including the data for the key. • Request.Continue sent when no more nodes can be contacted and hops-to-live is nonzero
Security Goals • Anonymity • Anonymity of requestors of files • Anonymity of inserters of files • Plausible deniability • Make it plausibly deniable that any node may or may not have requested, inserted or stored a given file. • Integrity • Prevent malicious removal of a file • Prevent malicious modification • Reliability • Resist denial-of-service attacks
Freenet Anonymity Details • Since routing depends on the keys, key anonymity cannot be achieved with basic Freenet • Sender anonymity against a collaboration of nodes is preserved because any node that sends a message could be the originator or just forwarding the message
Freenet Anonymity Details (2) • Sender anonymity against a local eavesdropper cannot be achieved because the local eavesdropper could perform traffic analysis on incoming and outgoing messages • Also, the local eavesdropper could act as the first node in the query, so then the encrypted request would still be known to the eavesdropper
Freenet Anonymity Details (3) • The depth and hops-to-live values could be used to locate the originator (or at least locate a set of possible originators) • This is obscured by random selection initial depth and hops-to-live values • Depth is incremented and hops-to-live is decremented probabilistically to further obscure the originator
Pre-Routing • Basic Freenet messages are encrypted using a series of public keys, which determine the path of the pre-routing • The message is forwarded along that route and decrypted partially at each node. • When the message reaches the pre-routing endpoint, it is injected into Freenet normally • The intermediate pre-routing nodes cannot read nor alter the request nor the originating node
Data Source Anonymity • The data source field is probabilistically altered during transmission through the network. • It is not possible to tell whether a node provided the file or was just forwarding it • This provides plausible deniability because it is not provable whether or not a node had a file before an investigative node queried it for the specific file. • This is because any request will cache the result along the return path.
Data Source Anonymity (2) • In a normal situation, it would be possible to identify the existence of a file on a node by sending a request to that node with HTL = 1 • Freenet solves this by probabilistically forwarding any message with HTL = 1 to the next node.
Prevention of Modification • Content-hash keys and Signed-subspace keys • A (possibly signed) hash of the data accompanies the data • Modification would require: • Finding a hash collision for content-hash keys • Successfully forging a digital signature for signed-subspace keys • Keyword-signed keys • Keys are a hash of the original descriptive string. • Vulnerable to dictionary attack • Colliding keys can be made by anyone knowing the original descriptive string.
Denial-of-Service Attack Prevention • Junk File Attack • Attacker attempts to flood network with a large number of junk files • Data store is divided into two parts: new files and established files • Inserts displace new files, not established files • Junk File Attack would only paralyze inserts temporarily, not displace files that are desired
Denial-of-Service Attack Prevention (2) • Alternate Versions Attack • Attacker inserts alternate versions of files under the same keys of the file they want to displace • Does not work against content-hash keys or signed-subspace keys without hash collision or digital signature forgery • If done with a keyword-signed key, both versions of the file would coexist in the network. • Solved by the insert protocol • Every unsuccessful attempt to insert the alternate file further distributes the real file’s data across the network because Send.Data returned from the insert request.
Performance Evaluation • Number of nodes = 1000 • Datastore size = 50 items • Routing table size = 250 addresses • Ring lattice topology
Network convergence • X-axis: time • Y-axis: # of pathlength • 1000 Nodes, 50 items datastore, 250 entries routing table • the routing tables were initialized to ring-lattice topology • Pathlength: the number of hops actually taken before finding the data.
Scalability • X-axis: # of nodes • Y-axis: # of pathlength • The relation between network size and average pathlength. • Initially, 20 nodes. Add nodes regularly.
Fault Tolerance • X-axis: % Node Failure • Y-axis: Pathlength in hops • The median pathlength remains below 20 even when up to 30% nodes fails.
Small world Model • X-axis: Number of links • Y-axis: Proportion of nodes • Most of nodes have only few connections while a small number of news have large set of connections. • Follows power law.
Conclusions • Freenet provides a fairly anonymous file storage and retrieval medium. • It uses an adaptive routing algorithm to fulfill queries, not a broadcast like Gnutella and others. • Freenet was seen to be highly scalable in simulation results.