410 likes | 588 Views
“ Blessed are the poor in spirit: for theirs is the kingdom of heaven . †< Matthew 5:3>. Intelligent File Sharing Framework. A THESIS IN Computer Science Changgyu Oh 5/2/2002. Title Page Motivations 3 Network Topologies 4 Problem Domains 5 Research Goal 8
E N D
“Blessed are the poor in spirit: for theirs is the kingdom of heaven.” <Matthew 5:3>
Intelligent File Sharing Framework A THESIS IN Computer Science Changgyu Oh 5/2/2002
Title Page Motivations 3 Network Topologies 4 Problem Domains 5 Research Goal 8 Related Works 9 Intelligent File Sharing Framework 12 Framework Figure 13 Query Service Using Reasoning 14 IS-A/Contained-In Hierarchies 15 File Association Rules 16 The Benefits of IFS Search 17 Grouping Service 18 IFS P2P V.S. P2P Network 19 Benefits of Dynamic Group Partition 20 Title Page Dynamic Group Partition 21 IP Clue Mechanism 22 File Transaction in IFS 24 QUERY SERVICE TYPES 25 IFS System Architecture 26 Client View 27 Server View 28 IFS Prototype Implementation 29 IFS Query Interface 30 Experimental Results 32 Comparative Analysis 33 Contributions 34 Conclusion 35 Future Work 36 References 37 Contents
Motivations • Why P2P? • Limitations of Client/Server • Increasing interest in sharing and collaborative computing • Improving P2P technologies • Why P2P File Sharing? • FILE Reusability • Share available resources • Significance of this research • Increase Network scalability • Anonymity • Flexible and powerful query
Problem Domains (1) • Limitations of P2P Network • Scalability • Utilization of Network resources • P2P Network Topology • Broadcast • Logical Mesh network
Problem Domains (2) • Limitation of Resource Source’s Anonymity • Resource source’s IP address in queryHit message • Privacy and security • How can source node send it to destination without revealing its IP address in public?
Problem Domains (3) • Limitation of Keyword Based Query • Primitive and limited • Only one file searching • Not flexible • Not satisfy users’ requests
Related Works-IAnonymous Publication Service • The Publius system [Marc W., 2000] • document-anonymity because the key is split between the n servers, and without sufficient shares of the key a server is unable to decrypt the document that is stores. • Anonymity based on static, system-wide list of available servers. • Not support the adding of new server • The Eternity system [Ross J., 1996] • Provides publisher’s anonymity by using one-way anonymous re-mailers • Server anonymity is not provided • Reader anonymity is not provided by open public proxies • Query and Advertising System [Heimbigner D., 2000] • Arbitrary name is placed at the first level server for each client. • First level server has actual IP address of clients • Freenet [Ian C., 2000] • Provides document-anonymity • Server-anonymity is not provided.
Related Works- IIMeta Search Methods • Efficient and Effective Metasearch [Yu C.,’1999] • representatives for each database optimizing relationship hierarchy • Efficient Transitive Closure Reasoning [Lee Y.,2001] • inheritance, classification transitive closure reasoning • Class/Part/Containment Hierarchy • Browsing Large Digital Library Collections [Geffner S., 1999] • classification hierarchies to increase capabilities of the data browsing in digital libraries.
Related Works-IIIFile Sharing Systems using Caching • The Distributed File System [Burns, R.C , 2000] • Detecting network failures ensures that caches are consistent. • Network File System [Palmer J., 1996] • Clients poll the server to find out when the file was last modified • Determines the cached version is valid. • Hint-Based Cooperative Caching file system [Sarkar, P., 2000] • Help clients make decisions based on the computer’s local state • Reduce overhead and access latency
Intelligent File Sharing Framework • Major Building Blocks: • Query Service using Reasoning • IP-Clue Mechanism: • Encoding/Decoding • Dynamic Grouping and Caching Service
Query Service Using Reasoning • Goal: • Fast search using the file relation hierarchy Set • More flexible query and directory services • Approach: • Relationships: • IS-A • Contained-In • Run-With • File Relation Hierarchy Set <Ν,Ŗ,Ω,Њ> • Set of Number pairs (Ν), • Relation Type (Ŗ), • Constraint Rule (Ω), • Hierarchy Identifier (Њ). • File Association Rules • Generalized Association Rule • Aggregated Association Rules • Constrain-based Association Rule
File Association Rules • Generalized Association Rule • Subtype relationship between files • E.g., If Window multimedia application X is a multimedia application Y and if a multimedia file Z is running with the Multimedia application Y, then X runs Z. • Aggregated Association Rule • directory contains multiple sub-directories or files • E.g., “Find the files on CS101 homework” • Constrain-based Association Rule • File association based on constraints such as file size, Network capacity, etc. • E.g., “Find a file whose size is less than 1 MBtype and can be opened with MS Word.”
Grouping Service • Goal: Increase Scalability • Control Maximum hop • Control a number of replicas of message generated by peer hosts • Control a number of peer hosts for message forwarding in a routing table of each peer host. • Approach: • Group partition • Brother relationship • Caching
Benefits of Dynamic Group Partition • Broadcast in a same group • Robust Search against node failure • Ensure a shortest path • Increase Network Scalability by grouping peers • Server-less and Decentralized manner • Dynamic partition • Reduce network traffics • Requires only one hop per a group
IP Clue Mechanism • Goal:Protect identity of resource publisher in P2P file sharing • Approach • IP Encoding/Decoding • Encoding the IP in source peers • Decoding the encoded IP in destination peers • Formula: • Assume that IP address of A is represented in [W.X.Y.Z] (e.g., [255.122.25.5]) • (1) W + the size of query • (2) X + the first character of a query • (3) Y + the file extension size • (4) Z + the last character of a query message Only the destination peers can recognize the IP Clue!!!
IFS System Architecture • Component-based Architecture • Servant Component • Highest level of component • Server + Client Components • Manager Components: • Control work flow • Assign tasks to worker components • Worker Components: • Perform actual tasks • Service (Entity) Components: • Task description
IFS Prototype Implementation • IFS prototype is built on top of Gnutella Phex System • Developing System Environment • Need at least 25 Mbyte free Memory Space • JAVA Virtual Machine • Pentium III 500MHz CPU • Event Driven Methods • Each task is performed based on events • Components based Programming • Manager Components • Worker Components • Service Components
Contributions • Proposed a conceptual framework for decentralized P2P file sharing. • Dynamic group partition and caching • Query using fast reasoning • IP-clue mechanism (encoding/decoding) • Designed a component-based architecture • Implemented to extend an existing file sharing system (Gnutella Phex)
Conclusion • The IFS system • Supports decentralized P2P File Sharing. • Increases high Network scalability. • Provides flexible file searching and querying. • Protect resource sources’ anonymity.
Future Work • Further Research on the latency due to the grouping • File registration strategy on heterogeneous environment • Discover advanced mechanism to reasoning file relationships & file association rules • Research on the grouping policies • Grouping by peer host’s network capacity • Grouping by interests • Grouping by context • Grouping by location
References: • C. T. Yu, W. Meng, K.-L. Liu, W. Wu, and N. Rishe. Efficient and effective metasearch for a large number of text databases. In CIKM, pages 217--224, 1999 • Y. Lee and J. Geller, Efficient Transitive Closure Reasoning in a Combined Class/Part/Containment Hierarchy, Journal of Knowledge and Information System, 2002 • S. Geffner, D. Agrawal, A. Abbadi and T. Smith, Browsing Large Digital Library Collections Using Classification Hierarchies, CIKM, 195-201, 1999
References: (Continue) • M. Waldman, A. Rubin, and L. F. Cranor. Publius: A robust, tamperevident, censorship-resistant, web publishing system. In Proc. 9th USENIX Security Symposium, page 59-72, August 2000 • R. J. Anderson, The Eternityservice, in Proceedings of the 1st International Conference on the Theory and Applications of Cryptology (PRAGOCRYPT '96), Prague, Czech Republic 1996. • J. Palmer, R. Strong, and E. Upfal. Nonblocking membership protocols with asymmetric safety. Technical Report RJ10096 (91912), IBM Research Division, December 1997.
References: (Continue) • I. Clarke, O. Sandberg, B. Wiley, and T. Hong. Freenet: A distributed anonymous information storage and retrieval system. In Proceedings of the Workshop on Design Issues in Anonymity and Unobservability, pages 46-66, July 2000. • D. Heimbigner, Adapting Publish/Subscribe Middleware to Achieve Gnutella-like Functionality. Technical Report CU-CS-909-00, Department of Computer Science, University of Colorado, Sept. 2000 • P. Sarkar, J. H. Hartman ACM Transactions on Computer Systems (TOCS) November 2000 Volume 18 Issue 4