300 likes | 410 Views
A Distributed Search Service for P2P File Sharing in Mobile Applications. 4 September, 2003 Authors - Christoph Lindemann and Oliver P. Waldhorst, University of Dormund, Dept. of Computer Science. Itinerary. Background Studies Introducing Passive Distributed Indexing (PDI) Algorithm Details
E N D
A Distributed Search Service for P2P File Sharing in Mobile Applications 4 September, 2003 Authors - Christoph Lindemann and Oliver P. Waldhorst, University of Dormund, Dept. of Computer Science
Itinerary • Background Studies • Introducing Passive Distributed Indexing (PDI) • Algorithm Details • Performance Results • Conclusion and Future Work
Background Studies A Mobile Ac-Hoc Network Short-range Wireless, e.g. Bluetooth Medium-range Wireless, e.g. IEEE 802.11 Such Ac-Hoc Network can be used for data sharing between mobiles,e.g. Documents, MP3s and Video Clips How to enable searching of P2P data on top of the architecture?
Background Studies Related Works
Proposed Solution Objectives“to provide a general-purpose file search service which can be used by several kinds of mobile applications running on top” Passive Distributed Indexing (PDI) - Each device stores its local documents as a Repository- Uniquely identify documents with its local pathand unique device ID, a.k.a. Document Identifier- A local Index Cache is maintained on each device,which forms the core component of this architecture- Searching is performed by keyword searches
Passive Distributed Indexing Operation Scenario Node 3 Node 1 q: d2, d3 … Node 2 q: d1, d2
Passive Distributed Indexing Operation Scenario Node 3 Node 1 q: d2, d3 [QUE] q ? … Node 2 q: d1, d2
Passive Distributed Indexing Operation Scenario Node 3 Node 1 q: d2, d3 [QUE] q ? … Node 2 q: d1, d2
Passive Distributed Indexing Operation Scenario Node 3 Node 1 q: d1, d2, d3 [REP] q : d1, d2 q: d1, d2 [REP] q : d2, d3 Node 2 [REP] q : d1, d2 q: d1, d2, d3
Passive Distributed Indexing Operation Scenario Node 3 Node 1 q: d1, d2, d3 [REP] q : d3 q: d1, d2, d3 Node 2 q: d1, d2, d3
Performance Analysis Independent Parameters No. of Devices, Transmission Range, Mobility Model System Param. No. of Documents, No. of Keywordsof Interest, Distribution of Keywords Inter-request Timeof Queries Application Param. Index Cache Size, Max. TTL, (Document Timeout) Protocol Param.
Performance Analysis Values for Simulation
Performance Analysis Performance Measure ?
Performance Analysis Performance Measure ? Nall = 5
Performance Analysis Performance Measure ? Nall = 5 Nrep = 3
Performance Analysis Performance Measure ? Nall = 5 Nrep = 3 Query Hit Rate = Nrep / Nall (other performance measures, e.g. system response time, is left for future work.)
Analysis of Results Sensitivity to System Parameters : No. of Devices & Index Cache Size ↑in No. of devices leads to ↑in PDI performance (2) Local Index Cache has very little Impact Limited impact of ↑in No. of devices on PDI performance (1) (1, 2) Small index cache cannot accommodate entries for all matching documents Conclusion : Index Cache size can be small when No. of devices is small, whereas sufficient index cache size can boost performance in case of large No. of devices
Analysis of Results Sensitivity to System Parameters : No. of Devices & Forwarding TTL Advantage vanished when No. of devices grows further (2) Forwarding improves performance by 20% (1) Message forwarding has very little Impact (1) A higher probability of reaching more devices for forwarding in medium No. of devices (2) High No. of devices fills local index cache with nearby entries, which replaces message-forwarding adequately Conclusion : Forwarding is useful in medium density systems, but should be disabled for high density systems to avoid unnecessary network traffic
Analysis of Results Sensitivity to System Parameters : Transmission Range & Index Cache Size Index Cache Size significantly improves performance Local Index Cache has very little Impact (1) (1) Small No. of devices is reached with very low transmission range, thus increase in cache size makes no impact Conclusion : Index Cache size can be small for short-range devices such as Bluetooth, whereas No. of devices should be high to compensate for the low Hit Rate
Analysis of Results Sensitivity to System Parameters : Transmission Range & Forwarding TTL PDI with message forwarding disabled gains best performance for high-range devices (1) Responses for uncommon entries are still forwarded over great distances, that fills index caches with junk entries Conclusion : When transmission range is high, message forwarding should be disabled
Analysis of Results Sensitivity to App Parameters : Zipf Zipf-like distribution is used to model PDF of searching keywords For keyword kj, Pr(k = kj) ≈ j- α, for 0 <= α <= 1 Therefore, the higher the α, more localized is the query stream
Analysis of Results Sensitivity to App Parameters : Zipf & Index Cache Size PDI can achieve a hit rate of > 70% despite of locality in large Index Cache PDI is extremely sensitivity to locality in request stream for small Index Cache Conclusion : For applications offering no significant locality in the request stream, sizes of Index Cache must be chosen adequate
Analysis of Results Sensitivity to App Parameters : Zipf & Forwarding TTL For even higher locality, 2-hop forwarding out-performs the others PDI is gains performance improvements from packet forwarding for higher locality, 2-hop forwarding performs similarly with higher Hops Conclusion : 2-hop message forwarding should be enabled in applications offering a high degree of locality in request stream
Analysis of Results Sensitivity to App Parameters : No. of Document & Index Cache Size Performance decreases linearly with No. of documents per device Performance increases with Index Cache size in only a log-like fashion (1) (1) Has been shown elsewhere what this behaviors is explained if a Zipf-like request distribution is assumed Conclusion : Maybe more sophisticated Forwarding Strategies rather than increasing Index Cache Size should be employed to improve the performance
Analysis of Results Sensitivity to App Parameters : No. of Document & Index Cache Size Performance is improved by 10% if a small No. of documents exists in each device, with near-maximal performance with 2-hop forwarding For large No. of documents per device, no significant difference in forwarding strategy Conclusion : 2-hop forwarding can improves performance in small No. of documents per device, but all forwarding gains no performance when No. of documents per device is large
Analysis of Results Transient Behaviors PDI Hit Rate increases steadily after simulation start Real Hit Rate is constant over time Real Hit Rate : Rate of hits reported from devices actually hold a matching document Conclusion : System will attain its maximal performance automatically and no initial warm-up mechanism is required
Conclusion and Future Work PDI is … General-purpose Distributed Document Search Service Utilizes Local Caching of Query Results to Avoid Flooding the Network Tunable(Cache Size, TTL, Document Timeout) to Support Different Environments & Applications Provides an Initial Filling of Index Caches in a Very Short Time, No Warm-up Mechanism is Needed
Conclusion and Future Work Contributions of Simulation Results High Density,Low Query Locality Requires Sufficiently Large Index Cache Size Medium Density, Medium-range 2-hop Packet Forwarding should be DisabledIf EitherThe No. of Devices or Transmission Range is High Large No. ofDocuments Requires Sufficient Large Index Cache Size
Conclusion and Future Work Future Works include … • Investigation on the Impact of Document Modifications on the Performance of PDI, and the Design of the Appropriate Workaround Mechanism • Evaluation of the Performance of PDI considering Sophisticated Workload Models that Contains Location Depended Queries • Development of a Prototype Implementation of PDI and Field Tests
Conclusion and Future Work Comments … • PDI is a very simple solution for porting P2P File Sharing to Ac-Hoc Mobile Network • The Paper contains comprehensive simulation results and analysis of the PDI mechanism • However, the author did not suggest further modification on the PDI mechanism based on the analyzed results • There is also no analytical comparisons to any other similar implementations • PDI is yet to be challenged for improvement