310 likes | 322 Views
This paper explores the tradeoffs between data accessibility and query delay in ad hoc networks, proposing data replication schemes that aim to strike a balance. The authors present heuristics to improve performance and address the challenges of link failures and limited resources.
E N D
Balancing the Tradeoffs between Data Accessibility and Query Delay in Ad Hoc Networks Lianzhong Yin and Guohong Cao 소프트웨어공 강동희 소프트웨어공 이동섭 소프트웨어공 유수연 소프트웨어공 전창오
Abstract ■ mobile ad hoc networks - nodes move freely - link/node failures are common - degrade the performance of data access ■ reducing the query delay ■ improving the data accessibility ■ balance the tradeoffs between data accessibility and query delay
Introduction • ■ Mobile internet • - Portable computers and wireless networks are becoming widely available • ■ Ad hoc network • - mobile users may want to communicate with each other in situations • - Emergency rescue workers after an earthquake • - a group of soldiers
In ad hoc network • ■ Disconnections may occur frequently • - Low data accessibility • ■ Data replication • - Improve data accessibility • - reduce the query delay • - a group of soldiers
In ad hoc network • ■ limited resource • - mobile nodes to cooperate with each other • - tradeoff between query delay and data accessibility • ■ Propose data replication schemes • - balance the tradeoffs between data accessibility and query delay
Related works • ■ Data replication in Web Environment • - Links and nodes are stable in Web • ■ Data replication in Distributed database systems • - Nodes are more reliable and less likely to fail than that in ad hoc • ■ Data replication in Wireless network • - Not multi-hop ad hoc network
Related works • ■ Hara’s data replication schemes ( Related to two previous papers) • - Link Failure and Query Delay were not considered • ■ Caching used to improve Data Accessibility and query delay • - Caching schemes are passive approaches. (vs. Ours are proactive)
Contribution • ■ Greedy Schemes (vs SAF ? ) • - Local Data • CF > Greedy-S • ■ OTOO (One-To-One Optimization) Scheme (vs DAFN ? ) • - cooperates with at most one neighbor • ■ RN (Reliable Neighbor) Scheme (vs DCG ? ) • - Increasing degree of cooperation
Preliminaries • ■ System Model m: the total number of mobile nodes ( N1, N2,..., Nm ) Ni: mobile node i n: the total number of data items in the database di: data item i si: the size of di C: the memory size of each mobile node for hosting data replicas. fij: the link failure probability between node Ni and Nj (fij = fji: assume symmetric link conditions) • aij: the access frequency of node Ni to dj • ■ Each mobile node can only host C, C<n ( limited memory size ) • ■ Data Accessibility = • the number of successful data accesses / the total number of data accesses
Preliminaries • ■ Problem Analysis Data replication problem we studied is extremely hard in terms of the computational complexity. • Even for a simplified version of the problem, it is still NP-hard to approximate the problem • We present heuristics that can provide satisfying performance with very small computation • overhead • ■ NP-hard • in computational complexity theory, is a class of problems that are, informally, "at least as hard • as the hardest problems in NP“ ...... • ■ Heuristic • refers to experience-based techniques for problem solving, learning, and discovery ......
The Proposed Data Replication Schemes • ■ An Example • - Only two nodes N1, N2 • - Same-size data items d1, d2, d3, d4 • - Each node only has enough space to host two data items • - According to the DAFN scheme Step 1 Step 2 • ■ DAFN is good duplicated data remove ...... memory size is used effectively
The Proposed Data Replication Schemes • However, DAFN does not consider link failure probability. • When the link failure probability is high... data accessibility is decreased • We consider the link stability between mobile nodes and the query delay. • Due to the complexity of the problem, next, we present the heuristics used in our solution DAFN OUR 0.25
The Proposed Data Replication Schemes • ■ Mobile nodes have limited memory space. Therefore, it is important for mobile nodes to contribute part of their memory to hold data for • other nodes. This is some kind of cooperation between mobile nodes. • ■ Bad cooperation may actually reduce the performance, as show in the example above • ■ If Links to other nodes are stable ... More cooperation • ■ If Links to other nodes are not very stable ... Hosting more of the interested data locally
The Proposed Data Replication Schemes • ■ Greedy • ■ Zipf Law
The Proposed Data Replica Schemes ■ Greedy Schemes- Overview • No cooperation with neighboring node • Naïve Greedy : Allocate the most frequently access data until memory is full, not considering data size difference • Greedy-S : Assume that each data item has different size sk , Allocate in descending order of Access Frequency(AFi(k)) until memory is full AFi(k) = aik/sk AFi(k) : Access Frequency of Ni to data item dk aik : access frequency of Ni to data item dk Sk : size of data item dk
The Proposed Data Replica Schemes ■ Greedy Schemes- Performance Analysis(1) • < Assumptions and Definitions> • For simplicity, the data size is assumed to be same in the analysis. (sk=1) • Because of computational complexity, we give an upper bound of the data accessibility by using super-optimal algorithm (maybe better than optimal and not feasible). • Ni may have multiple one-hop neighbors. fNi = the probability of all links between Ni and its neighbors fail Ni hosts C most frequently accessed data Sc : the set of data items which Ni hosts as most frequently accessed data. (the set of data items Ni has in its local memory) 16
The Proposed Data Replica Schemes ■ Greedy Schemes- Performance Analysis(2) Because accessing local data is always successful, Data accessibility is larger than the sum of access frequency to local data items. • Data accessibility of greedy scheme • Super-optimal solution for Ni allocating the other data in a way that they are all accessible from Ni’s neighbors. (impossible in practice) • Therefore, 17
The Proposed Data Replica Schemes ■ Greedy Schemes- Numeric Result 1. Greedy schemes performs relatively well even when compared to super-optimal scheme which is not feasible 2. Zipf-parameter θis larger = Data accesses focus onmore hot data = Data access more skewed greedy scheme performs better because more hot data served by local copies 3. Drawback : not considering cooperation between neighboring nodes limited performance 18
The Proposed Data Replica Schemes ■ OTOO (One-To-One Optimization) Scheme • Each node only cooperates with at most one neighbor CAF1ij(k) = (aik + ajk*(1-fij)) /sk 3) 2) 1) • CAF1ij(k) : Combined Access Frequency value of Ni and Nj to data item dk at Ni (Ni and Nj are neighboring nodes) • Allocate in descending order of CAF1 value until memory is full. • CAF1 value has 3 considerations : 1) considers the access frequency from a neighboring node (Data Accessibility↑) 2) considers the data size 3) gives the access frequency from the node itself a high priority (Data Accessibility↑, Query Delay↓) 19
The Proposed Data Replica Schemes ■ OTOO (One-To-One Optimization) Scheme M5 M7 M1 M2 M4 M6 M3 20 OTOO Scheme works as follows: 0. All nodes are marked as “white” initially (no allocation process yet) 1. Broadcasting : Node ids and access frequency for each data item 2. Invitation, Calculation and Allocation : Invitation to the most stable neighboring node (neighbor with the lowest fij) , Calculating CAF1 value and Allocation Both nodes are marked as “black”, no longer participate the allocation 3. In case of two or more nodes processing at the same time (M2 , M3 and M5) : When receiving more than one invitation : accepts the invitation from the node with the lowest id (M2, M3 -> M4) 4. No more white neighbors : allocating its own most interested data items (M3)
The Proposed Data Replica Schemes ■ RN (Reliable Neighbor) Scheme • Increasing degree of cooperation : Contribute more memory to replicate data for Reliable neighbors. • Reliable Neighbors • For Ni, if 1-fij > Tr , then Nj is reliable neighbor. And let nb(i) be the set of the Ni’s reliable neighbors. • Total Contributed memory size of Ni, Cc(i) is set to be, If links are stable, Cc is larger (as 1-fji ↑), but if not stable, then Cc(i) is smaller. • α is system tuning factor ; α ↓ Cc(i) ↑ more cooperation with neighbors (RN2>RN8>RN16) • [C-Cc(i)] Ni first allocates its most interested data up to C-Cc(i) memory space • [Cc(i)] In descending order of CAF2 value of Ni to dk, allocate the rest of data. 21
Simulation experiments • ■ Simulation Model - m nodes are placed randomly in a 1500m * 1500m area. • - radio range is set to be D. • - nodes can communicate with each other. • - link may fail. • - the number of data items n is set to be the same as the number of nodes m. • - data item di’s original host is Ni • - δ values ranging from 0.6 to 1.4 • - each node has a memory size of C • ■ Access patterns - different access pattern • 1) all nodes follow the Zipf-like access pattern • 2) different nodes have different hot data. • 3) randomly selecting an offset value for each node Ni: offset i is between 1 and n-1. • - same access pattern • 1) all nodes have the same access pattern. • 2) all nodes have the same access probability to the same data item. • ■ Performance metrics • - data accessibility • - query delay
Simulation experiments • ■ Fine-tuning the RN scheme – same access pattern - threshold value Tr (4.3.3 The Reliable Neighbor (RN) Scheme) • - RN2 > RN8 > RN16 • - Tr has the largest effect on the performance of RN2 : RN2 contributes the largest portion of the memory size to neighbors. • - Tr = 0.6 achieves a balance between the data accessibility and query delay. RN2 RN2 RN8 RN8 RN16 RN16
Simulation experiments • ■ Effects of Zipf Parameter (θ) – different access pattern - As θ increases, more accesses focus on hot data items and the data accessibility is expected the increase. • - Proposed schemes outperform the DAFN scheme in terms of data accessibility in almost all cases. • - Proposed schemes 1) consider the link failure probability when replicating data 2) avoid replicating data items that are not frequently accessed by using the CAF value. - DAFN scheme • 1) does not consider the link failure probability 2) sometimes replicates data item with low access frequency instead of frequently accessed data items. DAFN
Simulation experiments • ■ Effects of Zipf Parameter (θ) – different access pattern (continue) - DAFN scheme tries to avoid duplicated items among neighboring nodes, which means that even if a data item is popular among two neighboring nodes, it is still allocated at only one of the neighboring nodes. • - RN2 > RN8 = RN16 > OTOO • - Nodes have different interest, it is better for them to host data they are interested in. • - Cooperation does not have advantages. DAFN RN2 RN8 = RN16 OTOO (best)
Simulation experiments • ■ Effects of Zipf Parameter (θ) – same access pattern - Greedy-S performs better than Greedy. : it gives higher priority to data items with smaller size, and thus more important data can be replicated. • - data accessibility : RN2 > RN8 > RN16 > OTOO (RN2 performs the best) • - query delay : RN2> RN8 > RN16 > OTOO (OTOO performs the best) • - Higher degree of cooperation improves the data accessibility, but it also increases the query delay. RN2 RN2 Greedy-S RN8 RN8 > RN16 > OTOO RN16 OTOO Greedy DAFN
Simulation experiments • ■ Effects of Radio Range (D) – same access pattern - When the radio range increases, the network is better connected and the accessibility is expected to increase. • - Data accessibility 1) Data accessibility increases as the radio range increases. 2) Radio range is very large, different schemes have similar data accessibility. - Query delay 1) Query delay increases as the radio range increases. 2) Network is better connected, some data are previously not available can not be found at faraway nodes. • - Total traffic 1) Greedy, Greedy-S scheme generate lowest replication traffic (do not cooperate) 2) DAFN tries to remove duplicated data items in neighboring nodes. – highest traffic 3) RN2 > RN8 > RN16 (RN2 contributes a large amount of memory space to neighboring nodes) similar DAFN DAFN RN2 RN8 RN16 DAFN Greedy Greedy-S Near zero
Simulation experiments • ■ Effects of the Error Factor of Link Failure Estimation (δ) - DAFN, Greedy, Greedy-S is not affected by δ as they do not depend on the estimation of link failure probability. • - RN2, RN8, RN16, OTOO, the effect is not very significant even when the error is very large. : Proposed schemes robust and not sensitive to estimation errors. DAFN Greedy-S Greedy Greedy Greedy-S DAFN
Conclusion • ■ Propose Three Method • - Greedy Schemes (cf > Greedy – s) : Local Data • - OTOO (One-To-One Optimization) Scheme : cooperate with only one neighboring node (at most one neighbor) • - RN (Reliable Neighbor) Scheme : cooperate with more neighboring nodes and contributes more memory for data of neighboring nodes • ■ Link Failure considered, try to Balance Data accessibility and Query Delay • ■ Our proposed schemes can providehigh data accessibility and achievebalance between Data accessibility and Query Delay
Appendix#01. Zipf-like Distribution • : Access Probability of kth data item (1<=k<=n) in Zipf-like distribution Pak When n=100… Θ = 1 : y=0.2/x Θ = ½ : y=0.05/√x Θ=0 : y=0.01 • θ larger more access focus on the hot data, data access pattern more skewed Θ = 1 0< Θ <1 Θ =0 k Hot data