DCIM: Distributed Cache Invalidation Method Authors: Kassem Fawaz and Hassan Artail

DCIM: Distributed Cache Invalidation MethodAuthors: KassemFawaz and Hassan Artail Alex Buchholz CS 6204 April 25th, 2013

Presentation Overview • Paper overview • Main concepts/definitions • Related Works • DCIM Architecture and Operations • Analysis • Experimental Results

DCIM Overview Distributed Cache Invalidation Method • Client-BasedCache Consistency Scheme • Pull Based • Implements: • TTL • Piggybacking • Prefetching • Provides “near strong consistency” “this is the first complete client side approach employing adaptive TTL and achieving superior availability, delay, and traffic performance”

Cache Consistency • Strong: cached items are identical to server • Weak: client queries might get stale data items • Delta: data stale for a period of time • Probabilistic: data is consistent with a probability p • Probabilistic-delta: data item is at most delta units of time stale with a probability not less than p

Three Main Cache Mechanisms • Push: Server informs change • Pull: Client asks for change • Hybrid: Both

Time to Live (TTL) • Scheme: TTL stored along with value • Data considered good until TTL has passed • Popular because: • Simplicity • Good performance • Flexibility to assign TTL values • Completely client based

Piggybacking Two Ways: • Cache piggybacks a list of invalidated documents when communicating with the server • Server piggybacks a list of updated documents when it communicates with the cache.

Prefetching • Client pulls data before needed • based on that items request rates • Helps increase availability and reduce wait states

Pull Based Approaches • Client Polling • Time to Live

Client Polling • Cache validation initiated on client’s schedule • To achieve strong consistency: • Validate each item before being served *** Low bandwidth, high latency ***

Two TTL Approaches • Fixed: A set TTL value for all data items • Adaptive: provides higher consistency requirements along with lower traffic

Adaptive TTL Approaches 5 Discussed in the paper • TTL = factor * time difference btw query and last update • TTL = factor * last update interval • TTL = time difference btw query time and kth recent update divided by factor, server replays cache at k most recent update times • Consider complete update history and predict future updates • TTL = probability describing staleness of cache

COACS • DCIM builds on top of COACS • Cooperative caching architecture • Two types of nodes • Caching nodes (CNs): cache previously requested items • Query Directories (QD): index queries along with the addresses of the caching nodes • Either can be a Requesting Node (RN) • Node that requests data items

COACS Operation • RN wants data– scans QD’s sequentially from nearest to farthest for same query • Far less QD’s than CN • When the same query is found at a QD, it forwards the request to the CN which has the data • If all are traversed and no queries found (cache miss), the last QD queries the server • RN -> CN

Design Methodology • Pull-based: CN’s monitor TTL of cache • Scalable: anyone can become a CN if what they requested has not been cached • Adaptive TTL values: each CN estimates the interupdate interval and uses to set TTL GOAL: improve efficiency of cache updating in MANETs without requiring mobile devices to maintain cache state information

Piggybacking in DCIM • CN polls server often to know update times of items it has cached • Piggybacks requests refresh of cached items each time contacts server • To avoid unnecessary packets, two-phase approach: • After each polling interval (Tpoll) CNs revalidate data with expired TTL with high update rate • After a configurable number of polling intervals (Npoll), Cn revalidates if at least one item has passed TTL Tpigg = Npoll X Tpoll • Allows for delta consistency when at least one item has an expired TTL • CN ensures that data items are at most one piggybacking interval stale

DCIM Basic System Overview Two CNs are sending cache validation requests to the server (dotted arrows) via gateway nodes and through the Access Point. The server replies back with lists of valid and changed data items (short-dashed arrows) to the CNs Which in turn update the corresponding QDs asynchronously about the items they cache (long-dashed arrows).

Basic Data Request

TTL in DCIM • All CN’s have insight into update patterns • Store last update times for all data in server • Uses this info to predict next update time • HOWEVER, CN are constrained • Power, processing, storage • DCIM uses a running average to estimate interupdate interval at the server • CN only needs to store the estimated interval and the last updated time. • simplicity and ease of computation • minimum amount of data required • diminishing weights assigned to older data IUI(t) = (1 – α) * IUI(t-1) + α * LUI IUI(t) = estimated interarrival time at time t LUI = last interupdate interval α = value btw 0.1 and 0.2 (0.125 in this paper)

Server Operations Server only reacts to the received CURP messages -> doesn’t need to be aware of MANET dynamics Therefore, DCIM could be deployed in an Internet environment using HTTP header fields CURP: Cache Update Request SVRP: Server Validation Reply SUDP: Server Update Data Decision flow at the server

QD Operations (COACS) • Elected based on resource capability • # bounded by two limits: • Lower bound: having enough such that an additional QD will not yield enough reduction in workload to be worth it • Upper bound: corresponds to delay threshold • # can change dynamically(7<->100 in sims.) • Typically experience 1.5x the load of a CN

CN Processing The process that runs on the CN includes two threads: A monitoring thread A processing thread Elements the General Cache Information Table Elements of the Query-Specific Cache Information Table

CN Monitoring Thread • Checks for expired data items • Issues validation requests • Requests for data items It performs these in two functions: Inner Loop Function: Outer Loop function

Inner Loop Function Outer Loop Function

Outer Loop Function Npoll passed? Inner loop function

CN Processing Thread • Processes data requests (DRPs) from RNs • Processes SUDP and SVRP messages in response to CURP messages • Also computes TTL values

Processing Data Requests • CN checks the requested item in the DRP • If INVALID -> issues an update request directly to the server • Changes state to TEMP_INVALID. Places query in waiting list • Otherwise, the query is processed by sending the request item to the RN view a DREP message

Processing SVRP and SUDP • SUDP – if received, must be for an item that has changed at the server • CN calculates TTL. If SUDP makes reference to items that have requests placed in the waiting list, those items are sent to the corresponding requesting nodes • SVRP – sent from the server in response to a CURP packet • there are items which were specified in the CURP packet but not sent as part of the SVRP because the actual updated data items were sent to the CNs as part of the SUDP message SVRP: Server Validation Reply SUDP: Server Update Data

TTL Calculation • Exact TTL calculation depends on whether the item was expired at the server or not Average TTL versus inverse of interupdate interval. Inverse of update rate (s) Shows that at very low update rates (less than 1 update per 1,000 s), the estimated TTL does not adapt well. However, in actuality, time goes beyond the 2,000 s considered for this simulation time, meaning that more item updates will occur on the server during the longer time interval.

Analysis Overview • DCIM analyzed to assess: • Bandwidth gain • Query response time gain • Compared to two consistency schemes: • Poll-every-time (PET) • Push-based stateful (PBS)

Analysis Defined • Bandwidth gain: difference btw DCIM and PET, PBS • Query response time gain: difference btw times it takes to answer queries • Data requests defined by homogeneous Poisson distributions: • λR = rate of requests, λU = item update rate

Response Time Gain Given HD = 5, HR = 5.21, HC = 5.21, Tin = 5ms, Tout = 70 ms, λU = 1/500 (2) and (3) derived response time gain of DCIM over PET and PBS

Bandwidth Gain D is the disconnection interval N is the number of cached items E[Ndisc] is the CN disconnection rate Given λR = λU = 1/500, N = 4,000, M = 20, SR = 0.5 KB, SD = 10KB, SU = 0.15 KB, q = 20, E[Ndisc] = 15, and D = 20 s

Analysis Results • Response time gain mainly depends on update rate • In DCIM the majority of the requests are answered from the MANET, this is why the time response difference is less than 10 ms • The traffic resulting from large piggybacking intervals is lower than that of small piggybacking intervals • The traffic demands for DCIM decrease exponentially for small polling intervals • DCIM is less costly than PBS • *** The polling interval is the main parameter that affects performance *** Optimal polling interval is the value that results in the lowest traffic consumption – equivalent to the high bandwidth gain since poll every time does not depend on the polling interval

Experimental Results Overview • Implemented using ns2 • Two additional schemes were implemented for comparison: • Poll every time: validate each item each time it’s requested • Fixed-TTL: each item has same TTL value • Three version of DCIM implemented • Piggybacking is disabled • “prefetch” is always set • Only implements the update rate adaptation mechanism – items are validated when they expire • 400x400 m2area, 100 nodes, 2 Mbps node bitrate, 2 m/s node mobility, etc. Rest shown on next slide

Experimental Results Overview

Experimental Results Overview • The reported results are from 5 experiments that involve varying: • request rate • update rate • item popularity • maximum velocity • polling interval • data item size • The results are: • Consistency ratio • Query delay • Cached data query delay • Uplink traffic • Downlink traffic • Average overhead traffic

Varying Request Rate • Varied between 5s and 120s • Piggybacking -> appropriate TTL value -> high consistency ratio • Prefetching -> high hit ratio-> lower delays • Query delay gets smaller after the item is cached increases by a small margin due to less prefetching • DCIM consumes more traffic on the server side due to prefetching • As for the node traffic, by piggybacking large amount of items, DCIM consumes more traffic when compared to other approaches. • However, as the request rate decreases, prefetching does not happen that often, and this leads to lower traffic as shown in the graph. This is how DCIM adapts prefetching to the request rate of items.

Varying Update Rate • TTL < 100 s is less than the interupdate intervals in all of the scenarios simulated -> provide the best consistency level. • Fixed TTL approaches have higher hit rates than poll every time, but less than DCIM • Delay after caching for DCIM < polling every time and fixed-100, but it may exceed that of fixed-500 that keeps the element for a longer time. • Traffic not high at the server, very low in the MANET • less than 10 kbps, while the bandwidth is 2 Mbps • Piggybacking of requests -> traffic increase • Increases in frequency as update rates increase • Without this traffic though, the CNs would not be able to infer the update rate and calculate reliable TTL estimates.

Varying Zipf (Popularity) • Effectively varies the popularity of the data items • Analogous to varying the items’ request rate • Zipf increases -> diversity of items decreases • smaller subset of the items is requested more • TTL saved for all items regardless of their request rates • constant consistency at 98 percent • Zipf increases -> smaller set of requests -> increases the probability of hits. It is • Through prefetching -> constant hit rate

Varying Node Velocity • Varied between 0 m/s and 20 m/s • No special outcome • Mild increase in the delay

Varying Polling Interval • Polling interval is increased from 1 to 50 s • Mild decrease in the consistency ratio and hit ratio • an increase in the delay • Piggyback interval increases due to the decrease of hit rate • By increasing the polling interval the validation requests from the inner loop function become farther apart in time • Piggybacking large - > CN will predict that items will be requested before the end of this interval • leads to more prefetching and consequently more traffic

Varying Data Item Size • 5 KB and 1,500 KB • DCIM at high data sizes still demands less traffic than the network bandwidth • Traffic per node increases linearly with the increase in data item size

Effectiveness of DCIM • Better consistency over its variants • Piggybacking offers an added value by providing a better estimate of TTL • consequently higher data consistency • Piggybacking induces more overhead traffic • validation requests and prefetching • Prefetching highly requested items saves on traffic and provides acceptable query delay • Piggybacking increases data consistency • Prefetching adapted to the request rate controls overhead traffic consumption

Comparison between DCIM and Other Approaches

Conclusions • DCIM is a client-based cache consistency scheme for MANETs • Relies on estimating the inter update intervals of data items to set their expiry time. • Uses piggybacking and prefetching • increases accuracy of estimation • reduce both traffic and query delays • Compared DCIM approach to fixed TTL and client polling and to two server-based approaches SSUM and UIR • Showed that DCIM provides a better overall performance than the other client- based schemes and comparable performance to SSUM

DCIM: Distributed Cache Invalidation Method Authors: Kassem Fawaz and Hassan Artail