270 likes | 451 Views
RFID Data Aggregation. Dritan Bleco, Yannis Kotidis Department of Informatics Athens University of Economics and Business. Outline. Introduction Temporal Aggregation Basic Temporal Aggregation - BTA Lossy Temporal Aggregation - LTA Spatial Aggregation Evaluation Conclusions.
E N D
RFID Data Aggregation Dritan Bleco, Yannis Kotidis Department of Informatics Athens University of Economics and Business
Outline • Introduction • Temporal Aggregation • Basic Temporal Aggregation - BTA • Lossy Temporal Aggregation - LTA • Spatial Aggregation • Evaluation • Conclusions
Radio Frequency Identification (RFID) • Use radio-frequency waves to transfer data between a reader and a movable item to identify, categorize, track... • Does not require physical sight or contact between reader/scanner and the tagged item
Variations • Active/Passive Tags • Memory Size (16bits- KBs) • Memory Type • Read Only, WORM, Read/Write • Frequency • 125KHz - 5.8 GHz • Physical Dimensions • Thumbnail to Brick sizes • Price (few cents-hundred euros)
Existing Applications • Animal/livestock tracking • Postal services (routing and sorting) • Libraries • Toll collection • Warehousing • Supply chain management • …
Level-3 IT Applications Middleware (remote) Level-2 Aggregated RFID Data Stream Edgeware (on-site) filtering, cleaning, aggregation Level-1 Raw RFID Data Stream RFID Reader RFID Reader RFID Reader Level-0 tag tag tag tag tag tag tag tag tag RFID System Architecture EPC Code (tag) Time (reader) Location (reader)
Simple RFID data stream model • Assume streaming records with schema • EPC code: EPCi • (Discrete) Time: ti • Location: loci
Aggregated stream (EPCi,loci,tstart,tend) Basic Temporal Aggregation (BTA) • Collate consecutive reports of the same tag EPCi tstart tend Raw stream Reader (EPCi,loci,tstart) (EPCi,loci,tstart+1) (EPCi,loci,tstart+2) … (EPCi,loci,tend) loci
Problems with BTA • RFID readers often drop observations • e.g. due to collisions • Up to 30% loss is not uncommon [Jeffery2006] • Objects are often moved within the facility • Multiple BTA records • Reduction depends on data characteristics • Need an application-controllable reduction framework • OLAP analysis does not require precise knowledge!
Lossy Temporal Aggregation (LTA) • LTA record format: (EPC,loc,tstart,tend,p) • Tag may be partially present during the interval • Value denotes the fraction of times that the tag was observed during the interval • BTA: p=1 (implied) • LTA: 0<p≤1 • Allow us to control the size of the aggregated stream or the level of accuracy
Types of Error in LTA • X=epochs when tag was reported in [tstart,tend] • Y=epochs when tag was not reported in [tstart,tend] • p = X / (X+Y) selected LTA interval tstart tend Tag spotted but reported with probability p instead of 1 Tag spotted but not reported Tag not spotted but nevertheless reported with probability p
Problem Formulation • Compute best B-tuple LTA representation such that cumulative error (including both false negative and false positive error types) is minimized • Cumulative Error = 2*X*Y/(X+Y)2 • Other error metrics? • Dual problem also interesting
Helpful Observations • Selected end-points tstart,tend must contain appearance of a tag • Should not break consecutive observations Bad choice due to (2) Bad choice due to (1) Thus, we can first apply BTA and afterwards LTA
BTA Interval LTA Interval Linear Algorithm • Goal: generate B LTA records • Input: n BTA records • Example: • Reduce stream from 8 to B=4 records
Greedy Algorithm • Iteratively merge intervals • Select best candidate at each step • Stop when left with exactly B intervals
E(i,k)=min(E(j,k-1)+err(j+1,i)) j<i Optimal LTA • Dynamic Programming formulation • E(i,k): error of best k LTA representation for first i BTA intervals • err(a,b): error for single LTA record encoding intervals a, a+1, … b BTA intervals: 1 2 j j+1 i k-1 LTA intervals 1 LTA interval
Spatial Aggregation • Tags often move in batches • Common in supply-chain/distribution networks • Idea: create surrogate EPC codes to replace multiple tags packaged together • Proposed in [Gonzales et all ICDE 2006] • Note: • Do not know in advance how items are grouped • Surrogate codes do not imply physical grouping
Example LTA stream I1 L1 T1 T5 .78 I2 L1 T1 T5 .69 I3 L1 T2 T5 .90 I1 L2 T12 T22 .67 I2 L2 T12 T22 .62 I4 L2 T12 T22 .66 These items are observed at the same interval/location Surrogate Group codes G1: I1 I2
Example LTA stream G1L1 T1 T5 .69 I3 L1 T2 T5 .90 I1 L2 T12 T22 .67 I2 L2 T12 T22 .62 I4 L2 T12 T22 .66 New record replaces both entries More tags spotted together Surrogate Group codes G1: I1 I2 G2: G1 I4
Resulting Tables LTA stream Reduced stream G1L1 T1 T5 .69 I3 L1 T2 T5 .90 G2 L2 T12 T22 .62 I1 L1 T1 T5 .78 I2 L1 T1 T5 .69 I3 L1 T2 T5 .90 I1 L2 T12 T22 .67 I2 L2 T12 T22 .62 I4 L2 T12 T22 .66 Surrogate Group codes G1: I1 I2 G2: G1 I4
Experiments • Used RFID traces from 2008 Hope Conference in New York • Sampled data at 30sec intervals • 1.9Million records • Reduced to 423K records via BTA
Accuracy (LTA) • Picked tag with most intervals (569) • Vary number of requested LTA-tuples (B)
Notes on Spatial Aggregation • Input: 434K BTA records • Output • 77K surrogate group ids • 39% space reduction (accounting surrogates) • 3.3secs (1.83GHz Core Duo with 1GB)
Different Choices 3:1 3:1 Lossless Lossy
Conclusions • Explored different aggregation schemes • Exploit temporal and spatial correlations • Schemes reduce size of RFID stream in a user-controllable manner • All algorithms are fairly fast • Greedy is orders of magnitude faster than OptimalDP with practically identical performance • More schemes possible • Ex: spatial with fuzzy groups • Other error metrics, dual problem
Thank you, Questions?