210 likes | 306 Views
ENTROPY-BASED CONCEPT SHIFT DETECTION PETER VORBURGER, ABRAHAM BERNSTEIN IEEE ICDM 2006. 1. Speaker: Li HueiJyun Advisor: Koh JiaLing Date:2007/11/6. OUTLINE. Introduction Entropy and Concept Shift Adaption Calculating Entropy on Data Streams
E N D
ENTROPY-BASED CONCEPT SHIFT DETECTIONPETER VORBURGER, ABRAHAM BERNSTEINIEEE ICDM 2006 1 Speaker: Li HueiJyun Advisor: Koh JiaLing Date:2007/11/6
OUTLINE • Introduction • Entropy and Concept Shift Adaption • Calculating Entropy on Data Streams • Algorithm Control Strategy using Entropy Measure • Experimental setup • Experimental Results • Discussion of the Experiments • Application to a Real-World Problem: Context Switches in Sensor Data • Limitations, Future Work, and Conclusion 2
INTRODUCTION • Problem: • In many applications data is gathered over time, which raises the problem that the concepts to be learned may drift (i.e., change) over time. • The increasing amount of data (e.g., multimedia content, data warehouse ) and limitation of computing power due to miniaturization (e.g., wearable computing) call for faster and more resource friendly algorithms. • Motivation: the analysis of sensor data on wearable devices. 3
INTRODUCTION • Context-awareness: A Scenario-based Approach for Direct Interruptablity Prediction on Wearable Devices • Classifiers predict peoples’ anticipated behavior based on sensory input • Contexts (or contextual situations) switch rather than gradually change • Contextual information could be reused, even for new, not yet encountered situations • An ongoing monitoring of the sensor stream is needed 4
INTRODUCTION • Problem: • online pattern matching mechanism comparing the sensor stream to the entire library of already known contexts is computational complex and not yet suitable for today’s wearable devices. • Solution: • indicate possible candidates (or hot spots) for context changes limiting the computationally intensive context (re-)determination on those candidates. 5
ENTROPY AND CONCEPT SHIFT ADAPTION • Assumptions: • As long as the distribution of older instances (features and target values) is similar to the distribution of new instances no concept drift occurred • A distribution difference between older and more recent instances indicates a change in the target concept • Measure the distribution inequality: • If two distributions are equal, the entropy measure results in a value of 1 • If they are absolutely different the measure will result in a value of 0 6
CALCULATING ENTROPY ON DATA STREAMS • Sliding window technique: compares two windows, one presenting older and the other representing more recent instances in the stream • Compare the two windows by counting and comparing all instances with respect to their class and stream membership • Discretize the range of instance values to a fixed number of bins to take the approximate value distribution into account 7
CALCULATING ENTROPY ON DATA STREAMS • A data stream: a sequence consisting of sequentially ordered tuples in time ti • i (1, 2, 3, …) • := ( , li) • where is the vector of all feature stream instances sni at time ti • The domain of the label stream l is discrete and contains all class values c C 8
CALCULATING ENTROPY ON DATA STREAMS Hi: the resulting entropy at time ti and is defined as the mean of all data stream entropies His at time ti where S: the number of feature-streams His is calculated from the entropies Hiscb
CALCULATING ENTROPY ON DATA STREAMS Hiscb: represent the entropy of each class (c C) and bin (b B) given the stream s at time ti Bins: discrete aggregation of the values of each feature stream s : the probability that an instance occurs in the old window at time ti, belong to class c, with feature domain of stream s in bin b wiscb: depend on i, s, c, b
ALGORITHM CONTROL STRATEGY USING ENTROPY MEASURE • Instance selection style algorithm
EXPERIMENTAL SETUP • Real concept drifts: changes in the actual target concepts • Virtual concept drifts: changes in the distribution • Generate synthetic data set: • H. Wang, W. Fan, P. S. Yu, and J. Han. Mining concept-drifting data streams using ensemble classifiers. In KDD ‘03: proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining • Real drift data set • Virtual drift data set • Mixed data set
EXPERIMENTAL SETUP • Performance measure: • Accuracy • The area under the ROC-curve • A representative set of benchmarks: • Perfect benchmark: assumes an oracle-given ideal window size ξ for any point in time • A selection of ensemble classifiers: the literature so far showed to have the highest accuracy and robustness against noise • P. Vorburger and A. Bernstein. Entropy-based detection of real and virtual concept shifts. Working Paper – University of Zurich, Department of Informatics, 2006
EXPERIMENTAL RESULTS • The prediction quality against increasing noise levels
EXPERIMENTAL RESULTS • Computational complexity • Compare ensemble classifiers and the entropy measure based algorithm • Measure the elapsed time: • three committee classifiers: 2031.6±15s • Entropy based algorithm: 148.6s • Entropy calculation without Naïve Bayes model building: 1.1±0.1s
DISCUSSION OF THE EXPERIMENT • Entropy measure outperforms the ensemble benchmark algorithm on real concept shifts • Exhibit a greater predictive power while requiring less computational resources • The entropy measure based algorithm showed the nearly the same robustness towards noise as the perfect benchmark and the committee classifiers
APPLICATION TO A REAL-WORLD PROBLEM: CONTEXT SWITCHES IN SENSOR DATA • Data set: • Audio: decomposed into 10 features • accelerometer data recorded over a time of 15381s: merged in one single feature • The wearable data acquisition set up: a microphone and three three-dimentional accelerometers attached on the subject’s shoulder, wrist, and leg
APPLICATION TO A REAL-WORLD PROBLEM: CONTEXT SWITCHES IN SENSOR DATA • (A)walking (B)streetcar (C)office work • (D)lecture (E)cafeteria (F)meeting
LIMITATIONS, FUTURE WORK, AND CONCLUSION • Gradual concept drifts • Find boundary conditions • Recognize recurring concepts and exploit this information • Generalizability • The choice of the suitable parameters could be optimized
LIMITATIONS, FUTURE WORK, AND CONCLUSION • Find a measure for detecting and measuring concept shifts as an analogue for context switches • Formulation of entropy on data streams is capable to detect and measure concept shifts • Algorithm with an entropy based instance selection strategy outperformed ensemble based algorithms on real concept shift data sets • Given algorithm robustness towards noise, its sensitivity towards concept shifts, its computational efficiency, and predictive power on real concept shift data sets