180 likes | 335 Views
Limitations with Activity Recognition Methodology and Data Sets. Gary Weiss Fordham University Jeffrey Lockhart Cambridge University Work supported by National Science Foundation Grant No. 1116124. Genesis of this work.
E N D
Limitations with Activity Recognition Methodology and Data Sets Gary Weiss Fordham University Jeffrey Lockhart Cambridge University Work supported by National Science Foundation Grant No. 1116124.
Genesis of this work • Our WISDM (Wireless Sensor Data Mining) Lab has been working on activity recognition for several years • Have focused on building and deploying a real-world system called actitracker • Recent work has focused on implementing, analyzing, and using different types of models • When comparing our AR work with other work we identified several key issues in methodology, which also impact the resulting data sets HASCA 2014
Overview • Identify some methodological issues and resulting impact on data sets • Make people aware of these issues • Propose mechanisms for addressing these issues • Largest focus on model type but many other factors are considered • Ultimate goal is to generate more diverse data sets and precisely label underlying assumptions HASCA 2014
Factors Impacting Activity Recognition • Model Type: • Personal, Impersonal, Hybrid • Collection Method: • Fully Natural, Semi-Natural, Laboratory • Data • Number of Subjects • Population (college, elderly, etc.) • Traits (height, weight, income, education ,…) • Activities (running, jogging, standing, …) • Duration (1 hour of data …) HASCA 2014
Factors Impacting Activity Recognition • Sensors • Type: accelerometer, gyroscope, barometer • Sampling rate: 20Hz, 50 Hz, … • Number of sensors • Location of sensors (pocket, belt, wrist, …) • Orientation (facing up, down, in, out) • Features • Raw features • Transformed, Window Size • Results • Accuracy • Consistency HASCA 2014
Analysis of AR Research • We examined 34 published AR papers • Many were smartphone-based • Several papers cover multiple data sets and thus 38 data sets were analyzed • Several papers utilized multiple model types and hence 47 distinct models were analyzed • Detailed analysis published in Lockhart’s MS thesis: • Benefits of Personalized Data Mining Approaches to Human Activity Recognition with Smartphone Sensor Data • A table describes each of the factors listed on prior 2 slides for each dataset • Summary information described in this presentation HASCA 2014
Background on Model Type • Personal Models • Model based on labeled data from intended user • Requires new users to provide training data • Our AR results show high accuracy (~98%) • Impersonal/Universal Models • Model based on a panel of representative users • No training phase required– works “out of the box” • Our AR results show modest performance (~76%) • Hybrid Models • Model based on panel of users that includes intended user • Requires a training phase for user • Our AR results much closer to personal models even though panel includes dozens of users (~95%) HASCA 2014
Issues with Hybrid Models • Our results show that personal models perform really well with only small amounts of data per activity • Little practical need for hybrid models given need for training • Why are hybrid models often used in research papers? • Simple experimental setup: use cross validation on single data set. No need to carefully partition the data. • With n users, personal and impersonal models require n separate partitions • Often assumed that hybrid models approximate impersonal models and are treated as such • In actuality they are much closer to personal models HASCA 2014
Issue 1: Model Type • Hybrid model most popular and authors often claim results generalizable to new users (not true) • In 10 of 19 cases 10 or fewer users so even closer to personal models (we had 59) • Couldn’t determine model type in 6 cases; serious methodological issue • 53% of the cases we claim methodological issues (40% + 13%) Analysis of 47 models from 38 data sets HASCA 2014
Issue 2: # Subjects & Diversity • Number of subjects often small • 11 studies had less than 5; 12 had less than 10 • HASC 2010 & 2012 more users but little data per user • Impacts ability to evaluate performance • Our results show impersonal models are very inconsistent across users • 4 studies evaluated universal models with less than 8 users; only 2 had at least 30 • Populations should also be diverse but many studies focus on college students; personal info should also be provided (height, weight, etc) Distribution of impersonal model performance across 59 users HASCA 2014
Issue 3: Collection Methodology • Many possible distinctions but 3 main categories: • Fully natural: normal daily activities • Semi-natural: operate in normal environment but may be directed (e.g., asked to walk for 5 minutes) • Laboratory: structured tasks in a controlled environment • Type of collection environment should be documented since this impacts results and ability to replicate • We have released an AR data set that is semi-natural and our Actitracker data set that is fully natural (except for self-training phase) HASCA 2014
Issue 4: Sensors • Type of sensor and number of sensors • Usually provided: not an issue • Location • Precise location and orientation is often not specified • Our results indicate these factors are important • For smartphone, which pants pocket? • How oriented? Mine almost always down and in (i.e., screen facing thigh). HASCA 2014
Issue 5: Features & Feature Generation • Usually little choice in how to represent raw features except for sampling rate • Raw sensor data transformed into multivariate records using sliding window and summary features • Half of studies don’t report window size • Vast majority of smartphone AR research only uses basic statistics • Yield good results which appear to be competitive with more complex features (e.g., based on FFT info) Distribution of window sizes for 52% of studies that report this info HASCA 2014
Features & Feature Generation • Important that all AR data sets: • Release raw data • Transformed data or script to generate transformed data • Descriptions of higher level features often not sufficiently well specified • Our datasets include raw and transformed data sets and recently we also released the transformation scripts • Interestingly, researchers found inconsistencies between our raw and transformed data and helped us identify several bugs HASCA 2014
WISDM Activity Recognition data sets • Two main data sets • Activity Prediction • 36 users with semi-natural data collection • All data is labeled with activity • Actitracker Data • Data from our publically available Actitracker app • Data set will be updated periodically • Fully natural data collection with semi-natural data collection for self-training data • Self-training data is labeled; remaining data is not labeled • Available from: http://www.cis.fordham.edu/wisdm/dataset.php HASCA 2014
Conclusions • All activity recognition research should clearly describe relevant factors and describe experimental methodology • Propose a list of factors/issues to include • Many existing studies do not provide important information • Highlight role of model type • Show that many studies do not specify model type or use hybrid models • Hybrid models are inappropriate in most cases and many studies assume they approximate impersonal/universal models– which is contradicted by our research HASCA 2014
Acknowledgements • Material based on Jeff Lockhart’s MS Thesis • Activity Recognition research was supported by all WISDM Lab members • Funding provided by NSF Grant 1116124 HASCA 2014
More Information • Information available from wisdmproject.com • Papers available under “About: Publications” tab • Includes Jeff’s MS Thesis • Jeff Lockhart, Gary Weiss (2014). The Benefits of Personalized Smartphone-Based Activity Recognition Models,In Proc. SIAM International Conference on Data Mining,Society for Industrial and Applied Mathematics, Philadelphia, PA, 614-622. • Info about our app available from actitracker.com • App available for download from Google Play • Feel free to download our data sets and ask us about our data HASCA 2014