Barnan Das

Addressing Machine Learning Challenges to Perform Automated Prompting Barnan Das PhD Preliminary Exam November 8, 2012 ***Self-portraits by William Utermohlen, an American artist living in London, after he was diagnosed with Alzheimer’s disease in 1995. Utermohlen died from the consequences of Alzheimer’s disease in March 2007.

36 million Worldwide Dementia population Actual and expected number of Americans >=65 year with Alzheimer’s $200 billion Payment for care in 2012 15 million Unpaid caregivers Source: World Health Organization and Alzheimer’s Association.

Automated Prompting Help with Activities of Daily Living (ADLs)

Existing Work • Rule-based (temporal or contextual) • Activity initiation • RFID and video-input based prompts for activity steps Our Contribution • Learning-based • Sub-activity level prompts • No audio/video input

System Architecture Published at ICOST 2011 and Journal of Personal and Ubiquitous Computing 2012.

Outline of Work

prompt Off-line Classification of Activity Steps no-prompt

Data Collection

Class Distribution Total number of data points 3980

Imbalanced Class Distribution

Existing Work • Preprocessing • Sampling • Over-sampling minority class • Under-sampling majority class • Oversampling minority class • Spatial location of samples in Euclidean feature space

Proposed Approach • Preprocessing technique • Oversampling minority class • Based on Gibbs sampling Attribute Value Markov Chain Node Submitted at Journal of Machine Learning Research, 2012.

Proposed Approach Markov Chains Minority Class Samples Majority Class Samples

(wrapper-based)RApidlyCOnvergingGibbs sampler: RACOG & wRACOG • Differ in sample selection from Markov chains • RACOG: • Based on burn-in and lag • Stopping criteria: predefined number of iterations • Effectiveness of new samples is not judged • wRACOG: • Iterative training on dataset, addition of misclassified data points • Stopping criteria: No further improvement of performance measure (TP rate)

Experimental Setup Implemented Gibbs sampling, SMOTEBoost, RUSBoost

Results (RACOG & wRACOG) Geometric Mean (TP Rate, TN Rate) TP Rate

Results (RACOG and wRACOG) ROC Curve

Outline of Work

Overlapping Classes

Overlapping Classes in Prompting Data 3D PCA Plot of prompting data

Existing Work • Discard data of the overlapping region • Treat overlapping region as a separate class

Tomek Links

Cluster-Based Under-Sampling(ClusBUS) Form clusters Under-sampling interesting clusters Published in IOS Press Book on Agent-Based Approaches to Ambient Intelligence, 2012.

Experimental Setup

Results (ClusBus)

Outline of Work

s1 s2 Unsupervised Learning of Prompt Situations on Streaming Sensor Data s4 s1 s3 s2

Motivation • Several hundred man-hours to label activity steps • High probability of inaccuracy • Needs activity-step recognition model

Knowledge Flow

Data Collection

Modeling Activity Errors Abnormal Occurrence Delayed Occurrence

Modeling Delayed Occurrence Elapsed Time Sensor Frequency

Predicting Errors At every sensor event evaluate: Likelihood of sensor si occurrence for participant pj Probability of elapsed time for current nth occurrence of sensor si Probability of all sensor frequency for current nth occurrence of sensor si

Preliminary Experiments Elapsed Time No observable trend Sensor Frequency No observable trend

Current Obstacles • Noisy data • Unwanted sensor events, specifically, object sensors • Erroneous activity sequences not suitable for model evaluation

Proposed Plan • Identifying suitable distributions for modeling sensor frequency and elapsed time • Finding out additional statistical measures that can model the errors better • Building generalized prompt model for all six ADLs (if at all possible(?)) • Need data to evaluate proposed model • Synthetically generate erroneous sequences from normal sequences(?) • Collect more data if necessary

Publications

Barnan Das

Barnan Das

Presentation Transcript

Das neue - und das alte

das