1 / 24

CS 2750 Project Report

CS 2750 Project Report. Jason D. Bakos. Project Goals. Data Sensor readings from 11 different people walking in a controlled environment An accelerometer records floor vibration data from footfalls A microphone records sounds from footballs This data is recorded 10 times for each person

kelvin
Download Presentation

CS 2750 Project Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 2750 Project Report Jason D. Bakos

  2. Project Goals • Data • Sensor readings from 11 different people walking in a controlled environment • An accelerometer records floor vibration data from footfalls • A microphone records sounds from footballs • This data is recorded 10 times for each person • Data gathered from 11 different people

  3. Project Goals • Use this data to perform multiple classification • Human gait analysis • Eventually want to determine if a person is in duress • Most important aspect: learn the nature of the data to determine how best to classify it

  4. Data Preprocessing • Data size • Data is collected at 15KHz for approximately 10 seconds • 150,000 samples • Must get data out of time domain • Must capture a “walk” as a single data point • Time series => cross sectional

  5. Data Preprocessing • Extract the largest intensity step from the data • Closest to sensors • Transform data to frequency domain • Fourier transform • Used MatLab FFT – output is real array • Integrated over time • Bin resultant data into bins • These are now the features

  6. Data Preprocessing • Extracting footstep • Method 1 • Find max value in time-domain • Center fixed window around data • 2000, 4000, 6000 • Method 2 • Actively find footstep • Create new vector by recording sliding abs “mean”-window • Extract largest hill (using gradient descent and threshold) • Index from meanarray into data array • Meanwindow sizes 1000, 2000, 3000

  7. Data Preprocessing Mean window of 1000

  8. Data Preprocessing Mean window of 2000

  9. Data Preprocessing Mean window of 3000

  10. Analysis of Preprocessed Data • Cluster analysis • Unsupervised learning • 3 steps • Distance calculation • Linkage analysis • Clustering

  11. Analysis of Preprocessed Data • Distance Calculation • 4 distance measures • Euclid • Standard distance • Standardized Euclid • Shorter distance between points who have relatively smaller variances • City Block • Similar to Euclid, used for comparison • Minkowski • Another way to measure distance, used for comparison • Result is array, distance from each point to every other point

  12. Analysis of Preprocessed Data • Linkage Analysis • Hierarchically link datapoints • Methods • Shortest distance • Average distance • Uses center points of clusters • Centroid distance • Draws “sphere” around center point, uses furthest point as radius – use distance from edges of sphere • Incremental sum-of-squares • Similar to centroid, used for comparison • Result is matrix

  13. Analysis of Preprocessed Data • Clustering • Force datapoints into a fixed number of clusters • Result is cluster vector and dendrogram

  14. Analysis of Preprocessed Data • How to judge how well the clustering worked? • My answer • Since there is exactly 10 samples from 11 people, define “uniformity” as a metric

  15. Analysis of Preprocessed Data

  16. Analysis of Preprocessed Data • Checked all 12 charts • fix2000, fix4000, fix6000, win1000, win2000, win3000 for vibration and audio • Euclid/Sum-of-squares is best for vibration and audio • win3000 is best for vibration • fix2000 is best for audio

  17. Analysis of Preprocessed Data

  18. Indirect Learning • Used parametric Naïve Bayes model to do multi-way classification • 11 classes • Used 50-bin data • Assumed data was multivariate Gaussian • Chose class based on maxium posterior of C • Used multiple train/test splits to train 3 models with bagging (voting)

  19. Indirect Learning

  20. Indirect Learning • Bad results • Worse than random predictor • Conclusion • Data is not Gaussian

  21. Direct Learning • Trained neural network with same data • Used softmax network to perform multiway classification • 1000 epochs, log sigmoid, gradient descent • Tried different parameters for neural network

  22. Direct Learning Vibration Audio

  23. Direct Learning • No improvement after 50 neurons per level (vib and aud) • 4 levels is best (including output level) • Results terrible for test sets

  24. Conclusion • Need • Better feature extraction • Better classifiers • Or… maybe different sensors are needed • Video

More Related