1 / 34

Engineering Data Analysis & Modeling Practical Solutions to Practical Problems

Engineering Data Analysis & Modeling Practical Solutions to Practical Problems. Dr. James McNames Biomedical Signal Processing Laboratory Electrical & Computer Engineering Portland State University. Course Overview. Key question: How to extract useful information from data? Some theory

shubha
Download Presentation

Engineering Data Analysis & Modeling Practical Solutions to Practical Problems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Engineering Data Analysis & ModelingPractical Solutions to Practical Problems Dr. James McNames Biomedical Signal Processing Laboratory Electrical & Computer Engineering Portland State University

  2. Course Overview • Key question: How to extract useful information from data? • Some theory • Mostly methods & applications • Problem oriented, not technology focused • Project course

  3. Talk Overview • Problem definitions • Applications • Project ideas • Course specifics

  4. Problem Definitions • Preprocessing (briefly) • Variable selection • Dimension reduction • Decision theory (hypothesis testing) • Density estimation • Nonlinear optimization • Pattern recognition/Classification (very briefly) • Nonlinear modeling (univariate & multivariate)

  5. Variable Selection • Many algorithms fail if too many inputs • Often fewer inputs are sufficient due to • Redundant inputs • Irrelevant inputs • Goal: Find a subset of inputs that maximizes model accuracy • Is Greenspan’s BP relevant?

  6. Dimension Reduction • Redundant inputs can also be combined into a smaller composite set • Improves accuracy • Reduces computation • If done well, minimal information is lost • Used for signal compression • Principal component analysis is most common

  7. Dimension Reduction Example 1

  8. Dimension Reduction Example 2

  9. Nonlinear Optimization • Find the vector a such that E(a) is minimized • Many algorithms have parameters that must be “fit” to the data • Usually “fit” by minimizing error measure • Sometimes subject to a constraint G(a) = 0 • Unconstrained optimization more common • Very widely used • Many engineering applications

  10. Pattern Recognition • Closely related to nonlinear modeling • Goal is to identify most likely category given an input vector • Equivalent to drawing decision boundaries • Following example • Crab data • Four categories • Two composite inputs

  11. Crabs Data Set

  12. Biomedical Application • Goal: identify brain cell types from microrecordings • Current research project • 5 categories of cell types • Created metrics to characterize signals • Following scatterplot shows 2 of these metrics

  13. Neurosurgery Example

  14. Nonlinear Modeling • Given many examples of observed variables, create a model that can predict the output • No other assumed knowledge • Observed variables • Quantitative • Measurable

  15. Nonlinear Modeling • Observed variables may not be causal • Not all causal effects are observed • Model will not be perfect • How do you measure how good the model is?

  16. Smoothing • For single-input single-output (SISO) systems, can plot the data • Problem is to estimate a curve that most accurately predicts future points • Could draw a smooth curve by hand • More difficult to implement automatically • More than one curve may be reasonable

  17. Smoothing Example

  18. Multiple “Reasonable” Solutions

  19. Nonlinear Modeling • Many methods do not work well • Usually is much more difficult • Noise • Multiple inputs • Time-varying system • Small data sets • Still an active area of research • Will discuss "tried and true” solutions

  20. Overview of Course • Introduction & review • Linear models • Univariate smoothing • Optimization algorithms • Nonlinear modeling • Pattern recognition & classification

  21. Application Areas • Engineering • Controls (system identification) • Signal processing (estimation & prediction) • Communications (channel equalization) • Statistics • Mathematics • Computer science • Systems science

  22. Application Examples • Time series prediction • Aircraft carrier landing systems • Spatial Wafer Patterns • Fault Detection • Machinery health monitoring • Automated, objective credit rating • Fraud detection

  23. Time Series Prediction

  24. Spatial Wafer Patterns

  25. Wafer Components

  26. Estimation (Regression) Results

  27. Fault Detection in Semiconductor Manufacturing

  28. Aircraft Carrier Landing System • Can be very hard • Limited visibility • Rough seas • Night • Predict location at touch down • Flight deck • Aircraft • Is rocking of flight deck predictable?

  29. Machinery Health Monitoring • Cost of machinery failure can be very high • Recent growth in real-time monitoring • Health and Usage Monitoring Systems (HUMS) • Condition Based Maintenance (CBM) • Reduce costs • Increase safety

  30. Fraud Detection • Credit card fraud cost $864 million in 1992 • How quickly can fraud be detected? • The companies have amassed large data bases • What are the patterns of fraud? • Active area of research

  31. Past Projects • Many past projects • See reports & slides on the web • Many time series applications • Need not be time series related • Many have resulted in conference and journal publications • Expect improved quality this term

  32. Project Ideas • It is up to you to identify a project • Preferred • Data readily available (no new instrumentation or study design) • Independent samples (not time series data) • Engineering related • High likelihood of success (no financial forecasting)

  33. Course Logistics • Project oriented • Project reports • Must meet IEEE journal requirements • May be encouraged to publish • Brief oral slide presentation at end of term • Most projects are applied • May also create new methods or compare existing methods

  34. Prerequisites • Helpful • Random processes (ECE 565) • Signal processing (ECE 566) • Proficient at MATLAB or similar • Required • Calculus • Probability & statistics (STAT 451) • Linear algebra (MTH 343) • Proficiency at programming

More Related