250 likes | 259 Views
Explore the intersection of math, science, and technology in understanding intelligence through learning models. Discover how these models impact fields like pharmaceuticals, marketing, cancer diagnosis, and superconductivity applications.
E N D
Math Models for Learning and Discovery Kristin P. Bennett Mathematical Sciences Department Rensselaer Polytechnic Institute
The Learning Problem The problem of understanding intelligence is said to be the greatest problem in science today and “the” problem of for this century – as deciphering the genetic code was for the second half of the last one…the problem of learning represents a gateway to understanding intelligence in man and machines. -- Tomasso Poggio and Steven Smale 2003
What do these problems have in common? • Design and Discovery of Pharmaceuticals • Target Marketing in Business • Diagnosis of Breast Cancer • Discovery of Novel Superconductors • Detection of Anthrax using TZ spectroscopy • Modeling and predicting global trade • RNA Transcription
DRUG TRIVIA • In USA $25B/yr for R&D of pharmaceuticals (33% clinicals) • Worth their weight in gold • 10-15 years from conception market for drug • Development cost 0.5B/drug • First-year sales > $1B/drug • 1 drug approved/5000 compounds tested • 1 out of 100 drugs succeeds to market • 19 Alzheimer’s drugs in development • 20,000,000 Americans with Alzheimer by 2050 DDASSL RENSSELAER
Drugs Worth weight in GOLD DDASSL RENSSELAER
TOWARDS TREATING THE HIV EPEDIMIC HIV Reverse-Transcriptase Inhibition modeling: • Have a few Molecules that have been tested: • Can we predict if new molecule will inhibit HIV?
What do we know? • The bioactivities of a small set of molecules • Many Possible Descriptors for each molecules: Molecular Weight Electrostatic Potential Ionization Potential • Can we predict molecules bioactivity?
Database Marketing • Bank has $1.7 billion portfolio of home mortgages. • When customer refinances, they may lose customer. • Questions will a customer refinance? • If so, offer that customer a good deal on refinancing.
What do we know? • For many customers, we know if they refinanced or not. • We know attributes of customer: • Income • Age • Residential Area • Payment History • Can we predict behavior of future customers?
Breast Cancer Diagnosis Fine needle aspirate of breast tumor. Is tumor benign or malignant?
What do we know? • For patients in initial study, we know whether tumor was benign or malignant. • Have a digital image of tumor aspirate. • Know characteristics doctors look at: • Uniformity of cell shape • Uniformity of cell size • Cell Mitosis
What do we know? • For patients in initial study, we know whether tumor was benign or malignant. • Have a digital image of tumor aspirate. • Know characteristics doctors look at: • Uniformity of cell shape • Uniformity of cell size • Cell Mitosis
Superconductivity • Superconductivity is the ability of a material to conduct current with no resistance and extremely low loss. • A few high temperature superconductors have been found. • What other compounds are superconductors?
Applications of Superconductivity: Magnetic Resonance Imaging
Applications of Superconductivity • Maglev Trains
Applications of Superconductivity • Very small and efficient motors • Better power transmission cables • Better cellular phone service Find a cheap high-temperature superconductor and you will get the NOBEL PRIZE.
What do we know? • Many compounds have been tested to see if they are superconductors. • Many descriptors exists for these compounds based on molecular properties.
What do all these problems have in common? Each problem • Can be posed as a “yes” or “no” question. • Has examples known to be of the “yes” type or the “no” type. • Each example has an associated set of descriptors. Learn Classification Function !
Data Mining • Each problem has data. • Our job is to “mine” information from this data. • Information depends on the question asked. • In this case we must produce a predictive yes/no model (a.k.a. a classification model) based on the data.
Mathematical Model • Have data • Construct predictive function f(x)y • Solve mathematical model to find f • Want f to generalize well on future data
Types of Learning Problems • Classification • Regression • Clustering • Ranking
Data Mining • Classification = yes/no models • Start with examples of yes and no. • Associate a set of descriptors with each example. Descriptors must be appropriate for the question you are asking. • Construct a model to split the two sets • Use the model to predict new examples.
Learning Model • What kind of learning task is it? • What sort of f should we use? • Kernel function • What loss function to use? • What regularization function? • How can we solve this learning model? • How well will the model predict new points?
Class information • See course web page http://www.rpi.edu/~bennek/class/mmld/index.htm