630 likes | 645 Views
Slides at www.cs.ucr.edu/~eamonn/public/cs170guest.ppt Eamonn Keogh eamonn@cs.ucr.edu. What do. Bill Gates Sir Richard Branson Larry Page Mohammed Saleh Bin Laden. …have in common?. They each wrote to me in the last 3 months about my paper on Bayesian classification of insects….
E N D
Slides at www.cs.ucr.edu/~eamonn/public/cs170guest.ppt Eamonn Keogh eamonn@cs.ucr.edu
What do... • Bill Gates • Sir Richard Branson • Larry Page • Mohammed Saleh Bin Laden …have in common?
They each wrote to me in the last 3 months about my paper on Bayesian classification of insects… • Let us • Review Bayesian classification • Discuss the paper
Naïve Bayes Classifier Thomas Bayes 1702 - 1761
The Classification Problem (informal definition) Katydids Given a collection of annotated data. In this case 5 instances Katydids of and five of Grasshoppers, decide what type of insect the unlabeled example is. Grasshoppers Katydid or Grasshopper?
For any domain of interest, we can measure features Color {Green, Brown, Gray, Other} Has Wings? Abdomen Length Thorax Length Antennae Length Mandible Size Spiracle Diameter Leg Length
10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 Grasshoppers Katydids Antenna Length Abdomen Length Let’s get lots more data…
10 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 10 Katydids Grasshoppers With a lot of data, we can build a histogram. Let us just build one for “Antenna Length” for now… Antenna Length
We can leave the histograms as they are, or we can summarize them with two normal distributions. Let us us two normal distributions for ease of visualization in the following slides…
We want to classify an insect we have found. Its antennae are 3 units long. How can we classify it? • We can just ask ourselves, give the distributions of antennae lengths we have seen, is it more probable that our insect is a Grasshopperor a Katydid. • There is a formal way to discuss the most probable classification… p(cj | d) = probability of class cj, given that we have observed d 3 Antennae length is 3
p(cj | d) = probability of class cj, given that we have observed d P(Grasshopper | 3 ) = 10 / (10 +2) = 0.833 P(Katydid | 3 ) = 2 / (10 + 2) = 0.166 10 2 3 Antennae length is 3
p(cj | d) = probability of class cj, given that we have observed d P(Grasshopper | 7 ) = 3 / (3 +9) = 0.250 P(Katydid | 7 ) = 9 / (3 + 9) = 0.750 9 3 7 Antennae length is 7
p(cj | d) = probability of class cj, given that we have observed d P(Grasshopper | 5 ) = 6 / (6 +6) = 0.500 P(Katydid | 5 ) = 6 / (6 + 6) = 0.500 6 6 5 Antennae length is 5
Bayes Classifiers • That was a visual intuition for a simple case of the Bayes classifier, also called: • Idiot Bayes • Naïve Bayes • Simple Bayes • We are about to see some of the mathematical formalisms, and more examples, but keep in mind the basic idea. • Find out the probability of the previously unseen instance belonging to each class, then simply pick the most probable class.
Bayes Classifiers • Bayesian classifiers use Bayes theorem, which says p(cj | d ) = p(d | cj ) p(cj) p(d) • p(cj | d) = probability of instance d being in class cj, This is what we are trying to compute • p(d | cj) = probability of generating instance d given class cj, We can imagine that being in class cj, causes you to have feature d with some probability • p(cj) = probability of occurrence of class cj, This is just how frequent the class cj, is in our database • p(d) = probability of instance d occurring This can actually be ignored, since it is the same for all classes
(Note: “Drew can be a male or female name”) Drew Barrymore Assume that we have two classes c1 = male, and c2 = female. We have a person whose sex we do not know, say “drew” or d. Classifying drew as male or female is equivalent to asking is it more probable that drew is male or female, i.e which is greater p(male| drew) or p(female| drew) Drew Carey What is the probability of being called “drew” given that you are a male? What is the probability of being a male? p(male| drew) = p(drew | male) p(male) p(drew) What is the probability of being named “drew”? (actually irrelevant, since it is that same for all classes)
This is Officer Drew (who arrested me in 1997). Is Officer Drew a Male or Female? Luckily, we have a small database with names and sex. We can use it to apply Bayes rule… Officer Drew p(cj | d) = p(d | cj ) p(cj) p(d)
p(cj | d) = p(d | cj ) p(cj) p(d) Officer Drew p(male| drew) = 1/3 * 3/8 = 0.125 3/83/8 Officer Drew is more likely to be a Female. p(female| drew) = 2/5 * 5/8 = 0.250 3/83/8
Officer Drew IS a female! Officer Drew p(male| drew) = 1/3 * 3/8 = 0.125 3/8 3/8 p(female| drew) = 2/5 * 5/8 = 0.250 3/8 3/8
So far we have only considered Bayes Classification when we have one attribute (the “antennae length”, or the “name”). But we may have many features. How do we use all the features? p(cj | d) = p(d | cj ) p(cj) p(d)
To simplify the task, naïve Bayesian classifiers assume attributes have independent distributions, and thereby estimate p(d|cj) = p(d1|cj) * p(d2|cj) * ….* p(dn|cj) The probability of class cj generating instance d, equals…. The probability of class cj generating the observed value for feature 1, multiplied by.. The probability of class cj generating the observed value for feature 2, multiplied by..
To simplify the task, naïve Bayesian classifiers assume attributes have independent distributions, and thereby estimate p(d|cj) = p(d1|cj) * p(d2|cj) * ….* p(dn|cj) • p(officer drew|cj) = p(over_170cm = yes|cj) * p(eye =blue|cj) * …. Officer Drew is blue-eyed, over 170cm tall, and has long hair • p(officer drew| Female) = 2/5 * 3/5 * …. • p(officer drew| Male) = 2/3 * 2/3 * ….
cj The Naive Bayes classifiers is often represented as this type of graph… Note the direction of the arrows, which state that each class causes certain features, with a certain probability p(d1|cj) p(d2|cj) p(dn|cj) …
cj Naïve Bayes is fast and space efficient We can look up all the probabilities with a single scan of the database and store them in a (small) table… p(d1|cj) p(d2|cj) p(dn|cj) …
cj An obvious point. I have used a simple two class problem, and two possible values for each example, for my previous examples. However we can have an arbitrary number of classes, or feature values p(d1|cj) p(d2|cj) p(dn|cj) …
The Naïve Bayesian Classifier has a piecewise quadratic decision boundary Grasshoppers Katydids Ants Adapted from slide by Ricardo Gutierrez-Osuna
What is the Deadliest Animal? 300+ per year 50,000+ per year 500+ per year 10+ per year 1,000,000+ per year 200+ per year
What is the Deadliest Animal? 300+ per year 50,000+ per year 500+ per year 10+ per year 1,000,000+ per year 200+ per year
One penny weights about the same as one thousand mosquitoes How can something so small be so deadly?
What is Malaria? • Malaria is a disease that involves high fevers, shaking chills, joint pain, flu-like symptoms. In some cases it can produce coma and death. • There are more than 225 million cases of malaria each year, killing around 1-million people.
Where does Malaria come from? Malaria has been known since ancient times. Many believed it came from “bad air” (Italian: mala aria, “bad air”) 500 years ago, a handful of people believed that insects might be involved in human diseases. Hortus Sanitatis (The Garden of Health) 1497
It was Sir Ronald Ross, an British army surgeon working in India, who proved in 1897 that malaria is transmitted by mosquitoes. Sir Ronald Ross received the 1902 Nobel Prize for Physiology or Medicine for his work (This was somewhat controversial, as many others made similar discoveries around the same time )
Malaria Parasites Malaria Transmission Cycle 1st Vector Next Human host Initial Human host Liver infection 2nd Vector Blood infection
We get malaria from mosquitoes We get malaria from humans
The Mosquito • There are 3,528 kinds of mosquitoes • Only a handful of species take human blood • Only the females take human blood • There are 100 trillion mosquitoes alive today • Mosquitoes have been around for at least 100 million years • We know this from fossil records/DNA studies • Mosquitoes have spread malaria for at least 35 million years • We know this from insects found in amber
Given that we have known for over one hundred years how Malaria is spread, where is the magic pill or immunization? For a variety of reasons, a cure or immunization continues to alluded mankind. However there are some interventions that can help In the 20th century, smallpox killed 400 million people worldwide, it is now eradicated. Polio is almost eradicated.
Interventions to Mitigate Malaria • The use of insecticidal treated mosquito nets • Spraying of insecticides (including controversial chemicals such as DDT) • Introduction of fish/turtles/crustaceans to eat mosquito larva • The introduction of dragonflies which eat adult mosquitoes. • Habitat reduction by draining ponds and pools • Use of chemical films to reduce the surface tension of water (drowning the pupa). • .. and hundreds more proven or tentative ideas
Interventions Cost Money! • Even cheap solutions have hidden costs • Insecticidal treated mosquito nets are cheap to make, but… • To make mosquito nets work, you need educators, incentive programs, maintenance etc “...aid agencies and non-governmental organizations are quietly grappling with a problem: Data suggest that nearly half of Africans who have access to the nets refuse to sleep under them” (LA Times May-2-2010).
Planning interventions requires knowledge • We need to know where/when the problem is the greatest. Where are the insects? Which insects are they? When did they arrive? • The classic solution? Use sticky traps • Inaccurate • Costly • Long time lag
My Research • I believe that we can count and classify insects with sensors. The classification problem • Must be cheap • Must be low powered • Must be accurate
UCR Wingbeat Sensor One second of audio from our sensor. The Common Eastern Bumble Bee (Bombus impatiens) takes about one tenth of a second to pass the laser. 0.2 0.1 0 Background noise Bee begins to cross laser Bee has past though the laser -0.1 -0.2 4 x 10 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
One second of audio from the laser sensor. Only Bombus impatiens (Common Eastern Bumble Bee) is in the insectary. 0.2 0.1 0 Background noise -0.1 Bee begins to cross laser 4 -0.2 x 10 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 -3 x 10 Single-Sided Amplitude Spectrum of Y(t) 4 Peak at 197Hz 3 60Hz interference |Y(f)| 2 Harmonics 1 0 0 100 200 300 400 500 600 700 800 900 1000 Frequency (Hz)
-3 x 10 4 3 |Y(f)| 2 1 0 0 100 200 300 400 500 600 700 800 900 1000 Frequency (Hz) 0 100 200 300 400 500 600 700 800 900 1000 Frequency (Hz) 0 100 200 300 400 500 600 700 800 900 1000 Frequency (Hz)
0 200 300 400 500 600 700 100 Wing Beat Frequency Hz
0 200 300 400 500 600 700 100 Wing Beat Frequency Hz
400 500 600 700 Aedes aegyptii : Female mean =567, Std = 43 Anopheles stephensi: Female mean =475, Std = 30 517 If I see an insect with a wingbeat frequency of 500, what is it?