160 likes | 263 Views
Prepare for the kNN examination with tips on implementation, understanding the method, and classification of data. Get insights on solving assignments and oral exams with Erik Zeitler. Study theory and practical applications to excel in the assessment.
E N D
The K Nearest Neighbor Algorithm (kNN) Erik Zeitler Uppsala Database Laboratory
Examination • Examination is split in two parts • Solve the assignment • Oral examination • During the oral examination • The instructor validates your program using a script • Non-working program the examination ends immediately (“fail” grade is given) you may re-do the examination later • The instructor will ask questions • on your implementation • on the method itself • All group members must take part in the solution. • Group members can get different grades on the same assignment. Erik Zeitler
Grades Erik Zeitler
Examination • Why do we have the oral part? Are we out to get you? • The assignments cover a good part of the course understanding them will help you. • If you have problems solving the assignment, please ask during office hours. • The only way asking will affect your grade is that you might learn more. • Solving assignments • Understanding your own solution Different things! Erik Zeitler
What you need to do • Sign up for oral exam • Groups of 2 – 3 students • Forms are on my office door, P1320 • Implement a solution • Deadline: Submit by e-mail 24h before your oral exam • 1, 2: erik.zeitler@it.uu.se • 3, 4: gyozo.gidofalvi@it.uu.se • Answer the questions on the form • Bring one form per student • Prepare for oral exam: • Study the theory behind Erik Zeitler
K Nearest Neighbor • Basic idea: • If it walks like a duck and it quacks like a duck Then it must be a duck • So how do we know how a duck walks and talks? • Either we ask the other ducks – or if they are unavailable – • Look at who else is walking and talking this way. Erik Zeitler
Duck walking and talking • Assume that a duck • has average step length 5…15 cm • quacks at a frequency 600…700 Hz • On the other hand consider a cow: • step length is 30…60 cm • a cow moos at 100…200 Hz Erik Zeitler
Cows and Ducks in a Plot Erik Zeitler
Enter the Chicken Erik Zeitler
Classifying you using kNN • Each of you belong to a group: • [F|STS|Int Masters|Exchange Students|Other] • Let’s classify each one using 1-NN and 3-NN • How do we select our distance measure? • How do we decide which of 1-NN and 3-NN is best? Erik Zeitler
Things to Consider for the Assignment • Preprocessing • What are the ranges of the different measurements? • Is one characteristic more important than another? • If so, how can we reflect this? • If not, do we need to do something else? • You can assume: no missing points, no noise • Selecting training and testing data and choosing K • Is the data sorted in any way? If so is this good or bad? • Are there different ways of subdividing the known data? • How do we know if the value of K is good or bad? Erik Zeitler
Things to Consider for the Assignment • Classifying unknown data • Do we need to preprocess the unknown data? • Which data set should we use to classify the unknown data? • Complexity • What is the offline part of kNN and what is the online part? • What is the complexity for the offline and online parts of kNN? Erik Zeitler