190 likes | 308 Views
Artificial Intelligence Project 1 Neural Networks. Biointelligence Lab School of Computer Sci. & Eng. Seoul National University. Outline. Classification Problems Task 1 Estimate several statistics on Diabetes data set Task 2
E N D
Artificial IntelligenceProject 1Neural Networks Biointelligence Lab School of Computer Sci. & Eng. Seoul National University
Outline • Classification Problems • Task 1 • Estimate several statistics on Diabetes data set • Task 2 • Given unknown data set, find the performance as good as you can get • The test data is hidden. (C) 2000-2002 SNU CSE BioIntelligence Lab
Network Structure (1) positive negative … fpos(x) > fneg(x),→ x is postive (C) 2000-2002 SNU CSE BioIntelligence Lab
Network Structure (2) … f(x) > thres,→ x is postive (C) 2000-2002 SNU CSE BioIntelligence Lab
Pima Indian Diabetes • Data (768) • 8 Attributes • Number of times pregnant • Plasma glucose concentration in an oral glucose tolerance test • Diastolic blood pressure (mm/Hg) • Triceps skin fold thickness (mm) • 2-hour serum insulin (mu U/ml) • Body mass index (kg/m2) • Diabetes pedigree function • Age (year) • Positive: 500, negative: 268 (C) 2000-2002 SNU CSE BioIntelligence Lab
Report (1/4) • Number of Epochs (C) 2000-2002 SNU CSE BioIntelligence Lab
Report (2/4) • Number of Hidden Units • At least, 10 runs for each setting (C) 2000-2002 SNU CSE BioIntelligence Lab
Report (3/4) (C) 2000-2002 SNU CSE BioIntelligence Lab
Report (4/4) • Normalization method you applied. • Other parameters setting • Learning rates • Threshold value with which you predict an example as positive. • If f(x) > thres, you can say it is positive, otherwise negative. (C) 2000-2002 SNU CSE BioIntelligence Lab
Challenge (1) • Unknown Data • Data for you: 3282 examples • 16 dim-input vector labeled one of 5 classes • 5 classes are: A,B, C, D, E • Test data • 582 examples • Labels are HIDDEN! (C) 2000-2002 SNU CSE BioIntelligence Lab
Challenge (2) • Data • Train.txt : 3282 x 17 (16987 examples, 16 dim-input + with last column as label) • Test.txt: 582 x 16 (582 examples, 16 dim-input, labels are hidden) • Verify your NN at • http://knight.snu.ac.kr/aiproj1/ai_nn.asp (C) 2000-2002 SNU CSE BioIntelligence Lab
A B C D E (C) 2000-2002 SNU CSE BioIntelligence Lab
제출할 것 • 최고 성능을 낸 제출자 명시 • 뉴럴넷 구조 • 최고 성능을 이끌어 내기 위해 자신이 시도한 내역 기술 • 자신의 최고 성능 (score) : 성능과 점수는 상관 관계가 작습니다. (C) 2000-2002 SNU CSE BioIntelligence Lab
References • Source Codes • Free softwares • NN libraries (C, C++, JAVA, …) • MATLAB Tool box • Weka • Web sites • http://www.cs.waikato.ac.nz/~ml/weka/ (C) 2000-2002 SNU CSE BioIntelligence Lab
Pay Attention! • Due (October 14, 2003): until pm 11:59 • Submission • Results obtained from your experiments • Compress the data • Via e-mail • Report: Hardcopy!! • Used software and running environments • Results for many experiments with various parameter settings • Analysis and explanation about the results in your own way (C) 2000-2002 SNU CSE BioIntelligence Lab
Optional Experiments • Various learning rate • Number of hidden layers • Different k values • Output encoding (C) 2000-2002 SNU CSE BioIntelligence Lab