100 likes | 268 Views
KDD CUP 2007. GROUP:16 Student Number : M9615002 Name : Po-Jui Sue Student Number : M9615083 Name : Shun-Jhong Niou. SYSTEM AND METHOD. We use Microsoft Windows XP Service1Pack2, AMD Athlon(tm) 64 Processor 3500+ and 1GB RAM as our test platform.
E N D
KDD CUP 2007 GROUP:16 Student Number:M9615002 Name:Po-Jui Sue Student Number:M9615083 Name:Shun-Jhong Niou
SYSTEM AND METHOD • We use Microsoft Windows XP Service1Pack2, AMD Athlon(tm) 64 Processor 3500+ and 1GB RAM as our test platform. • At first we delete the date attribution , because we think date isn’t more important than movie id and customer id. • The training dataset is too huge , we get 20 datasets for each file, we use feed-forward back propagation network type and traingdm training function. TRAINGDM training function.
The target format that we use is probability matrix. , and start.
RESULT • Experimental data are predicted rating 4. The rating 1 is 0.0065, 2 is 0.5534, 3 is 0.0132, 4 is 0.6126, 5 is 0.1717 and 0 is 0.0198. But the correct answer are most of rating 0. • We have a problem in our case.so we try to add some answer files to the training data set. • The output is better and the rating 1 is 0.1136, 2 is 0.0117, 3 is 0.8042, 4 is 0.084, 5 is 0.993 and 0 is 0.7571.
We try to use only answer files to train and simulate. The rating 1 is 0.0027, 2 is 0.0075, 3 is 0.0229, 4 is 0.0268, 5 is 0.018 and 0 is 0.922. • And We try to change training function to get better result. • trainbfg function rating 1 is 0.002, 2 is 0.0173,3 is 0.0691, 4 is 0.1436, 5 is 0.045 and 0 is 0 • traincgb function rating 1 is 0.0058, 2 is 0.1456,3 is 0.021, 4 is 0.1667, 5 is 0.0003 and 0 is 0.0156 • traincgf function rating 1 is 0.097, 2 is 0.1461,3 is 0.2737, 4 is 0.2351, 5 is 0.1535 and 0 is 0
traingp function rating 1 is 0.0096, 2 is 0.0057,3 is 0.0068, 4 is 0.0227, 5 is 0.2307 and 0 is 0.0009 • traingd function rating 1 is 0.1136, 2 is 0.0117,3 is 0.7995, 4 is 0.0843, 5 is 0.993 and 0 is 0.7491 • trainda function rating 1 is 0.0173, 2 is 0.035,3 is 0.3254, 4 is 0.763, 5 is 0.0011 and 0 is 0.1676 • traindx function rating 1 is 0.0014, 2 is 0.1462,3 is 0.0794, 4 is 0.0001, 5 is 0.999 and 0 is 0.1254 • trainoss function rating 1 is 0.0969, 2 is 0.1466,3 is 0.2736, 4 is 0.2351, 5 is 0.1536 and 0 is 0.0943
ANALYSIS • The trainscg training function is the best and it can predict the different dataset to get different probability and get better answer than other training function. • Then we try to change transfer function and the hidden nodes.
Transfer function log purelin tansig 0.0866 1 0.1136 0.0004 1 0.0117 0.9967 1 0.8042 0.01 0 0.084 0.9803 0 0.993 0.9965 0 0.7571 The method of output using tansig in layer1 and logsig in layer2 is the best in this case.
ANALYSIS 15network 5network 1network 0.4112 0 0 0.0853 0.0044 1 0.009 0.0885 0 0.1748 0.7549 0 0.577 0.6614 0.0012 0.0001 0.9623 0.4846 By the result ,the more nodes don’t ensure a better solution. The less also don’t ensure a poor one , but there is a best number of nodes in this case.
We think the data sets range are too wide. We only get some of the training datasets to train. • It can have some error to predict and it is difficult to learn.