1 / 11

Sequential Tree-k-means Classification Algorithm

A step-by-step guide to the sequential tree-k-means classification algorithm using mean calculations and class sorting.

jsanon
Download Presentation

Sequential Tree-k-means Classification Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PAj>c=Pj,m om...ok+1Pj,koi=AND iff bi=1, k is the rightmost bit pos with bit-value "0", opeations are right binding. c = bm ... bk+1 ... b0 1.attr, class, calc means, mean_gaps. sLNmmg se 51 12 vi 63 7 ve 70 sWDmmg ve 32 1 vi 33 2 se 35 pLNmmg se 14 33 ve 47 13 vi 60 pWDmmg se 2 12 ve 14 11 vi 25 se 51 35 14 2 0 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 1 1 1 0 0 0 0 1 0 Initial means: ve 70 32 47 14 1 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 0 1 1 1 0 sLN sWD pLN pWD sepalLeNgth sepalWiDth pedalLeNgth pedalWiDth vi 63 33 60 25 0 1 1 1 1 1 1 1 0 0 0 0 1 0 1 1 1 1 0 0 1 1 0 0 1 pTree-k-means-classification-sequential (pkmc-s) Initially, let PREMAINING be pure1. Initially from the TrainingSet, 1. For each attribute, calculate the mean for each class and sort ascending on mean. Calculate all mean_gaps = differences_of_consecutive_means. Create MeanTable(attribute, class, mean, gapL, gapH, gapRELATIVE) sorted desc on gapRELATIVE = ( gapL + gapH)/mean ) gapL is the gap on the low side of the mean. gapH, high side. 2. Choose and remove the MT record with max gapRELATIVE. Use formula above with cL=mean-gapL/2 and with cH=mean+gapH/2 to produce PL=PA>cL and PH=P'A>cH The class mask is PCLASS = PL & PH & PREMAINING and we update PREMAINING = PREMAINING & P'CLASS 3. Repeat 2 above until all classes have a pTree mask (or until PREMAINING is pure0, but that's a count op.). 4. Repeat 1,2,3 until means stop changing (much). The next two slides contain a (partial) walk through of this algorithm for a subset of the IRIS dataset. Initial means are shown below (the clusters are color coded throughout with R,G,B for setosa, versicolor, virginica and also the features (sepal Length, sepal Width, pedal Length, pedal Width) are color coded.

  2. pkmc-s PREM=pure1 1.attr, class, calc means, gaps. MT(attr,class,mean,gapL,gapH,gapREL) sorted desc on gapREL =(gapL+gapH)/2*mean) gapL=lo gap. gapH hi. 2. MT rec w max gapREL cL=mn-gapL/2 cH=mn+gapH/2 PCLASS = PA>cL & P'A>cH & PREM PREM= PREM &P'CLASS 3. Repeat 2 til all classes pTree. 4. Repeat 1,2,3 til conv Sepal LengthSepal WidthPedal LengthPedal Wth 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 se 49 30 14 2 0 1 1 0 0 0 1 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 0 se 47 32 13 2 0 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 se 46 31 15 2 0 1 0 1 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 0 1 0 se 54 36 14 2 0 1 1 0 1 1 0 1 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 1 0 se 54 39 17 4 0 1 1 0 1 1 0 1 0 0 1 1 1 0 0 1 0 0 0 1 0 0 1 0 0 se 46 34 14 3 0 1 0 1 1 1 0 1 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 1 1 se 50 34 15 2 0 1 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 1 1 1 0 0 0 1 0 se 44 29 14 2 0 1 0 1 1 0 0 0 1 1 1 0 1 0 0 0 1 1 1 0 0 0 0 1 0 se 49 31 15 1 0 1 1 0 0 0 1 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 se 54 37 15 2 0 1 1 0 1 1 0 1 0 0 1 0 1 0 0 0 1 1 1 1 0 0 0 1 0 ve 64 32 45 15 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 ve 69 31 49 15 1 0 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 0 0 1 0 1 1 1 1 ve 55 23 40 13 0 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0 1 1 0 1 ve 65 28 46 15 1 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1 1 0 0 1 1 1 1 ve 57 28 45 13 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 1 1 0 1 0 1 1 0 1 ve 63 33 47 16 0 1 1 1 1 1 1 1 0 0 0 0 1 0 1 0 1 1 1 1 1 0 0 0 0 ve 49 24 33 10 0 1 1 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 ve 66 29 46 13 1 0 0 0 0 1 0 0 1 1 1 0 1 0 1 0 1 1 1 0 0 1 1 0 1 ve 52 27 39 14 0 1 1 0 1 0 0 0 1 1 0 1 1 0 1 0 0 1 1 1 0 1 1 1 0 ve 50 20 35 10 0 1 1 0 0 1 0 0 1 0 1 0 0 0 1 0 0 0 1 1 0 1 0 1 0 1.attr, class, calc means, mean_gaps. sLNmmg se 51 12 vi 63 7 ve 70 sWDmmg ve 32 1 vi 33 2 se 35 pLNmmg se 14 33 ve 47 13 vi 60 pWDmmg se 2 12 ve 14 11 vi 25 vi 58 27 51 19 0 1 1 1 0 1 0 0 1 1 0 1 1 0 1 1 0 0 1 1 1 0 0 1 1 vi 71 30 59 21 1 0 0 0 1 1 1 0 1 1 1 1 0 0 1 1 1 0 1 1 1 0 1 0 1 vi 63 29 56 18 0 1 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1 0 0 0 1 0 0 1 0 vi 65 30 58 22 1 0 0 0 0 0 1 0 1 1 1 1 0 0 1 1 1 0 1 0 1 0 1 1 0 vi 76 30 66 21 1 0 0 1 1 0 0 0 1 1 1 1 0 1 0 0 0 0 1 0 1 0 1 0 1 vi 49 25 45 17 0 1 1 0 0 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 0 1 vi 73 29 63 18 1 0 0 1 0 0 1 0 1 1 1 0 1 0 1 1 1 1 1 1 1 0 0 1 0 vi 67 25 58 18 1 0 0 0 0 1 1 0 1 1 0 0 1 0 1 1 1 0 1 0 1 0 0 1 0 vi 72 36 61 25 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 vi 65 32 51 20 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 MTclatmngLgHgR (not yet sorted on gR) se sLN 51 1212.25(12+12)/(2*51) se sWD 35 2 2.06( 2+ 2)/(2*35) x's fill ins. se pLN 14 33332.36(33+33)/(2*14) se pWD 2 1212 6(12+12)/(2* 2) PREM PA>cH vi sLN 63 8 7 .12 ( 8+ 7)/(2*63) vi sWD 33 1 2 .05 ( 1+ 2)/(2*33) vi pLN 60 1313 .22 (13+13)/(2*60) vi pWD 25 1111 .44 (11+11)/(2*25) ve sLN 70 7 7 .1 ( 7+ 7)/(2*70) ve sWD 32 1 1 .03 ( 1+ 1)/(2*32) Pse = P'A>cH ve pLN 47 33 13 .94 (33+13)/(2*47) ve pWD 14 12 11 .82 (12+11)/(2*14) =(P4,4|(P4,3&(P4,2|(P4,1|P4,0)))) PA>cH = MTsclatmngLgHgR (sortws desc gR) se pWD 2 1212 6 se pLN 14 33332.36 ve pLN 47 33 13 .94 ve pWD 14 12 11 .82 vi pWD 25 1111 .44 se sLN 51 12 12 .25 vi pLN 60 1313 .22 vi sLN 63 8 7 .12 ve sLN 70 7 7 .1 se sWD 35 2 2.06 vi sWD 33 1 2 .05 ve sWD 32 1 1 .03 MTclatmngLgHgR We're separating out setosa class 2. MT rec w max gapREL cL= mean - gapL/2 cH=mean+gapH/2 se pWD 2 1212 6 PA>cL =Ppure1 = 2 - 12/2 = -4 = 2 +12/2 = 8 = 0 1 0 0 0 Psetosa =PA>cL & P'A>cH & PREM =Ppure1& P'A>cH & Ppure1 = P'A>cH PREM= PREM &P'CLASS = Ppure1 &P'setosa = P'setosa

  3. pkmc-s PREM=pure1 1.attr, class, calc means, gaps. MT(attr,class,mean,gapL,gapH,gapREL) sorted desc on gapREL =(gapL+gapH)/2*mean) gapL=lo gap. gapH hi. 2. Get MT w max gapREL cL=mn-gapL/2 cH=mn+gapH/2 PCLASS = PA>cL & P'A>cH & PREM PREM= PREM &P'CLASS 3. Repeat 2 til all classes pTree. 4. Repeat 1,2,3 til conv Sepal LengthSepal WidthPedal LengthPedal Wth se 49 30 14 2 0 1 1 0 0 0 1 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 0 se 47 32 13 2 0 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 se 46 31 15 2 0 1 0 1 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 0 1 0 se 54 36 14 2 0 1 1 0 1 1 0 1 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 1 0 se 54 39 17 4 0 1 1 0 1 1 0 1 0 0 1 1 1 0 0 1 0 0 0 1 0 0 1 0 0 se 46 34 14 3 0 1 0 1 1 1 0 1 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 1 1 se 50 34 15 2 0 1 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 1 1 1 0 0 0 1 0 se 44 29 14 2 0 1 0 1 1 0 0 0 1 1 1 0 1 0 0 0 1 1 1 0 0 0 0 1 0 se 49 31 15 1 0 1 1 0 0 0 1 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 se 54 37 15 2 0 1 1 0 1 1 0 1 0 0 1 0 1 0 0 0 1 1 1 1 0 0 0 1 0 ve 64 32 45 15 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 ve 69 31 49 15 1 0 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 0 0 1 0 1 1 1 1 ve 55 23 40 13 0 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0 1 1 0 1 ve 65 28 46 15 1 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1 1 0 0 1 1 1 1 ve 57 28 45 13 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 1 1 0 1 0 1 1 0 1 ve 63 33 47 16 0 1 1 1 1 1 1 1 0 0 0 0 1 0 1 0 1 1 1 1 1 0 0 0 0 ve 49 24 33 10 0 1 1 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 ve 66 29 46 13 1 0 0 0 0 1 0 0 1 1 1 0 1 0 1 0 1 1 1 0 0 1 1 0 1 ve 52 27 39 14 0 1 1 0 1 0 0 0 1 1 0 1 1 0 1 0 0 1 1 1 0 1 1 1 0 ve 50 20 35 10 0 1 1 0 0 1 0 0 1 0 1 0 0 0 1 0 0 0 1 1 0 1 0 1 0 1.attr, class, calc means, mean_gaps. sLNmmg se 51 8 vi 63 7 ve 70 sWDmmg ve 32 1 vi 33 2 se 35 pLNmmg se 14 33 ve 47 13 vi 60 pWDmmg se 2 12 ve 14 11 vi 25 vi 58 27 51 19 0 1 1 1 0 1 0 0 1 1 0 1 1 0 1 1 0 0 1 1 1 0 0 1 1 vi 71 30 59 21 1 0 0 0 1 1 1 0 1 1 1 1 0 0 1 1 1 0 1 1 1 0 1 0 1 vi 63 29 56 18 0 1 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1 0 0 0 1 0 0 1 0 vi 65 30 58 22 1 0 0 0 0 0 1 0 1 1 1 1 0 0 1 1 1 0 1 0 1 0 1 1 0 vi 76 30 66 21 1 0 0 1 1 0 0 0 1 1 1 1 0 1 0 0 0 0 1 0 1 0 1 0 1 vi 49 25 45 17 0 1 1 0 0 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 0 1 vi 73 29 63 18 1 0 0 1 0 0 1 0 1 1 1 0 1 0 1 1 1 1 1 1 1 0 0 1 0 vi 67 25 58 18 1 0 0 0 0 1 1 0 1 1 0 0 1 0 1 1 1 0 1 0 1 0 0 1 0 vi 72 36 61 25 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 vi 65 32 51 20 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 ONLY 2 MISTAKES 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 MTclatmngLgHgR 2. MT rec w max gapREL cL= mean+gapH/2 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 1 1 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 ve pLN 47 33 13 .94 =47+13/2=52.5 52= 0 1 1 0 1 0 0 PA>cL =P3,6|(P3,5&(P3,4& (P3,3|(P3,2&(P3,1 MTsclatmngLgHgR (sortws desc gR) se pWD 2 1212 6 se pLN 14 33332.36 ve pLN 47 33 13 .94 ve pWD 14 12 11 .82 vi pWD 25 1111 .44 se sLN 51 12 12 .25 vi pLN 60 1313 .22 vi sLN 63 8 7 .12 ve sLN 70 7 7 .1 se sWD 35 2 2.06 vi sWD 33 1 2 .05 ve sWD 32 1 1 .03

  4. PAj>c=Pj,m om...ok+1Pj,koi=AND iff bi=1, k is the rightmost bit pos with bit-value "0", operations are right binding. c = bm ... bk+1 ... b0 1.attr, class, calc means, mean_gaps. sLNmmg se 51 12 vi 63 7 ve 70 sWDmmg ve 32 1 vi 33 2 se 35 pLNmmg se 14 33 ve 47 13 vi 60 pWDmmg se 2 12 ve 14 11 vi 25 se 51 35 14 2 0 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 1 1 1 0 0 0 0 1 0 ve 70 32 47 14 1 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 0 1 1 1 0 vi 63 33 60 25 0 1 1 1 1 1 1 1 0 0 0 0 1 0 1 1 1 1 0 0 1 1 0 0 1 pTree-k-means-classification-divisive (pkmc-d) Current Cluster=CC={Class1, ...Classm} (all classes), is represented by pTree mask, PCC ( pure1 initially). From the TrainingSet, 1. For each attribute, calculate the mean for each class in CC and sort asc on mean. Calculate all mean_gaps = difference_of_consecutive_means. Create MeanTable (attribute, class, mean, gap) sorted desc on gap 2. Choose and remove the MT record with maximum gap Use PA>c (c=mean+gap/2) to separate the current cluster into two clusters. The cluster masks are PNEWCLUSTER1 = PA>c & PCC PNEWCLUSTER2= P'A>c & PCC and the new clusters then are NEWCLUSTER1= {all classes corresponding to the mean that had the max gap and those above it from CC. NEWCLUSTER2= {all other classes in CC}, also definable as {all classes below max gap class in CC) 3. Repeat 2 with CC=NEWCLUSTERi (i=1,2) until all clusters are singleton sets of classes. 4. Repeat 1,2,3 until means stop changing (much). On the next two slides you will find a (partial) walk through of this algorithm for a subset of the IRIS dataset. The initial means are shown below (the clusters are color coded throughout with R,G,B for setosa,. versicolor, virginica. I also color code the features (sepal Length, sepal Width, pedal Length, pedal Width Then I take 10 samples from each class for the example. sLN sWD pLN pWD sepalLeNgth sepalWiDth pedalLeNgth pedalWiDth

  5. Sepal LengthSepal WidthPedal LengthPedal Wth 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 se 49 30 14 2 0 1 1 0 0 0 1 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 0 se 47 32 13 2 0 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 se 46 31 15 2 0 1 0 1 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 0 1 0 se 54 36 14 2 0 1 1 0 1 1 0 1 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 1 0 se 54 39 17 4 0 1 1 0 1 1 0 1 0 0 1 1 1 0 0 1 0 0 0 1 0 0 1 0 0 se 46 34 14 3 0 1 0 1 1 1 0 1 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 1 1 se 50 34 15 2 0 1 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 1 1 1 0 0 0 1 0 se 44 29 14 2 0 1 0 1 1 0 0 0 1 1 1 0 1 0 0 0 1 1 1 0 0 0 0 1 0 se 49 31 15 1 0 1 1 0 0 0 1 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 se 54 37 15 2 0 1 1 0 1 1 0 1 0 0 1 0 1 0 0 0 1 1 1 1 0 0 0 1 0 ve 64 32 45 15 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 ve 69 31 49 15 1 0 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 0 0 1 0 1 1 1 1 ve 55 23 40 13 0 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0 1 1 0 1 ve 65 28 46 15 1 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1 1 0 0 1 1 1 1 ve 57 28 45 13 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 1 1 0 1 0 1 1 0 1 ve 63 33 47 16 0 1 1 1 1 1 1 1 0 0 0 0 1 0 1 0 1 1 1 1 1 0 0 0 0 ve 49 24 33 10 0 1 1 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 ve 66 29 46 13 1 0 0 0 0 1 0 0 1 1 1 0 1 0 1 0 1 1 1 0 0 1 1 0 1 ve 52 27 39 14 0 1 1 0 1 0 0 0 1 1 0 1 1 0 1 0 0 1 1 1 0 1 1 1 0 ve 50 20 35 10 0 1 1 0 0 1 0 0 1 0 1 0 0 0 1 0 0 0 1 1 0 1 0 1 0 1.attr, class, calc means, gaps. sLNmgap se 51 12 vi 63 7 ve 70 sWDmgap ve 32 1 vi 33 2 se 35 pLNmgap se 14 33 ve 47 13 vi 60 pWDmgap se 2 12 ve 14 11 vi 25 vi 58 27 51 19 0 1 1 1 0 1 0 0 1 1 0 1 1 0 1 1 0 0 1 1 1 0 0 1 1 vi 71 30 59 21 1 0 0 0 1 1 1 0 1 1 1 1 0 0 1 1 1 0 1 1 1 0 1 0 1 vi 63 29 56 18 0 1 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1 0 0 0 1 0 0 1 0 vi 65 30 58 22 1 0 0 0 0 0 1 0 1 1 1 1 0 0 1 1 1 0 1 0 1 0 1 1 0 vi 76 30 66 21 1 0 0 1 1 0 0 0 1 1 1 1 0 1 0 0 0 0 1 0 1 0 1 0 1 vi 49 25 45 17 0 1 1 0 0 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 0 1 vi 73 29 63 18 1 0 0 1 0 0 1 0 1 1 1 0 1 0 1 1 1 1 1 1 1 0 0 1 0 vi 67 25 58 18 1 0 0 0 0 1 1 0 1 1 0 0 1 0 1 1 1 0 1 0 1 0 0 1 0 vi 72 36 61 25 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 vi 65 32 51 20 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 MTatclmngap (not yet sorted and I'm not using relative gaps this time, Note also, there is only 1 entry for each gap - not 2 for each mean) se sLN 51 12 se pLN 14 33 se pWD 2 12 vi sLN 63 7 vi sWD 33 2 PNEW1 PA>31 ve sWD 32 1 ve pLN 47 13 ve pWD 14 11 MTclatmngap (sorted desc on gap) PNEW2 P'A>31 se pLN 14 33 ve pLN 47 13 se pWD 2 12 se sLN 51 12 P3,6 | P3,5 PA>31= ve pWD 14 11 vi sLN 63 7 vi sWD 33 2 ve sWD 32 1 pkmc-d CC=all [3] classes, mask, PCC ( pure). 2.PA>c (c=mean+gap/2*mean) separate CC into 2. PNEWCLUSTER1=PA>c & PCC PNEWCLUSTER2=P'A>c & PCC 3. Repeat 2 w CC=NEWCLUSTERi (i=1,2) until all are singletons.4. Repeat 1,2,3 until means stop changing. 1.MT(attr,class,mean,gap) sorted desc on gap . MTatclmngap (separates {ve. vi} from {setosa} using pedal Length 2. MT rec w max gap c= mean + gap/2 se pLN 14 33 = 14 + 33/2 = 31 = (applied roof) 0 0 1 1 1 1 1 PNEW2 is done ( cluster is the singleton set {setosa} ) Need to further partition PNEW1 (cluster is {versicolor, virginica} )

  6. Sepal LengthSepal WidthPedal LengthPedal Wth 1.attr, class, calc means, gaps. sLNmgap se 51 12 vi 63 7 ve 70 sWDmgap ve 32 1 vi 33 2 se 35 pLNmgap se 14 33 ve 47 13 vi 60 pWDmgap se 2 12 ve 14 11 vi 25 MTatclmngap (sorted desc on gap) se pLN 14 33 ve pLN 47 13 se pWD 2 12 ve pWD 14 11 P3,6 | P3,5 =P3,6|(P3,5&(P3,4&(P3,3|(P3,2&(P3,1 &(P3,0 PA>c = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 se sLN 51 8 vi sLN 63 7 vi sWD 33 2 ve sWD 32 1 or pkmc-d CC=all [3] classes, mask, PCC ( pure). 2.PA>c (c=mean+gap/2) separate CC into 2. PNEWCLUSTER1=PA>c & PCC PNEWCLUSTER2=P'A>c & PCC 3. Repeat 2 w CC=NEWCLUSTERi (i=1,2) until all are singletons.4. Repeat 1,2,3 until means stop changing. ve 64 32 45 15 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 ve 69 31 49 15 1 0 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 0 0 1 0 1 1 1 1 ve 55 23 40 13 0 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0 1 1 0 1 ve 65 28 46 15 1 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1 1 0 0 1 1 1 1 ve 57 28 45 13 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 1 1 0 1 0 1 1 0 1 ve 63 33 47 16 0 1 1 1 1 1 1 1 0 0 0 0 1 0 1 0 1 1 1 1 1 0 0 0 0 ve 49 24 33 10 0 1 1 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 ve 66 29 46 13 1 0 0 0 0 1 0 0 1 1 1 0 1 0 1 0 1 1 1 0 0 1 1 0 1 ve 52 27 39 14 0 1 1 0 1 0 0 0 1 1 0 1 1 0 1 0 0 1 1 1 0 1 1 1 0 ve 50 20 35 10 0 1 1 0 0 1 0 0 1 0 1 0 0 0 1 0 0 0 1 1 0 1 0 1 0 1.MT(attr,class,mean,gap) sorted desc on gap . vi 58 27 51 19 0 1 1 1 0 1 0 0 1 1 0 1 1 0 1 1 0 0 1 1 1 0 0 1 1 vi 71 30 59 21 1 0 0 0 1 1 1 0 1 1 1 1 0 0 1 1 1 0 1 1 1 0 1 0 1 vi 63 29 56 18 0 1 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1 0 0 0 1 0 0 1 0 vi 65 30 58 22 1 0 0 0 0 0 1 0 1 1 1 1 0 0 1 1 1 0 1 0 1 0 1 1 0 vi 76 30 66 21 1 0 0 1 1 0 0 0 1 1 1 1 0 1 0 0 0 0 1 0 1 0 1 0 1 vi 49 25 45 17 0 1 1 0 0 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 0 1 vi 73 29 63 18 1 0 0 1 0 0 1 0 1 1 1 0 1 0 1 1 1 1 1 1 1 0 0 1 0 vi 67 25 58 18 1 0 0 0 0 1 1 0 1 1 0 0 1 0 1 1 1 0 1 0 1 0 0 1 0 vi 72 36 61 25 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 vi 65 32 51 20 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 MTatclmngap (separates {ve} from {vi} using pedal Length 2. MT rec w max gap c= mean + gap/2 vepLN4713 = 47 + 13/2 = 54 = (applied roof) 0 1 1 0 1 1 0 This is the virginica mask. There are no mistakes on versicolor, 3 mistakes on virginica (#'s 1,6,10). With one epoch, overall accuracy is 90% 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 1 1 0 1 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 1 0 1 1 1 1 0 1 0 0 0 1 1 1 0 1 1 1 1 0 0 0 0 1 0 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 1 0 0 0 0 0 0 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 0 0 0 1 1 0 1 1 Ways to improve accuracy (at a slight cost in speed) include: 1. Use more than one attrubute cutpoint each time. 2. Use standard deviation calculations to optimize cutpoints. or

  7. Sepal LengthSepal WidthPedal LengthPedal Wth 1.attr, class, calc means, gaps. sLNmgap se 51 8 vi 63 7 ve 70 sWDmgap ve 32 1 vi 33 2 se 35 pLNmgap se 14 33 ve 47 13 vi 60 pWDmgap se 2 12 ve 14 11 vi 25 MTatclmngap (sorted desc on gap) se pLN 14 33 ve pLN 47 13 se pWD 2 12 ve pWD 14 11 P3,6 | P3,5 =P3,6|(P3,5&(P4,4&(P4,3|(P4,2&(P4,1 &(P3,0 PA>c = se sLN 51 8 vi sLN 63 7 vi sWD 33 2 ve sWD 32 1 pkmc-d CC=all [3] classes, mask, PCC ( pure). 2.PA>c (c=mean+gap/2*mean) separate CC into 2. PNEWCLUSTER1=PA>c & PCC PNEWCLUSTER2=P'A>c & PCC 3. Repeat 2 w CC=NEWCLUSTERi (i=1,2) until all are singletons.4. Repeat 1,2,3 until means stop changing. ve 64 32 45 15 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 ve 69 31 49 15 1 0 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 0 0 1 0 1 1 1 1 ve 55 23 40 13 0 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0 1 1 0 1 ve 65 28 46 15 1 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1 1 0 0 1 1 1 1 ve 57 28 45 13 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 1 1 0 1 0 1 1 0 1 ve 63 33 47 16 0 1 1 1 1 1 1 1 0 0 0 0 1 0 1 0 1 1 1 1 1 0 0 0 0 ve 49 24 33 10 0 1 1 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 ve 66 29 46 13 1 0 0 0 0 1 0 0 1 1 1 0 1 0 1 0 1 1 1 0 0 1 1 0 1 ve 52 27 39 14 0 1 1 0 1 0 0 0 1 1 0 1 1 0 1 0 0 1 1 1 0 1 1 1 0 ve 50 20 35 10 0 1 1 0 0 1 0 0 1 0 1 0 0 0 1 0 0 0 1 1 0 1 0 1 0 1 1 0 1 0 0 0 1 0 0 0 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1 1 0 1 1 1 0 1 0 0 1 0 0 0 0 0 0 1 0 1 1 1 0 1 1 1 0 1 0 0 1 0 0 0 0 0 1 1 0 0 1 0 0 1 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 1 0 1 0 1 1 1 1 0 0 0 0 1 0 0 0 1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 1 1 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 0 0 1 0 1 0 1 1 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 1 0 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 0 0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 0 1 1 0 0 1 0 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 1 0 1 1 0 0 1 1 0 0 1 1 1 1 1 0 0 1 0 0 1 1 0 0 1 1 0 0 1 0 0 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 1 1 0 1 1 0 1 1 1 1 0 1 0 0 0 1 1 1 0 1 1 1 1 0 1 0 0 1 1 1 0 1 1 0 0 0 0 0 0 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 0 0 0 1 1 0 1 1 1.MT(attr,class,mean,gap) sorted desc on gap . vi 58 27 51 19 0 1 1 1 0 1 0 0 1 1 0 1 1 0 1 1 0 0 1 1 1 0 0 1 1 vi 71 30 59 21 1 0 0 0 1 1 1 0 1 1 1 1 0 0 1 1 1 0 1 1 1 0 1 0 1 vi 63 29 56 18 0 1 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1 0 0 0 1 0 0 1 0 vi 65 30 58 22 1 0 0 0 0 0 1 0 1 1 1 1 0 0 1 1 1 0 1 0 1 0 1 1 0 vi 76 30 66 21 1 0 0 1 1 0 0 0 1 1 1 1 0 1 0 0 0 0 1 0 1 0 1 0 1 vi 49 25 45 17 0 1 1 0 0 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 0 1 vi 73 29 63 18 1 0 0 1 0 0 1 0 1 1 1 0 1 0 1 1 1 1 1 1 1 0 0 1 0 vi 67 25 58 18 1 0 0 0 0 1 1 0 1 1 0 0 1 0 1 1 1 0 1 0 1 0 0 1 0 vi 72 36 61 25 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 vi 65 32 51 20 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 MTatclmngap(separates {ve} from {vi} using pedal Length 2. MT rec w max gap c= mean + gap/2 vepWD1411 = 14 + 11/2 = 20 = (applied roof) 0 1 1 0 1 0 0 To improve accuracy (at a slight cost in speed) include: 1. Use more than one attribute cutpoint. Using pWD as 2nd attribute for separating ve and vi: 1 1 1 1 1 0 1 1 1 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 0 1 1 0 0 1 0 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 1 0 1 1 0 0 1 1 0 0 No improvement by including ve pWD. Next we'll try vi sLN.

  8. Sepal LengthSepal WidthPedal LengthPedal Wth 1.attr, class, calc means, gaps. sLNmgap se 51 8 vi 63 7 ve 70 sWDmgap ve 32 1 vi 33 2 se 35 pLNmgap se 14 33 ve 47 13 vi 60 pWDmgap se 2 12 ve 14 11 vi 25 MTatclmngap (sorted desc on gap) se pLN 14 33 ve pLN 47 13 se pWD 2 12 ve pWD 14 11 P3,6 | P3,5 =P1,6&(P1,5|(P1,4|(P1,3|(P1,2 PA>c = 0 0 1 0 1 1 1 0 1 1 1 0 1 0 0 1 0 0 0 0 0 0 1 0 1 1 1 0 1 1 1 0 1 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0 0 1 0 0 1 1 0 1 0 0 0 0 0 se sLN 51 8 vi sLN 63 7 vi sWD 33 2 ve sWD 32 1 pkmc-d CC=all [3] classes, mask, PCC ( pure). 2.PA>c (c=mean+gap/2*mean) separate CC into 2. PNEWCLUSTER1=PA>c & PCC PNEWCLUSTER2=P'A>c & PCC 3. Repeat 2 w CC=NEWCLUSTERi (i=1,2) until all are singletons.4. Repeat 1,2,3 until means stop changing. ve 64 32 45 15 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 ve 69 31 49 15 1 0 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 0 0 1 0 1 1 1 1 ve 55 23 40 13 0 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0 1 1 0 1 ve 65 28 46 15 1 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1 1 0 0 1 1 1 1 ve 57 28 45 13 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 1 1 0 1 0 1 1 0 1 ve 63 33 47 16 0 1 1 1 1 1 1 1 0 0 0 0 1 0 1 0 1 1 1 1 1 0 0 0 0 ve 49 24 33 10 0 1 1 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 ve 66 29 46 13 1 0 0 0 0 1 0 0 1 1 1 0 1 0 1 0 1 1 1 0 0 1 1 0 1 ve 52 27 39 14 0 1 1 0 1 0 0 0 1 1 0 1 1 0 1 0 0 1 1 1 0 1 1 1 0 ve 50 20 35 10 0 1 1 0 0 1 0 0 1 0 1 0 0 0 1 0 0 0 1 1 0 1 0 1 0 1 1 0 1 0 0 0 1 0 0 0 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1 1 0 1 1 1 0 1 0 0 1 0 0 0 0 0 0 1 0 1 1 1 0 1 1 1 0 1 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0 0 1 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 1 0 1 0 1 1 1 1 0 0 0 0 1 0 0 0 1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 1 1 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 0 0 1 0 1 0 1 1 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 1 0 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 0 0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 0 1 1 0 0 1 0 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 1 0 1 1 0 0 1 1 0 0 1 1 1 1 1 0 0 1 0 0 1 1 0 0 1 1 0 0 1 0 0 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 1 1 0 1 1 0 1 1 1 1 0 1 0 0 0 1 1 1 0 1 1 1 1 0 1 0 0 1 1 1 0 1 1 0 0 0 0 0 0 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 0 0 0 1 1 0 1 1 1.MT(attr,class,mean,gap) sorted desc on gap . vi 58 27 51 19 0 1 1 1 0 1 0 0 1 1 0 1 1 0 1 1 0 0 1 1 1 0 0 1 1 vi 71 30 59 21 1 0 0 0 1 1 1 0 1 1 1 1 0 0 1 1 1 0 1 1 1 0 1 0 1 vi 63 29 56 18 0 1 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1 0 0 0 1 0 0 1 0 vi 65 30 58 22 1 0 0 0 0 0 1 0 1 1 1 1 0 0 1 1 1 0 1 0 1 0 1 1 0 vi 76 30 66 21 1 0 0 1 1 0 0 0 1 1 1 1 0 1 0 0 0 0 1 0 1 0 1 0 1 vi 49 25 45 17 0 1 1 0 0 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 0 1 vi 73 29 63 18 1 0 0 1 0 0 1 0 1 1 1 0 1 0 1 1 1 1 1 1 1 0 0 1 0 vi 67 25 58 18 1 0 0 0 0 1 1 0 1 1 0 0 1 0 1 1 1 0 1 0 1 0 0 1 0 vi 72 36 61 25 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 vi 65 32 51 20 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 MTatclmngap(separates {ve} from {vi} using pedal Length 2. MT rec w max gap c= mean + gap/2 visLN63 7 = 63 + 7/2 = 67 = (applied roof) 1 0 0 0 0 1 1 0 1 1 0 1 1 1 0 1 1 1 1 1 0 1 1 1 0 1 0 1 1 0 1 0 0 0 1 0 0 0 1 0 1 1 0 1 1 1 1 No improvement by including vi sLN.

  9. Sepal LengthSepal WidthPedal LengthPedal Wth 1.attr, class, calc means, gaps. sLNmgap se 51 8 vi 63 7 ve 70 sWDmgap ve 32 1 vi 33 2 se 35 pLNmgap se 14 33 ve 47 13 vi 60 pWDmgap se 2 12 ve 14 11 vi 25 MTatclmngap (sorted desc on gap) se pLN 14 33 -------or ve pLN 47 13 se pWD 2 12 ve pWD 14 11 P3,6 | P3,5 PA>c = se sLN 51 8 vi sLN 63 7 vi sWD 33 2 ve sWD 32 1 pkmc-d CC=all [3] classes, mask, PCC ( pure). 2.PA>c (c=mean+gap/2*mean) separate CC into 2. PNEWCLUSTER1=PA>c & PCC PNEWCLUSTER2=P'A>c & PCC 3. Repeat 2 w CC=NEWCLUSTERi (i=1,2) until all are singletons.4. Repeat 1,2,3 until means stop changing. ve 64 32 45 15 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 ve 69 31 49 15 1 0 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 0 0 1 0 1 1 1 1 ve 55 23 40 13 0 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0 1 1 0 1 ve 65 28 46 15 1 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1 1 0 0 1 1 1 1 ve 57 28 45 13 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 1 1 0 1 0 1 1 0 1 ve 63 33 47 16 0 1 1 1 1 1 1 1 0 0 0 0 1 0 1 0 1 1 1 1 1 0 0 0 0 ve 49 24 33 10 0 1 1 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 ve 66 29 46 13 1 0 0 0 0 1 0 0 1 1 1 0 1 0 1 0 1 1 1 0 0 1 1 0 1 ve 52 27 39 14 0 1 1 0 1 0 0 0 1 1 0 1 1 0 1 0 0 1 1 1 0 1 1 1 0 ve 50 20 35 10 0 1 1 0 0 1 0 0 1 0 1 0 0 0 1 0 0 0 1 1 0 1 0 1 0 1 1 0 1 0 0 0 1 0 0 0 1 0 1 1 0 1 1 1 1 0 0 1 0 1 1 1 0 1 1 1 0 1 0 0 1 0 0 0 0 0 0 1 0 1 1 1 0 1 1 1 0 1 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0 0 1 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 1 0 1 0 1 1 1 1 0 0 0 0 1 0 0 0 1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 1 1 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 0 0 1 0 1 0 1 1 1 1 0 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 1 0 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 0 0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 0 1 1 0 0 1 0 1 1 0 0 0 0 1 1 1 0 1 0 0 1 0 1 1 1 0 1 1 0 0 1 1 0 0 1 1 1 1 1 0 0 1 0 0 1 1 0 0 1 1 0 0 1 0 0 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 1 1 0 1 1 0 1 1 1 1 0 1 0 0 0 1 1 1 0 1 1 1 1 0 1 0 0 1 1 1 0 1 1 0 0 0 0 0 0 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 0 0 0 1 1 0 1 1 1.MT(attr,class,mean,gap) sorted desc on gap . vi 58 27 51 19 0 1 1 1 0 1 0 0 1 1 0 1 1 0 1 1 0 0 1 1 1 0 0 1 1 vi 71 30 59 21 1 0 0 0 1 1 1 0 1 1 1 1 0 0 1 1 1 0 1 1 1 0 1 0 1 vi 63 29 56 18 0 1 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1 0 0 0 1 0 0 1 0 vi 65 30 58 22 1 0 0 0 0 0 1 0 1 1 1 1 0 0 1 1 1 0 1 0 1 0 1 1 0 vi 76 30 66 21 1 0 0 1 1 0 0 0 1 1 1 1 0 1 0 0 0 0 1 0 1 0 1 0 1 vi 49 25 45 17 0 1 1 0 0 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 0 1 vi 73 29 63 18 1 0 0 1 0 0 1 0 1 1 1 0 1 0 1 1 1 1 1 1 1 0 0 1 0 vi 67 25 58 18 1 0 0 0 0 1 1 0 1 1 0 0 1 0 1 1 1 0 1 0 1 0 0 1 0 vi 72 36 61 25 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 vi 65 32 51 20 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 MTatclmngap(separates {ve} from {vi} using pedal Length 2. MT rec w max gap c= mean + gap/2 visWD33 2 = 33 + 2/2 = 34 = 1 0 0 0 1 0 0 =P3,6|(P2,5&(P2,4|(P2,3|(P2,2|(P2,1 &(P2,0 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 0 0 0 1 1 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 0 0 1 0 1 0 1 1 1 1 0 1 0 1 0 0 1 1 0 0 1 0 1 1 0 1 0 1 0 0 1 1 1 0 0 Improvement by including visWD (captures 10 as virginica while missing also on 10 in versicolor.

  10. The above methods are all pkmc methods involving the distance, Lp distance in one dimension (the most relevant dimension based on mean gaps or?.). I say Lp because all of these distance are identical in one dimension (just absolute value of the value difference). To improve accuracy we could try using std based gap measurements and pick the maximum number of gap stds each time (using Mohammad's formula for variance), rather than gap distance and/or we could maximize the relative gap = gap/mean measure or #gap_stds/mean. Could use the L distance on all relevant dimensions instead of just one dimension. Could use the Lp distance on all relevant dimensions (L1 and L2 using Mohammad's formulas).

  11. + (a5,2*a5,1*2(2+1)+a5,2*a5,0*2(2+0)) + a5,1*a5,0*2(1+0) a6,2*a6,2*22+2 + a4,2*a4,2*22+2 + a5,2*a5,2*22+2 + a3,2*a3,2*22+2 + a6,1*a6,1*21+1 a4,1*a4,1*21+1 a5,1*a5,1*21+1 a3,1*a3,1*21+1 +a6,0*a6,0*20+0 +a3,0*a3,0*20+0 +a5,0*a5,0*20+0 +a4,0*a4,0*20+0 + + + + a5,2*a5,1*2(2+1)+a5,2*a5,0*2(2+0) a4,2*a4,1*2(2+1)+a4,2*a4,0*2(2+0) a6,2*a6,1*2(2+1)+a6,2*a6,0*2(2+0) a3,2*a3,1*2(2+1)+a3,2*a3,0*2(2+0) + a6,1*a6,0*2(1+0) + a4,1*a4,0*2(1+0) + a3,1*a3,0*2(1+0) + a5,1*a5,0*2(1+0) e.g., first column: (a5,2*a5,2+ a6,2*a6,2+ a3,2*a3,2+ a4,2*a4,2) *22+2 Here I (Mohammad) am giving another algorithm to cal summation of squared value using p-trees. If we only need the summation of squared value of all the data, and do not need the p-trees of the individual squared value then this algorithm is really easy (I believe). Suppose we have 4 values represented by 3 p-trees A a2 a1 a0 = == == == 5 1 0 1 6 1 1 0 3 0 1 1 4 1 0 0 we need to calculate the squared sum A2 s5 s4 s3 s2 s1 s0 = == == == == == == 25 0 1 1 0 0 1 36 1 0 0 1 0 0 9 0 0 1 0 0 1 16 0 1 0 0 0 0 -- Sum=86 5*5 = (a5,2*22 + a5,1*21 + a5,0*20)* (a5,2*22 + a5,1*21 + a5,0*20) (a5,2*a5,2*22+2 + a5,2*a5,1*22+1 + a5,2*a5,0*22+0) = + (a5,1*a5,2*21+2 + a5,1*a5,1*21+1 + a5,1*a5,0*21+0) + (a5,0*a5,2*20+2 + a5,0*a5,1*20+1 + a5,0*a5,0*20+0) = (a5,2*a5,2*22+2 + a5,1*a5,1*21+1 + a5,0*a5,0*20+0) All of these products are binary (1*1=1, 1*0=0*1=0*0=0 and therefore accomplished by ANDing)

More Related