130 likes | 345 Views
CLARA. data 10 4 2 9 6 9 5 4 9 9 8 1 5 4 0 8 8 0 4 1 6 2 8 2 9 6 7 3 2 2. Algorithm CLARA 1. For i= 1 to 5, repeat the following steps:. k = 2 mincost = 9999 bestset = {} i = 1. sample of data 10 4 5 4 8 1 4 1 6 2 7 3. K cost { (10,4), (5,4) } 11.24
E N D
data 10 4 2 9 6 9 5 4 9 9 8 1 5 4 0 8 8 0 4 1 6 2 8 2 9 6 7 3 2 2 Algorithm CLARA 1. For i= 1 to 5, repeat the following steps: k = 2 mincost = 9999 bestset = {} i = 1
sample of data 10 4 5 4 8 1 4 1 6 2 7 3 K cost { (10,4), (5,4) } 11.24 { (10,4), (8,1) } 12.715 { (10,4), (4,1) } 12.166 { (10,4), (6,2) } 8.1224 { (10,4), (7,3) } 9.4919 { (8,1) , (5,4) } 11.24 { (4,1) , (5,4) } 13.472 { (6,2) , (5,4) } 10.358 { (7,3) , (5,4) } 9.9748 min cost Algorithm CLARA 2. Draw a sample of 40 + 2k objects randomly from the entire data set, and call Algorithm PAM to find k medoids of the sample. Sample data 6 objects
Data cost k# 10 4 0 1 2 9 8.0623 2 6 9 6.4031 1 5 4 2.2361 2 9 9 5.099 1 8 1 2.2361 2 5 4 2.2361 2 0 8 8.4853 2 8 0 2.8284 2 4 1 2.2361 2 6 2 0 2 8 2 2 2 9 6 2.2361 1 7 3 1.4142 2 2 2 4 2 Algorithm CLARA 3. For each object Oj in the entire data set, determine which of the k medoids is the most similar to Oj. 4. Calculate the average dissimilarity of the cluster- ing obtained in the previous step. If this value is less than the current minimum, use this value as the current minimum, and retain the k medoids found in Step (2) as the best set of medoids ob- tained so far. medoids = { (10,4), (6,2) } cost(medoids) = 49.473 cost(medoids) < mincost mincost = cost(medoids) = 49.473 bestset = medoids = { (10,4), (6,2) }
sample of data 2 9 5 4 8 1 0 8 6 2 9 6 K cost { (2,9), (9,6) } 16.807 { (2,9), (5,4) } 13.187 { (2,9), (8,1) } 13.814 { (2,9), (0,8) } 31.509 { (2,9), (6,2) } 11.708 { (5,4) , (9,6) } 18.713 { (8,1) , (9,6) } 23.314 { (0,8) , (9,6) } 16.807 { (6,2) , (9,6) } 20.573 min cost Algorithm CLARA 5. Return to Step (1) to start the next iteration. 1. next iteration. 2. Draw a sample of 40 + 2k objects randomly from the entire data set, and call Algorithm PAM to find k medoids of the sample. mincost 49.473 bestset { (10,4), (6,2) } Step 1. i = i+1 , i = 2 Step 2. Sample data 6 objects
Data cost k# 10 4 4.4721 2 2 9 0 1 6 9 4 1 5 4 2.2361 2 9 9 7 1 8 1 2.2361 2 5 4 2.2361 2 0 8 2.2361 1 8 0 2.8284 2 4 1 2.2361 2 6 2 0 2 8 2 2 2 9 6 5 2 7 3 1.4142 2 2 2 4 2 Algorithm CLARA 3. For each object Oj in the entire data set, determine which of the k medoids is the most similar to Oj. 4. Calculate the average dissimilarity of the cluster- ing obtained in the previous step. If this value is less than the current minimum, use this value as the current minimum, and retain the k medoids found in Step (2) as the best set of medoids ob- tained so far. medoids = { (2,9), (6,2) } cost(medoids) = 41.895 cost(medoids) < mincost mincost = cost(medoids) = 41.895 bestset = medoids = { (2,9), (6,2) }
sample of data 10 4 2 9 9 9 5 4 6 2 7 3 K cost { (5,4), (7,3) } 16.732 { (5,4), (10,4) } 15.402 { (5,4), (2,9) } 15.875 { (5,4), (9,9) } 15.303 { (5,4), (6,2) } 18.12 { (10,4), (7,3) } 16.56 { (2,9), (7,3) } 13.137 { (9,9), (7,3) } 13.813 { (6,2), (7,3) } 19.533 min cost Algorithm CLARA 5. Return to Step (1) to start the next iteration. 1. next iteration. 2. Draw a sample of 40 + 2k objects randomly from the entire data set, and call Algorithm PAM to find k medoids of the sample. mincost 41.895 bestset { (2,9), (6,2) } Step 1. i = i+1 , i = 3 Step 2. Sample data 6 objects
Data cost k# 10 4 3.1623 2 2 9 0 1 6 9 4 1 5 4 2.2361 2 9 9 6.3246 2 8 1 2.2361 2 5 4 2.2361 2 0 8 2.2361 1 8 0 3.1623 2 4 1 3.6056 2 6 2 1.4142 2 8 2 1.4142 2 9 6 3.6056 2 7 3 0 2 2 2 5.099 2 Algorithm CLARA 3. For each object Oj in the entire data set, determine which of the k medoids is the most similar to Oj. 4. Calculate the average dissimilarity of the cluster- ing obtained in the previous step. If this value is less than the current minimum, use this value as the current minimum, and retain the k medoids found in Step (2) as the best set of medoids ob- tained so far. medoids = { (2,9), (7,3) } cost(medoids) = 40.732 cost(medoids) < mincost mincost = cost(medoids) = 40.732 bestset = medoids = { (2,9), (7,3) }
sample of data 10 4 5 4 9 9 6 2 8 2 7 3 K cost { (9,9), (8,2) } 9.8482 { (9,9), (10,4) } 15.463 { (9,9), (5,4) } 13.078 { (9,9), (6,2) } 10.122 { (9,9), (7,3) } 8.2268 { (10,4), (8,2) } 12.119 { (5,4), (8,2) } 12.646 { (6,2), (8,2) } 13.55 { (7,3), (8,2) } 12.803 min cost Algorithm CLARA 5. Return to Step (1) to start the next iteration. 1. next iteration. 2. Draw a sample of 40 + 2k objects randomly from the entire data set, and call Algorithm PAM to find k medoids of the sample. mincost 40.732 bestset { (2,9), (7,3) } Step 1. i = i+1 , i = 4 Step 2. Sample data 6 objects
Data cost k# 10 4 3.1623 2 2 9 7 1 6 9 3 1 5 4 2.2361 2 9 9 0 1 8 1 2.2361 2 5 4 2.2361 2 0 8 8.6023 2 8 0 3.1623 2 4 1 3.6056 2 6 2 1.4142 2 8 2 1.4142 2 9 6 3 1 7 3 0 2 2 2 5.099 2 Algorithm CLARA 3. For each object Oj in the entire data set, determine which of the k medoids is the most similar to Oj. 4. Calculate the average dissimilarity of the cluster- ing obtained in the previous step. If this value is less than the current minimum, use this value as the current minimum, and retain the k medoids found in Step (2) as the best set of medoids ob- tained so far. medoids = { (9,9), (7,3) } cost(medoids) = 49.168 cost(medoids) < mincost ?= false
sample of data 2 9 6 9 9 9 6 2 8 2 9 6 K cost { (6,9), (6,2) } 13.243 { (6,9), (2,9) } 21.523 { (6,9), (9,9) } 21.071 { (6,9), (8,2) } 13.123 { (6,9), (9,6) } 16.123 { (2,9), (6,2) } 18 { (9,9), (6,2) } 15 { (8,2), (6,2) } 26.256 { (9,6), (6,2) } 16.858 min cost Algorithm CLARA 5. Return to Step (1) to start the next iteration. 1. next iteration. 2. Draw a sample of 40 + 2k objects randomly from the entire data set, and call Algorithm PAM to find k medoids of the sample. mincost 40.732 bestset { (2,9), (7,3) } Step 1. i = i+1 , i = 5 Step 2. Sample data 6 objects
Data cost k# 10 4 2.8284 2 2 9 4 1 6 9 0 1 5 4 3.6056 2 9 9 3 1 8 1 1 2 5 4 3.6056 2 0 8 6.0828 1 8 0 2 2 4 1 4.1231 2 6 2 2 2 8 2 0 2 9 6 4.1231 2 7 3 1.4142 2 2 2 6 2 Algorithm CLARA 3. For each object Oj in the entire data set, determine which of the k medoids is the most similar to Oj. 4. Calculate the average dissimilarity of the cluster- ing obtained in the previous step. If this value is less than the current minimum, use this value as the current minimum, and retain the k medoids found in Step (2) as the best set of medoids ob- tained so far. medoids = { (6,9), (8,2) } cost(medoids) = 43.783 cost(medoids) < mincost ?= false
Algorithm CLARA 5. Return to Step (1) to start the next iteration. mincost 40.732 bestset { (2,9), (7,3) } Stop.