280 likes | 413 Views
Data Representation. Lars Asker. Predictive modeling. Earlier experiences have to be grouped as cases in the same format - the modeling techniques typically require that each case is transformed into a row in a table. Predictive modeling. Name. Solu- bility. No. C atoms.
E N D
Data Representation Lars Asker
Predictive modeling Earlier experiences have to be grouped as cases in the same format - the modeling techniques typically require that each case is transformed into a row in a table
Predictive modeling Name Solu- bility No. C atoms Fraction of rotatable bonds Topol. diam. Geom. diam. LogP No. heavy bonds … methylpentane good 6 0.40 4 3.46 2.44 5 … methylcyclohexene good 7 0 4 3.00 2.51 7 … nonene med. 9 0.75 8 6.93 3.53 8 … hexadiene good 6 0.60 5 4.36 2.14 5 … butadiene good 4 0.33 3 2.65 1.36 3 … naphthalene good 10 0 5 3.61 2.84 11 … acenaphthylene good 12 0 5 3.58 3.32 14 … pyrene poor 16 0 7 5.00 4.58 19 … dimethylanthracene poor 16 0 7 5.29 4.61 18 … hexahydropyrene med. 16 0 7 5.00 3.82 19 … triphenylene poor 18 0 7 5.00 5.15 21 … benzo(e)pyrene poor 20 0 7 5.29 5.64 24 …
Concept learning from examples ...and counter examples ?
Cards made of plastic ...or paper
Representation Value: 1-13 (odd, even, prime...) Colour: Hearts, Clubs, Diamonds, Spades (red, black) Height, width, weight, material, age, price, designer, owner,... ... ...
51, 35, 14, 02, 49, 30, 14, 02, 47, 32, 13, 02, 46, 31, 15, 02, 50, 36, 14, 02, 54, 39, 17, 04, 46, 34, 14, 03, 50, 34, 15, 02, 44, 29, 14, 02, 49, 31, 15, 01, 54, 37, 15, 02, 48, 34, 16, 02, 48, 30, 14, 01, 43, 30, 11, 01, 58, 40, 12, 02, 57, 44, 15, 04, 54, 39, 13, 04, 51, 35, 14, 03, 57, 38, 17, 03, 51, 38, 15, 03, 54, 34, 17, 02, 51, 37, 15, 04, 46, 36, 10, 02, 51, 33, 17, 05, 48, 34, 19, 02, 50, 30, 16, 02, 50, 34, 16, 04, 52, 35, 15, 02, 52, 34, 14, 02, 47, 32, 16, 02, 48, 31, 16, 02, 54, 34, 15, 04, 52, 41, 15, 01, 55, 42, 14, 02, 49, 31, 15, 01, 50, 32, 12, 02, 55, 35, 13, 02, 49, 31, 15, 01, 44, 30, 13, 02, 51, 34, 15, 02, 50, 35, 13, 03, 45, 23, 13, 03, 44, 32, 13, 02, 50, 35, 16, 06, 51, 38, 19, 04, 48, 30, 14, 03, 51, 38, 16, 02, 46, 32, 14, 02, 53, 37, 15, 02, 50, 33, 14, 02, 70, 32, 47, 14, 64, 32, 45, 15, 69, 31, 49, 15, 55, 23, 40, 13, 65, 28, 46, 15, 57, 28, 45, 13, 63, 33, 47, 16, 49, 24, 33, 10, 66, 29, 46, 13, 52, 27, 39, 14, 50, 20, 35, 10, 59, 30, 42, 15, 60, 22, 40, 10, 61, 29, 47, 14, 56, 29, 36, 13, 67, 31, 44, 14, 56, 30, 45, 15, 58, 27, 41, 10, 62, 22, 45, 15, 56, 25, 39, 11, 59, 32, 48, 18, 61, 28, 40, 13, 63, 25, 49, 15, 61, 28, 47, 12, 64, 29, 43, 13, 66, 30, 44, 14, 68, 28, 48, 14, 67, 30, 50, 17, 60, 29, 45, 15, 57, 26, 35, 10, 55, 24, 38, 11, 55, 24, 37, 10, 58, 27, 39, 12, 60, 27, 51, 16, 54, 30, 45, 15, 60, 34, 45, 16, 67, 31, 47, 15, 63, 23, 44, 13, 56, 30, 41, 13, 55, 25, 40, 13, 55, 26, 44, 12, 61, 30, 46, 14, 58, 26, 40, 12, 50, 23, 33, 10, 56, 27, 42, 13, 57, 30, 42, 12, 57, 29, 42, 13, 62, 29, 43, 13, 51, 25, 30, 11, 57, 28, 41, 13, 63, 33, 60, 25, 58, 27, 51, 19, 71, 30, 59, 21, 63, 29, 56, 18, 65, 30, 58, 22, 76, 30, 66, 21, 49, 25, 45, 17, 73, 29, 63, 18, 67, 25, 58, 18, 72, 36, 61, 25, 65, 32, 51, 20, 64, 27, 53, 19, 68, 30, 55, 21, 57, 25, 50, 20, 58, 28, 51, 24, 64, 32, 53, 23, 65, 30, 55, 18, 77, 38, 67, 22, 77, 26, 69, 23, 60, 22, 50, 15, 69, 32, 57, 23, 56, 28, 49, 20, 77, 28, 67, 20, 63, 27, 49, 18, 67, 33, 57, 21, 72, 32, 60, 18, 62, 28, 48, 18, 61, 30, 49, 18, 64, 28, 56, 21, 72, 30, 58, 16, 74, 28, 61, 19, 79, 38, 64, 20, 64, 28, 56, 22, 63, 28, 51, 15, 61, 26, 56, 14, 77, 30, 61, 23, 63, 34, 56, 24, 64, 31, 55, 18, 60, 30, 48, 18, 69, 31, 54, 21, 67, 31, 56, 24, 69, 31, 51, 23, 58, 27, 51, 19, 68, 32, 59, 23, 67, 33, 57, 25, 67, 30, 52, 23, 63, 25, 50, 19, 65, 30, 52, 20, 62, 34, 54, 23, 59, 30, 51, 18
51, 35, 14, 02, 49, 30, 14, 02, 47, 32, 13, 02, 46, 31, 15, 02, 50, 36, 14, 02, 54, 39, 17, 04, 46, 34, 14, 03, 50, 34, 15, 02, 44, 29, 14, 02, 49, 31, 15, 01, 54, 37, 15, 02, 48, 34, 16, 02, 48, 30, 14, 01, 43, 30, 11, 01, 58, 40, 12, 02, 57, 44, 15, 04, 54, 39, 13, 04, 51, 35, 14, 03, 57, 38, 17, 03, 51, 38, 15, 03, 54, 34, 17, 02, 51, 37, 15, 04, 46, 36, 10, 02, 51, 33, 17, 05, 48, 34, 19, 02, 50, 30, 16, 02, 50, 34, 16, 04, 52, 35, 15, 02, 52, 34, 14, 02, 47, 32, 16, 02, 48, 31, 16, 02, 54, 34, 15, 04, 52, 41, 15, 01, 55, 42, 14, 02, 49, 31, 15, 01, 50, 32, 12, 02, 55, 35, 13, 02, 49, 31, 15, 01, 44, 30, 13, 02, 51, 34, 15, 02, 50, 35, 13, 03, 45, 23, 13, 03, 44, 32, 13, 02, 50, 35, 16, 06, 51, 38, 19, 04, 48, 30, 14, 03, 51, 38, 16, 02, 46, 32, 14, 02, 53, 37, 15, 02, 50, 33, 14, 02, 70, 32, 47, 14, 64, 32, 45, 15, 69, 31, 49, 15, 55, 23, 40, 13, 65, 28, 46, 15, 57, 28, 45, 13, 63, 33, 47, 16, 49, 24, 33, 10, 66, 29, 46, 13, 52, 27, 39, 14, 50, 20, 35, 10, 59, 30, 42, 15, 60, 22, 40, 10, 61, 29, 47, 14, 56, 29, 36, 13, 67, 31, 44, 14, 56, 30, 45, 15, 58, 27, 41, 10, 62, 22, 45, 15, 56, 25, 39, 11, 59, 32, 48, 18, 61, 28, 40, 13, 63, 25, 49, 15, 61, 28, 47, 12, 64, 29, 43, 13, 66, 30, 44, 14, 68, 28, 48, 14, 67, 30, 50, 17, 60, 29, 45, 15, 57, 26, 35, 10, 55, 24, 38, 11, 55, 24, 37, 10, 58, 27, 39, 12, 60, 27, 51, 16, 54, 30, 45, 15, 60, 34, 45, 16, 67, 31, 47, 15, 63, 23, 44, 13, 56, 30, 41, 13, 55, 25, 40, 13, 55, 26, 44, 12, 61, 30, 46, 14, 58, 26, 40, 12, 50, 23, 33, 10, 56, 27, 42, 13, 57, 30, 42, 12, 57, 29, 42, 13, 62, 29, 43, 13, 51, 25, 30, 11, 57, 28, 41, 13, 63, 33, 60, 25, 58, 27, 51, 19, 71, 30, 59, 21, 63, 29, 56, 18, 65, 30, 58, 22, 76, 30, 66, 21, 49, 25, 45, 17, 73, 29, 63, 18, 67, 25, 58, 18, 72, 36, 61, 25, 65, 32, 51, 20, 64, 27, 53, 19, 68, 30, 55, 21, 57, 25, 50, 20, 58, 28, 51, 24, 64, 32, 53, 23, 65, 30, 55, 18, 77, 38, 67, 22, 77, 26, 69, 23, 60, 22, 50, 15, 69, 32, 57, 23, 56, 28, 49, 20, 77, 28, 67, 20, 63, 27, 49, 18, 67, 33, 57, 21, 72, 32, 60, 18, 62, 28, 48, 18, 61, 30, 49, 18, 64, 28, 56, 21, 72, 30, 58, 16, 74, 28, 61, 19, 79, 38, 64, 20, 64, 28, 56, 22, 63, 28, 51, 15, 61, 26, 56, 14, 77, 30, 61, 23, 63, 34, 56, 24, 64, 31, 55, 18, 60, 30, 48, 18, 69, 31, 54, 21, 67, 31, 56, 24, 69, 31, 51, 23, 58, 27, 51, 19, 68, 32, 59, 23, 67, 33, 57, 25, 67, 30, 52, 23, 63, 25, 50, 19, 65, 30, 52, 20, 62, 34, 54, 23, 59, 30, 51, 18
Figure 1: Principal component analysis of a two-dimensional data cloud. The line shown is the direction of the first principal component, which gives an optimal (in the mean-square sense) linear reduction of dimension from 2 to 1 dimensions. Height Weight
Figure 1: Principal component analysis of a two-dimensional data cloud. The line shown is the direction of the first principal component, which gives an optimal (in the mean-square sense) linear reduction of dimension from 2 to 1 dimensions. Height Weight