250 likes | 327 Views
Evolving Fuzzy Rules with Genetic Programming and Clustering. Soft Computing For Controle. G-REX (Previous work). The transformation of an highly accurate opaque model to a comprehensible model . Genetic programming Black box Arbitary representation and fitness function
E N D
Evolving Fuzzy Rules with Genetic Programming and Clustering Soft Computing For Controle
G-REX (Previous work) • The transformation of an highly accurate opaque model to a comprehensible model. • Genetic programming • Black box • Arbitary representation and fitness function • Balances Accuracy and Comprehensibility IF Age > 25 IF Salary > 5000 Reject Accept Reject
Background • Evolving Fuzzy Decision Trees With Genetic Programming and Clustering • J. Eggermont, (2001) • Automatic fuzzyfication using K-Means • Genetic Programming • Fuzzy Representation
Membership functions • Three types of membership function • Distances does not need to be equal • Based on medioids/centroids
K-means • Most frequently used clustering method • Fast, deterministic and easy to implement. • J.B MacWueen (1967) • K- stand for the number of clusters • Each cluster is represented by one membership function • A cluster is represented by a centroid. • The mean value of the members • An instance belongs to the closest centroid
1 • Euclidian distance
2 • The new centroid is the mean of its members
3 • Recalculate members • Repeat until no change
Kaufmans Initialization • K-Means is sensitive to the initialization method • Pêna J.M. Lozano J. A. and Larranga P. (1999) • An Empirical Investigation of Four Initialization Methods for the K-Means Algorithm Step 1. The instance closest to the mean value Step 2-3 Choose a instance far away from the other medioids with many instance close by.
Membership functions • Three types of membership function • Distance does not need to be equal • Based on medioids
GP Representation • All variables with less than k unique values are treated as crisp sets.
Fitness function Not precise enough Reward is equal to the membership Value for the correctly predicted instance 1- the MSE of each membership function
Experiments • 5 classification datasets • Only continuous variables • IRIS, WINE • Categorical and continuous • COLIC, CLEAVLAND, PIMA • 10-fold cross validation • Stratification • Fuzzy GP vs standard GP (if rules) • Evaluated against • Accuracy (ACC) • Area under ROC-curve (AUC) • Brier Score (BRI)
Disscussion • Current membership function removes information from the variable • A way to handle outliers • Some extremely simply if rules are better for some dataset. • Categorical variables • Should not be used as only method • Easy to remember rules but how accurate will they be as a decision support? • Gives a comprehensible explanation that could ad trust and there by improve predictions.
Future work • Alternative membership function • Fuzzy regression ?