660 likes | 886 Views
Constructing Fuzzy Signature Based on Medical Data. Student: Bai Qifeng Client: Prof. Tom Gedeon. Proposal. Explore an approach to automatic construct Fuzzy signature based on medical database It contains three questions:
E N D
Constructing Fuzzy Signature Based on Medical Data Student: Bai Qifeng Client: Prof. Tom Gedeon
Proposal Explore an approach to automatic construct Fuzzy signature based on medical database It contains three questions: • How to identify SARS suspect patients group? • How to explore the relationships among symptoms? • How to construct fuzzy signature based on above analysis?
Fuzzy Logic Theory • Fuzzy logic uses linguistic rules which reflect uncertainty or vagueness of concepts in natural in natural language. If 50m/h is the boundary of “slow” and “fast” , Conventional bivalent sets regards 50.1m/h as fast. What if current speed is 49.9m/h? In real world, it should be a smooth shift.
Now, assume there are three temperatures We can get the fuzzy sets: A fuzzy set is a set whose elements have degrees of membership. Slight Moderate Sever Extreme e 1 39.8 37.8 38.4 0.8 0.6 0.4 0.2 0 37.3 37.9 38.6 39.1 40 Fuzzy Set
Assume: IF Fever = Slight THEN dose = Low. IF Fever = Moderate THEN dose = Ave. Fuzzy value of fever is slight = 0.29 and moderate = 0.71 Value of dose will share properties of both Low and Ave range. IN OUT Why use fuzzy sets
Problem Definition • A Major issue in fuzzy applications is how to create fuzzy rules • the number of rules have an exponential increase with the number of inputs and terms. • At least one activated rule for every input. e.g. 5 terms, 2 inputs => 25 rules 5 terms, 5 inputs => 3,125 rules
Sketch of Solution • Three possible solutions • Decrease T : Sparse Fuzzy System • Decrease K: Hierarchical Fuzzy System • Decrease both simultaneously : Sparse Hierarchical Fuzzy Rule Bases
Hierarchical Fuzzy Systems • Hierarchical fuzzy systems reduce to the dimension of the sub-rule bases k by using meta – levels
Fuzzy Signatures • Fuzzy signatures structure data into vectors of fuzzy values, each of which can be a further vector. • Each signature corresponds to a nested vector structures or, equivalently, to a tree graph.
Fuzzy Signatures • The relationship between higher and lower levels is govern by fuzzy aggregations. • Fuzzy aggregation contains union, average, intersection etc. Examples: • Union: AUB = max [A, B] = A or B • Intersection: A∩B = min [A,B] = A and B
Clustering • The aim of cluster analysis is to classify objects based on similarities among them. • Definition of cluster is a group of objects that are more similar to one another than to members of other clusters. • Clustering is unsupervised classification: no predefined classes
Clustering: Similarity • How to evaluate the similarities of data? • Cluster analysis adapts the distance between two points as the criterion of similarity. • Distance-type measure has Euclidean distance and City block distance.
Clustering: Fuzzy C-Means Bezdek define objective function as : represents the deviation of data with centre. The number m governs the influence of membership grades. uij represents the degree of membership of the data point xj belonging to v .
Clustering: Cluster Valid Index • Xie and Beni Index • The numerator calculates the compactness of data in the same cluster and the denominator computes the separateness of data in different clusters. • Smaller value of numerator validity index indicates that the clusters are more compact and larger values of denominator denotes the clusters are well separated.
Factor Analysis • Factor analyses are performed by examining the pattern of correlations between the observed measures. . X is a vector of variables, where is a vector of r<p latent variables called factors, is a (p*r) matrix of coefficients (loadings), is a vector of random errors.
Factor Analysis: Principal component analysis • Principal component analysis aims to reduce the dimension of variables and these new variables can interpret most of cases.
Factor Analysis: Principal component analysis . x is the p dimensional variables, where U is an orthogonal matrix. • The loading of matrix U and vector Z( ) , which correspond to the variance and vector of the principal components respectively. • The value represents the contribution ratio which indicates how much percentage the principal component represents of the total tendency of the variables. • Usually, an accumulative contribution ratio of 70 - 80 percent can effectively represent the major variations in the original data.
Factor Analysis: PCA vs FA • Direction is reversed: the measured responses are based on the underlying factors while in PCA the principal components are based on the measured responses
Factor Analysis: Factor Rotation • For identify some variables having similar factor loading, we could rotate the factor coordinates in any direction without changing the relative locations of the points to each other.
Experiment: Scatter of Raw Data • Gravities of components are deviated by the noise or outliers.
Experiment: Scatter After Clustering Collected data can represent the pattern of the disease more accurately.
Experiment: KMO and Bartlett’s Test • KMO test indicates the possibility of containing underlying factors. • KMO < .50, factor analysis is not useful. • Bartlett's test indicate whether variables are unrelated. • significance level < .05 significant relationships
Experiment: PCA Model Accumulative contribution ratio = 63%
Experiment: PCA Model It denotes that variables could be divided into 3 factors
Experiment: Constructed fuzzy signature • Hierarchical clustering or K-means can be used to cluster each factor • Weighted aggregation method in this fuzzy signature had higher performance • 3 weights & 3 aggregations
Experiment: Possible rule bases Aggregations: • Min (fever, cough, chest) • Min (dyspnes, lymphopenia) • Max (Min (kinase, malaise), Min(aspartate, dehydrogenase) ) Rules • If a patient has fever, cough and chest. • If a patient has dyspnes and lymphopenia. • If patient has kinase and malaise or has aspartate and dehydrogenase
Experiment: Possible rule bases Further assumption: • If a patient has fever, cough and chest, he/she would has 64% possibility to get SARS • If he/she has kinase and malaise or has aspartate and dehydrogenase simultaneously, the possibility is increasing to 93% • If he/she has dyspnes and lymphopenia, he/she can be diagnosed as a SARS Patient
Conclusion Advantages: • Fuzzy signatures are capable of improving the applicability of fuzzy systems. • Fuzzy signatures have the ability to cope with complex structured data and interdependent features problems. • With weighted aggregated, fuzzy signatures can assist experts to make decision by removing redundant information
Further Work • Further research can be focused on evaluating underlying relationships between the structures of fuzzy signatures, aggregation functions and weights of each vector.
Thank you ---- Bai Qifeng
Appendix • Demo of Fuzzy Control • Sparse Fuzzy System • Automatic Constructing Fuzzy Signature • Fuzzy c-Means
Fuzzy Control Fuzzy control is the most important current application in fuzzy theory. Usually, three steps in Fuzzy control: • Fuzzification • Rule evaluation • Defuzzification
Demo of Fuzzy Control • The most common one is the centre of gravity
Demo of Fuzzy Control • Use a procedure originated by Ebrahim Mamdani as demo. The application is to balance a pole on a mobile platform that can move in only two direction, to the left or the right. The angle between the platform and the pendulum and the angular velocity of this angle are chosen as the inputs of the system. Output is corresponding to the speed of the platform.
Fuzzification • First of all, the different levels of input and output are defined by specifying the membership functions for the fuzzy sets. • For similarity, it is assumed that all membership functions are spread equally. Hence, this explains why no actual scale is included in the graphs
Input Angel Input Angel Velocity Output Speed Fuzzification
Rule Evaluation • The next step is to define the fuzzy rules. The fuzzy rules are a series of if-then statements. For example: If angle is zero and angular velocity is zero then speed is also zero. If angle is zero and angular velocity is low then the speed shall be low.
Rule Evaluation • The full set of rules are listed in table
Rule Evaluation • Suppose an example has • 0.75 and 0.25 for zero and positive low angles • 0.4 and 0.6 for zero and negative low angular velocities.
Rule Evaluation • Consider the rule "if angle is zero and angular velocity is zero, the speed is zero".
Rule Evaluation • Consider the rule "if angle is zero and angular velocity is negative low, the speed is negative low".
Rule Evaluation • Consider the rule "if angle is positive low and angular velocity is zero, the speed is positive low".
Rule Evaluation • The Results overlap and are reduced to the following figure
Defuzzification • Defuzzification is used to choose an appropriate representative value as the final output. • The most common one is the centre of gravity
Sparse Fuzzy Systems • Sparse fuzzy systems can be used in situations where full knowledge of the problem domain is not available. Problem domain experts often work with only important fuzzy rules. • Self learning algorithms to tune the parameters of a fuzzy system for accuracy improvement can also lead to sparse fuzzy systems. In most cases, parameter tuning involves the reshaping of the fuzzy sets in the rule antecedents. It can happen that the shrinking of the fuzzy sets leads to gaps between neighboring fuzzy sets. • Generating a sparse fuzzy system benefits from the reduced number of rules. (Chong 2004)