890 likes | 1.16k Views
Soft Computing, Machine Intelligence and Data Mining. Sankar K. Pal Machine Intelligence Unit Indian Statistical Institute, Calcutta http://www.isical.ac.in/~sankar. MIU Activities. (Formed in March 1993). Pattern Recognition and Image Processing Color Image Processing Data Mining
E N D
Soft Computing, Machine Intelligence and Data Mining Sankar K. Pal Machine Intelligence Unit Indian Statistical Institute, Calcutta http://www.isical.ac.in/~sankar
MIU Activities (Formed in March 1993) • Pattern Recognition and Image Processing • Color Image Processing • Data Mining • Data Condensation, Feature Selection • Support Vector Machine • Case Generation • Soft Computing • Fuzzy Logic, Neural Networks, Genetic Algorithms, Rough Sets • Hybridization • Case Based Reasoning • Fractals/Wavelets • Image Compression • Digital Watermarking • Wavelet + ANN • Bioinformatics
Externally Funded Projects • INTEL • CSIR • Silicogene • Center for Excellence in Soft Computing Research • Foreign Collaborations (Japan, France, Poland, Honk Kong, Australia) • Editorial Activities • Journals, Special Issues • Books • Achievements/Recognitions Faculty: 10 Research Scholar/Associate: 8
Contents • What is Soft Computing ? - Computational Theory of Perception • Pattern Recognition and Machine Intelligence • Relevance of Soft Computing Tools • Different Integrations
Emergence of Data Mining • Need • KDD Process • Relevance of Soft Computing Tools • Rule Generation/Evaluation • Modular Evolutionary Rough Fuzzy MLP • Modular Network • Rough Sets, Granules & Rule Generation • Variable Mutation Operations • Knowledge Flow • Example and Merits
Rough-fuzzy Case Generation • Granular Computing • Fuzzy Granulation • Mapping Dependency Rules to Cases • Case Retrieval • Examples and Merits • Conclusions
SOFT COMPUTING (L. A. Zadeh) • Aim : • To exploit the tolerance for imprecision uncertainty, approximate reasoning and partial truth to achieve tractability, robustness, low solution cost, and close resemblance with humanlike decision making • To find an approximate solution to an imprecisely/precisely formulated problem.
Parking a Car Generally, a car can be parked rather easily because the final position of the car is not specified exactly. It it were specified to within, say, a fraction of a millimeter and a few seconds of arc, it would take hours or days of maneuvering and precise measurements of distance and angular position to solve the problem. High precision carries a high cost.
The challenge is to exploit the tolerance for imprecision by devising methods of computation which lead to an acceptable solution at low cost. This, in essence, is the guiding principle of soft computing.
Soft Computing is a collection of methodologies (working synergistically, not competitively) which, in one form or another, reflect its guiding principle: Exploit the tolerance for imprecision, uncertainty, approximate reasoning and partial truth to achieve Tractability, Robustness, and close resemblance with human like decision making. • Foundation for the conception and design of high MIQ (Machine IQ) systems.
Provides Flexible Information Processing Capability for representation and evaluation of various real life ambiguous and uncertain situations. Real World Computing •• It may be argued that it issoft computing rather than hard computingthat should be viewed as the foundation for Artificial Intelligence.
RS • At this junction, the principal constituents of soft computing are Fuzzy Logic , Neurocomputing , Genetic Algorithms and Rough Sets . FL GA NC • Within Soft Computing FL, NC, GA, RS are Complementary rather than Competitive
RS RS • Role of FL : the algorithms for dealing with imprecision and uncertainty NC : the machinery for learning and curve fitting GA : the algorithms for search and optimization handling uncertainty arising from the granularity in the domain of discourse
Referring back to example:“Parking a Car’’ Do we use any measurement and computation while performing the tasks in Soft Computing? We use Computational Theory of Perceptions (CTP)
AI Magazine, 22(1), 73-84, 2001 Computational Theory of Perceptions (CTP) Provides capability to compute and reason with perception based information Example: Car parking, driving in city, cooking meal, summarizing story • Humans have remarkable capability to perform a wide variety of physical and mental tasks without any measurement and computations
They use perceptions of time, direction, speed, shape, possibility, likelihood, truth, and other attributes of physical and mental objects • Reflecting the finite ability of the sensory organs and (finally the brain) to resolve details, Perceptions are inherently imprecise
Perceptions are fuzzy (F) – granular (both fuzzy and granular) • Boundaries of perceived classes are unsharp • Values of attributes are granulated. (a clump of indistinguishable points/objects) Example: Granules in age: very young, young, not so old,… Granules in direction: slightly left, sharp right,…
Hybrid Systems • Neuro-fuzzy • Genetic neural • Fuzzy genetic • Fuzzy neuro genetic Knowledge-based Systems • Probabilistic reasoning • Approximate reasoning • Case based reasoning Data Driven Systems Machine Intelligence • Neural network system • Evolutionary computing • Fuzzy logic • Rough sets Non-linear Dynamics • Chaos theory • Rescaled range analysis (wavelet) • Fractal analysis • Pattern recognition and learning Machine Intelligence: A core concept for grouping various advanced technologies with Pattern Recognition and Learning
Relevance of FL, ANN, GAs Individually to PR Problems is Established
In late eighties scientists thought – Why NOT Integrations ? Fuzzy Logic + ANN ANN + GA Fuzzy Logic + ANN + GA Fuzzy Logic + ANN + GA + Rough Set Neuro-fuzzy hybridization is the most visible integration realized so far.
Why Fusion Fuzzy Set theoretic models try to mimic human reasoning and the capability of handling uncertainty – (SW) Neural Network models attempt to emulate architecture and information representation scheme of human brain – (HW) NEURO-FUZZY Computing (for More Intelligent System)
FUZZY SYSTEM ANN used for learning and Adaptation NFS ANN Fuzzy Sets used to Augment its Application domain FNN
MERITS • GENERIC • APPLICATION SPECIFIC
Rough Fuzzy Hybridization: A New Trend in Decision Making, S. K. Pal and A. Skowron (eds), Springer-Verlag, Singapore, 1999
IEEE TNN, .9, 1203-1216, 1998 L M H 1 1 FjL 2 FjM Fj 2 3 3 FjH GA Tuning X X | 0 0 0 | X X 0 0 | X X X | 00 Integrating of ANN, FL, Gas and Rough Sets Incorporate Domain Knowledge using Rough Sets
Before we describe • Modular Evolutionary Rough-fuzzy MLP • Rough-fuzzy Case Generation System • We explain Data Mining and the significance • of Pattern Recognition, Image Processing and • Machine Intelligence.
Why Data Mining ? • Digital revolution has made digitized information easy to capture and fairly inexpensive to store. • With the development of computer hardware and software and the rapid computerization of business, huge amount of data have been collected and stored in centralized or distributed databases. • Data is heterogeneous (mixture of text, symbolic, numeric, texture, image), huge (both in dimension and size) and scattered. • The rate at which such data is stored is growing at a phenomenal rate.
As a result, traditional ad hoc mixtures of statistical techniques and data management tools are no longer adequate for analyzing this vast collection of data.
Pattern Recognition and Machine Learning principles applied to a very large (both in size and dimension) heterogeneous database Data Mining Data Mining + KnowledgeInterpretation Knowledge Discovery Process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data
Pattern Recognition, World Scientific, 2001 Data Mining (DM) • Data • Cleaning Machine Learning Knowledge Interpretation • Data • Condensation Mathe- matical Model of Preprocessed Useful Huge Raw Data • Knowledge • Extraction • Knowledge • Evaluation • Dimensionality • Reduction Knowledge Data • Classification • Clustering • Rule • Generation Data (Patterns) • Data • Wrapping/ • Description Knowledge Discovery in Database (KDD)
Why Growth of Interest ? • Falling cost of large storage devices and increasing ease of collecting data over networks. • Availability of Robust/Efficient machine learning algorithms to process data. • Falling cost of computational power enabling use of computationally intensive methods for data analysis.
Example : Medical Data • Numeric and textual information may be interspersed • Different symbols can be used with same meaning • Redundancy often exists • Erroneous/misspelled medical terms are common • Data is often sparsely distributed
Robust preprocessing system is required to extract any kind of knowledge from even medium-sized medical data sets • The data must not only be cleaned of errors and redundancy, but organized in a fashion that makes sense for the problem
So, We NEED • Efficient • Robust • Flexible Machine Learning Algorithms NEED for Soft Computing Paradigm
Without “Soft Computing” Machine Intelligence Research Remains Incomplete.
Modular Neural Networks Task: Split a learning task into several subtasks, train a Subnetwork for each subtask, integrate the subnetworks to generate the final solution. Strategy: ‘Divide and Conquer’
The approach involves • Effective decomposition of the problems s.t. the • Subproblems could be solved with compact • networks. • Effective combination and training of the • subnetworks s.t. there is Gain in terms of both • total training time, network size and accuracy of • solution.
Advantages • Accelerated training • The final solution network has more structured • components • Representation of individual clusters (irrespective • of size/importance) is better preserved in the final • solution network. • The catastrophic interference problem of neural • network learning (in case of overlapped regions) • is reduced.
3-class problem 3 (2-class problem) Class 1 Subnetwork Class 2 Subnetwork Class 3 Subnetwork Integrate Subnetwork Modules Links to be grown Links with values preserved Final Training Phase I Final Network Inter-module links grown
Modular Rough Fuzzy MLP? A modular network designed using four different Soft Computing tools. Basic Network Model: Fuzzy MLP Rough Set theory is used to generate Crude decision rules Representing each of the classes from the Discernibility Matrix. (There may be multiple rules for each class multiple subnetworks for each class)
The Knowledge based subnetworks are concatenated to form a population of initial solution networks. The final solution network is evolved using a GA with variable mutation operator. The bits corresponding to the Intra-module links (already evolved) have low mutation probability, while Inter-module links have high mutation probability.
Rough Sets Upper Approximation BX Set X Lower Approximation BX [x]B (Granules) . x [x]B = set of all points belonging to the same granule as of the point x in feature space WB. [x]B is the set of all points which are indiscernible with point x in terms of feature subset B.
Approximations of the set w.r.t feature subset B B-lower: BX = Granules definitely belonging to X B-upper: BX = Granules definitely and possibly belonging to X If BX = BX, X is B-exact or B-definable Otherwise it is Roughly definable
Rough Sets Uncertainty Handling Granular Computing (Using information granules) (Using lower & upper approximations)
Granular Computing: Computation is performed using information granules and not the data points (objects) • Information compression • Computational gain
Information Granules and Rough Set Theoretic Rules F2 high medium low low medium high F1 Rule • Rule provides crude description of the class using • granule
Rough Set Rule Generation Decision Table: Object F1 F2 F3 F4 F5 Decision x1 1 0 1 0 1 Class 1 x2 0 0 0 0 1 Class 1 x3 1 1 1 1 1 Class 1 x4 0 1 0 1 0 Class 2 x5 1 1 1 0 0 Class 2 Discernibility Matrix (c) for Class 1:
Discernibility function: Discernibility function considering the object x1 belonging to Class 1 = Discernibility of x1 w.r.t x2 (and) Discernibility of x1 w.r.t x3 = Similarly, Discernibility function considering object Dependency Rules (AND-OR form):
Knowledge Flow in Modular Rough Fuzzy MLP IEEE Trans. Knowledge Data Engg., 15(1), 14-25, 2003 Feature Space Rough Set Rules C1 (R1) Network Mapping C1 F2 C2(R2) C2(R3) F1 R1 (Subnet 1) R2 (Subnet 2) R3 (Subnet 3) Partial Training with Ordinary GA Feature Space SN1 (SN2) (SN1) (SN3) F2 SN2 Partially Refined Subnetworks SN3 F1
Concatenation of Subnetworks high mutation prob. low mutation prob. Evolution of the Population of Concatenated networks with GA having variable mutation operator Feature Space C1 F2 C2 Final Solution Network F1