10 likes | 95 Views
Using Natural Cluster Information to Build Fuzzy Indexing Structure. Department of Computer Science and Engineering The Chinese University of Hong Kong. H.Y. Yue, I. King and K.S. Leung.
E N D
Using Natural Cluster Information to Build Fuzzy Indexing Structure Department of Computer Science and Engineering The Chinese University of Hong Kong H.Y. Yue, I. King and K.S. Leung We propose a novel fuzzy clustering algorithm, Sequential Fuzzy Competitive Clustering (SFCC), to obtain the natural cluster information from the data. We then use this information to build an efficient indexing structure, SFCC-binary tree (SFCC-b-tree). Our experimental results show that SFCC-b-tree performs better than the VP-tree in most cases. • The Pruning Algorithm • Each node in SFCC-b-tree contains a Minimum Boundary Rectangle (MBR), which is the smallest rectangle containing all the data objects for the node. • Given two n-dimensional hyper-cube P and Q exist over-lapping, if and only if P and Q have overlapping on all the n dimension. • Key Problems • Multimedia data are often high volume and high dimensional. • Most indexing methods for content-based retrieval often divide the objects in the same natural cluster into several different partitions. • Performance degrades when the queries lie near the partition boundaries. The Searching Algorithm in SFCC-b-tree • Contributions • Propose Sequential Fuzzy Competitive Clustering (SFCC). • Build an efficient indexing structure, SFCC Binary Tree (FCC-b-tree) by using SFCC. • Demonstrate how to make use of Minimum Boundary Rectangle (MBR), which is the smallest rectangle containing all the data objects for the node, to perform nearest-neighbor search efficiently. • The Advantages of SFCC-b-tree • The ability to group the data in the same natural cluster in the same node. • Fast building time. • Fast retrieval time. An Illustration of SFCC-b-tree Figure 1: Procedure of the SFCC-b tree (1) Find natural clusters information for the data set. (2) Split it into two child nodes according to the natural clusters information. (3) Loop back to step 1again on each child nodes until child node is small enough to fit into a leaf node. • Experiments • Examine the building and Searching time for SFCC-b-tree when compare with VP-tree. • The experimental environment is Ultra Sparc5 and implemented in C++. • Building time (in second) in different dimensionality. • Searching time (in second) in different dimensionality. Selected Publication H.Y. Yue, I. King, and K.S. Leung, Fuzzy clustering method for content-based indexing, in 2001 WSES FSFS International Conference, Fuzzy Sets and Fuzzy Systems, volume 1, pages 5411-5419, 2001. H.Y. Yue, Fuzzy Clustering for Content-Based Indexing in Multimedia Databases, M.Phil Thesis of the Dept. of the Computer Science and Engineering in CUHK, 2001.