1 / 27

Serkan Kiranyaz and Moncef Gabbouj

Hierarchical Cellular Tree: An Efficient Indexing Scheme for Content-Based Retrieval on Multimedia Databases. Serkan Kiranyaz and Moncef Gabbouj. Objective.

hinto
Download Presentation

Serkan Kiranyaz and Moncef Gabbouj

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hierarchical Cellular Tree: An Efficient Indexing Scheme for Content-Based Retrieval on Multimedia Databases Serkan Kiranyaz and Moncef Gabbouj

  2. Objective • To present the technique of using a Hierarchical Cellular Tree (HCT) as an indexing scheme for content-based retrieval on multimedia databases.

  3. Why is this technique important? • Technological hardware and network improvements • Daily usage of Internet • Technique reduces costly I/O operations

  4. HCT Overview • Is a MAM(Metric Access Method) technique. • Based off the M-tree • Is a dynamic, cell-based, hierarchical structured indexing method • Items are partitioned based on distances and stored within cells based on their similarity proximity • Self-organized tree implemented via genetic programming principles

  5. Indexing Technique Categories SAM(spatial access method) • (dis-)similarity distance only measured through Euclidean distance. • Not suited for deep spanning trees MAM (metric access method) • Support black box approach to (dis-)similarity distance. • Allows for deep trees • Do not support dynamic changes*

  6. *M-tree Similarities • Is a dynamic MAM • Has a hierarchical structure based on the mitosis of a cell • Tree grows one level upwards whenever a split occurs at the top level • Each cell is represented by a nucleus (except the top most cell)

  7. M-tree Problems • Achieves a balanced tree with low I/O cost in large datasets • Problem: Multimedia databases are seldom balanced at all. • HCT: Cells are unbalanced and can vary in size • Must know the size of the database entries/Cells before building (capacity M) • Problem: All M-tree structures can hit upper limits (size non dynamic) • HCT: Removes limit on cell size as long as they keep a definite "compactness" measure

  8. M-tree Problems • M-tree compactness is only measured with respect to distance of nucleus to furthest object (covering radius) • Problem: Determining compactness this way does not allow for dynamic sizing of cells. • HCT: Uses all cell items and their minimum distances to the cell(instead of a single nucleus item alone), compactness is constantly being updated.

  9. Related Work in Multimedia Databases (SAM trees) • KD-Trees • Hierarchical tree structure • Use space-partitioning methods to divide the feature space into predefined hyperplanes • R-Trees • Feature space divided according to distribution of database items • Region overlapping may occur

  10. Related Work in Multimedia Databases (SAM trees) • R*-trees • Improves the node splitting of R-tree by taking overlapping areas into consideration • TV-tree • Uses telescope vectors • Authors call telescope vectors "so called telescope vectors" • Google search does not come up with anything meaningful for telescope vectors

  11. Related Work in Multimedia Databases (SAM trees) • X-tree • Avoids overlapping of region bounding boxes by using a new organization of the directory • Boxes can still intersect at higher levels in the tree • Paper does not go into detail on what a bounding box is (assumption bounding box = cell) • SS-tree • Uses minimum bounding spheres instead of boxes • Less intersects at higher levels

  12. Related Work in Multimedia Databases (MAM trees) • vp-tree(vantage point) • organizes feature vectors(data points) into two groups according to their similarity distances with respect to a single point(vantage point) • mvp-tree(multiple vantage point) • assigns multiple vantage points instead of one

  13. HCT Structure - Cell Structure • Basic container in which similar database items are stored. • Ground level cells contain the entire database items • Cells carry an MST (Minimum Spanning Tree) • Holds minimum (dis-)similarity distance of each item to other items within the cell. • Used to determine when mitosis should occur. • Splits occur at longest branch. • This is actually very similar to MVP-tree except every cell is treated as a vantage point. • Better idea about the similarity proximity of an item.

  14. HCT Structure - Cell Structure • Cells cannot undergo mitosis before reaching a specific level of maturity • This works like real cells • Reason for this is not like real cells • Nucleus • Represents the owner cell of a higher level • Nucleus is found through MST • Item with maximum number of branches • Nucleus is updated with every operation performed • M-tree does not do this

  15. HCT Structure - Cell Structure • Cell Compactness • How tight focused the clustering for items within the cell • High variations are eliminated by using more than a single item(vantage point)

  16. HCT Structure - Cell Structure • Cell Mitosis • Two conditions for mitosis • Maturity (Nc > Nm) • c = number of items in cell • m = maturity minimum limit • Cell Compactness (CFc > CThrL) • CFc = Compactness feature • CThrL = current level compactness threshold • Cell Mitosis has no cost as the cell is simply split by breaking longest branch

  17. HCT Structure - Cell Structure

  18. HCT Structure - Level Structure • Top level always single cell • If mitosis occurs on top level, new top level is created to preserve single cell top level. • Each level attempts to dynamically maximize compactness of cells

  19. HCT Structure - HCT Operations • Three operations • Cell mitosis • Item insertion • Item removal • As stated before all three operations cause a recalculation of Compactness

  20. HCT Structure - HCT Operations • Insert • First performs the Pre-Emptive cell search • recursively descends HCT from top to target level • Once target located, insert item into target cell • Perform post-processing check • Check for mitosis • Recalculate compactness for single or multiple cells • If mitosis was performed • Remove old nucleus item from higher level • Consecutively call Insert for new nucleus

  21. HCT Structure - HCT Indexing • HCT can index using any set of available features • Must have fusion mechanism • Must have similarity measure • Consists of two operations • Incremental construction • Optional periodic fitness check

  22. HCT Structure - HCT Indexing • HCT Incremental Construction • Takes a Database D and appends all new items contained in an Array • If an HCT does not already exist for database D • All current items of D are inserted into the Array • A new HCT body is constructed from D • Else if an HCT does exist for database D • HCT body is first loaded • HCT body is updated with contents of Array

  23. HCT Structure - HCT Indexing • HCT Fitness Check • Aims to minimize corruption which can happen during construction of HCT body • Corruption happens because the order of items that are inserted is not handled • Outliers Check • Reduces the "crowd effect" by removing redundant minority cells • minority cells, cells with a few or one item in it • All minority cells are reintroduced into the system to see if they fit into another cell

  24. HCT Structure - HCT Indexing • Cell Merging • If a cell merge occurs that is later deemed as not meeting the requirements of cell compactness it can be merged.

  25. HCT - Examples

  26. HCT-Examples

  27. QA

More Related