650 likes | 910 Views
Traditional Database Indexing Techniques for Video Database Indexing. Jianping Fan. Department of Computer Science. University of North Carolina at Charlotte. Charlotte, NC 28223. jfan@uncc.edu. http://www.cs.uncc.edu/~jfan. 1. Why we need indexing?. Library:. 2000000 books. Query:.
E N D
Traditional Database Indexing Techniques for Video Database Indexing Jianping Fan Department of Computer Science University of North Carolina at Charlotte Charlotte, NC 28223 jfan@uncc.edu http://www.cs.uncc.edu/~jfan
1. Why we need indexing? Library: 2000000 books Query: Find the book with title “Multimedia Systems, Standards, and Networks” without indexing! How we can do this more efficiently? Too hard! 2000000!
2. How Library Works? a. Classify these books into several subjects: Books in Library Natural Sciences Social Sciences Computer Science Electrical Engineering Dancing Computer Languages Researches I get it! Database Multimedia Too easy! 11!
2. How Library Works? b. How they get this good partition and management? Taxonomy !! & Library Science!! How we can do this for data & image? Natural Sciences Social Science
3. Key Problems for Building Indexing? What you can find from this map?
3. Key Problems for Building Indexing? What you can find from this map?
3. Key Problems for Building Indexing? What you can find from this map?
3. Key Problems for Building Indexing? What you can find from this map?
3. Key Problems for Building Indexing? Database is some tables! Map is similar as tables! a. Partition Partition the large-scale data set into meaningful & manageable small regions hierarchically! b. Representation Represent these regions using efficient technique so that they can access very fast!
4. How to build indexing structure for data? a. Space partition approach: Partition the space into regions according to some measure
4. How to build indexing structure for data? a. Space partition approach: Space partition tree is attractive for GIS system.
4. How to build indexing structure for data? Space partition may not work for some case!!
4. How to build indexing structure for data? Partition data based on data distributions!
4. How to build indexing structure for data? b. Data Partition via Clustering clustering Using clustering to partition data set!!
4. How to build indexing structure for data? b. Data Partition via Clustering K-mean data clustering (1) Select K center to start Dark points
4. How to build indexing structure for data? b. Data Partition via Clustering K-mean data clustering (2) Put the testing point into most similar center
4. How to build indexing structure for data? b. Data Partition via Clustering K-mean data clustering (3) Update the corresponding cluster center
c. Representation of Data Partition Results: (1) Rectangular box (ID, x1,y1,x2,y2) (2) Sphere (SR-tree) (ID, xc,yc, R)
4. How to build indexing structure for data? R-tree: Minimum Rectangular Box Indexing tree Data set I A B C F G D J C Search road E H A K I J K M D E F G H I J K N L L M N D E F G H B L M N
4. How to build indexing structure for data? R-tree: Minimum Rectangular Box D A A B C D A B C D C First partition B
4. How to build indexing structure for data? Data partition approach: a b A B C D A A B C D f g c d e g a b c d e f Second partition of A
4. How to build indexing structure for data? Data partition approach: A B C D j k A B C D h B i h i j k Second partition of B
4. How to build indexing structure for data? Data partition approach: A B C D l C A B C D m Second partition of C m l
4. How to build indexing structure for data? Data partition approach: A B C D B C D A l m h i j k g a b c d e f Final indexing structure
4. How to build indexing structure for data? • R-tree family Root Node D E A B C A C B G D E F G H F H
4. How to build indexing structure for data? • R-tree family a. Overlap between A and C! Root Node D E A B C A C B G D E F G H F H
4. How to build indexing structure for data? X-tree: Minimum Rectangular Box with Fat Node root Normal directory nodes Super-nodes Data nodes
4. How to build indexing structure for data? SR-tree: Minimum Sphere
4. How to build indexing structure for data? • Grid file can be treated as an extended Q-tree with multiple partition at each attribute! age salary
4. How to build indexing structure for data? • Grid file can be treated as an extended Q-tree with multiple partition at each attribute! buckets
4. How to build indexing structure for data? overflow bucket primary buckets
4. How to build indexing structure for data? Bucket numbers: N; overflow bucket: M; Number of data entries for leaf node: K • Equal query: 1 + M • Range query: N + N*M • Insert: 1 + M + 1 • Delete: 1 + M + 1
4. How to build indexing structure for data? • Data distribution information can be used to improve the performance of grid file. Dynamic Grid File age salary
4. How to build indexing structure for data? age salary bucket
4. How to build indexing structure for data? 2 LOCAL DEPTH 3 LOCAL DEPTH Bucket A 16* 32* 32* 16* GLOBAL DEPTH Bucket A GLOBAL DEPTH 2 2 2 3 Bucket B 5* 21* 13* 1* 00 1* 5* 21* 13* 000 Bucket B 01 001 2 10 2 010 Bucket C 10* 11 10* Bucket C 011 100 2 2 DIRECTORY 101 Bucket D 15* 7* 19* 15* 7* 19* Bucket D 110 111 2 3 Bucket A2 4* 12* 20* DIRECTORY 12* 20* Bucket A2 4* (`split image' of Bucket A) (`split image' of Bucket A)
4. How to build indexing structure for data? Bucket numbers: N; overflow bucket: M; Number of data entries for leaf node: K • Equal query: 1 + M • Range query: N + N*M • Insert: 1 + M + 1 • Delete: 1 + M + 1
4. How to build indexing structure for data? • Database indexing structure is built for decision making and tries to make the decision as fast as possible! Decision Tree Color = Green? yes no Size = Big? Color = Yellow? yes yes no no Shape = Round? Size = small? watermelon Size = Medium? yes no yes no no yes apple Size = Big? banana Taste = sweet? apple yes no yes no Grape grapefruit lemon cherry grape
4. How to build indexing structure for data? • How to obtain decision for a database? • Obtain a set of labeled training data set from the database. • Calculate the entropy impurity: c. Classifier is built by:
KD-tree 4. How to build indexing structure for data? • By treating query as a decision making procedure, we can use decision to build more effective database indexing! Database root node no Salary > $75000? Age > 60? yes no yes no Age > 60? Data table no yes
4. How to build indexing structure for data? • Each inter-node, only one attribute is used! • It is not balance! Search from different node may have different I/O cost! • It can support multiple attribute database indexing like R-tree! • It has integrated decision making and database query!
4. How to build indexing structure for data? Tree levels: N; Leaf nodes: M; Number of data entries for leaf node: K The inter-nodes for kd-tree at the same level are stored on the same page. • Equal query: N + M • Range query: N + M • Insert: N + M + 1 • Delete: N+ M + 1
5. Storage Management for High-Dimensional Indexing Structures We want to put the similar data in the same page or neighboring pages! UNCLUSTERED Index entries CLUSTERED direct search for UNCLUSTERED Index entries data entries CLUSTERED direct search for data entries Data entries Data entries Data entries (Index File) Data entries (Data file) (Index File) (Data file) Data Records Data Records Data Records Data Records
5. Storage Management for High-Dimensional Indexing Structures It is very hard to do multi-dimensional data sorting! Hilbert Curve: scale multi-dimensional data into one dimension. 00 01 10 11
5. Storage Management for High-Dimensional Indexing Structures 0 14 15 1 2 3 13 12 4 7 11 8 5 9 6 10 From multi-dimensional indexing to one-dimensional storage in disk!
6. Video Database Indexing Can these technique be used for video database indexing? a. Curse of Dimensions: overlap in high-dimensional space b. Semantic Gap: visual features == semantic concepts What we should do?
Schema Determination Visual Representation Video Sequence Shot 1 Shot i Shot n Color HSV color histogram, dominant color, … a. Shot-based approach: Texture Edge histogram, wavelet coefficients, Tamura features, … Motion Directional motion histogram, Camera motion, … Other features
Schema Determination Video Sequence Key Object 1 Key Object i Key Object n Color HSV color histogram, dominant color, … Texture Edge histogram, Tamura, …. b. Object-based approach: Shape Rectangular box, moments, ….. Motion Trajectory, motion histogram, … Other features
6. Video Database Indexing A C overlap curse of dimensions B
6. Video Database Indexing Objective: We should try to bridge the semantic gap in the video content partition procedure. a. Concept Hierarchy 2000 Olympic Games filed basketball softball soccer volleyball Team Slovakia Team USA Team Norway Team USA Game Actions Players News Game Actions Players News