Random projection trees and low dimensional manifolds

Yoav Freund, SanjoyDasguptaUniversity of California, San Diego 2008 2013. 01.07(월) Jeonbuk National Univ. DBLAB 김태훈 Random projection trees and low dimensional manifolds

Contents • Introduction • Detailed overview • An RP-Tree-MAX adapts to assouad dimension. • An RP-Tree-MEAN adapts to local covariance dimension.

Introduction • A k-d Tree is spatial data structure that partitions into hyperrectangular cells. • k-d Tree는 hyperrectangularcells 속 파티션들의 공간적인 데이터 구조 • It is built in a recursive manner, splitting along one coordinate direction at a time. • k-d Tree는 한 방향을 따라서 한번에 분리되는 재귀적인 방법을 이용

Introduction • The succession of splits corresponds to a binary tree whose leave contain the individual cells in . • 이 분리의 연속은 각 셀들을 포함하고 있는 잎의 이진 트리와부합함. suppose that *. The dots are points in a database. *. The cross is a query point q.

Introduction • K-d Tree requires D level in order to halve the cell diameter. • K-d 트리는 각 반경을 나누기 위해서 D level을 요구 • If the data lie in , itcould take 1000 levels of the tree to bring the diameter of cells down to half that of the entire data set. • 만약 data가 주어 졌을 경우 1000 level 을 내려가야 함. This would require data points!

Introduction • Thus k-d trees are susceptible to the same curse of dimensionality. • 그래서 k-d tree는 차원의 저주를 받을 정도로 민감. • However, a recent positive development in machine learning has been realization that a lot of data which superficially lie in a very high-dimensional space , actually have low intrinsic dimension. • 하지만 최근 machine learning에서 깨닫게 되었는데 많은 데이터들이 주어졌을 때 실제로는 매우 높은 는 낮은 고유한 차원을 가짐. • d << D • d(nonparameter실제 주어지는 데이터)보다 D차원에 더 민감함

Introduction • In this paper, we are interested in techniques that automatically adapt to intrinsic low dimensional structure without having to explicitly learn this structure. • 이 논문에서는 명시적으로 이 구조에 배울 필요 없이 관심 있는 테크닉인 자동적으로 적응하는 고유의 저차원구조에 대해서 서술 하고자 함.

Detailed overview • Both k-d trees and RP trees are built by recursive binary splits. • K-d tree와 RP tree는 재귀적으로 이진으로 분리되서만듬. • The core tree-building algorithm is called MakeTree, and takes as input a data set S • 이 코어 트리 빌딩 알고리즘은 MakeTree라고 불리는데 이것은 어떤 집합셋인S가 Rd에 속하는 input 데이터를 가짐.

MakeTree algorithm procedureMakeTree(S) if |S| < MinSize return (Leaf) Rule ← ChooseRule(S) LeftTree← MakeTree({x ∈ S : Rule(x) = true}) RightTree← MakeTree({x ∈ S : Rule(x) = false}) return ([Rule, LeftTree, RightTree])

K-d tree version procedureMakeTree(S) if |S| < MinSize return (Leaf) Rule ← ChooseRule(S) LeftTree← MakeTree({x ∈ S : Rule(x) = true}) RightTree← MakeTree({x ∈ S : Rule(x) = false}) return ([Rule, LeftTree, RightTree]) procedureChooseRule(S) comment: k-d tree version choose a coordinate direction Rule() := ≤ median({: ∈ S}) return (Rule)

RP-tree version PCA • 임의의 방향을 선정해서 중점을 기준으로 방향을 선택.

Principal Component Analysis(주성분 분석)

RP-tree Max version procedureMakeTree(S) if |S| < MinSize return (Leaf) Rule ← ChooseRule(S) LeftTree← MakeTree({x ∈ S : Rule(x) = true}) RightTree← MakeTree({x ∈ S : Rule(x) = false}) return ([Rule, LeftTree, RightTree]) procedureChooseRule(S) comment: RPTree-Max version choose a random unit direction v ∈ pick any x ∈ S; let y ∈ S be the farthest point from it choose δ uniformly at random in [−1, 1] ·/ Rule() := ≤ ( return (Rule

RP-tree Mean version procedureMakeTree(S) if |S| < MinSize return (Leaf) Rule ← ChooseRule(S) LeftTree← MakeTree({x ∈ S : Rule(x) = true}) RightTree← MakeTree({x ∈ S : Rule(x) = false}) return ([Rule, LeftTree, RightTree])

Random projection trees and low dimensional manifolds

Random projection trees and low dimensional manifolds

Presentation Transcript

FACE RECOGNITION, EXPERIMENTS WITH RANDOM PROJECTION

MANIFOLDS and HEADS

Expanders via Random Spanning Trees

Margin Trees for High-dimensional Classification

Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach

Low dimensional ion crystals

Dimensionality reduction by random projection and latent semantic indexing

one dimensional arrays and Random numbers

Rapidly Exploring Random Trees

On Kernels, Margins, and Low-dimensional Mappings

High-dimensional FSI system and Low-Dimensional Modelling

Optically Trapped Low-Dimensional Bose Gases in Random Environment

Multi-dimensional Search Trees

Random Projection Approach to Motif Finding

Geometric Topology: Low-dimensional manifolds, groups and minimal surfaces

Dimensional and Mode Dependent Thermal Transport in Low-dimensional Systems

Kernels, Margins, and Low-dimensional Mappings

Multi-dimensional Search Trees

FACE RECOGNITION, EXPERIMENTS WITH RANDOM PROJECTION

Manifolds and Cobordism

Bitangencies on Higher Dimensional Immersed Manifolds