Spatio-Temporal Sequence Learning of Visual Place Cells for Robotic Navigation

Spatio-Temporal Sequence Learning of Visual Place Cells for Robotic Navigation IJCNN, WCCI, Barcelona, Spain, 2010 Nguyen Vu Anh, Alex Leng-Phuan Tay, Wooi-Boon Goh School of Computer Engineering Nanyang Technological University Singapore Janusz A. Starzyk School of Electrical Engineering Ohio University Athens, USA presented by Nguyen Vu Anh date: 20th July, 2010

Outline • Introduction • HMAX Feature Building and Extraction • Spatio-Temporal Learning and Recognition • Empirical Results • Conclusion and future directions

Introduction • Robotic navigation: Localization and Mapping. • Topological map & Place cells • Scope: Topological Visual Localization • Challenges: • High dimension and uncertainty of visual features • Perceptual aliasing • Complex probabilistic frameworks e.g. HMM • Approach: • Structural organization of human memory architecture. • Short-Term Memory (STM) and Long-Term Memory(LTM) Interaction

Introduction Classifier • System Architecture SequenceStorage Symbol Quantization Feature Building and Extraction

Introduction • Existing Works: • Autonomous navigation (SLAM): Mapping, Localization and Path Planning • Topological vs metric representation • Human employs mainly topological representation of environment[O’Keefe (1976), Redish(1999), Eichenbaum (1999), etc] • Visual Place-cell model: [Torralba (2001) ; Renninger&Malik (2004) ; Siagian&Itti (2007)] • Hierarchical feature building and extraction (HMAX Model) [Serre et al (2007)] • Spatio-Temporal sequence learning: [Wang&Arbib (1990) (1993), Wang&Yowono (1995)] • Our previous works: [Starzyk&He, (2007);Starzyk&He (2009);Tay et al (2007);Nguyen&Tay (2009)]

HMAX Feature Building and Extraction • Interleaving simple (S) and complex (C) layers with increasing spatial invariance (Retina - LGN – V1 – V2,V4) • 2 Stages: • Feature Construction • Feature Extraction • Feature Significance:

HMAX Feature Building and Extraction Dot-Product Matching Spatial Invariance Processing Prototypes Ref: Riesenhuber & Poggio (1999),Serre et al (2007)

Spatio-Temporal Learning Architecture • STM Structure: • Quantization of input using KFLANN with vigilance ρ See: Tay, Zurada,Wong and Xu, TNN, 2007

Spatio-Temporal Learning Architecture • STM Structure: See: Tay, Zurada,Wong and Xu, TNN, 2007

Spatio-Temporal Learning Architecture • LTM Cell Structure: • Each LTM is learnt by one-shot mechanism. • Each long training sequence is segmented into N overlapping subsequences of the same length M. • Each subsequence is dedicated permanently to an LTM cell.

Spatio-Temporal Learning Architecture • LTM Cell Structure: Dual Neurons – STM Primary Neurons – Primary Excitation

Spatio-Temporal Learning Architecture • Storage • One-shot learning • Recognition Input feature vector Primary ExcitationComputation Dual Neurons Update – Evidence Accumulation Output Matching Score from the last DN

Empirical Results • ICLEF Competition 2010 Dataset • 9 classes of places • 2 sets of images with the same trajectory (Set S and SetC) (~4000 images each set) C K L O

Empirical Results • Task • 1 sequence (Set S) as training set and 1 sequence as testing set (Set R). • Features: • 10% of the training sequence • Training • ρ=0.7. • Segmentation into consecutive subsequences of equal length (100) with overlapping portion (>50%). • Each subsequence is stored as a LTM cell. • The label of each LTM cell is the majority label of individual components. • Testing • The label is assigned as the label of the maximally activated LTM cell. • If the activation of the maximal activated LTM cell is below ө, the system refuses to assign the label.

Empirical Results Table: LTM listing with training set S

Empirical Results • Accuracy without threshold • Accuracy with threshold ө=0.4 • Robust testing: missing elements

Empirical Results Figure: LTM cells’ activation during recall stage

Empirical Results • Intersection case:

Conclusion • A hierarchical spatio-temporal learning architecture • HMAX hierarchical feature construction and extraction • STM clustering by KFLANN • Sequence storage and retrieval by LTM cells. • Application in appearance-based topological localization

Future Directions • Automatic tolerance estimation • E.g. Signal-to-noise ratio figure of features [Liu&Starzyk 2008] • Hierarchical episodic memory which characterizes the interaction between STM and LTM • Other embodied intelligence components • Goal creation system [Starzyk 2008] • Application in other domains: • Human Action Recognition

Thank you! 

Spatio-Temporal Sequence Learning of Visual Place Cells for Robotic Navigation