Decoding 3D Object Representation in the Brain: An Evolutionary Stimulus Strategy

Presented by Hanan Shteingart Physiology A – 2009, ICNC May 2009 Macaca Mulatta

Something to Wake You Up Go to icnc.wordpress.com to make your own dragon

Main Question How does the brain (IT) codes 3D objects? ? macaque monkey inferotemporal cortex (IT)

Abstract • Complex shape codes - focused on 2D. • 3D requires : • higher-D coding (more info) • computation (harder to decipher) • The Good News: evidence for an explicit neural code. • The Bad News: very complicated paper! (fancy object creation, evolutionary stimulus strategy, linear/nonlinear models with and crazy statistical tests)

Introduction object boundary fragments (features) integration into explicit signal component level shape (or holistic via learning) V2 IT V4 ~isomorphic (106 pixels) unwieldy and unstable  not useful for object perception 2D studies

The Hypothesis • H0: IT neurons encode three-dimensional spatial configurations of surface fragments. • H1: Complex shape perception is based primarily on two-dimensional image processing. • Previous studies • differential responses across a small number of 3D shapes or tuning along a single depth-related dimension

The 3D Decoding Problem • Which 3D shape factors are associated with neural responses? • 3D  complex, multi tuning properties  Large stimulus set • wide range of 3D elements • combined in many different ways • A conventional random or systematic (grid-based) approach can never produce sufficiently dense combinatorial sampling. • This has notbeen attempted before.

Decoding Solution • Evolutionary stimulus strategy • Two advantages: • Focused • Variant • This evolutionary stimulus strategy made it possible for the first time to test the three-dimensional configural coding hypothesis at the neural level.

Evolutionary Algorithms • Initial generation of 50 random 3D shapes • Averaged response = probability to reproduce • Descended morphed, either locally or globally

Evolutionary Stimulus Strategy

Response Model • Feature Space: Stimuli were characterized by 7 surface fragments: • X,Y,Z • XY, YZ angles, • max/min curvature • Two Gaussian function amplitude at the stimulus point closest to the Gaussian peak + nonlinear interactions.

Results Overview 1 • Strong cross-prediction of responses • Model order = 2 • Most fragments are outside plane (3D) • 3D shape tuning independent of lighting, position, size and depth. • wide range of tuned configurations. • 3D representation in IT is not holistic • Convexity bias 2

Tuning Functions • Consistency: • Lighting • Depth, Position, Size • Non consistence: • orientation Light angle depth position rotation size xy plane

3D Verification three-dimensional shape tuning was largely independent of lighting direction, stimulus position, stimulus size and stimulus depth. Separability is represented here by the fraction of response variance (r2) explainable by a matrix product between separate tuning functions for shape and shading, depth, position or size ??????????????

Surface-Fragment Configurations • Tuning models spanned a wide range of surface-fragment configurations • 3D shape representation in IT is not generally holistic (model covers partially the whole shape)

Convexity Bias stimuli model • Tuning was markedly biased in the curvature domain toward high values, especially on the convex end of the scale. spatial curvature

Discussion • Convexity bias may reflect functional importance (emphasize pointy parts) • Substantial fraction of IT neurons followed H0: they were simultaneously tuned for 3D shape • Neurons tuned for multiple regions in this domain • consistent with classic theories of configural shape representation (‘geons’) with 2 differs: • No rotating of reference frame • Multi-part configuration for single neuron

The Need for 3D Encoding? • Why would the brain explicitly represent complex 3D object shape, considering the computational expense of inferring 3D structure from the two-dimensional retinal image and the higher neural tuning dimensionality required? • Speculation: Representation of 3D object structure supports other aspects of object vision beyond identification (e.g. usage).

Configural coding of three-dimensional object structure Henry Moore’s ‘‘Sheep Piece’’ (1971–1972)

Question to Eli • “The average neural response to each stimulus determined the probability with which it produced morphed descendants in subsequent stimulus generations” (pg 4) but in methods it says: a typical second generation would contain ten stimuli generated de novo, four descendants of stimuli in the highest response bin, four descendants from the second highest bin, etc. [equal probabilities] • Fig 3b. Response consistency was measured by separability of tuning for shape? Why is this consistency? It’s more independence of variables or something.

Decoding 3D Object Representation in the Brain: An Evolutionary Stimulus Strategy