The perception of Shading and Reflectance E.H. Adelson, A.P. Pentland

The perception of Shading and ReflectanceE.H. Adelson, A.P. Pentland Presenter: Stefan Zickler

The “Intrinsic Image” • the underlying physical properties of a scene. • Looking at a 2D image, what does its 3-dimensional source model look like?

What makes an image? • A combination of three factors: • Lighting • Shading • Reflectance

Lighting • Variables: • Number of light sources • Intensity • Position • Distribution (Spot-light or Global)

Reflectance • How a surface’s material changes the light: • Color • Absorbance • Transparency • Etc…

Shading • A change to the angle of incidence of light based on the surface normal.

a simple formulation of an image in terms of reflectance and shading • I(x,y) = r(x,y) s(x,y) • r(x,y) is the reflectance image • s(x,y) is the shading image / luminance image • where s(x,y) = λ N(x,y)·L • N(x,y) is the surface normal • L is the illumination direction • λ is the “luminous flux”, meaning intensity of light.

The bad news • Any 2D image can be described by infinitely many 3D models of shading and reflectance (the most simple being a flat 2D screen, colored with the image).

The good news • Humans are easily able to reason about which intrinsic 3D model is likely to be the correct one. • Therefore, a computer should be able do the same…

How do we find the best intrinsic image? • A perception should correspond to the simplest or likeliest explanation. • One way to define simplicity is by introducing a cost-function.

The “workshop” metaphor • A generative model for shading, reflectance, and lighting. • We have three workers: • Painter • Sheet Metal Worker • Lighting Designer

The painter • Can paint polygons with certain colors. • Works on the reflectance component of our image.

The metal-worker • Can cut out new pieces of metal • Can bend pieces of metal • This is the shading component of our image.

The Lighting Designer • Can position lights to illuminate a scene. • Can chose between flood lights and spot lights.

What does this give us? • A fairly complete generative model to create any arbitrary 3D scene • How do we enforce simplistic solutions? • Through a cost-function.

The pricelist • Painter Fees: • Paint rectangular patch: $5 each • Paint general polygon: $5 each • Sheet Metal Worker Fees: • Right angle cuts $2 each • Odd angle cuts $5 each • Right angle bends $2 each • Odd angle bends $5 each • Lighting Designer Fees: • Flood light $5 each • Custom spot light $30 each

Each worker can create an entire image with a minimum of help from the other workers. • Painter’s solution: • Paint 9 polygons: $180 • Setup 1 flood light $5 • Cut 1 rectangle $8 • Total $193 • Sheet metal worker's solution: • Cut 24 odd angles $120 • Bend 6 odd angles $30 • Set up 1 flood light $5 • Total $155 • Lighting Designer's solution: • Cut 1 Rectangle $8 • Set up 9 spot lights $270 • Total $278

We need a supervisor • His role: • Coordinate the three workers to find a cooperative solution with the minimum overall cost. • In more scientific terms: • To perform a search through the entire solution space and find the point of minimum overall cost.

The supervisor’s solution: • Supervisor's solution: • Cut 1 rectangle $8 • Paint 3 rectangles $5 • Bend 2 right angles $4 • Supervisor's fee $30 • Total $47 • Compare to: • Painter’s solution: $193 • Metal Worker’s solution: $155 • Lighting Worker’s solution: $278

Tweaking the price-list:Discouraging naïve solutions • Make naïve solutions expensive. • We don’t want our algorithm to simply create a painted 2D screen. • On the other hand we don’t want to make things like paint too expensive so that they never get used. • Cooperative solutions should be cheaper than single workers

Is there an optimal pricelist? • Price-list values can be determined experimentally and tweaked in a way that they deliver the most likely solution for most images. • However, there is no universal price list that correctly describes all possible images.

The main problem with this workshop theory • The search space for cooperative solutions of our workers is enormous, as there are infinitely many ways of combining their skills • Even for small scenes, there exists no efficient search algorithm to solve this problem in a simultaneous fashion.

Their solution • Instead of a simultaneous cooperative model, we use a simplified, multi-stage generative model. • Where have we seen this before?

Stage 1: The Shape Specialist • Assumptions: • image was made by orthographic projection. • We are given the observed x,y coordinates of all edges and vertices in the image. • Operations: • We can move vertices among the z axis

Shape Specialist Contd. • Simple solutions are enforced by assigning higher costs to non-right angles. • Compactness (shorter edges) and planarity (less angle-variance) are rewarded. • This cost-metric works for most figures, but not all of them.

Stage 2: Lighting Specialist • Given the shape from the previous specialist, find the lighting direction that best explains the observed luminance variation in terms of shading. • This can be estimated linearly by solving for the light direction L of two connected surfaces: I1 = r1λN1·L I2 = r2λN2·L Where r(x,y) is an estimated average, and λ=1

Stage 3: Reflectance specialist • Given the shape and lighting from the previous two specialists, explain any left-over differences by painting the surfaces.

An example:

The problem with this approach • Real world scenes don’t look like this:

The problem with this approach • Instead, they look more like this:

Some Other Shortcomings • Tuning the cost-factors is done manually. There will never be a single set of parameters that will correctly describe all scenes. • A psychologist’s approach to computer science: not much information on how far this approach can scale up to more complex scenes, not much work on coming up with a better search algorithm or parameter learning. • How well this approach works on random, real-world scenes is questionable.

The perception of Shading and Reflectance E.H. Adelson, A.P. Pentland