1 / 56

Tutorial Notes in Vision Science Masking and Suppression TSM1

This tutorial provides an introduction to contrast masking and suppression in vision science, including relevant references and exercises.

Download Presentation

Tutorial Notes in Vision Science Masking and Suppression TSM1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tutorial Notes in Vision ScienceMasking and SuppressionTSM1 Dr Tim S. Meese t.s.meese@aston.ac.uk Last updated: March 27th 2009

  2. Key References The references in this list either (1) make an important contribution to the points developed in this lecture or (2) provide valuable reviews or expositions; or both. All are recommended reading. Highlights indicate essential reading. You will be directed to these at appropriate junctures during the PPT. You will also be given a few exercises to perform highlighted in red. The PPT might also include other references not included in this list which are less central to understanding the lecture. • ‘Introduction to spatial vision I’ and ‘Introduction to spatial vision II’ by Meese at: http://www1.aston.ac.uk/lhs/staff/az-index/meesets/#Links • Albrecht & Geisler (1991) • Bonds (1989) • Foley (1994) • Freeman, Durand, Kiper & Carandini (2002) • Geisler & Albrecht (1992) • Heeger (1992) • Holmes & Meese (2004) • Legge & Foley (1980) • Petrov, Carandini & McKee (2005) • Meese & Hess (2004) • Meese, Hess & Williams (2005) • Meese & Holmes (2007) • Meese & Baker (2009, in press) • Morrone et al (1982) • Phillips & Wilson (1984) • Porciatti et al (2000) • Snowden & Hammett (1998) • Watson & Solomon (1997) • Wilson, McFarlane & Phillips (1983) • Xing and Heeger (2000)

  3. Prologue • The term ‘masking’ is used very liberally in the literature. • We will not cover all its uses, or even catalogue them all. • In the main, we shall restrict ourselves to the context of contrast masking (see next slide). • though in some contexts this is also referred to as spatial frequency masking and orientation masking. • Within this field, the major impetus has come from psychophysics and single-cell recordings. More recently, functional imaging has contributed some valuable data but added little, if anything, to quantitative models of the process. • In general, there have been two main aims amongst vision scientists working on contrast masking. • To develop our understanding of the masking process. • To use the masking process to probe into the workings of the visual system.

  4. Contrast Masking: Some basics • When one stimulus makes another one more difficult to see, masking is said to occur. • Vision scientists often use gratings or Gabor patches as stimuli because it turns out that any retinal image can be thought of as a large collection of these basic image components. Furthermore, the receptive fields of cortical simple cells in V1 are very similar to Gabor patches, making them ideal stimuli for these cells. • In the real world, stimuli (e.g. lines, bars, edges, gratings, Gabor patches) are rarely seen in isolation, so we might expect masking to occur for everyday stimuli, not just the gratings and Gabor patches used in the psychophysics lab. • We need to understand masking because this will: • contribute to our understanding of signal degradation within the observer • contribute to our understanding of the organization of the early visual pathway • lead to theories about what computations are being performed by early vision. (ie. It makes sense to understand what vision does, before we ask why it does it)

  5. Technical note • As many of you will know, contrast (C) is often expressed as a proportion (C= 0 to 1). In the case of Michelson contrast this is usually derived as follows: • Often this is converted to a percentage simply by multiplying by 100. • Another convention is to express contrast in dB. This is a very convenient way of handling data where multiples (equal log steps) are more important than differences (equal linear steps). In vision, like many things in life (e.g. pay rises, the stock market, inflation, computer memory) a log scale is often more appropriate (less misleading) than a linear scale. Contrast in dB is simply 10 times the log10 of contrast squared, or : • If C is expressed as a proportion, this gives a range of -40dB to 0dB for C=0.01 to C=1. This is sometimes denoted: (dB re 1). • If C is expressed in %, this gives a range of 0dB to 40 dB for C=1% to C=100%. This is sometimes denoted: (dB re 1%). Note that in both cases, the number after ‘re’ tells you the contrast level associated with 0dB. (But sometimes, the ‘re’ is omitted!) • Sometimes contrast ratios are expressed in dB. This can be done by calculating the contrast ratio and converting to dB or, more practically, by converting both contrast measures to dB and subtracting one form the other. • In general, a difference of 6dB is about a factor of 2. So, for example, a contrast (or ratio) of say –4dB is half that of 2dB. • I have tried to avoid using dB in this lecture but you will find that it is used in the literature (and in TSM2 and TSM3), so you should familiarize yourself with it here.

  6. PART 1 • Contrast discrimination and a multi-channel model of early spatial vision

  7. Pedestal masking(Legge & Foley, 1980) • In two-interval forced-choice (2IFC), the mask stimulus is presented in both of two test intervals (usually one after the other with a gap in between), and the target is presented in just one, chosen at random. The observer must decide which interval contained the target. • Legge & Foley (1980) used vertical sine-wave gratings as both masks and targets. • When the target and mask are the same (or very similar), as here, then the mask is sometimes called a pedestal. • Because of the pedestal the observer’s judgment is one of deciding which interval contained the higher overall contrast. Therefore, this task is sometimes called contrast discrimination • In the real world, we are making fine contrast judgements such as these when inspecting fine detail in images such as x-rays. • A schematic illustration of this arrangement is shown in the next slide.

  8. High pedestal contrast Low pedestal contrast The contrast increment (target contrast) is adjusted by a suitable psychophysical technique (e.g. a staircase procedure) to find the contrast discrimination threshold DC Contrast (%) Contrast (%) DC Mask + target Mask Mask Mask + target Time Time Typical arrangement for individual trials in a 2IFC contrast discrimination (pedestal masking) experiment Many trials (~100) are needed to measure DC at detection threshold for each pedestal contrast tested. In this example, the detection threshold is higher when the pedestal contrast is higher.

  9. Legge and Foley’s results • Legge & Foley (1980) plotted contrast discrimination thresholds as a function of mask contrast. • Their results are shown on the next slide. They have been replicated by other vision scientists many times since.

  10. Data replotted from Legge & Foley (1980)A Dipper function 4 2 masking 1 Target contrast, DC (%) 1/2 Detection threshold (mask contrast = 0%) 1/4 1/8 facilitation 1/4 1 4 16 32 Pedestal contrast (%) • At low pedestal contrasts, the mask actually makes it easier to detect the target. We say that facilitation occurs. • At higher contrasts, the mask interferes with detection and we say that masking occurs. • The region of facilitation is sometimes called the dipper, and the region of masking is sometimes called the dipper handle. • We will consider the model in the next few slides (the fit is actually taken from Meese, Hess & Williams, 2005).

  11. k (Eq 1) k (Eq 2) Large contrast increment, DC Small contrast Increment, DC The nonlinear contrast transducer: facilitation Square law We begin by considering only the low pedestal contrasts. Assume that the observer compares their internal contrast response across the two 2IFC intervals: respdiff = abs(respinterval1 - respinterval2). Further assume that to detect the target (contrast increment), this response difference (respdiff) must be greater than or equal to a constant k (We will consider the meaning of k later). If the observer’s contrast-response (y-axis) is given by a square law, then we can see that as pedestal contrast (x-axis) increases, the size of the contrast increment (DC) that is needed for the increment (target) to be detected will decrease. Thus, an accelerating contrast transducer (e.g. a square law) will produce facilitation. In fact, Legge & Foley found a slightly better fit to their results using Eq 2; a transducer that accelerates slightly more rapidly than a square law.

  12. (Eq 3) (Eq 4) Large contrast Increment, DC Small contrast increment, DC The nonlinear contrast transducer: masking Square root law k k Now consider the higher pedestal contrasts. Let’s suppose that the observer’s contrast-response is given by a square-root law in this range. Now we can see that as pedestal contrast (x-axis) increases, the size of the contrast increment (DC) that is needed for the increment to be detected will also increase. Thus, a compressing contrast transducer (eg. a square-root law) will produce masking. In fact, Legge & Foley found a slightly better fit to their results using Eq 4; a transducer that compresses slightly more strongly than a square root law.

  13. More generally: (Eq 6) The nonlinear contrast transducer: putting it all together (Eq 5) • Eq 5 combines equations 2 and 4 so that they cover the entire contrast range. When C << Z, then Z dominates the denominator and Eq 5 is approximated by C2.4. When C >> Z, then C2 dominates the denominator and Eq 5 is approximated by C2.4/C2 = C0.4. • Eq 6 is a more general version of Eq 5, where p > q. Note that raising Z to the power q is arbitrary (since Z is a constant), but Eq 6 is sometimes expressed this way for convenience. • Eq 5 and Eq 6 are sometimes called static nonlinear contrast transducers. • Exercise: You you should try plotting Eq 6 for different values of p and q with Z=10% (for example). Compare your results on double log and double linear axes. • You should now read Legge & Foley (1980). A review of the central parts of this work can be found in Meese, Hess & Williams (2005).

  14. The nonlinear contrast transducer: from single-cell physiology to systems psychophysics (Eq 7) • Eq 6 provides a good account of contrast discrimination, but curiously it is not a particularly good account of individual cortical cells; typical cortical cells saturate at high contrasts. (This is to say that once stimulus contrast reaches a critical point, further increases do not produce any further change in the activity of the cell). The form of the response-function for a typical cell is described by Eq 7, where p ~ 2. Note that when C >> Z, resp ~ 1 for all C. • How might the component level analysis (single-cell physiology) of Eq 7, be reconciled with the system level analysis of Eq 6 (psychophysics)? • It turns out that a set of cells (Eq 7) with a range of values of Z behave very much like Eq 6. (see Watson & Solomon, 1997) • There is evidence from single-cell physiology and psychophysics (Yu, Klein and Levi, 2004) that this is the way that the early visual system is arranged. • Exercise: What happens to the contrast-response function of Eq 7 on double-log co-ordinates as Z is increased?

  15. (Eq 9) Using Eq 7 to fit dipper functions (Eq 8) • Eq 7 and Eq 8 are often fit to dipper functions using computational methods (e.g. the simplex algorithm) to adjust the free parameters (p, q, Z, & k) in the model to provide the best fit to the data. • As we have described already, p and q control the acceleration of facilitation and the severity of masking. Conveniently, this model provides a straightforward indication of what the log-log slope of the dipper handle will be, which turns out to be given by Eq 9. Usually, the empirical log-log slope of the dipper handle is around 0.6, consistent with p-q = 2.4 - 2.0 = 0.4. • Z, k and p determine the overall sensitivity (y-intercept), and the lateral placement of the dipper region. • A fit of this model (Eq 7 & 8) is shown by the solid black curve in the earlier plot of Legge and Foley’s data.

  16. G(s) + The static contrast transducer and noise • The dominant view in the early 1980s was that each visual filter-element (RF) (see lecture MAG2) is followed by a static contrast-response nonlinearity. • An important feature of this model (that we have overlooked until now) is that independent zero-mean, unit-variance, Guassian noise (G(s)) is added to the output of each transducer. • Because the visual channels are noisy, their responses are perturbed from moment-to-moment (2IFC trial-to-trial and interval-to-interval). • In 2IFC detection and discrimination experiments, observers are assumed to select the stimulus interval that generated the greatest response. • Because the responses are noisy, the observer wont necessarily make the same decision for the same stimulus every time (hence the psychometric function). • Very small changes in contrast response cannot be detected because they are swamped by the noise. • It is important to include neural noise in our models, without it there could be no masking! Fortunately, we do not have to change our equations because it turns out that the standard deviation of the noise is proportional to the parameter k that appeared earlier. Spatial filter element Nonlinear transducer Decision STIMULUS

  17. Alternative accounts of the dipper function • We have explained the dipper function in terms of a nonlinear contrast transducer. • However, other possibilities have also been considered. • The details are beyond the scope of this lecture, but you should be aware that: • Stimulus uncertainty has been used to explain the region of facilitation (Pelli, 1985). • Multiplicative (signal-dependent) noise has been used to explain the dipper handle. • For the contrast detection of sine-wave gratings, current evidence favours the accelerating contrast transducer explanation of facilitation (Lu & Dosher, 2008; Meese & Summers, 2009), though uncertainty might contribute to some of this effect (Meese & Summers, 2009). • For the dipper handle, it remains unclear how the balance lies between response-compression and multiplicative noise (e.g. Georgeson & Meese, 2006).

  18. G(s) G(s) G(s) G(s) G(s) Spatial filter element Nonlinear transducer + + + + + The static contrast transducer and masking • On this model, any stimulus that is sufficiently potent to push a spatial channel into the compressive part of its contrast-response characteristic will also produce masking for any other stimulus that also passes through that channel. • Wilson and colleagues (and other groups) used this model to estimate the number of spatial channels and their spatial frequency and orientation bandwidths (Wilson et al, 1983; Phillips & Wilson, 1984). • Models involving multiple mechanisms tuned to a range of spatial frequencies and orientations are now the bedrock of most contemporary models of early vision. Multiple transducers Multiple mechanisms Decision based on all outputs Spatial filter element Nonlinear transducer Spatial filter element Nonlinear transducer Spatial filter element Nonlinear transducer Spatial filter element Nonlinear transducer STIMULUS Multiple noise sources Multiple spatial channels

  19. G(s) G(s) G(s) G(s) G(s) Spatial filter element Nonlinear transducer + + + + + The static contrast transducer and masking • How is the decision derived from the multiple spatial channels? • Clearly, the observer must adopt some form of pooling rule across the channels. • We will not consider the pooling rule in detail here (though see TSM2 & TSM3), but a widely used rule has been that of probability summation. • This is say that the outputs of the multiple spatial channels are not wired together, but that decisions are based upon the one that produces the biggest response difference across the two stimulus intervals (Quick, 1974; Pelli, 1985; Tyler & Chen, 2000). • You should now read Phillips and Wilson (1984). Multiple transducers Multiple mechanisms Decision based on all outputs Spatial filter element Nonlinear transducer Spatial filter element Nonlinear transducer Spatial filter element Nonlinear transducer Spatial filter element Nonlinear transducer STIMULUS Multiple noise sources Multiple spatial channels

  20. Wilson’s multi-channel model • In the 1980s, the multi-channel model derived by Wilson and colleagues enjoyed a great deal of success with a variety of different experiments, tasks and stimuli. • These include: • Spatial frequency discrimination (Wilson & Gelb, 1984). • Hyperacuity (Wilson, 1986). • Curvature discrimination (Wilson 1885; Wilson & Richards, 1989). • Eccentricity effects and contrast matching (Swanson & Wilson, 1985).

  21. PART 2 • Saturation and contrast gain control

  22. Technical note • We use the abbreviation XOS to stand for cross-orientation suppression. • We use the abbreviation XOM to stand for cross-orientation masking. • By ‘cross-orientation’ we mean a mask orientation that is very different from the target orientation, often the two are at right-angles to each other. • We use the term ‘masking’ to refer to phenomena. For example, if a mask raises detection threshold then we say masking has occurred. • We use the term ‘suppression’ to refer to a process. Thus, the process of suppression might be said to underlie the phenomenon of masking.

  23. Response saturation (single cells) • Most cortical cells saturate (Eq 7): Stylized diagram As contrast increases beyond this point there is no further increase in response Vertical receptive field (RF) Vertical grating Response (pps) Contrast (%)

  24. Vertical receptive field (RF) Several grating orientations, from left to right: 0°, 10°, 20°, 30°, 40°… Different curves show the responses of a single cortical cell to oriented gratings that are increasingly different from the cell’s RF Stylized diagram Response (pps) Contrast (%) Response saturation (single cells) • As the orientation of a grating is rotated away from that preferred by the cell, the cell becomes less responsive to the grating. • Thus, we might expect the following set of contrast-response functions for a range of stimulus orientations.

  25. Different curves show the responses of different cortical cells with RFs that are progressively rotated away from the vertical grating Vertical grating Several RF orientations, from left to right: 0°, 10°, 20°, 30°, 40°… Stylized diagram Response (pps) Contrast (%) Response saturation (many cells) • Or another way to think of it is in terms of the responses of cells with differently oriented RFs to a single (say, vertical) grating.

  26. Population codes (many cells) • It is widely accepted that vision encodes basic image properties using population codes. These involve a distribution of activity across a population of cells such as a hypercolumn. (For some background on this see ‘Introduction to spatial vision I’ in the supporting materials folder for this module. • But we might expect that these codes would be seriously compromised by response saturation. • For stimulus contrasts that drive cells beyond the saturation point, we might expect the population codes to look like that shown on the next slide.

  27. The population code is compromised • This figure shows the response distribution across orientation derived from the earlier slide for a high contrast grating. The response distribution has a flat top; there is no single cell that responds most strongly to the vertical stimulus. • This would be a poor basis for coding orientation. Note that the distribution is not peaked, but flat! Stylized diagram High contrast vertical grating Several RF orientations, from left to right: -90°, -60°, -30°, 0°, 30°… Response (pps) RF orientation (deg)

  28. Different curves show the responses of different cortical cells with RFs that are progressively rotated from the vertical grating Vertical grating Several RF orientations: 0°, 10°, 20°, 30°, 40°… Stylized diagram Response (pps) Contrast (%) Contrast normalization of cellular activity • But the flat response distributions are not in fact what we find in V1. • Cortical cells don’t saturate at a fixed response level. • Instead, they tend to saturate at a fixed contrast, as shown below. • This means that population codes are preserved across the full contrast range.

  29. Contrast normalization (single cells) • This means that the response level at which a cell saturates depends on the stimulus: the more closely matched a cell is for a particular stimulus feature (e.g. orientation) then the higher will be the response level at which it saturates. How can this remarkable feat be achieved? • Heeger (1992) pointed out that this can be done by dividing the response of each cell by the activity of all of the other cells in the general vicinity (e.g. the hypercolumn). This normalizes the response of each cell to that of the entire pool, controlling the contrast response of each cell (contrast gain control). The pool of cells that contribute to the normalization process (divisive inhibition) is sometimes called the gain pool. Contrast responses of several different cells. Note that they all saturate, but the response functions do not converge. This means that the population code will be preserved at high contrasts. Vertical grating Several RF orientations: 0°, 10°, 20°, 30°, 40°… Stylized diagram Response (pps) Contrast (%)

  30. Heeger’s cortical model The cortical circuitry envisaged by Heeger is shown here. Note that the arrangement involves pooling the outputs of a population of complex cells (the gain pool) and feeding this signal back to the outputs of simple cells, where it acts divisively. Feedback circuits like this can be tricky to implement in computational models. Instead, most workers develop closely related equations that operate in a feed-forward manner. These are much easier to work with. Multiple phase-selective simple cells (spatial filter-elements) Inhibition Gain pool Complex cell • You should now read Heeger (1992) or, alternatively, Albrecht & Geisler (1991) who present a similar model and arguments. Geisler & Albrecht (1992) is short and accessible! resp Adapted from Heeger (1992), Visual Neuroscience

  31. Contrast gain control and cross-orientation suppression • Heeger’s model of contrast gain control is successful in several respects: • It predicts that cortical cells will saturate. • This is a normal part of the operation of visual cortex, but there is some evidence that for some people with photosensitive epilepsy this does not happen (Porciatti et al, 2000). • It preserves population codes across the full range of stimulus contrasts. • It predicts cross-orientation suppression. • This is the finding that when a cell responds strongly to its preferred grating, it is suppressed when an orthogonal grating is superimposed (Morrone et al, 1982; Bonds, 1989).

  32. Cross-orientation suppression and psychophysics • If cross-orientation suppression happens at the single-cell level, shouldn’t we expect to see psychophysical evidence for this? • In an elegant contrast masking study, Foley (1994) addressed this with two experiments.

  33. Foley (1994): Experiment 1, single mask As we have seen before, facilitation occurs when the mask and target are in the same channel (the mask drives the response up the accelerating part of the transducer). We see here that there is no facilitation when the mask and target have very different orientations, yet masking still occurs. Therefore, the masking cannot be ‘within-channel’. It must be cross-channel: ie. cross-orientation suppression. Stylized results • The target stimulus was a vertical Gabor patch. Masks were gratings at several orientations. Target orientation = 0° Masking occurs for all mask orientations Mask orientation 0° 45° Facilitation occurs only when the mask and target have the same (or similar) orientations and spatial frequencies Log target contrast 90° Log mask contrast

  34. Dipper functions are preserved in the presence of a mask, but to a first approximation they are shifted up and to the right (on double log coordinates) Foley (1994): Experiment 2, dual masking Stylized results As we know, facilitation occurs when the pedestal drives the target channel up the accelerating part of the transducer. The cross-oriented masks cause masking (see how the left hand limb of the masking functions is raised), but facilitation by the pedestal remains. Therefore, the cross-oriented mask cannot be driving the target channel through the accelerating transducer, so it must be producing masking by some other means: cross-channel suppression. • In this experiment there were two masks, one a pedestal, the other a cross-oriented mask fixed at a fairly high contrast. The target stimulus was a vertical Gabor patch. A Gabor patch was also used as the pedestal. Masks were gratings at several orientations. Target and pedestal orientation = 0° Mask orientation 45° Log target contrast 90° No mask Log pedestal contrast

  35. Foley’s (1994) model of masking (Eq 10) • Inspired by the work of Heeger, the essence of Foley’s model is given by equation 10, where Ct is the target contrast Cm,i is the contrast of the i-th mask component that is not a pedestal, and wi is the weight of that component in the contrast gain pool. This model provided a good fit to the results from both of Foley’s experiments (see also Holmes & Meese, 2004). • This is the same as the Legge and Foley contrast transducer (Eq 6) but with an additional term on the denominator that contributes only suppression (the contrast gain pool), no excitation. This adds one free parameter (wi) to the Legge & Foley model per additional mask component. • Note that for pedestal masking (with no additional masks) this equation is identical to the Legge & Foley equation, but its interpretation is different. No longer do we have a static S-shaped transducer on the output of each detection mechanism, but a dynamic gain control mechanism that includes divisive suppression from itself (self-suppression) and other mechanisms in the gain pool. • Eq 10 has become the standard contrast gain control equation. This model has been set in a filter-based image processing model by Watson and Solomon (1989). • You should now read Foley (1994).

  36. Model behaviours Preferred orientation of ith mechanism Preferred orientation of ith mechanism Stimulus orientation = vertical Stimulus orientation = vertical Foley (1994) • Here we take the opportunity to illustrate the behaviours of two different psychophysical models for a vertical sine-wave grating. • These models were designed to describe the entire behaving organism, but we can consider what happens when the model equations are applied to individual mechanisms (e.g. Watson & Solomon, 1997). • Each panel shows the contrast responses of i = 1 to 5 different mechanisms with their preferred orientations shown in the legend. For simplicity, we assume that the sensitivity (Ai) of each mechanism is: 1, 0.8, 0.6, 0.4, 0.2. (i.e. we assume a linear drop in sensitivity with orientation difference between the stimulus and the mechanism). The saturation constant Z = 10. • In the Foley (1994) model we make a further assumption that the weight of suppression in the gain pool is unity for each contributing mechanism. This is probably not strictly true (see Foley, 1994), but serves to keep things simple and to illustrate our point. The model equations are used in conjunction with Eq 8 and are shown above each figure (derived from Eqs 5 & 10). Note that the factor of 2.18 in the denominator derives simply from the sum of the squares of the sensitivities of the five different mechanisms contributing to the gain pool (12 + 0.82 + 0.62 + 0.42 + 0.22). To simplify comparisons across the two models, we have also used this same factor (2.18) on the denominator of the Legge & Foley model (this does not affect its basic form). • Unlike the single-cell example, the Legge & Foley functions do not converge (because they do not saturate). Nevertheless, the Foley functions splay out much more widely, indicating that this model produces a more refined orientation code. Legge & Foley (1980)

  37. Part 3 • Cross-orientation suppression and surround suppression

  38. But what about the ‘within-channel’ masking story? • As we have seen, psychophysics and single cell physiology both indicate a process of cross-orientation suppression. • But the masking models of the 1980s (e.g. Wilson and colleagues) were all based on the idea that masking was a ‘within-channel’ process. Masking wasn’t supposed to happen if the mask and target were in completely different channels. • How did two different stories for masking manage to evolve?

  39. Cross-orientation masking depends on spatiotemporal frequency • Meese & Holmes (2007) measured cross-orientation masking (XOM) functions for a wide range of spatial and temporal frequencies, where the mask and target always had the same spatiotemporal frequency. • They found that the weight of suppression (XOS) increased with an increase in temporal frequency (TF) and a decrease in spatial frequency (SF). • The early work that had not found XOM had tended to use high SF and/or low TF, whereas later work that found XOM had used low SF and/or high TF. • Therefore, two different stories emerged because the two different types of stimuli revealed different properties of the visual system. • The ratio: TF/SF, gives the scalar quantity speed. • Meese & Holmes found that the weight of XOS was a power function of speed (i.e. speed, raised to a power). • You should now read Meese & Holmes (2007).

  40. Mfilter’’ Pfilter’’ SF (c/deg) Frequency tuning of a pair of filters How can the brain compute speed? The arguments here are derived from Harris (1986). They are also presented in Meese & Baker (2009) Pfilter’ Mfilter’ • The ratio of a pair of temporal frequency (TF) filters such as those on the left will be exactly proportional to TF. That is: • Similarly, a pair of spatial frequency (SF) filters such as those on the right will be exactly proportional to SF. That is: • Now suppose that the TF and SF characteristics are the spatiotemporal frequency responses of a single pair of filters such that Mfilter = Mfilter’ x Mfilter’’ and Pfilter = Pfilter’ x Mfilter’’. Then, because speed = TF/SF it is easy to show that: Exercise: demonstrate to yourself that this is so. Response TF (c/deg)

  41. Magnocellular and Parvocellular cells • The magnocellular and parvocellular streams of processing that begin in the retina and remain distinct in the LGN have spatiotemporal tuning functions similar to those shown in the previous slide. • Thus, in principle, the ratio of these mechanisms could be used to modulate the weight of XOS, consistent with the results of Meese & Holmes (2007). • By the way, note that the proposal here refers to the scalar-quantity speed. The vector-quantity velocity (direction and speed) is probably computed further up the processing pathway, in area MT. That evidence does not form part of the lecture here.

  42. So where does this leave the within-channel analysis of the 1980s? • The general consensus remains that vision is a multi-channel system (this conclusion does not rely solely on evidence from masking. See MAG2). • However, estimates of mechanism bandwidth derived from masking experiments must be treated with caution if the analysis does not include a component of XOS.

  43. Where does all this suppression take place in the nervous system? • The original model of Heeger (1992) proposed that suppression is a cortical phenomenon. • But recently, single-cell evidence suggests that the process might arise earlier, in the LGN or retina. • Details and species differences (cat, monkey, human) still remain unclear, though it seems likely that XOS arises to some extent in the subcortex (retina and/or LGN). • We shall now consider the evidence for this.

  44. Excitatory centre (+) Inhibitory surround (-) Suppressive field (÷) Evidence for an early locus of XOS • Freeman et al (2002) made several important observations of the visual system of the cat which point to a subcortical locus for XOS. • First, it is known that the LGN responds to higher temporal frequencies than the cortex (i.e. it prevents very high TFs from reaching the cortex through the primary visual pathway). Nevertheless, Freeman et al found that TFs higher than those expected to drive cortical cells provided strong XOS in cortical cells. • Second, XOS was found to be immune from adaptation to the mask. As cortical cells are known to be desensitized by adaptation but most LGN cells are not, this points to a subcortical locus. • Bonin et al (2005) recorded from LGN neurons in cat to investigate the putative suppressive process. • In addition to the well-known concentric centre-surround construction of LGN cells they identified a suppressive field that responded to a wider range of spatial and temporal frequencies than those that would drive the cell. • You should now read Freeman et al (2002).

  45. One or two sites for XOS? • The evidence for this is not definitive, but a circumstantial case can be made for two sites (though we will revisit this more closely in TSM2) • Single-cell physiology indicates a subcortical locus (Freeman et al, 2002). • Subcortical cells are not orientation tuned • XOM involves suppression from orthogonal masks (Meese & Holmes, 2007). • There is an orientation tuned component to masking (Phillips and Wilson, 1984) and orientation tuning arises in the cortex. • Perhaps there is an isotropic component that accounts for XOS and is subcortical, followed by an orientation dependent cortical effect.

  46. Surround suppression • In primary visual cortex it is well known that many cells have an influential region that surrounds the classical receptive field (CRF) (e.g. DeAngelis et al, 1994). Details vary (and are still being uncovered), but in the main this region provides suppressive modulation for stimuli that excite the CRF and is tuned to the same orientation as the CRF. It does not influence the behaviour of the cell in the absence of stimulation within the CRF. This process is now referred to as surround-suppression and is probably the process that led Hubel and Wiesel to classify some of their complex cells as hypercomplex cells. Note though, surround suppression is found for both simple and complex cells. • As surround suppression is orientation tuned it presumably originates in the cortex • Psychophysical masking effects have also been found from the surround, though they are much more potent in the periphery (Xing & Heeger, 2000; Snowden & Hammett, 1998).

  47. Surround suppression and XOS • Petrov et al (2005) performed careful psychophysical experiments to investigate the relation between (superimposed) XOS and (parallel) surround suppression in the periphery. (See next slide for example stimuli from Challinor et al (2007) who performed similar experiments.) There were two key results: • When a cross-oriented surround was superimposed on a parallel surround, the level of masking diminished, suggesting that XOS precedes surround suppression. • When a parallel surround was added to a cross-oriented mask, the level of masking increased, suggesting that the two processes followed in sequence. • Although this result does not dictate the physiological loci, it is consistent with an early subcortical stage for XOS and a later, cortical stage for surround suppression. • They also found that surround suppression saturated with contrast. This means that the level of masking increased up to a certain mask contrast, but then didn’t become any more severe for further increases in mask contrast (This is relevant to TSM3) • You should now read Petrov et al (2005). • Additional note: The story of surround suppression is far from complete, and other work (both psychophysics and single-cell physiology) suggests that it might involve two distinct processes. We will not consider this work here.

  48. Red indicates strength of masking in fovea Green indicates strength of masking in periphery Strong Strong Very weak Very Weak Strong Medium Typical stimuli and summary of contrast detection results These stimuli were used by Challinor, Meese & Summers (2007) in an experiment closely related to Petrov et al (2005) Target + superimposed cross-oriented mask Target Target + surrounding cross-oriented mask Target + surrounding parallel mask

  49. Surround suppression and XOS • Summary of results • In the fovea, there is little evidence of surround suppression at detection threshold. The superimposed suppressive field is broadly tuned to SF, TF and orientation. It is either the same size as the CRF (Bonin et al, 2005), or possibly a little larger (Challinor et al, 2007). • In the periphery, the arrangement in the fovea is augmented by a further process of surround suppression. Whether this is the same process as that which produces self-suppression (compression and saturation), and extends across the CRF (as depicted) is not yet clear.

  50. The purpose of it all: XOS • We have seen Heeger’s proposal already, that XOS is part of a gain control system to protect population codes. • Another suggestion is that XOS might be involved in achieving an efficient coding system by removing redundancy (Schwartz & Simoncelli, 2001). • A further possibility is that it contributes to the refinement of orientation tuning in the cortex (eg. Ringach et al, 2001). • But why should any of these goals involve the distinct speed dependency found by Meese & Holmes (2007)?

More Related