500 likes | 585 Views
Attention and Selection: Part 1. This seminar is about how cognition (especially visual perception) connects with the world.
E N D
This seminar is about how cognition (especially visual perception) connects with the world • The central concept will be the notion of “picking out” or selectingand the usual mechanism that is appealed to in explaining this selection is attention (sometimes called focal attention or selective attention). • Why do we need to select? This is a nontrivial question and we will consider several different answers: • We need to select because we can’t process all the information available. This is the resource-limitation reason. • We need to select because of the way relevant information in the world is packaged. It gives rise to the Binding Problem • We need to select because certain patterns cannot be computed without first marking certain special elements of a scene • We need to select because selection is the first line of contact between the mind and the world – and precedes all conceptualizing and encoding
Attention in Psychology: Historical Background • Attention was one of the first concepts to appear in Psychology texts (ca 1730) – e.g., Ebbinghaus, Titchener, … • Early discussions (Hatfield, 1998) focused on properties such as • Narrowing of range of sensitivity (Aristotle, 4th century BC) • Active Directing (Lucretius, 1st century AD) • Involuntary shifts (Hippo, 400 AD) • Clarity (Buridan, 14th century) • Fixation over time (Descartes, 17th century) • Laws of Attention (Titchener, 1908) • Independence of clarity and other attributes (e.g., loudness) • Law of two levels of clarity (focus vs non-focus) • Law of accommodation (cuing) and law of Inertia (disengagement) • Law of prior entry (attended stimuli have temporal priority) • All the above phenomena (William James, early 1900s)
The functions of focal attention • A central notion in the present analysis is the notion of “picking out” or selecting. The usual mechanism that is appealed to in explaining perceptual selection is attention (sometimes called focal attention or selective attention). • Why must we select anyway?This is a rarely asked question to which there are several answers: • We need to select because we can’t process all the information available. This is the resource-limitation reason. <But in what ways is it limited? Along what dimensions?> • We need to select because certain patterns cannot be computed without first marking certain special elements of a scene • We need to select because of the way relevant information in the world is packaged (Strawson’s Collecting Principles). It is a response to the Binding Problem • We need to select because selection is a consequence of the first line of causal contact between mind and world: it precedes all conceptualizing and predicating.
Attention and Selection I will first concentrate on the Selection or Filtering aspects of attention. I will ask: • Why do we need to select anyway? • Because our processing capacity is limited? • The Big Question: In what way is it limited? (Miller, 1957) • We will return to this core question after some preliminaries on the early study of attention as selection and the filter theory. • On what basis do we select? Some alternatives: • We select according to what is important to us (e.g., affordances) • We select what can be described physically (i.e., “channels”) • We select based on what can be encoded without accessing LTM • We “pick out” things to which we subsequently attach concepts: i.e., we pick out objects (or regions?) • What happens to what we have not selected? A largely unsolved mystery (though in some cases there are plausible answers).
Big Question #1: Why do we need to select information? Because capacity is limited. Along which dimensions is human information processing capacity limited? • Channel capacity: Shannon-Hartley Theorem • Capacity measured in some sort of “chunks” (Miller) • Capacity measured in terms of the number of arguments that can be simultaneously bound to cognitive routines (Newell) • To what things in the world can the arguments of visual predicates be bound?
Amount of information in terms of the Information-theoretic measure (entropy) • Amount of information in a signal depends on how much one’s estimate of the probability of events is changed by the signal.H = -pi Log2 (pi) … information in bits • “One of by land, two if by sea” contains one bit of information if the two possibilities were equally likely, less if they were not (e.g., if one was twice as likely as the other the information in the message would be ⅓ Log ⅓ + ⅔ Log ⅔ = 0.92 bits <using Excel>) • The amount of information transmitted depends on the potential amount of information in the message and the amount of correlation between message sent and message received. So information transmitted is a type of I-O correlation measure. • The information measure is an “ideal receiver” or competence measure. It is the maximum information that could be transmitted, given the statistical properties of messages, assuming that the sender and receiver know the code.
Information transmitted in a typical absolute judgment experiment • Information transmitted in an experiment in which subjects were presented with tones drawn from a known practiced set (of a given size, which determines the value of input information) and had to name the tones from a learned name set. • The information transmitted was always around 2.5 bits or an average of 6.25 equiprobable alternatives!
Short term memory capacity is independent of the amount of information per item! This table shows the STM capacity as a function of the type of item. The number of items recalled remains roughly constant (at 7±2) while the amount of information recalled increases rapidly as the information carried by each item increases.
Why can we retain different amounts of information just by using a different encoding vocabulary? • Answer: The architecture of the cognitive system has the property that it can deal with a fixed maximum number of items, regardless of what the items are. • This property can be exploited to get around the bottleneck of the short-term memory. We do this by recoding the input into a smaller number of discrete units, called chunks. • There is also evidence that it takes additional time to encode and decode chunks, so the recoding technique is a case of time-capacity tradeoff or what is known in CS as a compute-vs-store tradeoff. • Allan Newell’s novel model to account for the time taken in the Sternberg memory scan experiment attributes the observed RT to encoding or chunking.
Example of the use of chunking • To recall a string of binary bits – e.g., 00101110101110110101001 • People can recall a string of about 8 binary integers. If they learn a binary encoding rule (000, 011, 102, 113) they can recall about 8 such chunks or 18 binary bits. If they learn a 3:1 chunking rule (called the Octal number system) they can recall a 24 bit string, etc
Early studies: Colin Cherry’s “Cocktail Party Problem” • What determines how well you can select one conversation among several? Why are we so good at it? • The more controlled version of this study used dichotic presentations – one “channel” per ear. • Cherry found that when attention is fully occupied in selecting information from one ear (through use of the “shadowing” task), almost nothing is noticed in the “rejected” ear (only if it was not speech). • More careful observations shows this was not quite true • Change in spectral properties (pitch) is noticed • You are likely to notice your name spoken • Even meaning is extracted, as shown by involuntary ear switching and disambiguating effect of rejected channel content
Visual analogues illustrating the two-channel selection problem In these examples you are to read only the text in shadows and ignore the rest. Read as quickly as you can and when you are finished, close your eyes or look away from the text.
Visual analogue #1 illustrating the two-channel selection problem In performing an experiment like this one on man attention car it house is boy critically hat important she that candy the old material horse that tree is pen being phone read cow by book the hot subject tape for pin the stand relevant view task sky be read cohesive man and car gramatically house complete boy but hat without shoe either candy being horse so tree easy pen that phone full cow attention book is hot not tape required pin in stand order view to sky read red it nor too difficult.
Visual analogue #2 illustrating the two-channel selection problem It is important that the subject man be car pushed slightly boy beyond that his normal limits horse of tree competence open for be only in phone this cow way book can hot one tape be pin certain stand that snaps he with is his paying teeth attention in to the the empty relevant air task and rather minimal than to the attention candy to horse the tree second or peripheral task.
Broadbent’s Filter Theory Rehearsal loop Effectors Motor planner Senses Filter Limited Capacity Channel Very Short Term Store Store of conditional probabilities of past events (in LTM) Broadbent, D. E. (1958). Perception and Communication. London: Pergamon Press.
Stroop EffectBaseline: Name the colors of the ink
Stroop Effect in English Name the colors of the ink REDGREENBLUEPINKBROWNORANGEGREENPINKREDYELLOWGREENYELLOWREDBROWNREDBLUEBROWNGREENREDORANGEREDBLUEYELLOWPINKORANGE GREENBLUEBROWNPINKREDYELLOWGREENYELLOWREDBROWNPINKREDYELLOWGREENYELLOWREDPINKORANGEGREENBLUE BROWNPINKREDYELLOWGREENYELLOWREDBROWNRED BLUEGREENBROWN YELLOWGREENYELLOWREDPINKORANGEGREENREDBLUEBROWNGREENREDORANGEREDBLUEYELLOWYELLOWGREENYELLOWREDBROWNPINKREDYELLOWGREENPINKREDYELLOW
Stroop Effect in PortugueseName the colors of the ink VERMELHOVERDEAZULMARROMROSAALARANJADOVERDEROSAVERMELHOAMARELOVERDEAMARELOVERMELHOMARROMVERMELHOAZULMARROMVERDEVERMELHOALARANJADOVERMELHOAZULAMARELOROSAALARANJADO VERDEAZULMARROMROSAVERMELHOAMARELOVERDEAMARELOVERMELHOMARROMROSAVERMELHOAMARELOVERDEAMARELOVERMELHOROSAALARANJADOVERDEAZUL MARROMROSAVERMELHOAMARELOVERDEAMARELOVERMELHOBROWNVERMELHO AZULMARROM VERDEAMARELOVERDEAMARELOVERMELHOROSAALARANJADOVERDEVERMELHOAZULMARROMVERDEVERMELHOALARANJADOVERMELHOAZUL
Degree of Interference of the attended message, as well as its interpretation, shows that the rejected message was understood • Moral: Although the rejected channel appears to be rejected, it is being processed enough to understand the words! • The semantic interpretation of attended message depends on the meaning content of the rejected message. Subjects were asked to paraphrase the attended message in: • Channel 1 (attended): “I think I will go down to the bank but I will be back for dinner” • Channel 2 (rejected): “The election results will depend on the value of the dollar against the Euro and on the state of the domestic economy” • OR Channel 2 (rejected): “The rain has resulted in erosion by the overflowing river”(Lackner, J. R., & Garrett, M. F. (1972). Resolving ambiguity: Effects of biasing context in the unattended ear. Cognition, 1, 359-372.)
From here on I will focus on the special case of visual attention • Visual working memory and visual selection • What is the nature of the input, storage and information processing limits in vision?
Studies of the capacity of Visual Working Memory(Luck & Vogel, 1997) • People appear to be able to retain about 4 properties of an object (4 colors, 4 shapes, 4 orientations, etc) over a short time • People can also retain the identity of 4 objects for a short time. • Luck and Vogel found that as long as there are not more than 4 properties per object, people can retain large numbers of properties when the properties are on different objects (a phenomenon that is reminiscent of Miller’s “chunking hypothesis” except the chunks are visual objects).
What does visual attention select? (What is the basis for selection?) • If visual attention is selection, what does it select? • An obvious answer is places. We can select places by moving our eyesso our gaze lands on different places. • When places are selected, are they selected automatically? • Must we always move our eyes to change what we attend to? • Studies of Covert Attention-Movement: Posner (1980). • How does attention switch from one place to another? • Is it always the case that we attend to places? Can we attend to any other property? Can we select on the basis of color, depth, spatial frequency, affordances, or the property a painting has of having been painted by Da Vinci (A property to which Bernard Berenson was able to attend extremely well). cf Gibson
What else can visual attention select? • Regions? Can we control the size and shape of the region that is selected, or is selection always punctate and data-driven? • Zoom Lens model of spatial attention (Eriksen & St James, 1986). • Controlling where attention moves: • Is this automatic or voluntary? • How do we know where to direct our attention? How do we specify a location prior to attending to it? • We need a way to specify where or what prior to attending to it! • Keep this conundrum in mind – we will return to it later! • How narrowly can we focus our attention? Can we make it pick out one out of several objects? • Are there special conditions under which we are able to pick out individual things? We will return to “attentional resolution” or the minimum spacing for selecting individual things.
Covert movement of attention Example of an experiment using a cue-validity paradigm for showing that the locus of attention moves without eye movements and for estimating its speed.Posner, M. I. (1980). Orienting of Attention. Quarterly Journal of Experimental Psychology, 32, 3-25.
Exogenous vs endogenous control of attention • In the Posner paradigm illustrated in the last slide, attention was automatically seized by the onset of a luminance change (exogenous attention allocation). Other experiments show that this can also be done under voluntary (endogenous) control – e.g., by providing a cue for which direction to move attention. • Posner, Tsal and others showed that when attention goes from A to B, intermediate locations are maximally sensitive to detecting a signal at intermediate times. • Although this suggests a continuously moving “spotlight” of attention, there are other models that claim that this results from attentional activation that fades at the starting place and grows at the target place, creating an overlap in intermediate locations (Sperling). • Both exogenous and endogenous control produces movement of attention, but they differ in some of their effects. • Endogenously moved attention does not lead to Inhibition of Return • Endogenous controlled movement does not appear to affect detection sensitivity, but it does affect discrimination • Endogenous controlled effects are stronger and appear earlier
Extension of Posner’s demonstration of attention switch Does the improved detection in intermediate locations entail that the “spotlight of attention” moves continuously through empty space?
Sperling & Weichselgartner (1995) “Episodic” or Quantal Theory of Attention switching Assumes a quantal “shift” in attention in which the spotlight pointed at location -2 is extinguished and, simultaneously, the spotlight at location +2 is turned on. Because extinction and onset take a measurable amount of time, there is a brief period when the spotlights partially illuminate both locations simultaneously.
Review of the basis for selection • If attention serves as a gatekeeper between the world and visual cognition, then we must ask: On what (properties, things) does it base its selection? • We have already seen that attention appears to care about certain kinds of bundles of information that Miller called “chunks”. But what do chunks correspond to in vision? • A visual “chunk” is given by the way the world is “parsed” into things and non-things. What counts as a “thing” is an empirical question still being investigated, but whatever it comes out to be exactly, it seems to be a precursor of real objects
The object-based view of attention selection • When we discuss some of the reasons for attention and the mechanisms involved I will propose that there are good reasons for supposing that attention attaches itself to objects rather than locations
Selecting embedded shapes • It seems we can attend to entire (or at least large parts of) random shapes embedded in other random shapes and recall the attended ones to some degree • But we fail to recall the shapes we did not attend to.
We can select a shape even when it is intertwined among other similar shapes Are there items on the left and on the right that have the same shape? On a surprise test at the end, subjects were not able to recall shapes that had been present but had not been attended in the task (Rock & Gutman, 1981)
Inattentional Blindness • The background task is to report which of two arms of the + is longer. One critical trial per subject, after about 3,4 background trials. Another “critical” trial presented as a divided attention control. • 25% of subjects failed to see the square when it was presented in the parafovea (2° from fixation). • But 65% failed to see it when it was at fixation! • When the background task cross was made 10% as large, Inattentional Blindness increased from 25% to 66%. • Inattentional Blindness may be due to concentration of attention at the primary task, or by the inhibition of non-attended regions or objects.Mack, A., & Rock, I. (1998). Inattentional blindness. Cambridge, MA: MIT Press.
Does inhibition play a role? Noticing odd stimuli when their location is pre-marked
Evidence of negative attentionor inhibition? • The increase in inattentional blindness when there are markers may be due to the inhibition of markers (since they are irrelevant to the primary task). • Often attending to one set of things results in the active inhibition of things not attended (since they may be potentially disruptive)
Other examples of attentionally induced inhibition • Negative Priming (Treisman & DeShepper, 1996). • Is there a figure on the right that is the same as the figure on the left? • When the figure on the left is one that had appeared as an ignored figure on the right, RT is long and accuracy poor. • This “negative priming” effect persisted over 200 intervening trials and lasted for a month!
Inhibition of return • If we vary the time between the cue and target in a modified Posner paradigm, we find that when the Cue-Target-Onset-Asynchrony (CTOA) gets to around 300-900 ms, reaction time to the target begins to increase. This is called Inhibition-of-return (Klein, 2000). • To get this effect we actually have to attract attention to the target location and then attract it back to the origin. IOR is one of many examples of an inhibition effect being produced by attention. Slowed detection due to Inhibition of Return
Inhibition of return aids “foraging” and search The observer fixates a small black disk at the center of the empty screen where the image to be searched (a picture from the Where’s Waldo series) is presented. After several saccades (illustrated in the top figure) as the observer search for Waldo, the fixation stimulus reappeared at a specified location (black circles on the left) while the search stimulus remained (1b) or was removed (1c). The task was to foveate the target disk as rapidly as possible. Arrows illustrate a saccade to the target in the near (0°) condition. In one experiment this target was presented at the most recently fixated location or other locations around an equi-eccentric circle. Shown in Figure 1(d) are the data from the experiment in which the penultimate fixation (labeled 0°) was used to generate the target location (two back). Saccadic reaction time when the target was located by the first post-saccade (as in b and c) increased with increases in the target’s proximity to a previously fixated place, but only when the scene was maintained.
Exploring the limits of attention and the units over which selection operates • It appears that the human information-processing bottleneck cannot be expressed perspicuously in terms of information-theoretic measures, nor can it be specified in physical parameters (e.g., in terms of locations or spatio-temporal regions), although such measures often do capture important aspects of attention (e.g., visual attention often moves continuously through space). • But there are other possible ways one might consider expressing the limits of attention. • Over the past 25 years evidence has been accumulating that the human attention system is, at least in part, tuned to individual objects in the world. This would certainly make sense from an evolutionary perspective. But what does this mean?
Summary of what we have so far • We saw that visual representations must be conceptual for empirical and logical reasons • The empirical reasons derive in part from the nature of generalizations and errors of recall • The logical reason is that vision must interact with thoughts and lead to new beliefs and plans of action • We saw that a large part of vision is cognitively impenetrable and encapsulated and that cognition can only be brought to bear prior to or after its automatic operation: As attention or interpretation. • We saw that there are good design reasons for vision to be selective and we considered several bases for selection. But selection has turned out to be a more difficult question than appeared – it consists in more that just filtering information to a more manageable amount, but it is also required for other reasons. These other reasons make it plausible that selection should operate over objects rather than bits of information in the Shannon sense.
The increasingly important role played by objects in studies of visual attention • Miller’s ‘Magic Number 7’ has continued to haunt us even beyond studies of short-term memory (STM). • There is a limitation in visual information processing that is beyond the limitation of acuity and of channel capacity: The perceptual system is limited in what it can individuate and how many of these individuals it can deal with at one time. • The capacity to individuate is different from memory capacity and discrimination capacity. • This notion of individuating and of individuals may be related to Miller’s “chunks”, but it has a special role in vision which we will explore later. A chunk is a relatively ill-defined notion in general whereas the units of visual attention are better thought of as objects