2010 Public Lecture – Allison Sekuler

Allison Sekuler

McMaster University

Allison Sekuler is Canada Research Chair in Cognitive Neuroscience, Professor of Psychology, Neuroscience & Behaviour, and Associate Vice-President and Dean of Graduate Studies at McMaster University, in Hamilton, Ontario. She received her B.A. with a joint degree in Mathematics and Psychology from Pomona College, and her Ph.D. in Psychology from the University of California, Berkeley. An outstanding teacher and internationally-recognized researcher, Dr. Sekuler has been recognized as an Alexander von Humboldt fellow and an Ontario Distinguished Researcher, and she was named one of Canada’s “Leaders of Tomorrow” in 2004. Her primary areas of research are vision science and cognitive neuroscience. Prof. Sekuler has served on numerous national and international boards in support of science, and is a former Treasurer and Member of the Board of Directors for the Vision Sciences Society. She is a passionate advocate for science outreach, frequently appearing in the media to discuss scientific issues, and currently representing the scientific community on the national Steering Committee for the Science Media Centre of Canada.

Vision and the Amazing, Changing, Aging Brain

Saturday, May 8, 2010, 10:00 – 11:30 am, Renaissance Academy of Florida Gulf Coast University

The “greying population” is the fastest growing group in North America. We know relatively little, however, about how aging affects critical functions such as vision and neural processing. For a long time, it was assumed that once we passed a certain age, the brain was essentially fixed, and could only deteriorate. But recent research shows that although aging leads to declines in some abilities, other abilities are spared and may even improve. This lecture will discuss the trade-offs in visual and neural processing that occur with age, and provide evidence that we really can teach older brains new tricks.

About the VSS Public Lecture

The annual public lecture represents the mission and commitment of the Vision Sciences Society to promote progress in understanding vision, and its relation to cognition, action and the brain. Education is basic to our science, and as scientists we are obliged to communicate the results of our work, not only to our professional colleagues but to the broader public. This lecture is part of our effort to give back to the community that supports us.

Jointly sponsored by VSS and the Renaissance Academy of Florida Gulf Coast University.

2010 Student Travel Awards

Jorge Almeida
Harvard University
Advisor: Ken Nakayama
Paola Binda
San Raffaele University
Advisors: M. Concetta Morrone and David C. Burr
Timothy Brady
Massachusetts Institute of Technology
Advisor: Aude Oliva
Thaddeus B. Czuba
University of Texas at Austin
Advisors: Lawrence K. Cormack and Alexander C. Huk
Maartje Cathelijne de Jong
Utrecht University
Advisors: Casper Erkelens and Raymond van Ee
Jeremy Freeman
New York University
Advisors: Eero P. Simoncelli and David J. Heeger
Koen Haak
University of Groningen
Advisor: Frans W. Cornelissen
Donatas Jonikaitis
Ludwid Maximilians University
Advisor: Heiner Deubel
Urs Kleinholdermann
University Giessen
Advisor: Volker H. Franz
MiYoung Kwon
University of Minnesota
Advisor: Gordon E. Legge
Maro Machizawa
University College London
Advisor: Jon Driver
J. Patrick Mayo
University of Pittsburgh
Advisor: Marc Sommer
Rachel Millin
University of Southern California
Advisor: Bosco Tjan
Wei Song Ong
Advisor: James Bisley
Laura Pérez Zapata
University of Barcelona
Advisor: Hans Sùper
Yoni Pertzov
University of Jerusalem
Advisors: Ehud Zohary and Galia Avidan
Miranda Scolari
University of California, San Diego
Advisor: John Serences
Philip Tseng
University of California, Santa Cruz
Advisor: Bruce Bridgeman
Michael Vesia
York University
Advisors: Doug Crawford and Lauren Sergio
Greg West
University of Toronto
Advisor: Jay Pratt

2010 Young Investigator – George Alvarez

George Alvarez

Harvard University

The winner of the 2010 VSS Young Investigator Award is George Alvarez, Assistant Professor of Psychology at Harvard University. Alvarez has made exceptionally influential contributions to a number of research areas in vision and visual cognition. His work has uncovered principles that shape the efficient representation of information about objects and scenes in high level vision. He has also studied the way that high-level visual representations interact with attention and memory, revealing the functional organization and limitations of these processes. His work particularly illuminates the interfaces of vision, memory, and attention, systems that have classically been studied as separate entities. His creative experiments elegantly represent the diversity and vitality of the emerging field of visual cognition.

The Young Investigator Award will be presented before the VSS Keynote Address on Saturday, May 8th, at 7:45 pm, in the Royal Palm Ballroom at the Naples Grande Hotel.


2010 Keynote – Carla Shatz

Carla Shatz

Carla Shatz

Professor of Biology and Neurobiology Director, Bio-X, Stanford University

Audio and slides from the 2010 Keynote Address are available on the Cambridge Research Systems website.

Releasing the Brake on Ocular Dominance Plasticity

Saturday, May 8, 2010, 7:45 pm, Royal Palm Ballroom 4-5

Connections in adult visual system are highly precise, but they do not start out that way. Precision emerges during critical periods of development as synaptic connections remodel, a process requiring neural activity and involving regression of some synapses and strengthening and stabilization of others. Activity also regulates neuronal genes; in an unbiased PCR-based differential screen, we discovered unexpectedly that MHC Class I genes are expressed in neurons and are regulated by spontaneous activity and visual experience (Corriveau et al, 1998; Goddard et al, 2007). To assess requirements for MHCI in the CNS, mice lacking expression of specific MHCI genes were examined. Synapse regression in developing visual system did not occur, synaptic strengthening was greater than normal in adult hippocampus, and ocular dominance (OD) plasticity in visual cortex was enhanced (Huh et al, 2000; Datwani et al, 2009). We searched for receptors that could interact with neuronal MHCI and carry out these activity-dependent processes. mRNA for PirB, an innate immune receptor, was found highly expressed in neurons in many regions of mouse CNS. We generated mutant mice lacking PirB function and discovered that OD plasticity is also enhanced (Syken et al., 2006), as is hippocampal LTP. Thus, MHCI ligands signaling via PirB receptor may function to “brake” activity- dependent synaptic plasticity. Together, results imply that these molecules, thought previously to function only in the immune system, may also act at neuronal synapses to limit how much- or perhaps how quickly- synapse strength changes in response to new experience. These molecules may be crucial for controlling circuit excitability and stability in developing as well as adult brain, and changes in their function may contribute to developmental disorders such as Autism, Dyslexia and even Schizophrenia.

Supported by NIH Grants EY02858, MH071666, the Mathers Charitable Foundation and the Dana Foundation


Carla Shatz is professor of biology and neurobiology and director of Bio-X at Stanford University. Dr. Shatz’s research focuses on the development of the mammalian visual system, with an overall goal of better understanding critical periods of brain wiring and developmental disorders such as autism, dyslexia and schizophrenia, and also for understanding how the nervous and immune systems interact. Dr. Shatz graduated from Radcliffe College in 1969 with a B.A. in Chemistry. She was honored with a Marshall Scholarship to study at University College London, where she received an M.Phil. in Physiology in 1971. In 1976, she received a Ph.D. in Neurobiology from Harvard Medical School, where she studied with Nobel Laureates David Hubel and Torsten Wiesel. During this period, she was appointed as a Harvard Junior Fellow. From 1976 to 1978 she obtained postdoctoral training with Dr. Pasko Rakic in the Department of Neuroscience, Harvard Medical School. In 1978, Dr. Shatz moved to Stanford University, where she attained the rank of Professor of Neurobiology in 1989. In 1992, she moved her laboratory to the University of California, Berkeley, where she was Professor of Neurobiology and an Investigator of the Howard Hughes Medical Institute. In 2000, she assumed the Chair of the Department of Neurobiology at Harvard Medical School as the Nathan Marsh Pusey Professor of Neurobiology. Dr. Shatz received the Society for Neuroscience Young Investigator Award in 1985, the Silvo Conte Award from the National Foundation for Brain Research in 1993, the Charles A. Dana Award for Pioneering Achievement in Health and Education in 1995, the Alcon Award for Outstanding Contributions to Vision Research in 1997, the Bernard Sachs Award from the Child Neurology Society in 1999, the Weizmann Institute Women and Science Award in 2000 and the Gill Prize in Neuroscience in 2006. In 1992, she was elected to the American Academy of Arts and Sciences, in 1995 to the National Academy of Sciences, in 1997 to the American Philosophical Society, and in 1999 to the Institute of Medicine. In 2009 she received the Salpeter Lifetime achievement award from the Society for Neuroscience.

New Methods for Delineating the Brain and Cognitive Mechanisms of Attention

New Methods for Delineating the Brain and Cognitive Mechanisms of Attention

Friday, May 7, 1:00 – 3:00 pm
Royal Ballroom 4-5

Organizers: George Sperling, University of California, Irvine

Presenters: Edgar DeYoe (Medical College of Wisconsin), Jack L. Gallant (University of California, Berkeley), Albert J. Ahumada (NASA Ames Research Center, Moffett Field CA 94035), Wilson S. Geisler (The University of Texas at Austin), Barbara Anne Dosher (University of California, Irvine), George Sperling (University of California, Irvine)

Symposium Description

This symposium brings together the world’s leading specialists in six different subareas of visual attention. These distinguished scientists will expose the audience to an enormous range of methods, phenomena, and theories. It’s not a workshop; listeners won’t learn how to use the methods described, but they will become aware of the existence of diverse methods and what can be learned from them. The participants will aim their talks to target VSS attendees who are not necessarily familiar with the phenomena and theories of visual attention but who can be assumed to have some rudimentary understanding of visual information processing. The talks should be of interest to and understandable by all VSS attendees who have an interest in visual information processing: students, postdocs, academic faculty, research scientists, clinicians, and the symposium participants themselves. Attendees will see examples of the remarkable insights achieved by carefully controlled experiments combined with computational modeling. DeYoe reviews his extraordinary fMRI methods for localizing spatial visual attention in the visual cortex of alert human subjects to measure their ”attention maps”. He shows in exquisite detail how top-down attention to local areas in visual space changes the BOLD response (an indicator of neural activity) in corresponding local areas V1 of visual cortex and in adjacent spatiotopic visual processing areas. This work is of fundamental significance in defining the topography of attention and it has important clinical applications. Gallant is the premier exploiter of natural images in the study of visual cortical processing. His work uses computational models to define the neural processes of attention in V4 and throughout the attention hierarchy. Gallant’s methods complement DeYoe’s in that they reveal functions and purposes of attentional processing that often are overlooked with simple stimuli traditionally used. Ahumada, who introduced the reverse correlation paradigm in vision science, here presents a model for the eye movements in perhaps the simplest search task (which happens also to have practical importance): the search for a small target near horizon between ocean and sky. This is an introduction to the talk by Geisler. Geisler continues the theme of attention as optimizing performance in complex tasks in studies of visual search. He presents a computational model for how attention and stimulus factors jointly control eye movements and search success in arbitrarily complex and difficult search tasks. Eye movements in visual search approach those of an ideal observer in making optimal choices given the available information, and observers adapt (learn) rapidly when the nature of the information changes. Dosher has developed analytic descriptions of attentional processes that enable dissection of attention into three components: filter sharpening, stimulus enhancement, and altered gain control. She applies these analyses to show how subjects learn to adjust the components of attention to easy and to difficult tasks. Sperling reviews the methods used to quantitatively describe spatial and temporal attention windows, and to measure the amplification of attended features. He shows that different forms of attention act independently.


I Know Where You Are Secretly Attending! The topography of human visual attention revealed with fMRI

Edgar DeYoe, Medical College of Wisconsin; Ritobrato Datta, Medical College of Wisconsin

Previous studies have described the topography of attention-related activation in retinotopic visual cortex for an attended target at one or a few locations within the subject’s field of view. However, a complete description for all locations in the visual field is lacking. In this human fMRI study, we describe the complete topography of attention-related cortical activation throughout the central 28° of visual field and compare it with previous models. We cataloged separate fMRI-based maps of attentional topography in medial occipital visual cortex when subjects covertly attended to each target location in an array of 3 concentric rings of 6 targets each. Attentional activation was universally highest at the attended target but spread to other segments in a manner depending on eccentricity and/or target size.. We propose an “Attentional Landscape” model that is more complex than a ‘spotlight’ or simple ‘gradient’ model but includes aspects of both. Finally, we asked subjects to secretly attend to one of the 18 targets without informing the investigator. We then show that it is possible to determine the target of attentional scrutiny from the pattern of brain activation alone with 100% accuracy. Together, these results provide a comprehensive, quantitative and behaviorally relevant account of the macroscopic cortical topography of visuospatial attention. We also show how the pattern of attentional enhancement as it would appear distributed within the observer’s field of view thereby permitting direct observation of a neurophysiological correlate of a purely mental phenomenon, the “window of attention.”

Attentional modulation in intermediate visual areas during natural vision

Jack L. Gallant, University of California, Berkeley

Area v4 has been the focus of much research on neural mechanisms of attention. However, most of this work has focused on reduced paradigms involving simple stimuli such as bars and gratings, and simple behaviors such as fixation. The picture that has emerged from such studies suggests that the main effect of attention is to change response rate, response gain or contrast gain. In this talk I will review the current evidence regarding how neurons are modulated by attention under more natural viewing conditions involving complex stimuli and behaviors. The view that emerges from these studies suggests that attention operates through a variety of mechanisms that modify the way information is represented throughout the visual hierarchy. These mechanisms act in concert to optimize task performance under the demanding conditions prevailing during natural vision.

A model for search and detection of small targets

Albert J. Ahumada, NASA Ames Research Center, Moffett Field CA 94035

Computational models predicting the distribution of the time to detection of small targets on a display are being developed to improve workstation designs. Search models usually contain bottom-up processes, like a saliency map, and top-down processes, like a priori distributions over the possible locations to be searched. A case that needs neither of these features is the search for a very small target near the horizon when the sky and the ocean are clear. Our models for this situation have incorporated a saccade-distance penalty and inhibition-of-return with a temporal decay. For very small, but high contrast targets, using the simple detection model that the target is detected if it is foveated is sufficient. For low contrast signals, a standard observer detection model with masking by the horizon edge is required. Accurate models of the the search and detection process without significant expectations or stimulus attractors should make it easier to estimate the way in which the expectations and attractors are combined when they are included.

Ideal Observer Analysis of Overt Attention

Wilson S. Geisler, The University of Texas at Austin

In most natural tasks humans use information detected in the periphery, together with context and other task-dependent constraints, to select their fixation locations (i.e., the locations where they apply the specialized processing associated with the fovea). A useful strategy for investigating the overt-attention mechanisms that drive fixation selection is to begin by deriving appropriate normative (ideal observer) models. Such ideal observer models can provide a deep understanding of the computational requirements of the task, a benchmark against which to compare human performance, and a rigorous basis for proposing and testing plausible hypotheses for the biological mechanisms. In recent years, we have been investigating the mechanisms of overt attention for tasks in which the observer is searching for a known target randomly located in a complex background texture (nominally a background of filtered noise having the average power spectrum of natural images). This talk will summarize some of our earlier and more recent findings (for our specific search tasks): (1) practiced humans approach ideal search speed and accuracy, ruling out many sub-ideal models; (2) human eye movement statistics are qualitatively similar to those of the ideal searcher; (3) humans select fixation locations that make near optimal use of context (the prior over possible target locations); (4) humans show relatively rapid adaptation of their fixation strategies to simulated changes in their visual fields (e.g., central scotomas); (5) there are biologically plausible heuristics that approach ideal performance.

Attention in High Precision Tasks and Perceptual Learning

Barbara Anne Dosher, University of California, Irvine; Zhong-Lin Lu, University of Southern California

At any moment, the world presents far more information than the brain can process. Visual attention allows the effective selection of information relevant for high priority processing, and is often more easily focused on one object than two. Both spatial selection and object attention have important consequences for the accuracy of task performance. Such effects are historically assessed primarily for relatively “easy” lower-precision tasks, yet the role of attention can depend critically on the demand for fine, high precision judgments. High precision task performance generally depends more upon attention and attention affects performance across all contrasts with or without noisy stimuli. Low precision tasks with similar processing loads generally show effects of attention only at intermediate contrasts and may be restricted to noisy display conditions. Perceptual learning can reduce the costs of inattention. The different roles of attention and task precision are accounted for within the context of an elaborated perceptual template model of the observer showing distinct functions of attention, and providing an integrated account of performance as a function of attention, task precision, external noise and stimulus contrast. Taken together, these provide a taxonomy of the functions and mechanisms of visual attention.

Modeling the Temporal, Spatial, and Featural Processes of Visual Attention

George Sperling, University of California, Irvine

A whirlwind review of the methods used to quantitatively define the temporal, spatial, and featural properties of attention, and some of their interactions. The temporal window of attention is measured by moving attention from one location to another in which a rapid sequence of different items (e.g., letters or numbers) is being presented. The probability of items from that sequence entering short-term memory defines the time course of attention: typically 100 msec to window opening, maxim at 300-400 msec, and 800 msec to closing. Spatial attention is defined like acuity, by the ability to alternately attend and ignore strips of increasingly finer grids. The spatial frequency characteristic so measured then predicts achievable attention distributions to arbitrarily defined regions. Featural attention is defined by the increased salience of items that contain to-be-attended features. This can be measured in various ways; quickest is an ambiguous motion task which shows that attended features have 30% greater salience than neutral features. Spatio-temporal interaction is measured when attention moves as quickly as possible to a designated area. Attention moves in parallel to all the to-be-attended areas, i.e., temporal-spatial independence. Independence of attentional modes is widely observed; it allows the most efficient neural processing.

Integrative mechanisms for 3D vision: combining psychophysics, computation and neuroscience

Integrative mechanisms for 3D vision: combining psychophysics, computation and neuroscience

Friday, May 7, 1:00 – 3:00 pm
Royal Ballroom 1-3

Organizers: Andrew Glennerster, University of Reading

Presenters: Roland W. Fleming (Max Planck Institute for Biological Cybernetics), James T Todd (Department of Psychology, Ohio State University), Andrew Glennerster (University of Reading), Andrew E Welchman (University of Birmingham), Guy A Orban (K.U. Leuven), Peter Janssen (K.U. Leuven)

Symposium Description

Estimating the three-dimensional (3D) structure of the world around us is a central component of our everyday behavior, supporting our decisions, actions and interactions. The problem faced by the brain is classically described in terms of the difficulty of inferring a 3D world from (“ambiguous”) 2D retinal images. The computational challenge of inferring 3D depth from retinal samples requires sophisticated neural machinery that learns to exploit multiple sources of visual information that are diagnostic of depth structure. This sophistication at the input level is demonstrated by our flexibility in perceiving shape under radically different viewing situations. For instance, we can gain a vivid impression of depth from a sparse collection of seemingly random dots, as well as from flat paintings. Adding to the complexity, humans exploit depth signals for a range of different behaviors, meaning that the input complexity is compounded by multiple functional outputs. Together, this poses a significant challenge when seeking to investigate empirically the sequence of computations that enable 3D vision.

This symposium brings together speakers from different perspectives to outline progress in understanding 3D vision. Fleming will start, addressing the question of “What is the information?”, using computational analysis of 3D shape to highlight basic principles that produce depth signatures from a range of cues. Todd and Glennerster will both consider the question of “How is this information represented?”, discussing different types of representational schemes and data structures. Welchman, Orban and Janssen will focus on the question of “How is it implemented in cortex?”. Welchman will discuss human fMRI studies that integrate psychophysics with concurrent measures of brain activity. Orban will review fMRI evidence for spatial correspondence in the processing of different depth cues in the human and monkey brain. Janssen will summarize results from single cell electrophysiology, highlighting the similarities and differences between the processing of 3D shape at the extreme ends of the dorsal and ventral pathways. Finally, Glennerster, Orban and Janssen will all address the question of how depth processing is affected by task.

The symposium should attract a wide range of VSS participants, as the topic is a core area of vision science and is enjoying a wave of public enthusiasm with the revival of stereoscopic entertainment formats. Further, the goal of the session in linking computational approaches to behavior to neural implementation is one that is scientifically attractive.


From local image measurements to 3D shape

Roland W. Fleming, Max Planck Institute for Biological Cybernetics

There is an explanatory gap between the simple local image measurements of early vision, and the complex perceptual inferences involved in estimating object properties such as surface reflectance and 3D shape.  The main purpose of my presentation will be to discuss how populations of filters tuned to different orientations and spatial frequencies can be ‘put to good use’ in the estimation of 3D shape.  I’ll show how shading, highlights and texture patterns on 3D surfaces lead to highly distinctive signatures in the local image statistics, which the visual system could use in 3D shape estimation.  I will discuss how the spatial organization of these measurements provides additional information, and argue that a common front end can explain both similarities and differences between various monocular cues.  I’ll also present a number of 3D shape illusions and show how these can be predicted by image statistics, suggesting that human vision does indeed make use of these measurements.

The perceptual representation of 3D shape

James T Todd, Department of Psychology, Ohio State University

One of the fundamental issues in the study of 3D surface perception is to identify the specific aspects of an object’s structure that form the primitive components of an observer’s perceptual knowledge.  After all, in order to understand shape perception, it is first necessary to define what ”shape” is.  In this presentation, I will assess several types of data structures that have been proposed for representing 3D surfaces.   One of the most common data structures employed for this purpose involves a map of the geometric properties in each local neighborhood, such as depth, orientation or curvature. Numerous experiments have been performed in which observers have been required to make judgments of local surface properties, but the results reveal that these judgments are most often systematically distorted relative to the ground truth and surprisingly imprecise, thus suggesting that local property maps may not be the foundation of our perceptual knowledge about 3D shape.  An alternative type of data structure for representing 3D shape involves a graph of the configural relationships among qualitatively distinct surface features, such as edges and vertices. The psychological validity of this type of representation has been supported by numerous psychophysical experiments, and by electrophysiological studies of macaque IT. A third type of data structure will also be considered in which surfaces are represented as a tiling of qualitatively distinct regions based on their patterns of curvature, and there is some neurophysiological evidence to suggest that this type of representation occurs in several areas of the primate cortex.

View-based representations and their relevance to human 3D vision

Andrew Glennerster, School of Psychology and CLS, University of Reading

In computer vision, applications that previously involved the generation of 3D models can now be achieved using view-based representations. In the movie industry this makes sense, since both the inputs and outputs of the algorithms are images, but the same could also be argued of human 3D vision. We explore the implications of view-based models in our experiments.

In an immersive virtual environment, observers fail to notice the expansion of a room around them and consequently make gross errors when comparing the size of objects. This result is difficult to explain if the visual system continuously generates a 3-D model of the scene using known baseline information from interocular separation or proprioception. If, on the other hand, observers use a view-based representation to guide their actions, they may have an expectation of the images they will receive but be insensitive to the rate at which images arrive as they walk.

In the same context, I will discuss psychophysical evidence on sensitivity to depth relief with respect to surfaces. The data are compatible with a hierarchical encoding of position and disparity similar to the affine model of Koenderink and van Doorn (1991).  Finally, I will discuss two experiments that show how changing the observer’s task changes their performance in a way that is incompatible with the visual system storing a 3D model of the shape or location of objects. Such task-dependency indicates that the visual system maintains information in a more ‘raw’ form than a 3D model.

The functional roles of visual cortex in representing 3D shape

Andrew E Welchman, School of Psychology, University of Birmingham

Estimating the depth structure of the environment is a principal function of the visual system, enabling many key computations, such as segmentation, object recognition, material perception and the guidance of movements. The brain exploits a range of depth cues to estimate depth, combining information from shading and shadows to linear perspective, motion and binocular disparity. Despite the importance of this process, we still know relatively little about the functional roles of different cortical areas in processing depth signals in the human brain. Here I will review recent human fMRI work that combines established psychophysical methods, high resolution imaging and advanced analysis methods to address this question. In particular, I will describe fMRI paradigms that integrate psychophysical tasks in order to look for a correspondence between changes in behavioural performance and fMRI activity. Further, I will review information-based fMRI analysis methods that seek to investigate different types of depth representation in parts of visual cortex. This work suggests a key role for a confined ensemble of dorsal visual areas in the processing information relevant to judgments of 3D shape.

Extracting depth structure from multiple cues

Guy A Orban, K.U. Leuven

Multiple cues provide information about the depth structure of objects: disparity, motion and shading and texture. Functional imaging studies in humans have been preformed to localize the regions involved in extracting depth structure from these four cues. In all these studies extensive controls were used to obtain activation sites specific for depth structure. Depth structure from motion, stereo and texture activates regions in both parietal and ventral cortex, but shading only activates a ventral region. For stereo and motion the balance between dorsal and ventral activation depends on the type of stimulus: boundaries versus surfaces. In monkey results are similar to those obtained in humans except that motion is a weaker cue in monkey parietal cortex. At the single cell level neurons are selective for gradients of speed, disparity and texture. Neurons selective for first and second order gradients of disparity will be discussed by P Janssen. I will concentrate on neurons selective for speed gradients and review recent data indicating that a majority of FST neurons is selective for second order speed gradients.

Neurons selective to disparity defined shape in the temporal and parietal cortex

Peter Janssen, K.U. Leuven; Bram-Ernst Verhoef, KU Leuven

A large proportion of the neurons in the rostral lower bank of the Superior Temporal Sulcus, which is part of IT, respond selectively to disparity-defined 3D shape (Janssen et al., 1999; Janssen et al., 2000). These IT neurons preserve their selectivity for different positions-in-depth, which proves that they respond to the spatial variation of disparity along the vertical axis of the shape (higher-order disparity selectivity). We have studied the responses of neurons in parietal area AIP, the end stage of the dorsal visual stream and crucial for object grasping, to the same disparity-defined 3D shapes (Srivastava et al., 2009). In this presentation I will review the differences between IT and AIP in the neural representation of 3D shape. More recent studies have investigated the role of AIP and IT in the perceptual discrimination of 3D shape using simultaneous recordings of spikes and local field potentials in the two areas, psychophysics and reversible inactivations. AIP and IT show strong synchronized activity during 3D-shape discrimination, but only IT activity correlates with perceptual choice. Reversible inactivation of AIP produces a deficit in grasping but does not affect the perceptual discrimination of 3D shape. Hence the end stages of both the dorsal and the ventral visual stream process disparity-defined 3D shape in clearly distinct ways. In line with the proposed behavioral role of the two processing streams, the 3D-shape representation in AIP is action-oriented but not crucial for 3D-shape perception.


Representation in the Visual System by Summary Statistics

Representation in the Visual System by Summary Statistics

Friday, May 7, 3:30 – 5:30 pm
Royal Ballroom 1-3

Organizers: Ruth Rosenholtz, MIT Department of Brain & Cognitive Sciences

Presenters: Ruth Rosenholtz (MIT Department of Brain & Cognitive Sciences), Josh Solomon (City University London), George Alvarez (Harvard University, Department of Psychology), Jeremy Freeman (Center for Neural Science, New York University), Aude Oliva (Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology), Ben Balas (MIT, Department of Brain and Cognitive Sciences)

Symposium Description

What is the representation in early vision?  Considerable research has demonstrated that the representation is not equally faithful throughout the visual field; representation appears to be coarser in peripheral and unattended vision, perhaps as a strategy for dealing with an information bottleneck in visual processing.  In the last few years, a convergence of evidence has suggested that in peripheral and unattended regions, the information available consists of summary statistics.  “Summary statistics” is a general term used to represent a class of measurements made by pooling over visual features of various levels of complexity, e.g. 1st order statistics such as mean orientation; joint statistics of responses of V1-like oriented feature detectors; or ensemble statistics that represent spatial layout information.  Depending upon the complexity of the computed statistics, many attributes of a pattern may be perceived, yet precise location and configuration information is lost in favor of the statistical summary.

This proposed representation for early vision is related to suggestions that the brain can compute summary statistics when such statistics are useful for a given task, e.g. texture segmentation, or explicit judgments of mean size of a number of items.  However, summary statistic models of early visual representation additionally suggest that under certain circumstances summary statistics are what the visual system is “stuck with,” even if more information would be useful for a given task.

This symposium will cover a range of related topics and methodologies.  Talks by Rosenholtz, Solomon, and Alvarez will examine evidence for a statistical representation in vision, and explore the capabilities of the system, using both behavioral experiments and computational modeling.    Freeman will discuss where summary statistics might be computed in the brain, based upon a combination of physiological findings, fMRI, and behavioral experiments.   Finally, we note that a summary statistic representation captures a great deal of important information, yet is ultimately lossy.  Such a representation in peripheral and/or unattended vision has profound implications for visual perception in general, from peripheral recognition through visual awareness and visual cognition.  Rosenholtz, Oliva, and Balas will discuss implications for a diverse set of tasks, including peripheral recognition, visual search, visual illusions, scene perception, and visual cognition.  The power of this new way of thinking about vision becomes apparent precisely from implications for a wide variety of visual tasks, and from evidence from diverse methodologies.


The Visual System as Statistician: Statistical Representation in Early Vision

Ruth Rosenholtz, MIT Department of Brain & Cognitive Sciences; B. J. Balas, Dept. of Brain & Cognitive Sciences, MIT; Alvin Raj, Computer Science and AI Lab, MIT; Lisa Nakano, Stanford; Livia Ilie, MIT

We are unable to process all of our visual input with equal fidelity.  At any given moment, our visual systems seem to represent the item we are looking at fairly faithfully.  However, evidence suggests that our visual systems encode the rest of the visual input more coarsely.  What is this coarse representation?  Recent evidence suggests that this coarse encoding consists of a representation in terms of summary statistics.  For a complex set of statistics, such a representation can provide a rich and detailed percept of many aspects of a visual scene.  However, such a representation is also lossy; we would expect the inherent ambiguities and confusions to have profound implications for vision.  For example, a complex pattern, viewed peripherally, might be poorly represented by its summary statistics, leading to the degraded recognition experienced under conditions of visual crowding.  Difficult visual search might occur when summary statistics could not adequately discriminate between a target-present and distractor-only patch of the stimuli.  Certain illusory percepts might arise from valid interpretations of the available – lossy – information.  It is precisely visual tasks upon which a statistical representation has significant impact that provide the evidence for such a representation in early vision.  I will summarize recent evidence that early vision computes summary statistics based upon such tasks.

Efficiencies for estimating mean orientation, mean size, orientation variance and size variance

Josh Solomon, City University London; Michael J. Morgan, City University London, Charles Chubb, University of California, Irvine

The merest glance is usually sufficient for an observer to get the gist of a scene. That is because the visual system statistically summarizes its input.  We are currently exploring the precision and efficiency with which orientation and size statistics can be calculated. Previous work has established that orientation discrimination is limited by an intrinsic source of orientation-dependent noise, which is approximately Gaussian. New results indicate that size discrimination is also limited by approximately Gaussian noise, which is added to logarithmically transduced circle diameters. More preliminary results include: 1a) JAS can discriminate between two successively displayed, differently oriented Gabors, at 7 deg eccentricity, without interference from 7 iso-eccentric, randomly oriented distractors. 1b) He and another observer can discriminate between two successively displayed, differently sized circles, at 7 deg eccentricity, without much interference from 7 iso-eccentric distractors. 2a) JAS effectively uses just two of the eight uncrowded Gabors when computing their mean orientation. 2b) He and another observer use at most four of the eight uncrowded circles when computing their mean size. 3a) Mean-orientation discriminations suggest a lot more Gaussian noise than orientation-variance discriminations. This surprising result suggests that cyclic quantities like orientation may be harder to remember than non-cyclic quantities like variance. 3b) Consistent with this hypothesis is the greater similarity between noise estimates from discriminations of mean size and size variance.

The Representation of Ensemble Statistics Outside the Focus of Attention

George Alvarez, Harvard University, Department of Psychology

We can only attend to a few objects at once, and yet our perceptual experience is rich and detailed. What type of representation could enable this subjective experience? I have explored the possibility that perception consists of (1) detailed and accurate representations of currently attended objects, plus (2) a statistical summary of information outside the focus of attention. This point of view makes a distinction between individual features and statistical summary features. For example, a single object’s location is an individual feature. In contrast, the center of mass of several objects (the centroid) is a statistical summary feature, because it collapses across individual details and represents the group overall. Summary statistics are more accurate than individual features because random, independent noise in the individual features cancels out when averaged together. I will present evidence that the visual system can compute statistical summary features outside the focus of attention even when local features cannot be accurately reported. This finding holds for simple summary statistics including the centroid of a set of uniform objects, and for texture patterns that resemble natural image statistics. Thus, it appears that information outside the focus of attention can be represented at an abstract level that lacks local detail, but nevertheless carries a precise statistical summary of the scene. The term ‘ensemble features’ refers to a broad class of statistical summary features, which we propose collectively comprise the representation of information outside the focus of attention (i.e., under conditions of reduced attention).

Linking statistical texture models to population coding in the ventral stream

Jeremy Freeman, Center for Neural Science, New York University, Luke E. Hallum, Center for Neural Science & Dept. of Psychology, NYU; Michael S. Landy, Center for Neural Science & Dept. of Psychology, NYU; David J. Heeger, Center for Neural Science & Dept. of Psychology, NYU; Eero P. Simoncelli, Center for Neural Science, Howard Hughes Medical Institute, & the Courant Institute of Mathematical Sciences, NYU

How does the ventral visual pathway encode natural images? Directly characterizing neuronal selectivity has proven difficult: it is hard to find stimuli that drive an individual cell in the extrastriate ventral stream, and even having done so, it is hard to find a low-dimensional parameter space governing its selectivity. An alternative approach is to examine the selectivity of neural populations for images that differ statistically (e.g. in Rust & DiCarlo, 2008). We develop a model of extrastriate populations that compute correlations among the outputs of V1-like simple and complex cells at nearby orientations, frequencies, and positions (Portilla & Simoncelli, 2001). These correlations represent the complex structure of visual textures: images synthesized to match the correlations of an original texture image appear texturally similar. We use such synthetic textures as experimental stimuli. Using fMRI and classification analysis, we show that population responses in extrastriate areas are more variable across different textures than across multiple samples of the same texture, suggesting that neural representations in ventral areas reflect the image statistics that distinguish natural textures. We also use psychophysics to explore how the representation of these image statistics varies over the visual field. In extrastriate areas, receptive field sizes grow with eccentricity. Consistent with recent work by Balas et al. (2009), we model this by computing correlational statistics averaged over regions corresponding to extrastriate receptive fields. This model synthesizes metameric images that are physically different but appear identical because they are matched for local statistics. Together, these results show how physiological and psychophysical measurements can be used to link image statistics to population representations in the ventral stream.

High level visual ensemble statistics: Encoding the layout of visual space

Aude Oliva, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology

Visual scene understanding is central to our interactions with the world. Recognizing the current environment facilitates our ability to act strategically, for example in selecting a route for walking, anticipating where objects are likely to appear, and knowing what behaviors are appropriate in a particular context. In this talk, I will discuss a role for statistical, ensemble representations in scene and space representation. Ensemble features correspond to a higher-level description of the input that summarizes local measurements. With this ensemble representation, the distribution of local features can be inferred and used to reconstruct multiple candidate visual scenes that share similar ensemble statistics. Pooling over local measurements of visual features in natural images is one mechanism for generating a holistic representation of the spatial layout of natural scenes. A model based on such summary representation is able to estimate scene layout properties as humans do.  Potentially, the richness of content and spatial volume in a scene can be at least partially captured using the compressed yet informative description of statistical ensemble representations.

Beyond texture processing: further implications of statistical representations

Ben Balas, MIT, Department of Brain and Cognitive Sciences; Ruth Rosenholtz, MIT; Alvin Raj, MIT

The proposal that peripherally-viewed stimuli are represented by summary statistics of visual structure has implications for a wide range of tasks.  Already, my collaborators and I have demonstrated that texture processing, crowding, and visual search appear to be well-described by such representations, and we suggest that it may be fruitful to significantly extend the scope of our investigations into the affordances and limitations of a “statistical” vocabulary. Specifically, we submit that many tasks that have been heretofore described broadly as “visual cognition” tasks may also be more easily understood within this conceptual framework. How do we determine whether an object lies within a closed contour or not? How do we judge if an unobstructed path can be traversed between two points within a maze? What makes it difficult to determine the impossibility of “impossible” objects under some conditions? These specific tasks appear to be quite distinct, yet we suggest that what they share is a common dependence on the visual periphery that constrains task performance by the imposition of a summary-statistic representation of the input. Here, we shall re-cast these classic problems of visual perception within the context of a statistical representation of the stimulus and discuss how our approach offers fresh insight into the processes that support performance in these tasks and others.


Dissociations between top-down attention and visual awareness

Dissociations between top-down attention and visual awareness

Friday, May 7, 3:30 – 5:30 pm
Royal Ballroom 6-8

Organizers: Jeroen van Boxtel, California Institute of Technology and Nao Tsuchiya, California Institute of Technology, USA and Tamagawa University, Japan

Presenters: Nao Tsuchiya (California Institute of Technology, USA, Tamagawa University, Japan), Jeroen J.A. van Boxtel (California Institute of Technology, USA), Takeo Watanabe (Boston University), Joel Voss (Beckman Institute, University of Illinois Urbana-Champaign, USA), Alex Maier (National Institute of Mental Health, NIH)

Symposium Description

Historically, the pervading assumption among sensory psychologists has been that attention and awareness are intimately linked, if not identical, processes. However, a number of recent authors have argued that these are two distinct processes, with different functions and underlying neuronal mechanisms. If this position were correct, we should be able to dissociate the effects of attention and awareness with some experimental manipulation.  Furthermore, we might expect extreme cases of dissociation, such as when attention and awareness have opposing effects on some task performance and its underlying neuronal activity. In the last decade, a number of findings have been taken as support for the notion that attention and awareness are distinct cognitive processes.  In our symposium, we will review some of these results and introduce psychophysical methods to manipulate top-down attention and awareness independently.  Throughout the symposium, we showcase the successful application of these methods to human psychophysics, fMRI and EEG as well as monkey electrophysiology.

First, Nao Tsuchiya will set the stage for the symposium by offering a brief review of recent psychophysical studies that support the idea of awareness without attention as well as attention without awareness.  After discussing some of the methodological limitations of these approaches, Jeroen VanBoxtel will show direct evidence that attention and awareness can result in opposite effects for the formation of afterimages. Takeo Watanabe’s behavioral paradigm will demonstrate that subthreshold motion can be more distracting than suprathreshold motion.  He will go on to show the neuronal substrate of this counter-intuitive finding with fMRI.  Joel Voss will describe how perceptual recognition memory can occur without awareness following manipulations of attention, and how these effects result from changes in the fluency of neural processing in visual cortex measured by EEG.  Finally, Alexander Maier will link these results in the humans studies to neuronal recordings in monkeys, where the attentional state and the visibility of a stimulus are manipulated independently in order to study the neuronal basis of each.

A major theme of our symposium is that emerging evidence supports the notion that attention and awareness are two distinctive neuronal processes.  Throughout the symposium, we will discuss how dissociative paradigms can lead to new progress in the quest for the neuronal processes underlying attention and awareness.  We emphasize that it is important to separate out the effects of attention from the effects of awareness.  Our symposium would benefit most vision scientists, interested in visual attention or visual awareness because the methodologies we discuss would inform them of paradigms that can dissociate attention from awareness. Given the novelty of these findings, our symposium will cover a terrain that remains largely untouched by the main program.


The relationship between top-down attention and conscious awareness

Nao Tsuchiya, California Institute of Technology, USA, Tamagawa University, Japan

Although a claim that attention and awareness are different has been suggested before, it has been difficult to show clear dissociations due to their tight coupling in normal situations; top-down attention and visibility of stimulus both improve the performance in most visual tasks. As proposed in this workshop, however, putative difference in their functional and computational roles implies a possibility that attention and awareness may affect visual processing in different ways.  After brief discussion on the functional and computational roles of attention and awareness, we will introduce psychophysical methods that independently manipulate visual awareness and spatial, focal top-down attention and review the recent studies showing consciousness without attention and attention without consciousness.

Opposing effects of attention and awareness on afterimages

Jeroen J.A. van Boxtel, California Institute of Technology, USA

The brain’s ability to handle sensory information is influenced by both selective attention and awareness. There is still no consensus on the exact relationship between these two processes and whether or not they are distinct. So far, no experiment simultaneously manipulated both, which severely hampers discussions on this issue. We here describe a full factorial study of the influences of attention and awareness (as assayed by visibility) on afterimages. We investigated the duration of afterimages for all four combinations of high versus low attention and visible versus invisible grating. We demonstrate that selective attention and visual awareness have opposite effects: paying attention to the grating decreases the duration of its afterimage, while consciously seeing the grating increases afterimage duration. We moreover control for various possible confounds, including stimulus, and task changes. These data provide clear evidence for distinctive influences of selective attention and awareness on visual perception.

Role of subthreshold stimuli in task-performance and its underlying mechanism

Takeo Watanabe, Boston University

Considerable evidence exists indicating that a stimulus which is subthreshold and thus consciously invisible, influences brain activity and behavioral performance. However, it is not clear how subthreshold stimuli are processed in the brain. We found that a task-irrelevant subthreshold coherent motion leads to stronger disturbance in task performance than suprathreshold motion. With the subthreshold motion, fMRI activity in the visual cortex was higher, but activity in the dorsolateral prefrontal cortex (DLPFC) was lower, than with suprathreshold motion. The results of the present study demonstrate two important points. First, a weak task-irrelevant stimulus feature which is below but near the perceptual threshold more strongly activates visual area (MT+) which is highly related to the stimulus feature and more greatly disrupts task performance. This contradicts the general view that irrelevant signals that are stronger in stimulus properties more greatly influence the brain and performance and that the influence of a subthreshold stimulus is smaller than that of suprathreshold stimuli. Second, the results may reveal important bidirectional interactions between a cognitive controlling system and the visual system. LPFC, which has been suggested to provide inhibitory control on task-irrelevant signals, may have a higher detection threshold for incoming signals than the visual cortex. Task-irrelevant signals around the threshold level may be sufficiently strong to be processed in the visual system but not strong enough for LPFC to “notice” and, therefore, to provide effective inhibitory control on the signals. In this case, such signals may remain uninhibited, take more resources for a task-irrelevant distractor, and leave fewer resources for a given task, and disrupt task performance more than a suprathreshold signal. On the other hand, suprathreshold coherent motion may be “noticed”, given successful inhibitory control by LPFC, and leave more resources for a task. This mechanism may underlie the present paradoxical finding that subthreshold task-irrelevant stimuli activate the visual area strongly and disrupt task performance more and could also be one of the reasons why subthreshold stimuli tend to lead to relatively robust effects.

Implicit recognition: Implications for the study of attention and awareness

Joel Voss, Beckman Institute, University of Illinois Urbana-Champaign, USA

Recognition memory is generally accompanied by awareness, such as when a person recollects details about a prior event or feels that a previously encountered face is familiar. Moreover, recognition is usually benefited by attention. I will describe a set of experiments that yielded unprecedented dissociations between recognition, attention, and awareness. These effects were produced by carefully selecting experimental parameters to minimize contributions from memory encoding and retrieval processes that normally produce awareness, such as semantic elaboration. Fractal images were viewed repeatedly, and repeat images could be discriminated from novel images that were perceptually similar. Discrimination responses were highly accurate even when subjects reported no awareness of having seen the images previously, a phenomenon we describe as implicit recognition. Importantly, implicit recognition was dissociated from recognition accompanied by awareness based on differences in the relationship between confidence and accuracy for each memory type. Diversions of attention at encoding greatly increased the prevalence of implicit recognition. Electrophysiological responses obtained during memory testing showed that implicit recognition was based on similar neural processes as implicit priming. Both implicit recognition and implicit priming for fractal images included repetition-induced reductions in the magnitude of neural activity in visual cortex, an indication of visual processing fluency.  These findings collectively indicate that attention during encoding biases the involvement of different memory systems. High attention recruits medial temporal structures that promote memory with awareness whereas low attention yields cortical memory representations that are independent from medial temporal contributions, such that implicit recognition can result.

Selective attention and perceptual suppression independently modulate contrast change detection.

Alex Maier, National Institute of Mental Health, NIH

Visual awareness bears a complex relationship to selective attention, with some evidence suggesting they can be operationally dissociated (Koch & Tsuchiya 2007). As a first step in the neurophysiological investigation of this dissociation, we developed a novel paradigm that allows for the independent manipulation of visual attention and stimulus awareness in nonhuman primates using a cued perceptual suppression paradigm. We trained two macaque monkeys to detect a slight decrement in stimulus contrast occurring at random time intervals. This change was applied to one of eight isoeccentric sinusoidal grating stimuli with equal probability. In 80% of trials a preceding cue at the fixation spot indicated the correct position of the contrast change. Previous studies in humans demonstrated that such cuing leads to increased selective attention under similar conditions (Posner et al. 1984). In parallel with behavioral cuing, we used binocular rivalry flash suppression (Wolfe 1984) to render the attended stimuli invisible on half the trials. The combined paradigm allows for independent assessment of the effects of spatial attention and perceptual suppression on the detection threshold of the contrast decrement, as well as on neural responses. Our behavioral results suggest that the visibility of the decrement is affected independently by attention and perceptual state. We will present preliminary electrophysiological data from early visual cortex that suggest independent contributions of these two factors to the modulation of neural responses to a visual stimulus.


Vision Sciences Society