University of Minnesota, Department of Psychology
Advisors: Paul Schrater, Daniel Kersten
University of California, Davis
Advisor: David Whitney
Humboldt-University Berlin, Germany
Advisor: Werner Sommer
Dartmouth College, Psychological and Brain Sciences
Advisor: Peter Tse
Brandeis University, Department of Psychology
Advisor: Robert Sekuler
|Sung Jun Joo
Yonsei University, Graduate Program in Cognitive Science
Advisor: Sang Chul Chong
Vanderbilt University, Department of psychology
Advisor: Randolph Blake
Advisor: Ladan Shams
MIT, McGovern Institute for Brain Research
Advisors: Nancy Kanwisher, Daniel D. Dilks
University of Minnesota, Department of Psychology, Eye & ENT hospital of Fudan University, Department of Ophthalmology
Advisor: Sheng He
University of California San Diego, Salk Institute and Graduate Program in Neurosciences
Advisor: Rich Krauzlis
Hebrew University of Jerusalem, Department of Neurobiology, Life Sciences Institute
Advisor: Ehud Zohary
Wake Forest Medical Center
Advisor: Christos Constantinidis
University of Southern California, Department of Psychology
Advisor: Bosco S. Tjan
University of Birmingham, School of Psychology
Advisor: Zoe Kourtzi
University of Rochester, Center for Visual Science, Department of Brain and Cognitive Sciences
Advisors: Dana Ballard, Mary Hayhoe
University of Giessen, Germany
Advisor: Julia Trommershäuser
|Yetta Kwailing Wong
Advisor: Isabel Gauthier
Duke University, Department of Neurobiology
Advisor: David Fitzpatrick
University of Washington, Cognition and Perception, Psychology Dept.
Advisor: John Palmer
Dr. David Whitney
Department of Psychology and Center for Mind & Brain, University of California, Davis
Dr. David Whitney has been chosen as this year�s recipient of the VSS Young Investigator Award in recognition of the extraordinary breadth and quality of his research. Using behavioral and fMRI measures in human subjects, Dr. Whitney has made significant contributions to the study of motion perception, perceived object location, crowding and the visual control of hand movements. His research is representative of the diversity and creativity associated with the best work presented at VSS.
The YIA award will be presented at the Keynote Address on Saturday, May 10, at 7:00 pm.
Edward Callaway, Ph.D., Systems Neurobiology Laboratories, Salk Institute
Unraveling fine-scale and cell-type specificity of visual cortical circuits
Audio and slides from the 2008 Keynote Address are available on the Cambridge Research Systems website.
Larry Abbott, Co-Director, Center for Theoretical Neuroscience, Columbia University School of Medicine
Enhancement of visual processing by spontaneous neural activity
Audio and slides from the 2007 Keynote Address are available on the Cambridge Research Systems website.
David R. Williams, Ph.D., William G. Allyn Professor of Medical Optics; Director, Center for Visual Science, University of Rochester
The Limits of Human Vision
Irene Pepperberg, Department of Psychology, Brandeis University
Action for perception: functional significance of eye movements for vision
Friday, May 9, 2008, 3:30 – 5:30 pm Orchid 1
Organizers: Anna Montagnini (Institut de Neurosciences Cognitives de la M�diterran�e) and Miriam Spering (Justus-Liebig University Giessen, Germany)
Presenters: Maria Concetta Morrone (Facolt� di Psicologia, Universit� Vita-Salute S Raffaele, Milano, Italy), Tirin Moore (Stanford University School of Medicine, USA), Michele Rucci (Boston University), Miriam Spering (Justus-Liebig University Giessen, Germany; New York University), Ziad Hafed (Systems Neurobiology Laboratory, Salk Institute), Wilson S. Geisler (University of Texas, Austin)
When we view the world around us, our eyes are constantly in motion.
Different types of eye movements are used to bring the image of an object of interest onto the fovea, to keep it stable on this high-resolution area of the retina, or to avoid visual fading. Moment by moment, eye movements change the retinal input to the visual system of primates, thereby determining what we see. This critical role of eye movements is now widely acknowledged, and closely related to a research program termed �Active Vision� (Findlay & Gilchrist, 2003).
While eye movements improve vision, they might also come at a cost.
Voluntary eye movements can impair perception of objects, space and time, and affect attentional processing. When using eye movements as a sensitive tool to infer visual and cognitive processing, these constraints have to be taken into account.
The proposed symposium responds to an increasing interest in vision sciences to use eye movements. The aims of the symposium are (i) to review and discuss findings related to perceptual consequences of eye movements, (ii) to introduce new methodological approaches that take into account these consequences, and (iii) to encourage vision scientists to focus on the dynamic interplay between vision and oculomotor behavior.
The symposium spans a wide area of research on visuomotor interaction, and brings to the table junior and senior researchers from different disciplines, studying different types of eye movements and perceptual behaviors. All speakers are at the forefront of research in vision and brain sciences and have made significant contributions to the understanding of the questions at hand, using a variety of methodological approaches.
Concetta Morrone (Universit� Vita-Salute, Italy) reviews findings on the perisaccadic compression of space and time, and provides a Bayesian model for these perceptual phenomena. Tirin Moore (Stanford University, USA) discusses the neural mechanisms of perisaccadic changes in visual and attentional processing. Michele Rucci (Boston University, USA) argues for an increase in spatial sensitivity due to involuntary miniature eye movements during fixation, which are optimized for the statistics of natural scenes.
Miriam Spering (University of Giessen, Germany) focuses on the relationship between smooth pursuit eye movements and the ability to perceive and predict visual motion. Ziad Hafed (Salk Institute, USA) discusses the effect of eye movements on object perception, pointing out an intriguing role of oculomotor control for visual optimization. Wilson Geisler (University of Texas, USA) uses ideal-observer analysis to model the selection of fixation locations across a visual scene, demonstrating the high degree of efficiency in human visuomotor strategy.
The topic of this symposium is at the same time of general interest and of specific importance. It should attract at least three groups of VSS attendants � those interested in low-level visual perception, in motor behavior, and those using eye movements as a tool. We expect to attract both students, seeking an introduction to the topic, and faculty, looking for up-to date insights. It will be beneficial for VSS to include a symposium devoted to the dynamic and interactive link between visual perception and oculomotor behavior.
Perception of space and time during saccades: a Bayesian explanation for perisaccadic distortions
Maria Concetta Morrone, Paola Binda and David Burr
During a critical period around the time of saccades, briefly presented stimuli are grossly mislocalized in space and time and both relative distances and durations appear strongly compressed. We investigated whether the Bayesian hypothesis of optimal sensory fusion could account for some of the mislocalizations, taking advantage of the fact that auditory stimuli are unaffected by saccades. For spatial localization, vision usually dominates over audition during fixation (the �ventriloquist effect�); but during perisaccadic presentations, auditory localization becomes relatively more important, so the mislocalized visual stimulus is seen closer to its veridical position. Both the perceived position of the bimodal stimuli and the time-course of spatial localization were well-predicted by assuming optimal Bayesian-like combination of visual and auditory signals. For time localization, acoustic signals always dominate. However, this dominance does not affect the dynamics of saccadic mislocalization, suggesting that audio-visual capture occurs after saccadic remapping. Our model simulates the time-course data, assuming that position in external space is given by the sum of retinal position and a noisy eye-position signal, obtained by integrating the output of two neural populations, one centered at the current point of gaze, the other centered at the future point of gaze. Only later the output signal is fused with the auditory signal, demonstrating that some saccadic distortions take place very early in visual analysis.
This model not only accounts for the bizarre perceptual phenomena caused by saccades, but provides a novel vision-based account of peri-saccadic remapping of space.
Neural mechanisms and correlates of perisaccadic changes in visual perception
The changes in visual perception that accompany saccadic eye movements, including shifts of attention and saccadic suppression, are well documented in psychophysical studies. However, the neural basis of these changes is poorly understood. Recent evidence suggests that interactions of oculomotor mechanisms with visual cortical representations may provide a basis for modulations of visual signals and visual perception described during saccades. I will discuss some recent neurophysiological experiments that address the impact of oculomotor mechanisms, and of saccade preparation, on the filtering of visual signals within cortex. Results from these experiments relate directly to the observed enhancement and suppression of visual perception during saccades.
Fixational eye movements, natural image statistics, and fine spatial vision
During visual fixation, small eye movements continually displace the stimulus on the retina. It is known that visual percepts tend to fade when retinal image motion is eliminated in the laboratory. However, it has long been debated whether, during natural viewing, fixational eye movements have other functions besides preventing the visual scene from fading. In this talk, I will summarize a theory for the existence of fixational eye movements, which links the physiological instability of visual fixation to the statistics of natural scenes. According to this theory, fixational eye movements contribute to the neural encoding of natural scenes by attenuating input redundancy and emphasizing the elements of the stimulus that cannot be predicted from the statistical properties of natural images. To test some of the predictions of this theory, we developed a new method of retinal image stabilization, which enables selective elimination of the motion of the retinal image during natural intersaccadic fixation. We show that fixational eye movements facilitate the discrimination of high spatial frequency patterns masked by low spatial frequency noise, as predicted by our theory.
These results suggest a contribution of fixational eye movements in the processing of spatial detail, a proposal originally speculated by Hering in 1899.
Motion perception and prediction during smooth pursuit eye movements
Miriam Spering, Alexander C. Sch�tz and Karl R. Gegenfurtner
Smooth pursuit eye movements are slow, voluntary movements of the eyes that serve to hold the retinal image of a moving object close to the fovea. Most research on the interaction of visual perception and oculomotor action has focused on the question what visual input drives the eye best, and what this tells us about visual processing for eye movement control. Here we take a different route and discuss findings on perceptual consequences of pursuit eye movements. Our recent research has particularly focused on the interaction between pursuit eye movements and motion sensitivity in different tasks and visual contexts. (i) We report findings from a situation that particularly requires the dissociation between retinal image motion due to eye movements and retinal object motion. A moving object has to be tracked across a dynamically changing moving visual context, and object motion has to be estimated. (ii) The ability to predict the trajectory of a briefly presented moving object is compared during pursuit and fixation for different target presentation durations. (iii) We compare the sensitivity to motion perturbations in the peripheral visual context during pursuit and fixation. Results imply that pursuit consequences are optimally adapted to contextual requirements.
Looking at visual objects
Much of our understanding about the brain mechanisms for controlling how and where we look derives from minimalist behavioral tasks relying on simple spots of light as the potential targets. However, visual targets in natural settings are rarely individual, point-like sources of light. Instead, they are typically larger visual objects that may or may not contain explicit features to look at. In this presentation, I will argue that the use of more complex, and arguably more “natural”, visual stimuli than is commonly used in oculomotor research is important for learning the extent to which eye movements can serve visual perception. I will provide an example of this by describing a behavioral phenomenon in which the visual system consistently fails in interpreting a retinal stimulus as containing coherent objects when this stimulus is not accompanied by an ongoing eye movement. I will then shed light on an important node in the brain circuitry involved in the process of looking at visual objects. Specifically, I will show that the superior colliculus (SC), best known for its motor control of saccades, provides a neural “pointer” for the location of a visual object, independent of the object’s individual features and distinct from the motor commands associated with this brain structure. Such a pointer allows the oculomotor system to precisely direct gaze, even in the face of large extended objects.
More importantly, because the SC also provides ascending signals to sensory areas, such a pointer may also be involved in modulating object-based attention and perception.
Mechanisms of fixation selection evaluated using ideal observer analysis
Wilson S. Geisler
The primate visual system combines a wide field of view with a high resolution fovea and uses saccadic eye movements to direct the fovea at potentially relevant locations in visual scenes. This is a sensible design for a visual system with limited neural resources. However, to be effective this design requires sophisticated task-dependent mechanisms for selecting fixation locations. I will argue that in studying the brain mechanisms that control saccadic eye movements in specific tasks, it can be very useful to consider how fixations would be selected by an ideal observer. Such an ideal-observer analysis provides: (i) insight into the information processing demands of the task, (ii) a benchmark against which to evaluate the actual eye movements of the organism, (iii) a starting point for formulating hypotheses about the underlying brain mechanisms, and (iv) a benchmark against which to evaluate the efficiency of hypothesized brain mechanisms. In making the case, I will describe recent examples from our lab concerning naturalistic visual-search tasks and scene-encoding tasks.
Bayesian models applied to perceptual behavior
Friday, May 9, 2008, 3:30 – 5:30 pm Royal Palm 4
Organizer: Peter Battaglia (University of Minnesota)
Presenters: Alan Yuille (University of California Los Angeles), David Knill (University of Rochester), Paul Schrater (University of Minnesota), Tom Griffiths (University of California, Berkeley), Konrad Koerding (Northwestern University), Peter Battaglia (University of Minnesota)
This symposium will provide information and methodological tools for researchers who are interested in modeling perception as probabilistic inference, but are unfamiliar with the practice of such techniques. In the last 20 years, scientists characterizing perception as Bayesian inference have produced a number of robust models that explain observed perceptual behaviors and predict new, unobserved behaviors. Such successes are due to the formal, universal language of Bayesian models and the powerful hypothesis-evaluation tools they allow. Yet many researchers who attempt to build and test Bayesian models feel overwhelmed by the potentially steep learning curve and abandon their attempts after stumbling over unintuitive obstacles. It is important that those scientists who recognize the explanatory power of Bayesian methods and wish to implement the framework in their own research have the tools, and know-how to use them, at their disposal. This symposium will provide a gentle introduction to the most important elements of Bayesian models of perception, while avoiding the nuances and subtleties that are not critical. The symposium will be geared toward senior faculty and students alike, and will require no technical prerequisites to understand the major concepts, and only knowledge of basic probability theory and experimental statistics to apply the methods. Those comfortable with Bayesian modeling may find the symposium interesting, but the target audience will be the uninitiated.
The formalism of Bayesian models allows a principled description of the processes that allow organisms to recover scene properties from sensory measurements, thereby enabling a clear statement of experimental hypotheses and their connections with related theories. Many people believe Bayesian modeling is primarily for fitting unpleasant data using a prior: this is a misconception that will be dealt with! In previous attempts to correct such notions, most instruction about probabilistic models of perception falls into one of two categories: qualitative, abstract description, or quantitative, technical application. This symposium constitutes a hybrid of these categories by phrasing qualitative descriptions in quantitative formalism. Intuitive and familiar examples will be used so the connection between abstract and practical issues remains clear.
The goals of this symposium are two-fold: to present the most current and important ideas involving probabilistic perceptual models, and provide hands-on experience working with them. To accomplish these goals, our speakers will address topics such as the history and motivation for probabilistic models of perception, the relation between sensory uncertainty and probability-theoretic representations of variability, the brain�s assumptions about how the world causes sensory measurements, how to investigate the brain�s internal knowledge of probability, framing psychophysical tasks as perceptually-guided decisions, and hands-on modeling tutorials presented as Matlab scripts that will be made available for download beforehand so those with laptops can follow along. Each talk will link the conceptual material to the scientific interests of the audience by presenting primary research and suggesting perceptual problems that are ripe for the application of Bayesian methods.
Modeling Vision as Bayesian Inference: Is it Worth the Effort?
The idea of perception as statistical inference grew out of work in the 1950s in the context of a general theory of auditory and visual signal detectability. Signal detection theory from the start used concepts and tools from Bayesian Statistical Decision theory that are with us today: 1) a generative model that specifies the probability of sensory data conditioned on signal states; 2) prior probabilities of those states; 3) the utility of decisions or actions as they depend on those states. By the 1990s, statistical inference models were being extended to an increasingly wider set of problems, including object and motion perception, perceptual organization, attention, reading, learning, and motor control. These applications have relied in part on the development of new concepts and computational methods to analyze and model more realistic visual tasks. I will provide an overview of current work, describing some of the success stories. I will try to identify future challenges for testing and modeling theories of visual behavior–research that will require learning, and computing probabilities on more complex, structured representations.
Bayesian modeling in the context of robust cue integration
Building Bayesian models of visual perception is becoming increasingly popular in our field. Those of us who make a living constructing and testing Bayesian models are often asked the question, “What good are models that can be fit to almost any behavioral data?” I will address this question in two ways: first by acknowledging the ways in which Bayesian modeling can be misused, and second by outlining how Bayesian modeling, when properly applied, can enhance our understanding of perceptual processing. I will use robust cue integration as an example to illustrate some ways in which Bayesian modeling helps organize our understanding of the factors that determine perceptual performance, makes predictions about performance, and generates new and interesting questions about perceptual processes. Robust cue integration characterizes the problem of how the brain integrates information from different sensory cues that have unnaturally large conflicts. To build a Bayesian model of cue integration, one must explicitly model the world processes that give rise to such conflicting cues. When combined with models of internal sensory noise, such models predict behaviors that are consistent with human performance. While we can “retro-fit” the models to the data, the real test of our models is whether they agree with what we know about sensory processing and the structure of the environment (though mismatches may invite questions ripe for future research). At their best, such models help explain how perceptual behavior relates to the computational structure of the problems observers face and the constraints imposed by sensory mechanisms.
Bayesian models for sequential decisions
Performing common perceptually-guided actions, like saccades and reaches, requires our brains to overcome uncertainty about the objects and geometry relevant to our actions (world state), potential consequences of our actions, and individual rewards attached to these consequences. A principled approach to such problems is termed “stochastic-optimal control”, and uses Bayesian inference to simultaneously update beliefs about the world state, action consequences, and individual rewards. Rational agents seek rewards, and since rewards depend on the consequences of actions, and those consequences depend on the world state, updating beliefs about all three is necessary to acquire the most reward possible.
Consider the example of reaching to grasp your computer mouse while viewing your monitor. Some strategies and outcomes for guiding your reach include: 1.) keeping your eyes fixed, moving quickly, and probably missing the mouse, 2.) keeping your eyes fixed, moving slowly, and wasting time reaching, 3.) turning your head, staring at the mouse, wasting time moving your head, or 4.) quickly saccading toward the mouse, giving you enough positional information to make a fast reach without wasting much time. This example highlights the kind of balance perceptually-guided actions strike thousands of times a day: scheduling information-gathering and action-execution when there are costs (i.e. time, missing the target) attached. Using the language of stochastic-optimal control, tradeoffs like these can be formally characterized and explain otherwise opaque behavioral decisions. My presentation will introduce stochastic-optimal control theory, and show how applying the basic principles offer a powerful framework for describing and evaluating perceptually-guided action.
Exploring subjective probability distributions using Bayesian statistics
Bayesian models of cognition and perception express the expectations of learners and observers in terms of subjective probability distributions – priors and likelihoods. This raises an interesting psychological question: if human inferences adhere to the principles of Bayesian statistics, how can we identify the subjective probability distributions that guide these inferences? I will discuss two methods for exploring subjective probability distributions. The first method is based on evaluating human judgments against distributions provided by the world. The second substitutes people for elements in randomized algorithms that are commonly used to generate samples from probability distributions in Bayesian statistics. I will show how these methods can be used to gather information about the priors and likelihoods that seem to characterize human judgments.
Causal inference in multisensory perception
Perceptual events derive their significance to an animal from their meaning about the world, that is from the information they carry about their causes. The brain should thus be able to efficiently infer the causes underlying our sensory events. Here we use multisensory cue combination to study causal inference in perception. We formulate an ideal-observer model that infers whether two sensory cues originate from the same location and that also estimates their location(s). This model accurately predicts the nonlinear integration of cues by human subjects in two auditory-visual localization tasks. The results show that indeed humans can efficiently infer the causal structure as well as the location of causes. By combining insights from the study of causal inference with the ideal-observer approach to sensory cue combination, we show that the capacity to infer causal structure is not limited to conscious, high-level cognition; it is also performed continually and effortlessly in perception.
How to: Applying a Bayesian model to a perceptual question
Bayesian models provide a powerful language for describing and evaluating hypotheses about perceptual behaviors. When implemented properly they allow strong conclusions about the brain�s perceptual solutions in determining what caused incoming sensory information. Unfortunately, constructing a Bayesian model may seem challenging and perhaps �not worth the trouble� to those who are not intimately familiar with the practice. Even with a clear Bayesian model, it is not always obvious how experimental data should be used to evaluate the model�s parameters. This presentation will demystify the process by walking through the modeling and analysis using a simple, relevant example of a perceptual behavior.
First I will introduce a familiar perceptual problem and describe the choices involved in formalizing it as a Bayesian model. Next, I will explain how standard experimental data can be exploited to reveal model parameter values and how the results of multiple experiments may be unified to fully evaluate the model. The presentation will be structured as a tutorial that will use Matlab scripts to simulate the generation of sensory data, the brain�s hypothetical inference procedure, and the quantitative analysis of this hypothesis. The scripts will be made available beforehand so the audience has the option of downloading and following along to enhance the hands-on theme. My goal is that interested audience members will be able to explore the scripts at a later time to familiarize themselves more thoroughly with a tractable modeling and analysis process.
Friday, May 9, 2008, 1:00 – 3:00 pm Royal Palm 5
Organizer: Denis G. Pelli (New York University)
Presenters: Patrick Cavanagh (Harvard University and LPP, Universit� Paris Descartes), Brad C. Motter (Veterans Affairs Medical Center and SUNY Upstate Medical University), Yury Petrov (Northeastern University), Joshua A. Solomon (City University, London), Katharine A. Tillman (New York University)
Crowding is a breakdown of object recognition. It happens when the visual system inappropriately integrates features over too large an area, coming up with an indecipherable jumble instead an object. An explosion of new experiments exploit crowding to study object recognition by breaking it. The five speakers will review past work, providing a tutorial introduction to crowding, and will describe the latest experiments seeking to define the limits of crowding and object recognition. The general question, including �integration�, �binding�, �segmentation�, �grouping,� �contour integration�, and �selective attention�, is a burning issue for most members of VSS.
Crowding: When grouping goes wrong
Early visual processes work busily to construct accurate representations of edges, colors and other features that appear within their receptive fields, dutifully posting their details across the retinotopic landscape of early cortices. Then the fat hand of attention makes a grab at a target and comes up with an indecipherable stew of everything in the region. Well, that�s one model of crowding. There are others. Whatever the model of crowding, it is clear that the phenomenon provides a rare window onto the mid-level process of feature integration. I will present results on nonretinotopic crowding and anticrowding that broaden the range of phenomena we include in the category of crowding.
Correlations between visual search and crowding
Brad C. Motter
Visual search through simple stimulus arrays can be described as a linear function of the angular separation between the target and surrounding items after scaling for cortical magnification. Maximum reading speeds as a function of eccentricity also appear to be bound by a cortical magnification factor. If crowding can explain these visual behaviors, what is the role of focal attention in these findings?
Locus of spatial attention determines inward-outward anisotropy in crowding
I show that the locus of spatial attention strongly affects crowding, inducing inward-outward anisotropy in some conditions, removing or reversing it in others. It appears that under normal viewing conditions attention is mislocalized outward of the target, which may explain stronger crowding by an outward mask.
Context-induced acuity loss for tilt: If it is not crowding, what is it?
Joshua A. Solomon and Michael J. Morgan
When other objects are nearby, it becomes more difficult to determine whether a particular object is tilted, for example, clockwise or anti-clockwise of vertical. “Crowding” is similar: when other letters are nearby, it becomes more difficult to determine the identity of a particular letter or whether it is, for example, upside down or mirror-reversed. There is one major difference between these two phenomena. The former occurs with big objects in the centre of the visual field; the latter does not. We call the former phenomenon “squishing.” Two mechanisms have been proposed to explain it: lateral inhibition and stochastic re-calibration. Simple models based on lateral inhibition cannot explain why nearby objects do not impair contrast discrimination as well as tilt acuity, but a new comparison of acuities measured with the Method of Single Stimuli and 2-Alternative Forced-Choice do not support models based on stochastic re-calibration. Lateral inhibition deserves re-consideration. Network simulations suggest that many neurones capable of contrast discrimination have little to contribute towards tilt identification and vice versa.
The uncrowded window for object recognition
Katharine A. Tillman and Denis G. Pelli
It has been known throughout history that we cannot see things that are too small. However, it is now emerging that vision is usually not limited by object size, but by spacing. The visual system recognizes an object by detecting and then combining its features. When objects are too close together, the visual system combines features from them all, producing a jumbled percept. This phenomenon is called crowding. Critical spacing is the smallest distance between objects that avoids crowding. We review the explosion of studies of crowding � in grating discrimination, letter and face recognition, visual search, and reading � to reveal a universal law, the Bouma law: Critical spacing is proportional to distance from fixation, depending only on where (not what) the object is. Observers can identify objects only in the uncrowded window within which object spacing exceeds critical spacing. The uncrowded window limits reading rate and explains why we can recognize a face only if we look directly at it. Visual demonstrations allow the audience to verify key experimental results.
Perceptual expectations and the neural processing of complex images
Friday, May 9, 2008, 1:00 – 3:00 pm Royal Palm 6-8
Organizer: Bharathi Jagadeesh (University of Washington)
Presenters: Moshe Bar (Harvard Medical School), Bharathi Jagadeesh (University of Washington), Nicholas Furl (University College London), Valentina Daelli (SISSA), Robert Shapley (New York University)
The processing of complex images occurs within the context of prior expectations and of current knowledge about the world. A clue about an image, “think of an elephant”, for example, can cause an otherwise nonsensical image to transform into a meaningful percept. The informative clue presumably activates the neural substrate of an expectation about the scene that allows the visual stimulus representation to be more readily interpreted. In this symposium we aim to discuss the neural mechanisms that underlie the use of clues and context to assist in the interpretation of ambiguous stimuli. The work of five laboratories, using imaging, single-unit recording, MEG, psychophysics, and network models of visual processes all show evidence of the impact of prior knowledge on the processing of visual stimuli.
In the work of Bar, we see evidence that a short latency neural response may be induced in higher level cortical areas by complex signals traveling through a fast visual pathway. This pathway may provide the neural mechanism that modifies the processing of visual stimuli as they stream through the brain. In the work of Jagadeesh, we see a potential effect of that modified processing: neural selectivity in inferotemporal cortex is sufficient to explain performance in a classification task with difficult to classify complex images, but only when the images are evaluated in a particular framed context: Is the image A or B (where A or B are photographs, for example a horse and a giraffe). In the work of Furl, human subjects were asked to classify individual exemplars of faces along a particular dimension (emotion), and had prior experience with the images in the form of an adapting stimulus. In this context, classification is shifted away from the adapting stimulus. Simultaneously recorded MEG activity shows evidence reentrant signal, induced by the prior experience of the prime, that could explain the shift in classification. In the work of Treves, we see examples of networks that reproduce the observed late convergence of neural activity onto the response to an image stored in memory, and that can simulate mechanisms possibly underlying predictive behavior. Finally, in the work of Shapley, we see that simple cells in layer 2/3 of V1 (a major input layer for intra-cortical connections) paradoxically show dynamic nonlinearities.
The presence of a dynamic nonlinearity in the responses of V1 simple cells indicates that first-order analyses often capture only a fraction of neuronal behavior, a consideration with wide ranging implications for the analysis in visual responses in more advanced cortical areas. Signals provided by expectation might influence processing throughout the visual system to bias the perception and neural processing of the visual stimulus in the context of that expectation.
The work to be described is of significant scientific merit and reflects recent work in the field; it is original, forcing re-examination of the traditional view of vision as a method of extracting information from the visual scene in the absence of contextual knowledge, a topic of broad interest to those studying visual perception.
The proactive brain: using analogies and associations to generate predictions
Rather than passively ‘waiting’ to be activated by sensations, it is proposed that the human brain is continuously busy generating predictions that approximate the relevant future. Building on previous work, this proposal posits that rudimentary information is extracted rapidly from the input to derive analogies linking that input with representations in memory.
The linked stored representations then activate the associations that are relevant in the specific context, which provides focused predictions. These predictions facilitate perception and cognition by pre-sensitizing relevant representations. Predictions regarding complex information, such as those required in social interactions, integrate multiple analogies. This cognitive neuroscience framework can help explain a variety of phenomena, ranging from recognition to first impressions, and from the brain’s ‘default mode’ to a host of mental disorders.
Neural selectivity in inferotemporal cortex during active classification of photographic images
Images in the real world are not classified or categorized in the absence of expectations about what we are likely to see. For example, giraffes are quite unlikely to appear in one’s environment except in Africa. Thus, when an image is viewed, it is viewed within the context of possibilities about what is likely to appear. Classification occurs within limited expectations about what has been asked about the images. We have trained monkeys to answer questions about ambiguous images in a constrained context: is the image A or B, where A and B are pictures from the visual world, like a giraffe or a horse and recorded responses in inferotemporal cortex while the task is performed, and while the same images are merely viewed. When we record neural responses to these images, while the monkey is required to ask (and answer) a simple question, neural selectivity in IT is sufficient to explain behavior. When the monkey views the same stimuli, in the absence of this framing context, the neural responses are insufficiently selective to explain the separately collected behavior. These data suggest that when the monkey is asked a very specific and limited question about a complex image, IT cortex is selective in exactly the right way to perform the task well. We propose this match between the needs of the task, and the responses in IT results from predictions, generated in other brain areas, which enhance the relevant IT representations.
Experience-based coding in categorical face perception
One fundamental question in vision science concerns how neural activity produces everyday perceptions. We explore the relationship between neural codes capturing deviations from experience and the perception of visual categories. An intriguing paradigm for studying the role of short-term experience in categorical perception is face adaptation aftereffects – where perception of ambiguous faces morphed between two category prototypes (e.g., two facial identities or expressions) depends on which category was experienced during a recent adaptation period. One might view this phenomenon as a perceptual bias towards novel categories – i.e., those mismatching recent experience. Using fMRI, we present evidence consistent with this viewpoint, where perception of nonadapted categories is associated with medial temporal activity, a region known to subserve novelty processing. This raises a possibility, consistent with models of face perception, that face categories are coded with reference to a representation of experience, such as a norm or top-down prediction. We investigated this idea using MEG by manipulating the deviation in emotional expression between the adapted and morph stimuli. We found signals coding for these deviations arising in the right superior temporal sulcus – a region known to contribute to observation of actions and, notably, face expressions. Moreover, adaptation in the right superior temporal sulcus was also predictive of the magnitude of behavioral aftereffects. The relatively late onset of these effects is suggestive of a role for backwards connections or top-down signaling. Overall, these data are consistent with the idea that face perception depends on a neural representation of the deviation of short-term experience.
Categorical perception may reveal cortical adaptive dynamics
Valentina Daelli, Athena Akrami, Nicola J van Rijsbergen and Alessandro Treves, SISSA
The perception of faces and of the social signals they display is an ecologically important process, which may shed light on generic mechanisms of cortically mediated plasticity. The possibility that facial expressions may be processed also along a sub-cortical pathway, leading to the amygdala, offers the potential to single out uniquely cortical contributions to adaptive perception. With this aim, we have studied adaptation aftereffects, psychophysically, using faces morphed between two expressions. These are perceptual changes induced by adaptation to a priming stimulus, which biases subjects to see the non-primed expression in the morphs. We find aftereffects even with primes presented for very short periods, or with faces low-pass filtered to favor sub-cortical processing, but full cortical aftereffects are much larger, suggesting a process involving conscious comparisons, perhaps mediated by cortical memory attractors, superimposed on a more automatic process, perhaps expressed also subcortically. In a modeling project, a simple network model storing discrete memories can in fact explain such short term plasticity effects in terms of neuronal firing rate adaptation, acting against the rigidity of the boundaries between long-term memory attractors. The very same model can be used, in the long-term memory domain, to account for the convergence of neuronal responses, observed by the Jagadeesh lab in monkey inferior temporal cortex.
Contrast-sign specificity built into the primary visual cortex, V1
Williams and Shpaley
We (Wlliams & Shapley 2007) found that in different cell layers in the macaque primary visual cortex, V1, simple cells have qualitatively different responses to spatial patterns. In response to a stationary grating presented for 100ms at the optimal spatial phase (position), V1 neurons produce responses that rise quickly and then decay before stimulus offset. For many simple cells in layer 4, it was possible to use this decay and the assumption of linearity to predict the amplitude of the response to the offset of a stimulus of the opposite-to-optimal spatial phase. However, the linear prediction was not accurate for neurons in layer 2/3 of V1, the main cortico-cortical output from V1. Opposite-phase responses from simple cells in layer 2/3 were always near zero. Even when a layer 2/3 neuron’s optimal-phase response was very transient, which would predict a large response to the offset of the opposite spatial phase, opposite-phase responses were small or zero. The suppression of opposite-phase responses could be an important building block in the visual perception of surfaces.
Simple cells like those found in layer 4 respond to both contrast polarities of a given stimulus (both brighter and darker than background, or opposite spatial phases). But unlike layer 4 neurons, layer 2/3 simple cells code unambiguously for a single contrast polarity. With such polarity sensitivity, a neuron can represent “dark-left – bright-right” instead of just an unsigned boundary.