Perceptual expectations and the neural processing of complex images
Friday, May 9, 2008, 1:00 – 3:00 pm Royal Palm 6-8
Organizer: Bharathi Jagadeesh (University of Washington)
Presenters: Moshe Bar (Harvard Medical School), Bharathi Jagadeesh (University of Washington), Nicholas Furl (University College London), Valentina Daelli (SISSA), Robert Shapley (New York University)
The processing of complex images occurs within the context of prior expectations and of current knowledge about the world. A clue about an image, “think of an elephant”, for example, can cause an otherwise nonsensical image to transform into a meaningful percept. The informative clue presumably activates the neural substrate of an expectation about the scene that allows the visual stimulus representation to be more readily interpreted. In this symposium we aim to discuss the neural mechanisms that underlie the use of clues and context to assist in the interpretation of ambiguous stimuli. The work of five laboratories, using imaging, single-unit recording, MEG, psychophysics, and network models of visual processes all show evidence of the impact of prior knowledge on the processing of visual stimuli.
In the work of Bar, we see evidence that a short latency neural response may be induced in higher level cortical areas by complex signals traveling through a fast visual pathway. This pathway may provide the neural mechanism that modifies the processing of visual stimuli as they stream through the brain. In the work of Jagadeesh, we see a potential effect of that modified processing: neural selectivity in inferotemporal cortex is sufficient to explain performance in a classification task with difficult to classify complex images, but only when the images are evaluated in a particular framed context: Is the image A or B (where A or B are photographs, for example a horse and a giraffe). In the work of Furl, human subjects were asked to classify individual exemplars of faces along a particular dimension (emotion), and had prior experience with the images in the form of an adapting stimulus. In this context, classification is shifted away from the adapting stimulus. Simultaneously recorded MEG activity shows evidence reentrant signal, induced by the prior experience of the prime, that could explain the shift in classification. In the work of Treves, we see examples of networks that reproduce the observed late convergence of neural activity onto the response to an image stored in memory, and that can simulate mechanisms possibly underlying predictive behavior. Finally, in the work of Shapley, we see that simple cells in layer 2/3 of V1 (a major input layer for intra-cortical connections) paradoxically show dynamic nonlinearities.
The presence of a dynamic nonlinearity in the responses of V1 simple cells indicates that first-order analyses often capture only a fraction of neuronal behavior, a consideration with wide ranging implications for the analysis in visual responses in more advanced cortical areas. Signals provided by expectation might influence processing throughout the visual system to bias the perception and neural processing of the visual stimulus in the context of that expectation.
The work to be described is of significant scientific merit and reflects recent work in the field; it is original, forcing re-examination of the traditional view of vision as a method of extracting information from the visual scene in the absence of contextual knowledge, a topic of broad interest to those studying visual perception.
The proactive brain: using analogies and associations to generate predictions
Rather than passively ‘waiting’ to be activated by sensations, it is proposed that the human brain is continuously busy generating predictions that approximate the relevant future. Building on previous work, this proposal posits that rudimentary information is extracted rapidly from the input to derive analogies linking that input with representations in memory.
The linked stored representations then activate the associations that are relevant in the specific context, which provides focused predictions. These predictions facilitate perception and cognition by pre-sensitizing relevant representations. Predictions regarding complex information, such as those required in social interactions, integrate multiple analogies. This cognitive neuroscience framework can help explain a variety of phenomena, ranging from recognition to first impressions, and from the brain’s ‘default mode’ to a host of mental disorders.
Neural selectivity in inferotemporal cortex during active classification of photographic images
Images in the real world are not classified or categorized in the absence of expectations about what we are likely to see. For example, giraffes are quite unlikely to appear in one’s environment except in Africa. Thus, when an image is viewed, it is viewed within the context of possibilities about what is likely to appear. Classification occurs within limited expectations about what has been asked about the images. We have trained monkeys to answer questions about ambiguous images in a constrained context: is the image A or B, where A and B are pictures from the visual world, like a giraffe or a horse and recorded responses in inferotemporal cortex while the task is performed, and while the same images are merely viewed. When we record neural responses to these images, while the monkey is required to ask (and answer) a simple question, neural selectivity in IT is sufficient to explain behavior. When the monkey views the same stimuli, in the absence of this framing context, the neural responses are insufficiently selective to explain the separately collected behavior. These data suggest that when the monkey is asked a very specific and limited question about a complex image, IT cortex is selective in exactly the right way to perform the task well. We propose this match between the needs of the task, and the responses in IT results from predictions, generated in other brain areas, which enhance the relevant IT representations.
Experience-based coding in categorical face perception
One fundamental question in vision science concerns how neural activity produces everyday perceptions. We explore the relationship between neural codes capturing deviations from experience and the perception of visual categories. An intriguing paradigm for studying the role of short-term experience in categorical perception is face adaptation aftereffects – where perception of ambiguous faces morphed between two category prototypes (e.g., two facial identities or expressions) depends on which category was experienced during a recent adaptation period. One might view this phenomenon as a perceptual bias towards novel categories – i.e., those mismatching recent experience. Using fMRI, we present evidence consistent with this viewpoint, where perception of nonadapted categories is associated with medial temporal activity, a region known to subserve novelty processing. This raises a possibility, consistent with models of face perception, that face categories are coded with reference to a representation of experience, such as a norm or top-down prediction. We investigated this idea using MEG by manipulating the deviation in emotional expression between the adapted and morph stimuli. We found signals coding for these deviations arising in the right superior temporal sulcus – a region known to contribute to observation of actions and, notably, face expressions. Moreover, adaptation in the right superior temporal sulcus was also predictive of the magnitude of behavioral aftereffects. The relatively late onset of these effects is suggestive of a role for backwards connections or top-down signaling. Overall, these data are consistent with the idea that face perception depends on a neural representation of the deviation of short-term experience.
Categorical perception may reveal cortical adaptive dynamics
Valentina Daelli, Athena Akrami, Nicola J van Rijsbergen and Alessandro Treves, SISSA
The perception of faces and of the social signals they display is an ecologically important process, which may shed light on generic mechanisms of cortically mediated plasticity. The possibility that facial expressions may be processed also along a sub-cortical pathway, leading to the amygdala, offers the potential to single out uniquely cortical contributions to adaptive perception. With this aim, we have studied adaptation aftereffects, psychophysically, using faces morphed between two expressions. These are perceptual changes induced by adaptation to a priming stimulus, which biases subjects to see the non-primed expression in the morphs. We find aftereffects even with primes presented for very short periods, or with faces low-pass filtered to favor sub-cortical processing, but full cortical aftereffects are much larger, suggesting a process involving conscious comparisons, perhaps mediated by cortical memory attractors, superimposed on a more automatic process, perhaps expressed also subcortically. In a modeling project, a simple network model storing discrete memories can in fact explain such short term plasticity effects in terms of neuronal firing rate adaptation, acting against the rigidity of the boundaries between long-term memory attractors. The very same model can be used, in the long-term memory domain, to account for the convergence of neuronal responses, observed by the Jagadeesh lab in monkey inferior temporal cortex.
Contrast-sign specificity built into the primary visual cortex, V1
Williams and Shpaley
We (Wlliams & Shapley 2007) found that in different cell layers in the macaque primary visual cortex, V1, simple cells have qualitatively different responses to spatial patterns. In response to a stationary grating presented for 100ms at the optimal spatial phase (position), V1 neurons produce responses that rise quickly and then decay before stimulus offset. For many simple cells in layer 4, it was possible to use this decay and the assumption of linearity to predict the amplitude of the response to the offset of a stimulus of the opposite-to-optimal spatial phase. However, the linear prediction was not accurate for neurons in layer 2/3 of V1, the main cortico-cortical output from V1. Opposite-phase responses from simple cells in layer 2/3 were always near zero. Even when a layer 2/3 neuron’s optimal-phase response was very transient, which would predict a large response to the offset of the opposite spatial phase, opposite-phase responses were small or zero. The suppression of opposite-phase responses could be an important building block in the visual perception of surfaces.
Simple cells like those found in layer 4 respond to both contrast polarities of a given stimulus (both brighter and darker than background, or opposite spatial phases). But unlike layer 4 neurons, layer 2/3 simple cells code unambiguously for a single contrast polarity. With such polarity sensitivity, a neuron can represent “dark-left – bright-right” instead of just an unsigned boundary.