Mid-level representations in visual processing

Organizer: Jonathan Peirce; University of Nottingham
Presenters: Jonathan Peirce, Anitha Pasupathy, Zoe Kourtzi, Gunter Loffler, Tim Andrews, Hugh Wilson

< Back to 2014 Symposia

Symposium Description

A great deal is known about the early stages of visual processing, whereby light of different wavelengths is detected and filtered in such a way as to represent something approximating “edges”. A large number of studies are also examining the “high-level” processing and representation of visual objects; the representation of faces and scenes, and the visual areas responsible for their processing. Remarkably few studies examine either the intervening “mid-level” representations or the visual areas that are involved in this level of processing. This symposium will examine what form these intermediate representations might take, and what methods we have available to study such mechanisms. The speakers have used a variety of methods to try and understand mid-level processing and the associated visual areas. Along the way, a number of questions will be considered. Do we even have intermediate representations; surely higher-order object representations could be built directly on the outputs of V1 cells since all of the information is available there? How does such a representation not fall foul of the problem of parameter explosion? What aspects of the visual scene are encoded at this level? How could we understand such representations further? Why have we not made further progress in this direction before; is the problem simply too hard to study? The symposium is designed for attendees of all levels and will involve a series of 20 minute talks (each including 5 minutes for questions) from each of the speakers. We hope to encourage people that this is an important and tangible problem that vision scientists should be working hard to solve.


Compound feature detectors in mid-level vision

Speaker: Jonathan Peirce; University of Nottingham

A huge number of studies have considered low-level visual processes (such as the detection of edges, colors and motion) and high-level visual processes (such as the processing of faces and scenes). Relatively few studies examine the nature of intermediate visual representations, or “mid-level” vision. One approach to studying mid-level visual representations might be to try and understand the mechanisms that combine the outputs of V1 neurons to create an intermediate feature detector. We have used adaptation techniques to try and probe the existence of detectors for combinations of sinusoids that might form plaid form detectors or curvature detectors. We have shown for both of these features that adaptation effects to the compound has been greater than predicted by adaptation to the parts alone, and that this is greatest when the components form a plaid that we perceive as coherent or a curve that is continuous. To create such representations requires simple logical AND-gates, which might be formed simply by summing the nonlinear outputs of V1 neurons. Many questions remain however, about where in the visual cortex these representations are stored, and how the different levels of representation interact.

Boundary curvature as a basis of shape encoding in macaque area V4

Speaker: Anitha Pasupathy; University of Washington

The broad goal of research in my laboratory is to understand how visual form is encoded in the intermediate stages of the ventral visual pathway, how these representations arise and how they contribute to object recognition behavior. Our current focus is primate V4, an area known to be critical for form processing. Given the enormity of the shape-encoding problem, our strategy has been to test specific hypotheses with custom-designed, parametric, artificial stimuli. With guidance from shape theory, computer-vision and psychophysical literature we identify stimulus features (for example T-junctions) that might be critical in natural vision and work these into our stimulus design so as to progress in a controlled fashion toward more naturalistic stimuli. I will present examples from our past and current experiments that successfully employ this strategy and have led to the discovery of boundary curvature as a basis for shape encoding in area V4. I will conclude with some brief thoughts on how we might move from the highly-controlled stimuli we currently use to the more rich and complex stimuli of natural vision.

Adaptive shape coding in the human visual brain

Speaker: Zoe Kourtzi; University of Birmingham

In the search for neural codes, we typically measure responses to input stimuli alone without considering their context in space (i.e. scene configuration) or time (i.e. temporal history). However, accumulating evidence suggests an adaptive neural code that is dynamically shaped by experience. Here, we present work showing that experience plays a critical role in molding mid-level visual representations and shape perception. Combining behavioral and brain imaging measurements we demonstrate that learning optimizes the binding of local elements into shapes, and the selection of behaviorally relevant features for shape categorization. First, we provide evidence that the brain flexibly exploits image regularities and learns to use discontinuities typically associated with surface boundaries for contour linking and target identification. Specifically, learning of regularities typical in natural contours (i.e., collinearity) can occur simply through frequent exposure, generalize across untrained stimulus features, and shape processing in occipitotemporal regions. In contrast, learning to integrate discontinuities (i.e., elements orthogonal to contour paths) requires task-specific training, is stimulus dependent, and enhances processing in intraparietal regions. Second, by reverse correlating behavioral and fMRI responses with noisy stimulus trials, we identify the critical image parts that determine the observers’ choice in a shape categorization task. We demonstrate that learning optimizes shape templates by tuning the representation of informative image parts in higher ventral cortex. In sum, we propose that similar learning mechanisms may mediate long-term optimization through development, tune the visual system to fundamental principles of feature binding, and shape visual category representations.

Probing intermediate stages of shape processing

Speaker: Gunter Loffler; Glasgow Caledonian University

The visual system provides a representation of what and where objects are. This entails parsing the visual scene into distinct objects. Initially, the visual system encodes information locally. While interactions between adjacent cells can explain how local fragments of an object’s contour are extracted from a scene, more global mechanisms have to be able to integrate information beyond that of neighbouring cells to allow for the representation of extended objects. This talk will examine the nature of intermediate-level computations in the transformation from discrete local sampling to the representation of complex objects. Several paradigms were invoked to study how information concerning the position and orientation of local signals is combined: a shape discrimination task requiring observers to discriminate between contours; a shape coherence task measuring the number of elements required to detect a contour; a shape illusion in which positional and orientational information is combined inappropriately. Results support the notion of mechanisms that integrate information beyond that of neighbouring cells and are optimally tuned to a range of different contour shapes. Global integration is not restricted to central vision: peripheral data show that certain aspects of this process only emerge at intermediate stages. Moreover, intermediate processing appears vulnerable to damage. Diverse clinical populations (migraineurs, pre-term children and children with Cortical Visual Impairment) show specific deficits for these tasks that cannot be accounted for by low-level processes. Taken together, evidence is converging towards the identification of an intermediate level of processing, at which sensitivity to global shape attributes emerges.

Low-level image properties of visual objects explain category-selective patterns of neural response across the ventral visual pathway

Speaker: Tim Andrews; University of York

Neuroimaging research over the past 20 years has begun to reveal a picture of how the human visual system is organized. A key organizing principle that has arisen from these studies is the distinction between low-level and high-level visual regions. Low-level regions are organized into visual field maps that are tightly linked to the image properties of the stimulus. In contrast, high-level visual areas are thought to be arranged in modules that are selective for particular object categories. It is unknown, however, whether this selectivity is truly based on object category, or whether it reflects tuning for low-level features that are common to images from a particular category. To address this issue, we compared the pattern of neural response elicited by each object category with the corresponding low-level properties of images from each object category. We found a strong positive correlation between the neural patterns and the underlying low-level image properties. Importantly, the correlation was still evident when the within-category correlations were removed from the analysis. Next, we asked whether low-level image properties could also explain variation in the pattern of response to exemplars from individual object categories (faces or scenes). Again, a positive correlation was evident between the similarity in the pattern of neural response and the low-level image properties of exemplars from individual object categories. These results suggest that the pattern of response in high-level visual areas may be better explained by the image statistics of visual stimuli than by their associated categorical or semantic properties.

From Orientations to Objects: Configural Processing in the Ventral Stream

Speaker: Hugh Wilson; York University

I shall review psychophysical and fMRI evidence for a hierarchy of intermediate processing stages in the ventral or form vision system. A review of receptive field sizes from V1 up to TE indicates an increase in diameter by a factor of about 3.0 from area to area. This is consistent with configural combination of adjacent orientations to form curves or angles, followed by combination of curves and angles to form descriptors of object shapes. Psychophysical and fMRI evidence support this hypothesis, and neural models provide a plausible explanation of this hierarchical configural processing.

< Back to 2014 Symposia