S6 - Integrating local motion information
Friday, May 6, 2:30 - 4:30 pm, Royal Ballroom 6-8
Organizer: Duje Tadin, University of Rochester, Center for Visual Science
Presenters: Xin Huang, partment of Physiology, University of Wisconsin; Duje Tadin, University of Rochester, Center for Visual Science; David R. Badcock, School of Psychology, The University of Western Australia; Christopher C Pack, Montreal Neurological Institute, McGill University; Shin'ya Nishida, NTT Communication Science Laboratories; Alan Johnston, Cognitive, Perceptual and Brain Sciences, University College London
Since Adelson and Movshon’s seminal 1982 paper on the phenomenal coherence of moving patterns, a large literature has accumulated on how the visual system integrates local motion estimates to represent true object motion. Although this research topic can be traced back to the early 20th century, a number of key questions remain unanswered. Specifically, we still have an incomplete understanding of how ambiguous and unambiguous motions are integrated and how local motion estimates are grouped and segmented to represent global object motions. A key problem for motion perception involves establishing the appropriate balance between integration and segmentation of local motions. Local ambiguities require motion integration, while perception of moving objects requires motion segregation. These questions form the core theme for this workshop that includes both psychophysical (Tadin, Nishida, Badcock and Johnston) and neurophysiological research (Pack and Huang).
Presentations by Huang and Tadin will show that the center-surround mechanisms play an important role in adaptively adjusting the balance between integration and segmentation. Huang reached this conclusion by studying area MT and the effects of unambiguous motion presented to the receptive field surround on the neural response to an ambiguous motion in the receptive field. Tadin reports that the degree of center-surround suppression increases with stimulus visibility, promoting motion segregation at high-contrast and spatial summation at low-contrast. More recently, Tadin investigated the neural correlates of centre-surround interactions and their role in figure-ground segregation.
Understanding how we perceive natural motion stimuli requires an understating of how the brain solves the aperture problem. Badcock showed that spatial vision plays an important role in solving this motion processing problem. Specifically, he showed that oriented motion streaks and textural cues play a role in early motion processing. Pack approached this question by recoding single-cell responses at various stages along the dorsal pathway. Results with plaid stimuli show a tendency for increased motion integration that does not necessarily correlate with the perception of the stimulus. Data from local field potentials recorded simultaneously suggest that the visual system solves the aperture problem multiple times at different hierarchical stages, rather than serially.
Finally, Nishida and Johnston will report new insights into integration of local motion estimates over space. Nishida developed a global Gabor array stimulus, which appears to cohere when the local speeds and orientation of the Gabor are consistent with a single global translation. He found that the visual system adopts different strategies for spatial pooling over ambiguous (Gabor) and unambiguous (plaid) array elements. Johnston investigated new strategies for combining local estimates, including the harmonic vector average, and have demonstrated coherence in expanding a rotating motion Gabor arrays displays – implying only a few local interactions may be all that is required to solve the aperture problem in complex arrays.
The symposium will be of interest to faculty and students working on motion, who will benefit from an integrated survey of new approaches to the current central question in motion processing, and a general audience interested in linking local and global processing in perceptual organization.
Stimulus-dependent integration of motion signals via surround modulation
Xin Huang, partment of Physiology, University of Wisconsin; Thomas D. Albright, Vision Center Laboratory, Salk Institute for Biological Studies; Gene R. Stoner, Vision Center Laboratory, Salk Institute for Biological Studies
The perception of visual motion plays a pivotal role in interpreting the world around us. To interpret visual scenes, local motion features need to be selectively integrated and segmented into distinct objects. Integration helps to overcome motion ambiguity in the visual image by spatial pooling, whereas segmentation identifies differences between adjacent moving objects. In this talk we will summarize our recent findings regarding how motion integration and segmentation may be achieved via ''surround modulation'' in visual cortex and will discuss the remaining challenges. Neuronal responses to stimuli within the classical receptive field (CRF) of neurons in area MT (V5) can be modulated by stimuli in the CRF surround. Previous investigations have reported that the directional tuning of surround modulation in area MT is mainly antagonistic and hence consistent with segmentation. We have found that surround modulation in area MT can be either antagonistic or integrative depending upon the visual stimulus. Furthermore, we have found that the direction tuning of the surround modulation is related to the response magnitude: stimuli eliciting the largest responses yield the strongest antagonism and those eliciting the smallest responses yield the strongest integration. We speculate that input strength is, in turn, linked with the ambiguity of the motion present within the CRF - unambiguously moving features usually evoke stronger neuronal response than do ambiguously moving features. Our modeling study suggests that changes in MT surround modulation result from shifts in the balance between directionally tuned excitation and inhibition mediated by changes in input strength.
Center-surround interactions in visual motion perception
Duje Tadin, University of Rochester, Center for Visual Science
Visual processing faces two conflicting demands: integration and segmentation (Braddick, 1993). In motion, spatial integration is required by the noisy inputs and local velocity ambiguities. Local velocity differences, however, provide key segregation information. We demonstrated that the balance between integrating and differentiating processes is not fixed, but depends on visual conditions: At low-contrast, direction discriminations improve with increasing size – a result indicating spatial summation of motion signals. At high-contrast, however, motion discriminations worsen as the stimulus size increases – a result we describe as spatial suppression (Tadin et al., 2003). This adaptive integration of motion signals over space might be vision’s way of dealing with the contrasting requirements of integration and segmentation, where suppressive mechanisms operate only when the sensory input is sufficiently strong to guarantee visibility. In subsequent studies, we have replicated and expanded these results using a range of methods, including TMS, temporal reverse correlation, reaction times, motion-aftereffect, binocular rivalry and modeling. Based on the converging evidence, we show that these psychophysical results could be linked to suppressive center-surround receptive fields, such as those in area MT.
What are functional roles of spatial suppression? Special population studies revealed that spatial suppression is weaker in elderly and schizophrenic patients – a result responsible for their paradoxically better-than-normal performance in some conditions. Moreover, these subjects also exhibit deficits in figure-ground segregation, suggesting a possible functional connection. In a recent study, we directly addressed this possibility and report experimental evidence for a functional link between surround suppression and motion segregation.
The role of form cues in motion processing
David R. Badcock, School of Psychology, The University of Western Australia; Edwin Dickinson, University of Western Australia; Allison McKendrick, University of Melbourne; Anna Ma-Wyatt, University of Adelaide; Simon Cropper, University of Melbourne
The visual system initially collects spatially localised estimates of motion and then needs to interpret these local estimates to generate 2D object motion and self-motion descriptions. Commonly sinusoidal gratings have been employed to study the perception of motion and while these stimuli are useful for investigating the properties of spatial- and temporal-frequency tuned detectors they are limited. They remove textural and shape cues that are usually present in natural images, which has led to models of motion processing that ignore those cues. However, the addition of texture and shape information can dramatically alter perceived motion direction.
Recent work has shown that orientation-tuned simple cells are stimulated by moving patterns because of their extended temporal integration. This response (sometimes called motion streaks) allows orientation-tuned detectors to contribute to motion perception by signalling the axis of motion. The orientation cue is influential even if second-order streaks which could not have been produced by image smear are employed. This suggests that any orientation cue may be used to determine local direction estimates: a view that is extended to argue that aperture shape itself may have an impact by providing orientation cues which are incorporated into the direction estimation process. Oriented textural cues will also be shown to distort direction estimates, even though current models suggest they should not. The conclusion is that pattern information has a critical role in early motion processing and should be incorporated more systematically into models of human direction perception.
Pattern motion selectivity in macaque visual cortex
Christopher C Pack, Montreal Neurological Institute, McGill University
The dorsal visual pathway in primates has a hierarchical organization, with neurons in V1 coding local velocities and neurons in the later stages of the extrastriate cortex encoding complex motion patterns. In order to understand the computations that occur along each stage of the hierarchy, we have recorded from single neurons in areas V1, MT, and MST of the alert macaque monkey. Results with standard plaid stimuli show that pattern motion selectivity is, not surprisingly, more common in area MST than in MT or V1. However, similar results were found with plaids that were made perceptually transparent, suggesting that neurons at more advanced stages of the hierarchy tend to integrate motion signals obligatorily, even when the composition of the stimulus is more consistent with the motion of multiple objects. Thus neurons in area MST in particular show a tendency for increased motion integration that does not necessarily correlate with the (presumptive) perception of the stimulus. Data from local field potentials recorded simultaneously show a strong bias toward component selectivity, even in brain regions in which the spiking activity is overwhelmingly pattern selective. This suggests that neurons with greater pattern selectivity are not overrepresented in the outputs of areas like V1 and MT, but rather that the visual system computes pattern motion multiple times at different hierarchical stages. Moreover, our results are consistent with the idea that LFPs can be used to estimate different anatomical contributions to processing at each visual cortical stage.
Intelligent motion integration across multiple stimulus dimensions
Shin'ya Nishida, NTT Communication Science Laboratories; Kaoru Amano, The University of Tokyo; Kazushi Maruya, NTT; Mark Edwards, Australian National University; David R. Badcock, University of Western Australia
In human visual motion processing, image motion is first detected by one-dimensional (1D), spatially local, direction-selective neural sensors. Each sensor is tuned to a given combination of position, orientation, spatial frequency and feature type (e.g., first-order and second-order). To recover the true 2-dimensional (2D) and global direction of moving objects (i.e., to solve the aperture problem), the visual system integrates motion signals across orientation, across space and possibly across the other dimensions. We investigated this multi-dimensional motion integration process, using global motion stimuli comprised of numerous randomly-oriented Gabor (1D) or Plaid (2D) elements (for the purpose of examining integration across space, orientation and spatial frequency), as well as diamond-shape Gabor quartets that underwent rigid global circular translation (for the purpose of examining integration across spatial frequency and signal type). We found that the visual system adaptively switches between two spatial integration strategies — spatial pooling of 1D motion signals and spatial pooling of 2D motion signals — depending on the ambiguity of local motion signals. MEG showed correlated neural activities in hMT+ for both 1D pooling and 2D pooling. Our data also suggest that the visual system can integrate 1D motion signals of different spatial frequencies and different feature types, but only when form conditions (e.g., contour continuity) support grouping of local motions. These findings indicate that motion integration is a complex and smart computation, and presumably this is why we can properly estimate motion flows in a wide variety of natural scenes.
Emergent global motion
Alan Johnston, Cognitive, Perceptual and Brain Sciences, University College, London; Andrew Rider, Cognitive, Perceptual and Brain Sciences, University College, London; Peter Scarfe, Cognitive, Perceptual and Brain Sciences, University College, London
The perception of object motion requires the integration of local estimates of image motion across space. The two general computational strategies that have been offered to explain spatial integration can be classified as hierarchical or lateral interactive. The hierarchical model assumes local motion estimates at a lower point in the hierarchy are integrated by neurons with large receptive fields. These neurons could make use of the fact that due to the aperture problem the 2D distribution of local velocities for a rigid translation falls on a circle through the origin in velocity space. However the challenge for this approach is how to segment and represent the motion of different objects or textures falling within the receptive field, including how to represent object boundaries. Apparent global rotations and dilations can be instantiated in randomly oriented global Gabor arrays suggesting that the aperture problem can be resolved though local interactions. The challenge for this approach is to discover local rules that will allow global organizations to emerge. These rules need to incorporate the status of ambiguous motion signals and unambiguous motion signals to explain how unambiguous 2D motion cues (e.g. at corners) influence the computed global motion field. Here we will describe a simple least squares approach to local integration, demonstrate its effectiveness in dealing with the dual problems of integration and segmentation and consider its limitations.