Active Perception: The synergy between perception and action

Time/Room: Friday, May 10, 1:00 – 3:00 pm, Royal 6-8
Organizer: Michele Rucci & Eli Brenner, Boston University & VU University
Presenters: Eli Brenner, John Wann, Heiner Deubel, Michele Rucci, Ronen Segev, Yves Frégnac

< Back to 2013 Symposia

Symposium Description

Visual perception is often studied in a passive manner. The stimulus on the display is typically regarded as the input to the visual system, and the results of experiments are frequently interpreted without consideration of the observer’s motor activity. In fact, movements of the eyes, head or body are often treated as a nuisance in vision research experiments, and care is often taken in minimizing them by properly constraining the observer. Like many other species, however, humans are not passively exposed to the incoming flow of sensory data. Instead, they actively seek useful information by coordinating sensory processing with motor activity. Motor behavior is a key component of sensory perception, as it enables control of sensory signals in ways that simplify perceptual tasks. The goal of this symposium is to make VSS attendees aware of recent advances in the field of active vision. Non-specialists often associate active vision with the study of how vision controls behavior. To counterbalance this view, the present workshop will instead focus on closing the loop between perception and action. That is, we will examine both the information that emerges in an active observer and how this information is used to guide behavior. To emphasize the fact that behavior is a fundamental component of visual perception, this symposium will address the functional consequences of a moving agent from multiple perspectives. We will cover the perceptual impact of very different types of behavior, from locomotion to microscopic eye movements. We will discuss the multimodal sources of information that emerge and need to be combined during motor activity. Furthermore, we will look at the implications of active vision at multiple levels, from the general computational strategies to the specific impact of eye movement modulations on neurons in the visual cortex. Speakers with expertise in complementary areas and with research programs involving a variety of techniques and focusing on different levels of analysis were specifically selected to provide a well-rounded overview of the field. We believe that this symposium will be of interest to all VSS participants, both students and faculty. It will make clear (to students in particular) that motor activity should not be regarded as an experimental nuisance, but as a critical source of information in everyday life. The symposium will start with a general introduction to the topic and the discussion of a specific example of closed sensory-motor loop, the interception of moving objects (Eli Brenner). It will continue discussing the visual information emerging during locomotion and its use in avoiding collisions (John Wann). We will then examine the dynamic strategy by which attention is redirected during grasping (Heiner Deubel), and how even microscopic “involuntary” eye movements are actually part of a closed sensory-motor loop (Michele Rucci). The last two speakers will address how different types of visual information emerging in an active observer are encoded in the retina (Ronen Segev) and in the cortex (Yves Fregnac).

Presentations

Introduction to active vision: the complexities of continuous visual control

Speaker: Eli Brenner, Human Movement Sciences, VU University
Authors: Jeroen Smeets, Human Movement Sciences, VU University

Perception is often studied in terms of image processing: an image falls on the retina and is processed in the eye and brain in order to retrieve whatever one is interested in. Of course the eye and brain analyse the images that fall on the retina, but it is becoming ever more evident that vision is an active process. Images do not just appear on the retina, but we actively move our eyes and the rest of our body, presumably to ensure that we constantly have the best possible information at our disposal for the task at hand. We do this despite the complications that moving sometimes creates for extracting the relevant information from the images. I will introduce some of the complications and benefits that arise from such active vision on the basis of research on the role of pursuing an object with one’s eyes when trying to intercept it. People are quite flexible in terms of where they look when performing an interception task, but where they look affects their precision. This is not only due to the inhomogeneity of the retina, but also to the fact that neuromuscular delays affect the combination of information from different sensory modalities. The latter can be overcome by relying as much as possible on retinal information (such as optic flow) but there are conditions in which people do not do so but rely on combinations of retinal and extra-retinal information instead (efferent and afferent information about one’s own actions).

Why it’s good to look where you are going

Speaker: John Wann, Dept of Psychology, Royal Holloway University of London

The control of direction and avoidance of collision is fundamental to effective locomotion. A strong body of research has explored the use of optic flow and/or eye-movement signals in judging heading. This presentation will outline research on active steering that explores the use of optic flow and eye-movement signals, but where a key aspect of effective control is where you look and when. The talk will also briefly outline studies using fMRI that highlight the neural systems that support the control model proposed from the behavioural research. Although this model is based on principles derived from optical geometry it conveniently converges on the heuristics used in advanced driver/motorcyclist training, and elite cycling, for negotiating bends at speed. Research supported by the UK EPSRC, UK ESRC, EU FP7 Marie Curie.

Motor selection and visual attention in manual pointing and grasping

Speaker: Heiner Deubel, Department Psychologie, Ludwig-Maximilians-Universitat Munchen, Germany
Authors: Rene Gilster, Department Psychologie, Ludwig-Maximilians-Universitat Munchen, Germany; Constanze Hesse, School of Psychology, University of Aberdeen, United Kingdom

It is now well established that goal-directed movements are preceded by covert shifts of visual attention to the movement target. I will first review recent evidence in favour of this claim for manual reaching movements, demonstrating that the planning of some of these actions establishes multiple foci of attention which reflect the spatial-temporal requirements of the intended motor task. Recently our studies have focused on how finger contact points are chosen in grasp planning and how this selection is related to the spatial deployment of attention. Subjects grasped cylindrical objects with thumb and index finger. A perceptual discrimination task was used to assess the distribution of visual attention prior to the execution of the grasp. Results showed enhanced discrimination for those locations where index finger and thumb would touch the object, as compared to the action-irrelevant locations. A same-different task was used to establish that attention was deployed in parallel to the grasp-relevant locations. Interestingly, while attention seemed to split to the action-relevant locations, the eyes tended to fixate the centre of the to-be-grasped object, reflecting a dissociation between overt and covert attention. A separate study demonstrated that a secondary, attention-demanding task affected the kinematics of the grasp, slowing the adjustment of hand aperture to object size. Our results highlight the import role of attention also in grasp planning. The findings are consistent with the conjecture that the planning of complex movements enacts the formation of a flexible “attentional landscape” which tags all those locations in the visual lay-out that are relevant for the impending action.

The function of microsaccades in fine spatial vision

Speaker: Michele Rucci, Boston University

The visual functions of microsaccades, the microscopic saccades that humans perform while attempting to maintain fixation, have long been debated. The traditional proposal that microsaccades prevent perceptual fading has been criticized on multiple grounds. We have recently shown that, during execution of a high-acuity task, microsaccades move the gaze to nearby regions of interest according to the ongoing demands of the task (Ko et al., Nature Neurosci. 2010). That is, microsaccades are used to examine a narrow region of space in the same way larger saccades normally enable exploration of a visual scene. Given that microsaccades keep the stimulus within the fovea, what is the function of these small gaze relocations? By using new gaze-contingent display procedures, we were able to selectively stimulate retinal regions at specific eccentricities within the fovea. We show that, contrary to common assumptions, vision is not uniform within the fovea: a stimulus displacement from the center of gaze of only 10 arcmin already causes a significant reduction in performance in a high-acuity task. We also show that precisely-directed microsaccades compensate for this lack of homogeneity giving the false impression of uniform foveal vision in experiments that lack control of retinal stimulation. Finally, we show that the perceptual improvement given by microsaccades in high-acuity tasks results from accurately positioning the preferred retinal locus in space rather than from the temporal transients microsaccades generate. These results demonstrate that vision and motor behavior operate in a closed loop also during visual fixation.

Decorrelation of retinal response to natural scenes by fixational eye movements

Speaker: Ronen Segev, Ben Gurion University of the Negev, Department of Life Sciences and Zlotowski Center for Neuroscience

Fixational eye movements are critical for vision since without them the retina adapts fast to a stationary image and the entire visual perception fades away in a matter of seconds. Still, the connection between fixational eye movements and retinal encoding is not fully understood. To address this issue, it was suggested theoretically that fixational eye movements are required to reduce the spatial correlations which are typical for natural scenes. The goal of our study was to put this theoretical prediction under experimental test. Using a multi electrode array, we measured the response of the tiger salamander retina to movies which simulated two types of stimuli: fixational eye movements over a natural scene and flash followed by static view of a natural scene. Then we calculated the cross-correlation in the response of the ganglion cells as a function of receptive field distance. We found that when static natural images are projected, strong spatial correlations are present in the neural response due to correlation in the natural scene. However, in the presence of fixational eye movements, the level of correlation in the neural response drops much faster as a function of distance which results in effective decorrelation of the channels streaming information to the brain. This observation confirms the prediction that fixational eye movement act to reduce the correlations in retinal response and provides better understanding of the contribution of fixational eye movements to the information processing by the retina.

Searching for a fit between the “silent” surround of V1 receptive fields and eye-movements

Speaker: Yves Frégnac, UNIC-CNRS Department of Neurosciences, Information and Complexity Gif-sur-Yvette, France

To what extent emerging macroscopic perceptual features (i.e., Gestalt rules) can be predicted in V1 from the characteristics of neuronal integration? We use on vivo intracellular electrophysiology in the anesthetized brain, but where the impact of visuomotor exploration on retinal flow is controlled by simulating realistic but virtual classes of eye-movements (fixation, tremor, shift, saccade). By comparing synaptic echoes to different types of full field visual statistics (sparse noise, grating, natural scene, dense noise, apparent motion noise) in which the retinal effects of virtual eye-movements is, or is not, included, we have reconstructed the perceptual association field of visual cortical neurons extending 10 to 20° away from the classical discharge field. Our results show that there exists for any V1 cortical cell a fit between the spatio-temporal organization of its subthreshold “silent” (nCRF) and spiking (CRF) receptive fields with the dynamic features of the retinal flow produced by specific classes of eye-movements (saccades and fixation). The functional features of the resulting association field are interpreted as facilitating the integration of feed-forward inputs yet to come by propagating some kind of network belief of the possible presence of Gestalt-like percepts (co-alignment, common fate, filling-in). Our data support the existence of global association fields binding Form and Motion, which operate during low-level (non attentive) perception as early as V1 and become dynamically regulated by the retinal flow produced by natural eye-movements. Current work is supported by CNRS, and grants from ANR (NatStats and V1-complex) and the European Community FET-Bio-I3 programs (IP FP6: FACETS (015879), IP FP7: BRAINSCALES(269921) and Brain-i-nets (243914)).

< Back to 2013 Symposia

Contextual and top-down influences in vision

Time/Room: Friday, May 10, 1:00 – 3:00 pm, Royal 4-5
Organizer: Uri Polat, Tel-Aviv University
Presenters: Charles Gilbert, Uri Polat, Rudiger von der Heydt, Pieter Roelfsema, Dennis Levi, Dov Sagi

< Back to 2013 Symposia

Symposium Description

According to classical models of spatial vision, the output of neurons in the early visual cortex is determined by the local features of the stimuli and integrated at later stages of processing (feedforward). However, experimental results obtained during the last two decades show contextual modulation: local perceptual effects are modulated by global image properties. The receptive field properties of cortical neurons are subject to learning and to top-down influences of attention, expectation and perceptual task. Even at early cortical stages of visual processing neurons are subject to contextual influences that play a role in intermediate level vision, contour integration and surface segmentation, which enables them to integrate information over large parts of the visual field. These influences are not fixed but are subject to experience, enabling neurons to encode learned information. The dynamic properties of context modulations are mediated by an interaction between reentrant signals to the cortex and intrinsic cortical connections, changing effective connectivity within the cortical network. The evolving view of the nature of the receptive field includes contextual influences which change in the long term as a result of perceptual learning and in the short term as a result of a changing behavioral context. In the symposia we will present anatomical, physiological and psychophysical data showing contextual effects in lateral interactions, grouping, border ownership, crowding and perceptual learning.

Presentations

Contextual modulation in the visual cortex

Speaker: Charles Gilbert, The Rockefeller University, New York

Vision is an active process. The receptive field properties of cortical neurons are subject to learning and to top-down influences of attention, expectation and perceptual task. Even at early cortical stages of visual processing neurons are subject to contextual influences that play a role in intermediate level vision, contour integration and surface segmentation, which enables them to integrate information over large parts of the visual field. These influences are not fixed but are subject to experience, enabling neurons to encode learned information. Even in the adult the visual cortex there is considerable plasticity, where cortical circuits undergo exuberant changes in axonal arbors following manipulation of sensory experience. The integrative properties of cortical neurons, the contextual influences that confer selectivity to complex stimuli, are mediated in part by a plexus of long range horizontal connections that enable neurons to integrate information over an area of visual cortex representing large parts of the visual field. These connections are the substrate for an association field, a set of interactions playing a role in contour integration and saliency. The association field is not fixed. Rather, neurons can select components of this field to express difference functional properties. As a consequence neurons can be thought of as adaptive processors, changing their function according to behavioral context, and their responses reflect the demands of the perceptual task being performed. The top-down signal facilitates our ability to segment the visual scene despite its complex arrangement of objects and backgrounds. It plays a role in encoding and recall of learned information. The resulting feedforward signals carried by neurons convey different meanings according to the behavioral context. We propose that these dynamic properties are mediated by an interaction between reentrant signals to the cortex and intrinsic cortical connections, changing effective connectivity within the cortical network. The evolving view of the nature of the receptive field includes contextual influences which change in the long term as a result of perceptual learning and in the short term as a result of a changing behavioral context.

Spatial and temporal rules for contextual modulations

Speaker: Uri Polat, Tel-Aviv University, Tel-Aviv, Israel

Most contextual modulations, such as center-surround and crowding exhibit a suppressive effect. In contrast, collinear configuration is a unique case of contextual modulation in which the effect can be either facilitative or suppressive, depending on the context. Physiological and psychophysical studies revealed several spatial and temporal rules that determine the modulation effect: 1) spatial configuration: collinear configuration can be either facilitative or suppressive, whereas non-collinear configurations may be suppressive; 2) separation between the elements: suppression for close separation that coincides with the size of the receptive field and facilitation outside the receptive field; 3) activity dependent: facilitation for low contrast (near the threshold) and suppression for high contrast; 4) temporal properties: suppression is fast and transient, whereas facilitation is delayed and sustained; 5) attention may enhance the facilitation; 6) slow modulation: perceptual learning can increase the facilitatory effect over a time scale of several days; 7) fovea and periphery: similar rules can be applied when spatial scaling to the size of receptive field is done. It is believed that the role of collinear facilitation is to enhance contour integration and object segmentation, whereas center-surround is important for pop-out. Our recent studies suggest that these rules can serve as a unified model for spatial and temporal masking as well as for crowding.

Border ownership and context

Speaker: Rudiger von der Heydt, The Johns Hopkins University, Baltimore, Maryland, USA

A long history of studies of perception has shown that the visual system organizes the incoming information early on, interpreting the 2D image in terms of a 3D world, and producing a structure that enables object-based attention and tracking of object identity. Recordings from monkey visual cortex show that many neurons, especially in area V2, are selective for border ownership. These neurons are edge selective and have ordinary classical receptive fields, but in addition, their responses are modulated (enhanced or suppressed) depending on the location of a ‘figure’ relative to the edge in their receptive field. Each neuron has a fixed preference for location on one side or the other. This selectivity is derived from the image context far beyond the classical receptive field. This talk will review evidence indicating that border ownership selectivity reflects mechanisms of object definition. The evidence includes experiments showing (1) reversal of border ownership signals with change of perceived object structure, (2) border ownership specific enhancement of responses in object-based selective attention, )3) persistence of border ownership signals in accordance with continuity of object perception, and (4) remapping of border ownership signals across saccades and object movements. Some of these findings can be explained by assuming that grouping circuits detect ‘objectness’ according to simple rules, and, via recurrent projections, enhance the low-level feature signals representing the object. This might be the mechanism of object-based attention. Additional circuits may provide persistence and remapping.

Visual cortical mechanisms for perceptual grouping

Speaker: Pieter Roelfsema, Netherlands Institute for Neuroscience, Amsterdam, the Netherlands

A fundamental task of vision is to group the image elements that belong to one object and to segregate them from other objects and the background. I will discuss a new conceptual framework that explains how the binding problem is solved by the visual cortex. According to this framework, two mechanisms are responsible for binding: base-grouping and incremental grouping. Base-groupings are coded by single neurons tuned to multiple features, like the combination of a color and an orientation. They are computed rapidly because they reflect the selectivity of feedforward connections that propagate information from lower to higher areas of the visual cortex. However, not all conceivable feature combinations are coded by dedicated neurons. Therefore, a second, flexible incremental grouping mechanism is required. Incremental grouping relies on horizontal connections between neurons in the same area and feedback connections that propagate information from higher to lower areas. These connections spread an enhanced response (not synchrony) to all the neurons that code image elements that belong to the same perceptual object. This response enhancement acts as a label that tags those neurons that respond to image elements to be bound in perception. The enhancement of neuronal activity during incremental grouping has a correlate in psychology because object-based attention is directed to the features labeled with the enhanced neuronal response. Our recent results demonstrate that feedforward and feedback processing rely on different receptors for glutamate and on processing in different cortical layers.

Crowding in context

Speaker: Dennis Levi, UC Berkeley, Berkeley, CA, USA

In peripheral vision, objects that can be readily recognized when viewed in isolation, become unrecognizable in clutter. This is the interesting phenomenon known as visual crowding. Crowding represents an essential bottleneck, setting limits on object perception, eye movements, visual search, reading and perhaps other functions in peripheral, amblyopic and developing vision (Whitney & Levi, 2011). It is generally defined as the deleterious influence of nearby contours on visual discrimination, but the effects of crowding go well beyond impaired discrimination. Crowding impairs the ability to recognize and respond appropriately to objects in clutter. Thus, studying crowding may lead to a better understanding of the processes involved in object recognition. Crowding also has important clinical implications for patients with macular degeneration, amblyopia and dyslexia. Crowding is strongly dependent on context. The focus of this talk will be on trying to put crowding into context with other visual phenomena.

Perceptual learning in context

Speaker: Dov Sagi, The Weizmann Institute of Science, Rehovot, Israel

Studies of perceptual learning show a large diversity of effects, with learning rate and specificity varying across stimuli and experimental conditions. Most notably, there is an initial fast phase of within session (online) learning followed by a slower phase, taking place over days, which is highly specific to basic image features. Our results show that the latter phase is highly sensitive to contextual modulation. While thresholds for contrast discrimination of a single Gabor patch are relatively stable and unaffected by training, the addition of close flankers induces dramatic improvements of thresholds, indicating increased gain of the contrast response function (“context enabled learning”). Cross-orientation masking effects can be practically eliminated by practice. In texture discrimination, learning was found to interact with slowly evolving adaptive effects reducing the effects of learning. These deteriorative effects can be eliminated by cross-orientation interactions found to counteract sensory adaptation. The experimental results are explained by plasticity within local networks of early vision assuming excitatory-inhibitory interactions, where context modulates the balance between excitation and inhibition. We suggest that reduced inhibitory effects increases learning efficiency, making learning faster and generalizable. Specificity of learning seems to be the result of experience dependent local contextual interactions.

< Back to 2013 Symposia

2013 Symposia

The structure of visual working memory

Organizer: Wei Ji Ma, Baylor College of Medicine
Time/Room: Friday, May 10, 1:00 – 3:00 pm, Royal 1-3

Working memory is an essential component of perception, cognition, and action. The past eight years have seen a surge of activity aimed at understanding the structure of visual working memory. Is working memory performance limited by a maximum number of objects that can be remembered, or by the quality of the memories? Does context affect how we remember objects? This symposium brings together some of the leading thinkers in this field to discuss these central theoretical issues. More…

Contextual and top-down influences in vision

Organizer: Uri Polat, Tel-Aviv University
Time/Room: Friday, May 10, 1:00 – 3:00 pm, Royal 4-5

Vision is an active process. The properties of cortical neurons are subject to learning and to top-down influences of attention, expectation and perceptual task. Even at early cortical stages of visual processing neurons are subject to contextual influences that play a role in our vision, These influences are not fixed but are subject to experience, enabling neurons to encode learned information. In the symposia we will present anatomical, physiological and psychophysical data showing contextual effects in almost every visual task. We will show that visual perception involves both instantaneous pre-attentive and attentive processes that enhance the visual perception. More…

Active Perception: The synergy between perception and action

Organizer: Michele Rucci, Boston University & Eli Brenner, VU University
Time/Room: Friday, May 10, 1:00 – 3:00 pm, Royal 6-8

Visual perception is often studied in a passive manner without consideration of motor activity. Like many other species, however, humans are not passively exposed to the incoming flow of sensory data. They actively seek useful information by coordinating sensory processing with motor activity. In fact, behavior is a key component of sensory perception, as it enables control of sensory signals in ways that simplify perceptual tasks. This workshop will focus on recent findings which have further emphasized the tight link between perception and action. More…

[email protected]: Visual Development

Organizers: Susana Chung, University of California, Berkeley and Anthony Norcia, Stanford University
Time/Room: Friday, May 10, 3:30 – 5:30 pm, Royal 1-3

Many visual functions continue to develop and reach adult levels only in late childhood. The successful development of normal visual functions requires ‘normal’ visual experience. The speakers of this symposium will review the time courses of normal visual development of selected visual functions, and discuss the consequences of abnormal visual experience during development on these visual functions. The prospect of recovering visual functions in adults who experienced abnormal visual experience during development will also be discussed, along with the advances made in the assessment of visual functions in children with abnormal visual development due to damage to the visual cortex and the posterior visual pathways. More…

Decoding and the spatial scale of cortical organization

Organizer: Jeremy Freeman, New York University; Elisha P. Merriam, Departments of Psychology and Neural Science, New York University; and Talia Konkle, Department of Psychology, Harvard University
Time/Room: Friday, May 10, 3:30 – 5:30 pm, Royal 4-5

With functional neuroimaging data we have incredible access to a rich landscape of neural responses, but this access comes with challenging questions: Over what expanse of cortex is information meaningfully clustered — in other words, over what scales should we expect neural information to be organized? How should inferences about cortical organization take into account the complex nature of the imaging signal, which reflects neural and non-neural signals at multiple spatial scales? In this symposium, six investigators discuss representational structure at multiple spatial scales across the cortex, highlighting the inferential strengths and weaknesses of cutting-edge analyses across multiple experimental techniques. More…

Does appearance matter?

Organizer: Sarah R. Allred, Rutgers–The State University of New Jersey
Time/Room: Friday, May 10, 3:30 – 5:30 pm, Royal 6-8

Vision science originated with questions about how and why things look the way do, but phenomenology is sometimes given short shrift in the field as a whole. We discuss objective methods that capture what we mean by appearance and examine the criteria for behaviors that are best thought of as mediated by reasoning about appearances. By utilizing phenomenology, we provide a parsimonious understanding of many empirical phenomena, including instructional effects in lightness perception, contextual effects on color constancy, systematic biases in egocentric distance perception and predicting 3D shape from orientation flows. We also discuss contemporary interactions between appearance, physiology, and neural models. More…

Vision Sciences Society