The Role of Ensemble Statistics in the Visual Periphery

S4 -The Role of Ensemble Statistics in the Visual Periphery

Time/Room: Friday, May 19, 2017, 2:30 – 4:30 pm, Pavilion
Organizer(s): Brian Odegaard, University of California-Los Angeles
Presenters: Michael Cohen, David Whitney, Ruth Rosenholtz, Tim Brady, Brian Odegaard

< Back to 2017 Symposia

The past decades have seen the growth of a tremendous amount of research into the human visual system’s capacity to encode “summary statistics” of items in the world. Studies have shown that the visual system possesses a remarkable ability to compute properties such as average size, position, motion direction, gaze direction, emotional expression, and liveliness, as well as variability in color and facial expression, documenting the phenomena across various domains and stimuli. One recent proposal in the literature has focused on the promise of ensemble statistics to provide an explanatory account of subjective experience in the visual periphery (Cohen, Dennett, & Kanwisher, Trends in Cognitive Sciences, 2016). In addition to this idea, others have suggested that summary statistics underlie performance in visual tasks in a broad manner. These hypotheses open up intriguing questions: how are ensemble statistics encoded outside the fovea, and to what extent does this capacity explain our experience of the majority of our visual field? In this proposed symposium, we aim to discuss recent empirical findings, theories, and methodological considerations in pursuit of answers to many questions in this growing area of research, including the following: (1) How does the ability to process summary statistics in the periphery compare to this ability at the center of the visual field? (2) What role (if any) does attention play in the ability to compute summary statistics in the periphery? (3) Which computational modeling frameworks provide compelling, explanatory accounts of this phenomenon? (4) Which summary statistics (e.g., mean, variance) are encoded in the periphery, and are there limitations on the precision/capacity of these estimates? By addressing questions such as those listed above, we hope that participants emerge from this symposium with a more thorough understanding of the role of ensemble statistics in the visual periphery, and how this phenomenon may account for subjective experience across the visual field. Our proposed group of speakers is shown below, and we hope that faculty, post-docs, and graduate students alike would find this symposium to be particularly informative, innovative, and impactful.

Ensemble statistics and the richness of perceptual experience

Speaker: Michael Cohen, MIT

While our subjective impression is of a detailed visual world, a wide variety of empirical results suggest that perception is actually rather limited. Findings from change blindness and inattentional blindness highlight how much of the huge amounts of the visual world regularly go unnoticed. Furthermore, direct estimates of the capacity of visual attention and working memory reveal that surprisingly few items can be processed and maintained at once. Why do we think we see so much when these empirical results suggests we see so little? One possible answer to this question resides in the representational power of visual ensembles and summary statistics. Under this view, those items that cannot be represented as individual objects or with great precision are nevertheless represented as part of a broader statistical summary. By representing much of the world as an ensemble, observers have perceptual access to different aspects of the entire field of view, not just a few select items. Thus, ensemble statistics play a critical role in our ability to account for and characterize the apparent richness of perceptual experience.

Ensemble representations as a basis for rich perceptual experiences

Speaker: David Whitney, University of California-Berkeley

Much of our rich visual experience comes in the form of ensemble representations, the perception of summary statistical information in groups of objects—such as the average size of items, the average emotional expression of faces in a crowd, or the average heading direction of point-light walkers. These ensemble percepts occur over space and time, are robust to outliers, and can occur in the visual periphery. Ensemble representations can even convey unique and emergent social information like the gaze of an audience, the animacy of a scene, or the panic in a crowd, information that is not necessarily available at the level of the individual crowd members. The visual system can make these high-level interpretations of social and emotional content with exposures as brief as 50 ms, thus revealing an extraordinarily efficient process for compressing what would otherwise be an overwhelming amount of information. Much of what is believed to count as rich social, emotional, and cognitive experience actually comes in the form of basic, compulsory, visual summary statistical processes.

Summary statistic encoding plus limits on decision complexity underlie the richness of visual perception as well as its quirky failures

Speaker: Ruth Rosenholtz, MIT

Visual perception is full of puzzles. Human observers effortlessly per-form many visual tasks, and have the sense of a rich percept of the visual world. Yet when probed for details they are at a loss . How does one explain this combination of marvelous successes and puzzling failures? Numerous researchers have explained the failures in terms of severe limits on resources of attention and memory. But if so, how can one explain the successes? My lab has argued that many experimental results pointing to apparent attentional limits instead derived at least in part from losses in peripheral vision. Furthermore, we demonstrated that those losses could arise from peripheral vision encoding its inputs in terms of a rich set of local image statistics. This scheme is theoretically distinct from encoding ensemble statistics of a set of similar items. I propose that many of the remaining attention/memory limits can be unified in terms of a limit on decision complexity. This decision complexity is difficult to reason about, because the complexity of a given task depends upon the underlying encoding. A complex, general-purpose encoding likely evolved to make certain tasks easy at the expense of others. Recent advances in understanding this encoding — including in peripheral vision — may help us finally make sense of the puzzling strengths and limitations of visual perception.

The role of spatial ensemble statistics in visual working memory and scene perception

Speaker: Tim Brady, University of California-San Diego

At any given moment, much of the relevant information about the visual world is in the periphery rather than the fovea. The periphery is particularly useful for providing information about scene structure and spatial layout, as well as informing us about the spatial distribution and features of the objects we are not explicitly attending and fixating. What is the nature of our representation of this information about scene structure and the spatial distribution of objects? In this talk, I’ll discuss evidence that representations of the spatial distribution of simple visual features (like orientation, spatial frequency, color), termed spatial ensemble statistics, are specifically related to our ability to quickly and accurately recognize visual scenes. I’ll also show that these spatial ensemble statistics are a critical part of the information we maintain in visual working memory – providing information about the entire set of objects, not just a select few, across eye movements, blinks, occlusions and other interruptions of the visual scene.

Summary Statistics in the Periphery: A Metacognitive Approach

Speaker: Brian Odegaard, University of California-Los Angeles

Recent evidence indicates that human observers often overestimate their capacity to make perceptual judgments in the visual periphery. How can we quantify the degree to which this overestimation occurs? We describe how applications of Signal Detection Theoretic frameworks provide one promising approach to measure both detection biases and task performance capacities for peripheral stimuli. By combining these techniques with new metacognitive measures of perceptual confidence (such as meta-d’; Maniscalco & Lau, 2012), one can obtain a clearer picture regarding (1) when subjects can simply perform perceptual tasks in the periphery, and (2) when they have true metacognitive awareness of the visual surround. In this talk, we describe results from recent experiments employing these quantitative techniques, comparing and contrasting the visual system’s capacity to encode summary statistics in both the center and periphery of the visual field.

< Back to 2017 Symposia

The Brain Correlates of Perception and Action: from Neural Activity to Behavior

S2 – The Brain Correlates of Perception and Action: from Neural Activity to Behavior

Time/Room: Friday, May 19, 2017, 12:00 – 2:00 pm, Pavilion
Organizer(s): Simona Monaco, Center for Mind/Brain Sciences, University of Trento & Annalisa Bosco, Department of Pharmacy and Biotech, University of Bologna
Presenters: J. Douglas Crawford, Patrizia Fattori, Simona Monaco, Annalisa Bosco, Jody C. Culham

< Back to 2017 Symposia

Symposium Description

In recent years neuroimaging and neurophysiology have enabled cognitive neuroscience to identify numerous brain areas that are involved in sensorimotor integration for action. This research has revealed cortical and subcortical brain structures that work in coordination to allow accurate hand and eye movements. The visual information about objects in the environment is integrated into the motor plan through a cascade of events known as visuo-motor integration. These mechanisms allow not only to extract relevant visual information for action, but also to continuously update this information throughout action plan and execution. As our brain evolved to act towards real objects in the natural environment, studying hand and eye movements in experimental situations that resemble the real world is critical for our understanding of the action system. This aspect has been relatively neglected in the cognitive sciences, mostly because of the challenges associated with the experimental setups and technologies. This symposium provides a comprehensive view of the neural mechanisms underlying sensory-motor integration for the production of eye and hand movements in situations that are common to real life. The range of topics covered by the speakers encompasses the visual as well as the motor and cognitive neuro-sciences, and therefore are relevant to junior and senior scientists specialized in any of these areas. We bring together researchers from macaque neurophysiology to human neuroimaging and behavior. The combination of works that use these cutting-edge techniques offers a unique insight into the effects that are detected at the neuronal level, extended to neural populations and trans-lated into behavior. There will be five speakers. Doug Crawford will address the neuronal mechanisms underlying perceptual-mo-tor integration during head-unrestrained gaze shifts in the frontal eye field and superior colliculus of macaques. Patrizia Fattori will describe how the activity of neurons in the dorsomedial visual stream of macaques is modulated by gaze and hand movement direction as well as properties of real objects. Jody Culham will illustrate the neural representation for visually guided actions and real objects in the human brain revealed by functional magnetic resonance imaging (fMRI). Simona Monaco will describe the neural mechanisms in the human brain underlying the influence of intended action on sensory processing and the involvement of the early visual cortex in action planning and execution. Annalisa Bosco will detail the behavioral aspects of the influence exerted by action on perception in human participants.

Visual-motor transformations at the Neuronal Level in the Gaze System

Speaker: J. Douglas Crawford, Centre for Vision Research, York University
Additional Authors: AmirSaman Sajad, Center for Integrative & Cognitive Neuroscience, Vanderbilt University and Morteza Sadeh, Centre for Vision Research, York University

The fundamental question in perceptual-motor integration is how, and at what level, do sensory signals become motor signals? Does this occur between brain areas, within brain areas, or even within individual neu-rons? Various training or cognitive paradigms have been combined with neurophysiology and/or neuroimaging to address this question, but the visuomotor transformations for ordinary gaze saccades remain elusive. To address these questions, we developed a method for fitting visual and motor response fields against various spatial models without any special training, based on trial-to-trial variations in behavior (DeSouza et al .2011). More recently we used this to track visual-motor transformations through time. We find that superior colliculus and frontal eye field visual responses encode target direction, whereas their motor responses encode final gaze position relative to initial eye orientation (Sajad et al. 2015; Sadeh et al. 2016). This occurs both between neuron populations, but can also be observed within individual visuomotor cells. When a memory delay is imposed, a gradual transition of intermediate codes is observed (perhaps due to an imperfect memory loop), with a further ‘leap’ toward gaze motor coding in the final memory-motor transformation (Sajad et al. 2016). However, we found a similar spatiotemporal transition even within the brief burst of neural activity that accompanies a reactive, visually-evoked saccade. What these data suggest is that visuomotor transformations are a network phenomenon that is simultaneously observable at the level of individual neurons, and distributed across different neuronal populations and structures.

Neurons for eye and hand action in the monkey medial posterior parietal cortex

Speaker: Patrizia Fattori, University of Bologna
Additional Authors: Fattori Patrizia, Breveglieri Rossella, Galletti Claudio, Department of Pharmacy and Biotechnology, University of Bologna

In the last decades, several components of the visual control of eye and hand movements have been disentangled by studying single neurons in the brain of awake macaque monkeys. In this presentation, particular attention will be given to the influence of the direction of gaze upon the reaching activity of neurons of the dorsomedial visual stream. We recorded from the caudal part of the medial posterior parietal cortex, finding neurons sensitive to the direction and amplitude of arm reaching actions. The reaching activity of these neurons was influenced by the direction of gaze, some neurons preferring foveal reaching, others peripheral reaching. Manipulations of eye/target positions and of hand position showed that the reaching activity could be in eye-centered, head-centered, or a mixed frame of reference according to the considered neuron. We also found neurons modulated by the visual features of real objects and neurons modulated also by grasping movements, such as wrist orientation and grip formation. So it seems that the entire neural machinery for encoding eye and hand action is hosted in the dorsomedial visual stream. This machinery takes part in the sequence of visuomotor transformations required to encode many aspects of the reach-to-grasp actions.

The role of the early visual cortex in action

Speaker: Simona Monaco, Center for Mind/Brain Sciences, University of Trento
Additional Authors: Simona Monaco, Center for Mind/Brain Sciences, University of Trento; Doug Crawford, Centre for Vision Research, York University; Luca Turella, Center for Mind/Brain Sciences, University of Trento; Jody Culham, Brain and Mind Institution

Functional magnetic resonance imaging has recently allowed showing that intended action modulates the sensory processing of object orientation in areas of the action network in the human brain. In particular, intended actions can be decoded in the early visual cortex using multivoxel pattern analyses before the movements are initiated, regardless of whether the tar-get object is visible or not. In addition, the early visual cortex is rerecruited during actions in the dark towards stimuli that have been previously seen. These results suggest three main points. First, the action-driven modulation of sensory processing is shown at the neural level in a network of areas that include the early visual cortex. Second, the role of the early visual cortex goes well beyond the processing of sensory information for perception and might be the target of reentrant feedback for sensory-motor integration. Third, the early visual cortex shows action-driven modulation during both action planning and execution, suggesting a continuous exchange of information with higher-order visual-motor areas for the production of a motor output.

The influence of action execution on object size perception

Speaker: Annalisa Bosco, Department of Pharmacy and Biotechnology, University of Bologna
Additional Authors: Annalisa Bosco, Department of Pharmacy and Biotechnology, University of Bologna; Patrizia Fattori, Department of Pharmacy and Biotechnology, University of Bologna

When performing an action, our perception is focused towards object visual properties that enable us to execute the action successfully. However, the motor system is also able to influence perception, but only few studies reported evidence for hand action-induced visual perception modifications. Here, we aimed to study for a feature-specific perceptual modulation before and after a reaching and grasping action. Two groups of subjects were instructed to either grasp or reach to different sized bars and, before and after the action, to perform a size perceptual task by manual and verbal report. Each group was tested in two experimental conditions: no prior knowledge of action type, where subjects did not know the successive type of movement, and prior knowledge of action type, where they were aware about the successive type of movement. In both manual and verbal perceptual size responses, we found that after a grasping movement the size perception was significantly modified. Additionally, this modification was enhanced when the subjects knew in advance the type of movement to execute in the subsequent phase of task. These data suggest that the knowledge of action type and the execution of the action shape the perception of object properties.

Neuroimaging reveals the human neural representations for visually guided grasping of real objects and pictures

Speaker: Jody C. Culham, Brain and Mind Institute, University of Western Ontario
Additional Authors: Jody C. Culham, University of Western Ontario; Sara Fabbri, Radboud University Nijmegen; Jacqueline C. Snow, University of Nevada, Reno; Erez Freud, Carnegie-Mellon University

Neuroimaging, particularly functional magnetic resonance imaging (fMRI), has revealed many human brain areas that are involved in the processing of visual information for the planning and guidance of actions. One area of particular interest is the anterior intraparietal sulcus (aIPS), which is thought to play a key role in processing information about object shape for the visual control of grasping. However, much fMRI research has relied on artificial stimuli, such as two-dimensional photos, and artificial actions, such as pantomimed grasping. Recent fMRI studies from our lab have used representational similarity analysis on the patterns of fMRI activation from brain areas such as aIPS to infer neural coding in participants performing real actions upon real objects. This research has revealed the visual features of the object (particularly elongation) and the type of grasp (including the number of digits and precision required) that are coded in aIPS and other regions. Moreover, this work has suggested that these neural representations are affected by the realness of the object, particularly during grasping. Taken together, these results highlight the value of using more ecological paradigms to study sensorimotor control.

< Back to 2017 Symposia

How can you be so sure? Behavioral, computational, and neuroscientific perspectives on metacognition in perceptual decision-making

S3 – How can you be so sure? Behavioral, computational, and neuro-scientific perspectives on metacogni-tion in perceptual decision-making

Time/Room: Friday, May 19, 2017, 2:30 – 4:30 pm, Talk Room 1
Organizer(s): Megan Peters, University of California Los Angeles
Presenters: Megan Peters, Ariel Zylberberg, Michele Basso, Wei Ji Ma, Pascal Mamassian

< Back to 2017 Symposia

Metacognition, or our ability to monitor the uncertainty of our thoughts, decisions, and perceptions, is of critical importance across many domains . Here we focus on metacognition in perceptual decisions — the continuous inferences that we make about the most likely state of the world based on incoming sensory information. How does a police officer evaluate the fidelity of his perception that a perpetrator has drawn a weapon? How does a driver compute her certainty in whether a fleeting visual percept is a child or a soccer ball, impacting her decision to swerve? These kinds of questions are central to daily life, yet how such ‘confidence’ is computed in the brain remains unknown. In recent years, increasingly keen interest has been directed towards exploring such metacognitive mechanisms from computational (e.g., Rahnev et al., 2011, Nat Neuro; Peters & Lau, 2015, eLife), neuroimaging (e.g., Fleming et al., 2010, Science), brain stimulation (e.g., Fetsch et al., 2014, Neuron), and neuronal electro-physiology (e.g., Kiani & Shadlen, 2009, Science; Zylberberg et al., 2016, eLife) perspectives. Importantly, the computation of confidence is also of increasing interest to the broader range of researchers studying the computations underlying perceptual decision-making in general. Our central focus is on how confidence is computed in neuronal populations, with attention to (a) whether perceptual decisions and metacognitive judgments depend on the same or different computations, and (b) why confidence judgments sometimes fail to optimally track the accuracy of perceptual decisions. Key themes for this symposium will include neural correlates of confidence, behavioral consequences of evidence manipulation on confidence judgments, and computational characterizations of the relationship between perceptual decisions and our confidence in them. Our principal goal is to attract scientists studying or interested in confidence/uncertainty, sensory metacognition, and perceptual decision-making from both human and animal perspectives, spanning from the computational to the neurobiological level. We bring together speakers from across these disciplines, from animal electrophysiology and behavior through computational models of human uncertainty, to communicate their most recent and exciting findings. Given the recency of many of the findings discussed, our symposium will cover terrain largely untouched by the main program. We hope that the breadth of research programs represented in this symposium will encourage a diverse group of scientists to attend and actively participate in the discussion.

Transcranial magnetic stimulation to visual cortex induces suboptimal introspection

Speaker: Megan Peters, University of California Los Angeles
Additional Authors: Megan Peters, University of California Los Angeles; Jeremy Fesi, The Graduate Center of the City University of New York; Namema Amendi, The Graduate Center of the City University of New York; Jeffrey D. Knotts, University of California Los Angeles; Hakwan Lau, UCLA

In neurological cases of blindsight, patients with damage to primary visual cortex can discriminate objects but report no visual experience of them. This form of ‘unconscious perception’ provides a powerful opportunity to study perceptual awareness, but because the disorder is rare, many researchers have sought to induce the effect in neurologically intact observers. One promising approach is to apply transcranial magnetic stimulation (TMS) to visual cortex to induce blindsight (Boyer et al., 2005), but this method has been criticized for being susceptible to criterion bias confounds: perhaps TMS merely reduces internal visual signal strength, and observers are unwilling to report that they faintly saw a stimulus even if they can still discriminate it (Lloyd et al., 2013). Here we applied a rigorous response-bias free 2-interval forced-choice method for rating subjective experience in studies of unconscious perception (Peters and Lau, 2015) to address this concern. We used Bayesian ideal observer analysis to demonstrate that observers’ introspective judgments about stimulus visibility are suboptimal even when the task does not require that they maintain a response criterion — unlike in visual masking. Specifically, observers appear metacognitively blind to the noise introduced by TMS, in a way that is akin to neurological cases of blindsight. These findings are consistent with the hypothesis that metacognitive judgments require observers to develop an internal model of the statistical properties of their own signal processing architecture, and that introspective suboptimality arises when that internal model abruptly becomes invalid due to external manipulations.

The influence of evidence volatility on choice, reaction time and confidence in a perceptual decision

Speaker: Ariel Zylberberg, Columbia University
Additional Authors: Ariel Zylberberg, Columbia University; Christopher R. Fetsch, Columbia University; Michael N. Shadlen, Columbia University

Many decisions are thought to arise via the accumulation of noisy evidence to a threshold or bound. In perceptual decision-making, the bounded evidence accumulation framework explains the effect of stimulus strength, characterized by signal-to-noise ratio, on decision speed, accuracy and confidence. This framework also makes intriguing predictions about the behavioral influence of the noise itself. An increase in noise should lead to faster decisions, reduced accuracy and, paradoxically, higher confidence. To test these predictions, we introduce a novel sensory manipulation that mimics the addition of unbiased noise to motion-selective regions of visual cortex. We verified the effect of this manipulation with neuronal recordings from macaque areas MT/MST. For both humans and monkeys, increasing the noise induced faster decisions and greater confidence over a range of stimuli for which accuracy was minimally impaired. The magnitude of the effects was in agreement with predictions of a bounded evidence accumulation model.

A role for the superior colliculus in decision-making and confidence

Speaker: Michele Basso, University of California Los Angeles
Additional Authors: Michele Basso, University of California Los Angeles; Piercesare Grimaldi, University of California Los Angeles; Trinity Crapse, University of California Los Angeles

Evidence implicates the superior colliculus (SC) in attention and perceptual decision-making . In a simple target-selection task, we previously showed that discriminability between target and distractor neuronal activity in the SC correlated with decision accuracy, consistent with the hypothesis that SC encodes a decision variable. Here we extend these results to determine whether SC also correlates with decision criterion and confidence. Trained monkeys performed a simple perceptual decision task in two conditions to induce behavioral response bias (criterion shift): (1) the probability of two perceptual stimuli was equal, and (2) the probability of one perceptual stimulus was higher than the other. We observed consistent changes in behavioral response bias (shifts in decision criterion) that were directly cor-related with SC neuronal activity. Furthermore, electrical stimulation of SC mimicked the effect of stimulus probability manipulations, demonstrating that SC correlates with and is causally involved in setting decision criteria. To assess confidence, monkeys were offered a ‘safe bet’ option on 50% of trials in a similar task. The ‘safe bet’ always yielded a small reward, encouraging monkeys to select the ‘safe bet’ when they were less confident rather than risk no reward for a wrong decision. Both monkeys showed metacognitive sensitivity: they chose the ‘safe bet’ more on more difficult trials. Single- and multi-neuron recordings from SC revealed two distinct neuronal populations: one that discharged more robustly for more confident trials, and one that did so for less confident trials. Together these finding show how SC encodes information about decisions and decisional confidence.

Testing the Bayesian confidence hypothesis

Speaker: Wei Ji Ma, New York University
Additional Authors: Wei Ji Ma, New York University; Will Adler, New York University; Ronald van den Berg, University of Uppsala

Asking subjects to rate their confidence is one of the oldest procedures in psychophysics. Remarkably, quantitative models of confidence ratings have been scarce. What could be called the “Bayesian confidence hypothesis” states that an observer’s confidence rating distribution is completely determined by posterior probability. This hypothesis predicts specific quantitative relationships between performance and confidence. It also predicts that stimulus combinations that produce the same posterior will also produce the same confidence distribution. We tested these predictions in three contexts: a) perceptual categorization; b) visual working memory; c) the interpretation of scientific data.

Integration of visual confidence over time and across stimulus dimensions

Speaker: Pascal Mamassian, Ecole Normale Supérieure
Additional Authors: Pascal Mamassian, Ecole Normale Supérieure; Vincent de Gardelle, Université Paris 1; Alan Lee, Lingnan University

Visual confidence refers to our ability to estimate our own performance in a visual decision task. Several studies have highlighted the relatively high efficiency of this meta-perceptual ability, at least for simple visual discrimination tasks. Are observers equally good when visual confidence spans more than one stimulus dimension or more than a single decision? To address these issues, we used the method of confidence forced-choice judgments where participants are prompted to choose between two alter-natives the stimulus for which they expect their performance to be better (Barthelmé & Mamassian, 2009, PLoS CB). In one experiment, we asked observers to make confidence choice judgments between two different tasks (an orientation-discrimination task and a spatial-frequency-discrimi-nation task). We found that participants were equally good at making these across-dimensions confidence judgments as when choices were restricted to a single dimension, suggesting that visual confidence judgments share a common currency. In another experiment, we asked observers to make confidence-choice judgments between two ensembles of 2, 4, or 8 stimuli. We found that participants were increasingly good at making ensemble confidence judgments, suggesting that visual confidence judgments can accumulate information across several trials. Overall, these results help us better understand how visual confidence is computed and used over time and across stimulus dimensions.

< Back to 2017 Symposia

Cutting across the top-down-bottom-up dichotomy in attentional capture research

Time/Room: Friday, May 19, 2017, 5:00 – 7:00 pm, Talk Room 1
Organizer(s): J. Eric T. Taylor, Brain and Mind Institute at Western University
Presenters: Nicholas Gaspelin, Matthew Hilchey, Dominique Lamy, Stefanie Becker, Andrew B. Leber

< Back to 2017 Symposia

Research on attentional selection describes the various factors that determine what information is ignored and what information is processed. These factors are commonly described as either bottom-up or top-down, indicating whether stimulus properties or an observer’s goals determine the outcome of selection. Research on selection typically adheres strongly to one of these two perspectives; the field is divided. The aim of this symposium is to generate discussions and highlight new developments in the study of attentional selection that do not conform to the bifurcated approach that has characterized the field for some time (or trifurcated, with respect to recent models emphasizing the role of selection history). The research presented in this symposium does not presuppose that selection can be easily or meaningfully dichotomized. As such, the theme of the symposium is cutting across the top-down-bottom-up dichotomy in attentional selection research. To achieve this, presenters in this session either share data that cannot be easily explained within the top-down or bot-tom-up framework, or they propose alternative models of existing descriptions of sources of attentional control. Theoretically, the symposium will begin with presentations that attempt to resolve the dichotomy with a new role for suppression (Gaspelin & Luck) or further bemuse the dichotomy with typically bottom-up patterns of behaviour in response to intransient stimuli (Hilchey, Taylor, & Pratt). The discussion then turns to demonstrations that the bottom-up, top-down, and selection history sources of control variously operate on different perceptual and attentional pro-cesses (Lamy & Zivony; Becker & Martin), complicating our categorization of sources of control. Finally, the session will conclude with an argument for more thorough descriptions of sources of control (Leber & Irons). In summary, these researchers will present cutting-edge developments using converging methodologies (chronometry, EEG, and eye-tracking measures) that further our understanding of attentional selection and advance attentional capture research beyond its current dichotomy. Given the heated history of this debate and the importance of the theoretical question, we expect that this symposium should be of interest to a wide audience of researchers at VSS, especially those interested in visual attention and cognitive control.

Mechanisms Underlying Suppression of Attentional Capture by Salient Stimuli

Speaker: Nicholas Gaspelin, Center for Mind and Brain at the University of California, Davis
Additional Authors: Nicholas Gaspelin, Center for Mind and Brain at the University of California, Davis; Carly J. Leonard, Center for Mind and Brain at the University of California, Davis; Steven J. Luck, Center for Mind and Brain at the University of California, Davis

Researchers have long debated the nature of cognitive control in vision, with the field being dominated by two theoretical camps. Stimulus-driven theories claim that visual attention is automatically captured by salient stimuli, whereas goal-driven theories argue that capture depends critically the goals of a viewer. To resolve this debate, we have previously provided key evidence for a new hybrid model called signal suppression hypothesis. According to this account, all salient stimuli generate an active salience signal which automatically attempts to guide visual attention. However, this signal can be actively suppressed. In the current talk, we review the converging evidence for this active suppression of salient items, using behavioral, eye tracking and electrophysiological methods. We will also discuss the cognitive mechanisms underlying suppression effects and directions for future research.

Beyond the new-event paradigm in visual attention research: Can completely static stimuli capture attention?

Speaker: Matthew Hilchey, University of Toronto
Additional Authors: Matthew D. Hilchey, University of Toronto, J. Eric T. Taylor, Brain and Mind Institute at Western University; Jay Pratt, University of Toronto

The last several decades of attention research have focused almost exclusively on paradigms that introduce new perceptual objects or salient sensory changes to the visual environment in order to determine how attention is captured to those locations. There are a handful of exceptions, and in the spirit of those studies, we asked whether or not a completely unchanging stimuli can attract attention using variations of classic additional singleton and cueing paradigms. In the additional singleton tasks, we presented a preview array of six uniform circles. After a short delay, one circle changed in form and luminance – the target location – and all but one location changed luminance, leaving the sixth location physically unchanged. The results indicated that attention was attracted toward the vicinity of the only unchanging stimulus, regardless of whether all circles around it increased or decreased luminance. In the cueing tasks, cueing was achieved by changing the luminance of 5 circles in the object preview array either 150 or 1000 ms before the onset of a target. Under certain conditions, we observed canonical patterns of facilitation and inhibition emerging from the location containing the physically unchanging cue stimuli. Taken together, the findings suggest that a completely unchanging stimulus, which bears no obvious resemblance to the target, can attract attention in certain situations.

Stimulus salience, current goals and selection history do not affect the same perceptual processes

Speaker: Dominique Lamy, Tel Aviv University
Additional Authors: Dominique Lamy, Tel Aviv University; Alon Zivony, Tel Aviv University

When exposed to a visual scene, our perceptual system performs several successive processes. During the preattentive stage, the attentional priority accruing to each location is computed. Then, attention is shifted towards the highest-priority location. Finally, the visual properties at that location are processed. Although most attention models posit that stimulus-driven and goal-directed processes combine to determine attentional priority, demonstrations of purely stimulus-driven capture are surprisingly rare. In addition, the consequences of stimulus-driven and goal-directed capture on perceptual processing have not been fully described. Specifically, whether attention can be disengaged from a distractor before its properties have been processed is unclear. Finally, the strict dichotomy between bottom-up and top-down attentional control has been challenged based on the claim that selection history also biases attentional weights on the priority map. Our objective was to clarify what perceptual processes stimulus salience, current goals and selection history affect. We used a feature-search spatial-cueing paradigm. We showed that (a) unlike stimulus salience and current goals, selection history does not modulate attentional priority, but only perceptual processes following attentional selection; (b) a salient distractor not matching search goals may capture attention but attention can be disengaged from this distractor’s location before its properties are fully processed; and (c) attentional capture by a distractor sharing the target feature entails that this distractor’s properties are mandatorily processed.

Which features guide visual attention, and how do they do it?

Speaker: Stefanie Becker, The University of Queensland
Additional Authors: Stefanie Becker, The University of Queensland; Aimee Martin, The University of Queensland

Previous studies purport to show that salient irrelevant items can attract attention involuntarily, against the intentions and goals of an observer. However, corresponding evidence originates predominantly from RT and eye movement studies, whereas EEG studies largely failed to support saliency capture. In the present study, we examined effects of salient colour distractors on search for a known colour target when the distractor was similar vs . dissimilar to the target. We used both eye tracking and EEG (in separate experiments), and also investigated participant’s awareness of the features of irrelevant distractors. The results showed that capture by irrelevant distractors was strongly top-down modulated, with target-similar dis-tractors attracting attention much more strongly, and being remembered better, than salient distractors. Awareness of the distractor correlated more strongly with initial capture rather than attentional dwelling on the distractor after it was selected. The salient distractor enjoyed no noticeable advantage over non-salient control distractors with regard to implicit measures, but was overall reported with higher accuracy than non-salient distractors. This raises the interesting possibility that salient items may primarily boost visual processes directly, by requiring less attention for accurate perception, not by summoning spatial attention.

Toward a profile of goal-directed attentional control

Speaker: Andrew B. Leber, The Ohio State University
Additional Authors: Andrew B. Leber, The Ohio State University; Jessica L. Irons, The Ohio State University

Recent criticism of the classic bottom-up/top-down dichotomy of attention has deservedly focused on the existence of experience-driven factors out-side this dichotomy. However, as researchers seek a better framework characterizing all control sources, a thorough re-evaluation of the top-down, or goal-directed, component is imperative. Studies of this component have richly documented the ways in which goals strategically modulate attentional control, but surprisingly little is known about how individuals arrive at their chosen strategies. Consider that manipulating goal-directed control commonly relies on experimenter instruction, which lacks ecological validity and may not always be complied with. To better characterize the factors governing goal-directed control, we recently created the adaptive choice visual search paradigm. Here, observers can freely choose between two tar-gets on each trial, while we cyclically vary the relative efficacy of searching for each target. That is, on some trials it is faster to search for a red target than a blue target, while on other trials the opposite is true . Results using this paradigm have shown that choice behavior is far from optimal, and appears largely determined by competing drives to maximize performance and minimize effort. Further, individual differences in performance are stable across sessions while also being malleable to experimental manipulations emphasizing one competing drive (e.g., reward, which motivates individuals to maximize performance). This research represents an initial step toward characterizing an individual profile of goal-directed control that extends beyond the classic understanding of “top-down” attention and promises to contribute to a more accurate framework of attentional control.

< Back to 2017 Symposia

A scene is more than the sum of its objects: The mechanisms of object-object and object-scene integration

Time/Room: Friday, May 19, 2017, 12:00 – 2:00 pm, Talk Room 1
Organizer(s): Liad Mudrik, Tel Aviv University and Melissa Võ, Goethe University Frankfurt
Presenters: Michelle Greene, Monica S. Castelhano, Melissa L.H. Võ, Nurit Gronau, Liad Mudrik

< Back to 2017 Symposia

Symposium Description

In the lab, vision researchers are typically trying to create “clean”, controlled environments and stimulations in order to tease apart the different processes that are involved in seeing. Yet in real life, visual comprehension is never a sterile process: objects appear with other objects in cluttered, rich scenes, which have certain spatial and semantic properties. In recent years, more and more studies are focusing on object-object and object-scene relations as possible guiding principles of vision. The proposed symposium aims to present current findings in this continuously developing field, while specifically focusing on two key questions that have attracted substantial scientific interest in recent years; how do scene-object and object-object relations influence object processing, and what are the necessary conditions for deciphering these relations. Greene, Castelhano and Võ will each tackle the first question in different ways, using information theoretic measures, visual search findings, eye movement, and EEG measures. The second question will be discussed with respect to attention and consciousness: Võ’s findings suggest automatic processing of object-scene relations, but do not rule out the need for attention. This view is corroborated and further stressed by Gronau’s results. With respect to consciousness, Mudrik, however, will present behavioral and neural data suggesting that consciousness may not be an immediate condition for relations processing, but rather serve as a necessary enabling factor. Taken together, these talks should lay the ground for an integrative discussion of both complimentary and conflicting findings. Whether these are based on different theoretical assumptions, methodologies or experimental approaches, the core of the symposium will speak to how to best tackle the investigation of the complexity of real-world scene perception.

Presentations

Measuring the Efficiency of Contextual Knowledge

Speaker: Michelle Greene, Stanford University

The last few years have brought us both large-scale image databases and the ability to crowd-source human data collection, allowing us to measure contextual statistics in real world scenes (Greene, 2013). How much contextual information is there, and how efficiently do people use it? We created a visual analog to a guessing game suggested by Claude Shannon (1951) to measure the information scenes and objects share. In our game, 555 participants on Amazon’s Mechanical Turk (AMT) viewed scenes in which a single object was covered by an opaque bounding box. Participants were instructed to guess about the identity of the hidden object until correct. Participants were paid per trial, and each trial terminated upon correctly guessing the object, so participants were incentivized to guess as efficiently as possible. Using information theoretic measures, we found that scene context can be encoded with less than 2 bits per object, a level of redundancy that is even greater than that of English text. To assess the information from scene category, we ran a second experiment in which the image was replaced by the scene category name. Participants still outperformed the entropy of the database, suggesting that the majority of contextual knowledge is carried by the category schema. Taken together, these results suggest that not only is there a great deal of information about objects coming from scene categories, but that this information is efficiently encoded by the human mind.

Where in the world?: Explaining Scene Context Effects during Visual Search through Object-Scene Spatial Associations

Speaker: Monica S. Castelhano, Queen’s University

The spatial relationship between objects and scenes and its effects on visual search performance has been well-established. Here, we examine how object-scene spatial associations support scene context effects on eye movement guidance and search efficiency. We reframed two classic visual search paradigms (set size and sudden onset) according to the spatial association between the target object and scene. Using the recently proposed Surface Guidance Framework, we operationalize target-relevant and target-irrelevant regions. Scenes are divided into three regions (upper, mid, lower) that correspond with possible relevant surfaces (wall, countertop, floor). Target-relevant regions are defined according to the surface on which the target is likely to appear (e.g., painting, toaster, rug). In the first experiment, we explored how spatial associations affect search by manipulating search size in either target-relevant or target-irrelevant regions. We found that only set size increases in target-relevant regions adversely affected search performance. In the second experiment, we manipulated whether a suddenly-onsetting distractor object appeared in a target-relevant or target-irrelevant region. We found that fixations to the distractor were significantly more likely and search performance was negatively affected in the target-relevant condition. The Surface Guidance Framework allows for further exploration of how object-scene spatial associations can be used to quickly narrow processing to specific areas of the scene and largely ignore information in other areas. Viewing effects of scene context through the lens of target-relevancy allows us to develop new understanding of how the spatial associations between objects and scenes can affect performance.

What drives semantic processing of objects in scenes?

Speaker: Melissa L.H. Võ, Goethe University Frankfurt

Objects hardly ever appear in isolation, but are usually embedded in a larger scene context. This context — determined e.g. by the co-occurrence of other objects or the semantics of the scene as a whole — has large impact on the processing of each and every object. Here I will present a series of eye tracking and EEG studies from our lab that 1) make use of the known time-course and neuronal signature of scene semantic processing to test whether seemingly meaningless textures of scenes are sufficient to modulate semantic object processing, and 2) raise the question of its automaticity. For instance, we have previously shown that semantically inconsistent objects trigger an N400 ERP response similar to the one known from language processing. Moreover, an additional but earlier N300 response signals perceptual processing difficulties that go in line with classic findings of impeded object identification from the 1980s. We have since used this neuronal signature to investigate scene context effects on object processing and recently found that a scene’s mere summary statistics — visualized as seemingly meaningless textures — elicit a very similar N400 response. Further, we have shown that observers looking for target letters superimposed on scenes fixated task-irrelevant semantically inconsistent objects embedded in the scenes to a greater degree and without explicit memory for these objects. Manipulating the number of superimposed letters reduced this effect, but not entirely. As part of this symposium, we will discuss the implications of these findings for the question as to whether object-scene integration requires attention.

Vision at a glance: the necessity of attention to contextual integration processes

Speaker: Nurit Gronau, The Open University of Israel

Objects that are conceptually consistent with their environment are typically grasped more rapidly and efficiently than objects that are inconsistent with it. The extent to which such contextual integration processes depend on visual attention, however, is largely disputed. The present research examined the necessity of visual attention to object-object and object-scene contextual integration processes during a brief visual glimpse. Participants performed an object classification task on associated object pairs that were either positioned in expected relative locations (e.g., a desk-lamp on a desk) or in unexpected, contextually inconsistent relative locations (e.g., a desk-lamp under a desk). When both stimuli were relevant to task requirements, latencies to spatially consistent object pairs were significantly shorter than to spatially inconsistent pairs. These contextual effects disappeared, however, when spatial attention was drawn to one of the two object stimuli while its counterpart object was positioned outside the focus of attention and was irrelevant to task-demands. Subsequent research examined object-object and object-scene associations which are based on categorical relations, rather than on specific spatial and functional relations. Here too, processing of the semantic/categorical relations necessitated allocation of spatial attention, unless an unattended object was explicitly defined as a to-be-detected target. Collectively, our research suggests that associative and integrative contextual processes underlying scene understanding rely on the availability of spatial attentional resources. However, stimuli which comply with task-requirements (e.g., a cat/dog in an animal, but not in a vehicle detection task) may benefit from efficient processing even when appearing outside the main focus of visual attention.

Object-object and object-scene integration: the role of conscious processing

Speaker: Liad Mudrik, Tel Aviv University

On a typical day, we perform numerous integration processes; we repeatedly integrate objects with the scenes in which they appear, and decipher the relations between objects, resting both on their tendency to co-occur and on their semantic associations. Such integration seems effortless, almost automatic, yet computationally speaking it is highly complicated and challenging. This apparent contradiction evokes the question of consciousness’ role in the process: is it automatic enough to obviate the need for conscious processing, or does its complexity necessitate the involvement of conscious experience? In this talk, I will present EEG, fMRI and behavioral experiments that tap into consciousness’ role in processing object-scene integration and object-object integration. The former revisits subjects’ ability to integrate the relations (congruency/incongruency) between an object and the scene in which it appears. The latter examines the processing of the relations between two objects, in an attempt to differentiate between associative relations (i.e., relations that rest on repeated co-occurrences of the two objects) vs. abstract ones (i.e., relations that are more conceptual, between two objects that do not tend to co-appear but are nevertheless related). I will claim that in both types of integration, consciousness may function as an enabling factor rather than an immediate necessary condition.

< Back to 2017 Symposia

2017 Symposia

S1 – A scene is more than the sum of its objects: The mechanisms of object-object and object-scene integration

Organizer(s): Liad Mudrik, Tel Aviv University and Melissa Võ, Goethe University Frankfurt
Time/Room: Friday, May 19, 2017, 12:00 – 2:00 pm, Talk Room 1

Our visual world is much more complex than most laboratory experiments make us believe. Nevertheless, this complexity turns out not to be a drawback, but actually a feature, because complex real-world scenes have defined spatial and semantic properties which allow us to efficiently perceive and interact with our environment. In this symposium we will present recent advances in assessing how scene-object and object-object relations influence processing, while discussing the necessary conditions for deciphering such relations. By considering the complexity of real-world scenes as information that can be exploited, we can develop new approaches for examining real-world scene perception. More…

S2 – The Brain Correlates of Perception and Action: from Neural Activity to Behavior

Organizer(s): Simona Monaco, Center for Mind/Brain Sciences, University of Trento & Annalisa Bosco, Dept of Pharmacy and Biotech, University of Bologna
Time/Room: Friday, May 19, 2017, 12:00 – 2:00 pm, Pavilion

This symposium offers a comprehensive view of the cortical and subcortical structures involved in perceptual-motor integration for eye and hand movements in contexts that resemble real life situations. By gathering scientists from neurophysiology to neuroimaging and psychophysics we provide an understanding of how vision is used to guide action from the neuronal level to behavior. This knowledge pushes our understanding of visually-guided motor control outside the constraints of the laboratory and into contexts that we daily encounter in the real world. More…

S3 – How can you be so sure? Behavioral, computational, and neuroscientific perspectives on metacognition in perceptual decision-making

Organizer(s): Megan Peters, University of California Los Angeles
Time/Room: Friday, May 19, 2017, 2:30 – 4:30 pm, Talk Room 1

Evaluating our certainty in a memory, thought, or perception seems as easy as answering the question, “Are you sure?” But how our brains make these determinations remains unknown. Specifically, does the brain use the same information to answer the questions, “What do you see?” and, “Are you sure?” What brain areas are responsible for doing these calculations, and what rules are used in the process? Why are we sometimes bad at judging the quality of our memories, thoughts, or perceptions? These are the questions we will try to answer in this symposium. More…

S4 – The Role of Ensemble Statistics in the Visual Periphery

Organizer(s): Brian Odegaard, University of California-Los Angeles
Time/Room: Friday, May 19, 2017, 2:30 – 4:30 pm, Pavilion

The past decades have seen the growth of a tremendous amount of research into the human visual system’s capacity to encode “summary statistics” of items in the world. One recent proposal in the literature has focused on the promise of ensemble statistics to provide an explanatory account of subjective experience in the visual periphery (Cohen, Dennett, & Kanwisher, Trends in Cognitive Sciences, 2016). This symposium will address how ensemble statistics are encoded outside the fovea, and to what extent this capacity explains our experience of the majority of our visual field. More…

S5 – Cutting across the top-down-bottom-up dichotomy in attentional capture research

Organizer(s): J. Eric T. Taylor, Brain and Mind Institute at Western University
Time/Room: Friday, May 19, 2017, 5:00 – 7:00 pm, Talk Room 1

Research on attentional selection describes the various factors that determine what information is ignored and what information is processed. Broadly speaking, researchers have adopted two explanations for how this occurs, which emphasize either automatic or controlled processing, often presenting evidence that is mutually contradictory. This symposium presents new evidence from five speakers that address this controversy from non-dichotomous perspectives. More…

S6 – Virtual Reality and Vision Science

Organizer(s): Bas Rokers, University of Wisconsin – Madison & Karen B. Schloss, University of Wisconsin – Madison
Time/Room: Friday, May 19, 2017, 5:00 – 7:00 pm, Pavilion

Virtual and augmented reality (VR/AR) research can answer scientific questions that were previously difficult or impossible to address. VR/AR may also provide novel methods to assist those with visual deficits and treat visual disorders. After a brief introduction by the organizers (Bas Rokers & Karen Schloss), 5 speakers representing both academia and industry will each give a 20-minute talk, providing an overview of existing research and identify promising new directions. ​The session will close with a 15 minute panel to deepen the dialog between industry and vision science. Topics include sensory integration, perception in naturalistic environments, and mixed reality. Symposium attendees may learn how to incorporate AR/VR into their research, identify current issues of interest to both academia and industry, and consider avenues of inquiry that may open with upcoming technological advances. More…

What do deep neural networks tell us about biological vision?

Time/Room: Friday, May 13, 2016, 2:30 – 4:30 pm, Talk Room 1-2
Organizer(s): Radoslaw Martin Cichy; Department of Psychology and Education, Free University Berlin, Berlin, Germany
Presenters: Kendrick Kay, Seyed-Mahdi Khaligh-Razavi, Daniel Yamins, Radoslaw Martin Cichy, Tomoyasu Horikawa, Kandan Ramakrishnan

< Back to 2016 Symposia

Symposium Description

Visual cognition in humans is mediated by complex, hierarchical, multi-stage processing of visual information, propagated rapidly as neural activity in a distributed network of cortical regions. Understanding visual cognition in cortex thus requires a predictive and quantitative model that captures the complexity of the underlying spatio-temporal dynamics and explains human behavior. Very recently, brain-inspired deep neural networks (DNNs) have taken center stage as an artificial computational model for understanding human visual cognition. A major reason for their emerging dominance is that DNNs perform near human-level performance on tasks such as object recognition (Russakovsky et al., 2014). While DNNs were initially developed by computer scientists to solve engineering problems, research comparing visual representations in DNNs and primate brains have found a striking correspondence, creating excitement in vision research (Kriegeskorte 2015, Ann Rev Vis, Keynote VSS 2014 Bruno Olshausen; Jones 2014; Nature). The aim of this symposium is three-fold: One aim is to describe cutting-edge research efforts that use DNNs to understand human visual cognition. A second aim is to establish which results reproduce across studies and thus create common ground for further research. A third aim is to provide a venue for critical discussion of the theoretical implications of the results. To introduce and frame the debate for a wide audience, Kendrick Kay will start with thorough introduction to the DNN approach in the beginning and formulate questions and challenges to which the individual speakers will respond in their talks. The individual talks will report on recent DNN-related biological vision research. The talks will cover a wide range of results: brain data recorded in different species (human, monkey), with different techniques (electrophysiology, fMRI, M/EEG), for static as well as movie stimuli, using a wide range of analysis techniques (decoding and encoding models, representational similarity analysis). Major questions addressed will be: what do DNNs tell us about visual processing in the brain? What is the theoretical impact of finding a correspondence between DNNs and representations in human brains? Do these insights extend to visual cognition such as imagery? What analysis techniques and methods are available to relate DNNs to human brain function? What novel insights can be gained from comparison of DNNs to human brains? What effects reproduce across studies? A final 20-min open discussion between speakers and the audience will close the symposium, encouraging discussion on what aims the DNN approach has reached already, where it fails, what future challenges lie ahead, and how to tackle them. As DNNs address visual processing across low to mid- to high-level vision, we believe this symposium will be of interest to a broad audience, including students, postdocs and faculty. This symposium is a grass-roots first-author-based effort, bringing together junior researchers from around the world (US, Germany, Netherlands, and Japan).

Presentations

What are deep neural networks and what are they good for?

Speaker: Kendrick Kay; Center for Magnetic Resonance Research, University of Minnesota, Twin Cities

In this talk, I will provide a brief introduction to deep neural networks (DNN) and discuss their usefulness with respect to modeling and understanding visual processing in the brain. To assess the potential benefits of DNN models, it is important to step back and consider generally the purpose of computational modeling and how computational models and experimental data should be integrated. Is the only goal to match experimental data? Or should we derive understanding from computational models? What kinds of information can be derived from a computational model that cannot be derived through simpler analyses? Given that DNN models can be quite complex, it is also important to consider how to interpret these models. Is it possible to identify the key feature of a DNN model that is responsible for a specific experimental effect? Is it useful to perform ‘in silico’ experiments with a DNN model? Should we should strive to perform meta-modeling, that is, developing a (simple) model of a (complex DNN) model in order to help understand the latter? I will discuss these and related issues in the context of DNN models and compare DNN modeling to an alternative modeling approach that I have pursued in past research.

Mixing deep neural network features to explain brain representations

Speaker: Seyed-Mahdi Khaligh-Razavi; CSAIL, MIT, MA, USA
Authors: Linda Henriksson, Department of Neuroscience and Biomedical Engineering, Aalto University, Aalto, Finland Kendrick Kay, Center for Magnetic Resonance Research, University of Minnesota, Twin Cities Nikolaus Kriegeskorte, MRC-CBU, University of Cambridge, UK

Higher visual areas present a difficult explanatory challenge and can be better studied by considering the transformation of representations across the stages of the visual hierarchy from lower- to higher-level visual areas. We investigated the progress of visual information through the hierarchy of visual cortex by comparing the representational geometry of several brain regions with a wide range of object-vision models, ranging from unsupervised to supervised, and from shallow to deep models. The shallow unsupervised models tended to have higher correlations with early visual areas; and the deep supervised models were more correlated with higher visual areas. We also presented a new framework for assessing the pattern-similarity of models with brain areas, mixed representational similarity analysis (RSA), which bridges the gap between RSA and voxel-receptive-field modelling, both of which have been used separately but not in combination in previous studies (Kriegeskorte et al., 2008a; Nili et al., 2014; Khaligh-Razavi and Kriegeskorte, 2014; Kay et al., 2008, 2013). Using mixed RSA, we evaluated the performance of many models and several brain areas. We show that higher visual representations (i.e. lateral occipital region, inferior temporal cortex) were best explained by the higher layers of a deep convolutional network after appropriate mixing and weighting of its feature set. This shows that deep neural network features form the essential basis for explaining the representational geometry of higher visual areas.

Using DNNs To Compare Visual and Auditory Cortex

Speaker: Daniel Yamins; Department of Brain and Cognitive Sciences, MIT, MA, USA
Authors: Alex Kell, Department of Brain and Cognitive Sciences, MIT, MA, USA

A slew of recent studies have shown how deep neural networks (DNNs) optimized for visual tasks make effective models of neural response patterns in the ventral visual stream. Analogous results have also been discovered in auditory cortex, where optimizing DNNs for speech-recognition tasks has produced quantitatively accurate models of neural response patterns in auditory cortex. The existence of computational models within the same architectural class for two apparently very different sensory representations begs several intriguing questions: (1) to what extent do visual models predict auditory response patterns, and to what extent to do auditory models predict visual response patterns? (2) In what ways are the vision and auditory models models similar, and what ways do they diverge? (3) What do the answers to the above questions tell us about the relationships between the natural statistics of these two sensory modalities — and the underlying generative processes behind them? I’ll describe several quantitative and qualitative modeling results, involving electrophysiology data from macaques and fMRI data from humans, that shed some initial light on these questions.

Deep Neural Networks explain spatio-temporal dynamics of visual scene and object processing

Speaker: Radoslaw Martin Cichy; Department of Psychology and Education, Free University Berlin, Berlin, Germany
Authors: Aditya Khosla, CSAIL, MIT, MA, USA Dimitrios Pantazis, McGovern Institute of Brain and Cognitive Sciences, MIT, MA, USA Antonio Torralba, CSAIL, MIT, MA, USA Aude Oliva, CSAIL, MIT, MA, USA

Understanding visual cognition means knowing where and when what is happening in the brain when we see. To address these questions in a common framework we combined deep neural networks (DNNs) with fMRI and MEG by representational similarity analysis. We will present results from two studies. The first study investigated the spatio-temporal neural dynamics during visual object recognition. Combining DNNs with fMRI, we showed that DNNs predicted a spatial hierarchy of visual representations in both the ventral, and the dorsal visual stream. Combining DNNs with MEG, we showed that DNNs predicted a temporal hierarchy with which visual representations emerged. This indicates that 1) DNNs predict the hierarchy of visual brain dynamics in space and time, and 2) provide novel evidence for object representations in parietal cortex. The second study investigated how abstract visual properties, such as scene size, emerge in the human brain in time. First, we identified an electrophysiological marker of scene size processing using MEG. Then, to explain how scene size representations might emerge in the brain, we trained a DNN on scene categorization. Representations of scene size emerged naturally in the DNN without it ever being trained to do so, and DNN accounted for scene size representations in the human brain. This indicates 1) that DNNs are a promising model for the emergence of abstract visual properties representations in the human brain, and 2) gives rise to the idea that the cortical architecture in human visual cortex is the result of task constraints imposed by visual tasks.

Generic decoding of seen and imagined objects using features of deep neural networks

Speaker: Tomoyasu Horikawa; Computational Neuroscience Laboratories, ATR, Kyoto, Japan
Authors: Yukiyasu Kamitani; Graduate School of Informatics, Kyoto University, Kyoto, Japan

Object recognition is a key function in both human and machine vision. Recent studies support that a deep neural network (DNN) can be a good proxy for the hierarchically structured feed-forward visual system for object recognition. While brain decoding enabled the prediction of mental contents represented in our brain, the prediction is limited to training examples. Here, we present a decoding approach for arbitrary objects seen or imagined by subjects by employing DNNs and a large image database. We assume that an object category is represented by a set of features rendered invariant through hierarchical processing, and show that visual features can be predicted from fMRI patterns and that greater accuracy is achieved for low/high-level features with lower/higher-level visual areas, respectively. Furthermore, visual feature vectors predicted by stimulus-trained decoders can be used to identify seen and imagined objects (extending beyond decoder training) from a set of computed features for numerous objects. Successful object identification for imagery-induced brain activity suggests that feature-level representations elicited in visual perception may also be used for top-down visual imagery. Our results demonstrate a tight link between the cortical hierarchy and the levels of DNNs and its utility for brain-based information retrieval. Because our approach enabled us to predict arbitrary object categories seen or imagined by subjects without pre-specifying target categories, we may be able to apply our method to decode the contents of dreaming. These results contribute to a better understanding of the neural representations of the hierarchical visual system during perception and mental imagery.

Mapping human visual representations by deep neural networks

Speaker: Kandan Ramakrishnan; Intelligent Sensory Information Systems, UvA, Netherlands
Authors: H.Steven Scholte; Department of Psychology, Brain and Cognition, UvA, Netherlands, Arnold Smeulders, Intelligent Sensory Information Systems, UvA, Netherlands, Sennay Ghebreab; Intelligent Sensory Information Systems, UvA, Netherlands

A number of recent studies have shown that deep neural networks (DNN) map to the human visual hierarchy. However, based on a large number of subjects and accounting for the correlations between DNN layers, we show that there is no one-to-one mapping of DNN layers to the human visual system. This suggests that the depth of DNN, which is also critical to its impressive performance in object recognition, has to be investigated for its role in explaining brain responses. On the basis of EEG data collected from a large set of natural images we analyzed different DNN architectures – a 7 layer, 16 layer and a 22 layer DNN network using weibull distribution for the representations at each layer. We find that the DNN architectures reveal temporal dynamics of object recognition, with early layers driving responses earlier in time and higher layers driving the responses later in time. Surprisingly the layers from the different architectures explain brain responses to a similar degree. However, by combining the representations of the DNN layers we observe that in the higher brain areas we explain more brain activity. This suggests that the higher areas in the brain are composed of multiple non-linearities that are not captured by the individual DNN layers. Overall, while DNNs form a highly promising model to map the human visual hierarchy, the representations in the human brain go beyond the simple one-to-one mapping of the DNN layers to the human visual hierarchy.

< Back to 2016 Symposia

What can we learn from #TheDress – in search for an explanation

Time/Room: Friday, May 13, 2016, 2:30 – 4:30 pm, Pavilion
Organizer(s): Annette Werner; Institute for Ophthalmic Research, Tübingen University, Germany
Presenters: Annette Werner, Anya Hurlbert, Christoph Witzel, Keiji Uchikawa, Bevil Conway, Lara Schlaffke

< Back to 2016 Symposia

Symposium Description

Few topics in colour research have generated so much interest in the science community and public alike, as the recent #TheDress. The phenomenon refers to the observation that observers cannot agree on colour names for a dress seen in a particular photograph, i.e. colour judgements fall into at least two categories, namely blue&black and white&gold. Although individual differences in colour perception are well known, this phenomenon is still unprecendented since it uncovers a surprising ambiguity in colour vision – surprising because our visual brain was thought to reconstruct surface colour so successfully that it is experienced by the naive observer as an inherent property of objects. Understanding the origin of the perceptual dichotomy of #TheDress is therefore important not only in the context of the phenomenon itself but also for our comprehension of the neural computations of colour in general. Since it’s discovery, a number of hypotheses have been put forward, in order to explain the phenomenon; these include individual differences in peripheral or sensory properties such as variations in the entopic filters of the eye, or in the spectral sensitivities of the chromatic pathways; „high end“ explanations concern differences at cognitive stages, e.g. regarding the interpretation of the lightfield in a scene, or the use of priors for estimating the illuminant or the surface colour. The ambiguity in case of #TheDress may arise because of the peculiar distribution of surface colours in the photo and the lack of further information in the background. The symposium shall gather the actual experimental evidence, and provide a profound basis for the discussion and evaluation of existing and novel hypotheses. The topic will be introduced by the organizer and concluded by a general discussion of the experimental findings of all presentations. Because of the wide spread interest for the topic of #TheDress and it’s general importance for colour vision, we expect a large VSS audience, including students, postdocs, and senior scientists from all fields in vision research.

Presentations

The #Dress phenomenon – an empirical investigation into the role of the background

Speaker: Annette Werner; Institute for Ophthalmic Research, Tübingen University, Germany
Authors: Alisa Schmidt, Institute for Ophthalmic Research, Tübingen University, Germany

The #TheDress phenomenon refers to a dichotomy in colour perception, which is specific to the foto of a blue&black dress, namely that most observers judge its colours as either blue/black or white/gold. Hypotheses explaining the phenomenon include individual variations of information pocessing at sensory as well as cognitive stages. In particular it has been proposed that the lack of/ambiguity in background information leads observers to different conclusions about the illuminant and the light field. We will present result of matching experiments involving the presentations of the real blue/black dress, mounted on differently colour backgrounds and under the illuminations of two slide projectors, thereby mimicking the ambiguity of the photo. The results identify the use of information from the background as a source for the observed individual differences. The results are discussed in the context of the aquisition, the content and the use of “scene knowledge“.

Is that really #thedress? Individual variations in colour constancy for real illuminations and objects

Speaker: Anya Hurlbert; Institute of Neuroscience, University of Newcastle upon Tyne, UK
Authors: Stacey Aston, Bradley Pearce: Institute of Neuroscience, University of Newcastle upon Tyne, UK

One popular explanation for the individual variation in reported colours of #thedress is an individual variation in the underlying colour constancy mechanisms, which cause differences in the illumination estimated and subsequently discounted. Those who see the dress as ‘white/gold’ are discounting a ‘blueish’ illumination, while those who see it as ‘blue/black’ are discounting a ‘yellowish’ illumination. These underlying differences are brought into relief by the ambiguity of the original photograph. If this explanation holds, then similarly striking individual differences in colour constancy might also be visible in colour matching and naming tasks using real objects under real illuminations, and the conditions under which they are elicited may help to explain the particular power of #thedress. I will discuss results of colour constancy measurements using the real dress, which is almost universally reported to be ‘blue/black’ when illuminated by neutral, broad-band light, yet elicits similar variability in colour naming to the original photograph, across observers within certain illumination conditions, most pronouncedly for ambiguous and/or atypical illuminations. Colour constancy by both naming and matching is in fact relatively poor for the real dress and other unfamiliar items of clothing, but better for “blueish” illuminations than other chromatic illuminations or ambiguous multiple-source illuminations. Overall, individual variations in colour constancy are significant, and depend on age and other individual factors.

Variation of subjective white-points along the daylight axis and the colour of the dress

Speaker: Christoph Witzel; Laboratoire Psychologie de la Perception, University Paris Descartes, France
Authors: Sophie Wuerger, University of Liverpool, UK, Anya Hurlbert, Institute of Neuroscience, University of Newcastle upon Tyne, UK

We review the evidence, from different data sets collected under different viewing conditions, illumination sources, and measurement protocols, for intra- and interobserver variability in “generic subjective white-point” settings along the daylight locus. By “generic subjective white-point” we mean the subjective white-point independent of the specific context. We specifically examine the evidence across all datasets for a “blue” bias in subjective white-points (i.e. increased variability or reduced sensitivity in the bluish direction). We compare the extent of daylight-locus variability generally and variability in the “bluish” direction specifically of subjective white points across these data sets (for different luminance levels and light source types). The variability in subjective white-point may correspond to subjective “priors” on illumination chromaticity. In turn, individual differences in assumptions about the specific illumination chromaticity on “the dress” (in the recent internet phenomenon) is widely thought to explain the individual differences in reported dress colours. We therefore compare the variability in generic white-point settings collated across these datasets with the variability in generic white-point settings made in the specific context of the dress (Witzel and O’Regan, ECVP 2015). Our analysis suggests that (1) there is an overall “blue” bias in generic subjective white-point settings and (2) the variability in generic subjective white-point settings is insufficient to explain the variability in reported dress colours. Instead, the perceived colors of the dress depend on assumptions about the illumination that are specific to that particular photo of the dress.

Prediction for individual differences in appearance of the “dress” by the optimal color hypothesis

Speaker: Keiji Uchikawa; Department of Information Processing, Tokyo Institute of Technology, Japan
Authors: Takuma Morimoto, Tomohisa Matsumoto; Department of Information Processing, Tokyo Institute of Technology, Japan

When luminances of pixels in the blue-black/white-gold “dress” image were plotted on the MacLeod-Boynton chromaticity diagram they appeared to have two clusters. They corresponded to the white/blue and the gold/black parts. The approach we took to solve the dress problem was to apply our optimal color hypothesis to estimate an illuminant in the dress image. In the optimal color hypothesis, the visual system picks the optimal color distribution, which best fits to the scene luminance distribution. The peak of the best-fit optimal color distribution corresponds to the illuminant chromaticity. We tried to find the best-fit optimal color distribution to the dress color distribution. When illuminant level was assumed to be low, the best-fit color temperature was high (20000K). Under this dark-blue illuminant the dress colors should look white-gold. When illuminant level was assumed to be high, the lower temperature optimal color distribution (5000K) fitted the best. Under this bright-white illuminant the dress colors should appear blue-black. Thus, for the dress image the best-fit optimal color distribution changed depending on illuminant intensity. This dual-stable illuminant estimations may cause the individual difference in appearance of the dress. If you choose a bright (or dark) illuminant the dress appears blue-black (or white-gold). When the chromaticity of the dress was rotated in 180 degree in the chromaticity diagram it appeared blue-gold without individual difference. In this case the optimal color hypothesis predicted an illuminant with almost no ambiguity. We tested individual differences using simple patterns in experiments. The results supported our prediction.

Mechanisms of color perception and cognition covered by #thedress

Speaker: Bevil Conway; Department of Brain and Cognitive Sciences, MIT, Cambridge MA, USA
Authors: Rosa Lafer-Sousa, Katherine Hermann

Color is notoriously ambiguous—many color illusions exist—but until now it has been thought that all people with normal color vision experience color illusions the same way. How does the visual system resolve color ambiguity? Here, we present work that addresses this question by quantifying people’s perception of a particularly ambiguous image, ‘the dress’ photograph. The colors of the individual pixels in the photograph when viewed in isolation are light-blue or brown, but popular accounts suggest the dress appears either white/gold or blue/black. We tested more than 1400 people, both on-line and under controlled laboratory conditions. Subjects first completed the sentence: “this is a ___and___dress”. Then they performed a color-matching experiment that did not depend on language. Surprisingly, the results uncovered three groups of subjects: white/gold observers, blue/black observers and blue/brown observers. Our findings show that the brain resolves ambiguity in ‘the dress’ into one of three stable states; a minority of people switched which colors they saw (~11%). It is clear that what we see depends on both retinal stimulation and internal knowledge about the world. Cases of multi-stability such as ‘the dress’ provide a rare opportunity to investigate this interplay. In particular, we go on to demonstrate that ‘the dress’ photograph can be used as a tool to discover that skin reflectance is a particularly important implicit cue used by the brain to estimate the color of the light source, to resolve color ambiguity, shedding light on the role of high-level cues in color perception.

The Brain’s Dress Code: How The Dress allows to decode the neuronal pathway of an optical illusion

Speaker: Lara Schlaffke; Department of Neurology, BG University Hospital Bergmannsheil, Bochum, Germany
Authors: Anne Golisch , Lauren M. Haag, Melanie Lenz, Stefanie Heba, Silke Lissek, Tobias Schmidt-Wilcke, Ulf T. Eysel , Martin Tegenthoff

Optical illusions have broadened our understanding of the brains role in visual perception 1–3. A modern day optical illusion emerged from a posted photo of a striped dress, which some perceived as white and gold and others as blue and black. Theories on the differences have been proposed and included e.g. colour constancy, contextual integration, and the principle of ambiguous forms4, however no consensus has yet been reached. The fact that one group sees a white/gold dress, instead of the actual blue/black dress, provides a control and therefore a unique opportunity in vision research, where two groups perceive the same object differently. Using functional magnetic resonance imaging (fMRI) we can identify human brain regions that are involved in this optical illusion concerning colour perception and investigate the neural correlates that underlie the observed differences. Furthermore open questions in visual neuroscience concerning the computation of complex visual scenes can be addressed. Here we show, using fMRI, that those who perceive The Dress as white/gold (n=14) have higher activation in response to The Dress in brain regions critically involved in visual processing and conflict management (V2, V4, as well as frontal and parietal brain areas), as compared to those who perceive The Dress as blue/black (n=14). These results are consistent with the theory of top-down modulation5 and extend the Retinex theory6 to include differing strategies the brain uses to form a coherent representation of the world around us. This provides a fundamental building block to study interindividual differences in visual processing.

< Back to 2016 Symposia

The parietal cortex in vision, cognition, and action

Time/Room: Friday, May 13, 2016, 5:00 – 7:00 pm, Pavilion
Organizer(s): Yaoda Xu and David Freedman; Harvard University and University of Chicago
Presenters: Sabine Kastner, Yaoda Xu, Jacqueline Gottlieb, David Freedman, Peter Janssen, Melvyn Goodale

< Back to 2016 Symposia

Symposium Description

The primate parietal cortex has been associated with a diverse set of operations. Early evidence has highlighted the role of this brain region in spatial, attention, and action related processing. More recent evidence, however, has suggests a role for parietal cortex in non-spatial and cognitive functions such as object representation, categorization, short-term memory, number processing and decision making. How should we understand its function, given the wide array of sensory, cognitive and motor signals found to be encoded in parietal areas? Are there functionally dissociable regions within the primate parietal cortex, each participating in distinct functions? Or are the same parietal regions involved in multiple functions? Is it possible to form a unified account of parietal cortex’s role in perception, action and cognition? In this symposium, by bringing together researchers from monkey neurophysiology and human brain imaging, we will first ask the speakers to present our current understanding regarding the role of parietal cortex in visual spatial, non-spatial and cognitive functions. We will then ask the speakers whether the framework they have developed to understand parietal involvement in a particular task setting can help understand its role in other task contexts and whether there are fundamental features of parietal cortex that enable it to participate in a diverse set of tasks and functions. There will be a total of six speakers. Sabine Kastner will address spatial mapping, attention priority signals and object representations in human parietal cortex. Yaoda Xu will describe human parietal cortex’s involvement in visual short-term memory and object representation and their correspondence with behavior. Jacqueline Gottlieb will describe attention and decision related signals in monkey parietal cortex. David Freedman will examine monkey parietal cortex’s involvement in visual categorization, category learning, and working memory and its interaction with other cortical areas. Peter Janssen will detail the functional organization of the monkey intraparietal sulcus in relation to grasping and 3D object representation. Melvyn Goodale will discuss the role of the parietal cortex in the control of action.

Presentations

Comparative studies of posterior parietal cortex in human and non-human primates

Speaker: Sabine Kastner; Department of Psychology and The Princeton Neuroscience Institute, Princeton University

The primate parietal cortex serves many functions, ranging from integrating sensory signals and deriving motor plans to playing a critical role in cognitive functions related to object categorization, attentional selection, working memory or decision making. This brain region undergoes significant changes during evolution and can therefore serve as a model for a better understanding of the evolution of cognition. I will present comparative studies obtained in human and non-human primates using basically identical methods and tasks related to topographic and functional organization, neural representation of object information and attention-related signals. Topographic and functional mapping studies identified not only the parietal regions that primate species have in common, but also revealed several human-specific areas along the intraparietal sulcus. FMRI studies on parietal object representations show that in humans, they resemble those typically found in ventral visual cortex and appear to be more complex than those observed in non-human primates suggesting advanced functionality possibly related to the evolving human-specific tool network. Finally, electrophysiological signatures of parietal attention signals in space-based attention tasks are similar in many respects across primate species providing evidence for preserved functionality in this particular cognitive domain. Together, our comparative studies contribute to a more profound understanding of the evolution of cognitive domains related to object perception and attention in primates.

Decoding Visual Representations in the Human Parietal Cortex

Speaker: Yaoda Xu; Psychology Department, Harvard University

Although visual processing has been mainly associated with the primate occipital/temporal cortices, the processing of sophisticated visual information in the primate parietal cortex has also been reported by a number of studies. In this talk, I will examine the range of visual stimuli that can be represented in the human parietal cortex and the nature of these representations in terms of their distractor resistance, task dependency and behavioral relevance. I will then directly compare object representation similarity between occipital/temporal and parietal cortices. Together, these results argue against a “content-poor” view of parietal cortex’s role in attention. Instead, they suggest that parietal cortex is “content-rich” and capable of directly participating in goal-driven visual information representation in the brain. This view has the potential to help us understand the role of parietal cortex in other tasks such as decision-making and action, both of which demand the online processing of visual information. Perhaps one way to understand the function of parietal cortex is to view it as a global workspace where sensory information is retained, integrated, and evaluated to guide the execution of appropriate actions.

Multi-dimensional parietal signals for coordinating attention and decision making

Speaker: Jacqueline Gottlieb; Department of Neuroscience, Kavli Institute for Brain Science, Columbia University

In humans and non-human primates, the parietal lobe plays a key role in spatial attention – the ability to extract information from regions of space. This role is thought to be mediated by “priority” maps that highlight attention-worthy locations, and provide top-down feedback for motor orienting and attention allocation. Traditionally, priority signals have been characterized as being purely spatial – i.e., encoding the desired locus of gaze or attention regardless of the context in which the brain generates that selection. Here I argue, however, based on non-spatial modulations found in the monkey lateral intraparietal area, that non-spatial responses are critical for allowing the brain to coordinate attention with action – i.e., to estimate the significance and relative utility of competing sensory cues in the immediate task context. The results prompt an integrative view whereby attention is not a disembodied entity that acts on sensory or motor representations, but an organically emerging process that depends on dynamic interactions within sensorimotor loops.

Categorical Decision Making and Category Learning in Parietal and Prefrontal Cortices

Speaker: David Freedman; Department of Neurobiology and Grossman Institute for Neuroscience, Quantitative Biology, and Human Behavior, The University of Chicago

We have a remarkable ability to recognize the behavioral significance, or category membership of incoming sensory stimuli. In the visual system, much is known about how simple visual features (such as color, orientation and direction of motion) are processed in early stages of the visual system. However, much less is known about how the brain learns and recognizes categorical information that gives meaning to incoming stimuli. This talk will discuss neurophysiological and behavioral experiments aimed at understanding the mechanisms underlying visual categorization and decision making, with a focus on the impact of category learning on underlying neuronal representations in the posterior parietal cortex (PPC) and prefrontal cortex (PFC). We recorded from PPC both before and after training on a visual categorization task. This revealed that categorization training influenced both visual and cognitive encoding in PPC, with a marked enhancement of memory-related delay-period encoding during the categorization task which was not observed during a motion discrimination task prior to categorization training. In contrast, the PFC exhibited strong delay-period encoding during both discrimination and categorization tasks. This reveals a dissociation between PFC’s and PPC’s roles in decision making and short term memory, with generalized engagement of PFC across a wider range of tasks, in contrast with more task-specific and training dependent mnemonic encoding in PPC.

The functional organization of the intraparietal sulcus in the macaque monkey

Speaker: Peter Janssen; Laboratory for Neuro- and Psychophysiology, Department of Neurosciences, KU Leuven

The lateral bank of the anterior intraparietal sulcus (IPS) is critical for object grasping. Functional magnetic resonance imaging (fMRI) (Durand et al., 2007) and single-cell recording studies (Srivastava, Orban, De Maziere, & Janssen, 2009) in macaque monkeys have demonstrated that neurons in the anterior intraparietal area (AIP) are selective for disparity-defined three-dimensional (3D) object shape. Importantly, the use of the same stimuli and tasks in macaque monkeys and humans has enabled us to infer possible homologies between the two species. I will review more recent studies combining fMRI, single-cell recordings, electrical microstimulation and reversible inactivation that have shed light on the functional organization of the IPS. Using an integrated approach (Premereur, Van Dromme, Romero, Vanduffel, & Janssen, 2015), we could identify differences in the effective connectivity between nearby patches of neurons with very similar response properties, resolving a long-standing controversy between anatomical and physiological studies with respect to the spatial extent of neighboring areas AIP and LIP. In addition, the effective connectivity of the different IPS sectors has clarified the functional organization of the anterior IPS. Finally, reversible inactivation during fMRI can demonstrate how visual information flows within the widespread functional network involved in 3D object processing. These results are not only critical to understand the role of the macaque parietal cortex, but will also contribute to a better understanding of the parietal cortex in humans.

The role of the posterior parietal cortex in the control of action

Speaker: Melvyn Goodale; The Brain and Mind Institute, The University of Western Ontario

A long history of neuropsychological research has shown the visual control of grasping and other skilled movements depends on the integrity of visual projections to the dorsal visual stream in the posterior parietal cortex. Patients with lesions to the dorsal stream are unable to direct their hands towards or grasp visual targets in the contralesional visual field, despite being able to describe the size, orientation, and location of those targets. Other patients with lesions to the ventral stream are able to grasp objects accurately and efficiently despite being unable to report the very object features controlling their actions. More recent imaging studies of both neurological patients and healthy controls has confirmed the role of the dorsal stream in transforming visual information into the required coordinates for action. In this presentation, I will discuss research from our lab showing that visual information about the metrical properties of goal objects may reach the dorsal stream via pathways that bypass the geniculostriate pathway. I will go on to show that manual interactions with some classes of objects, such as tools, requires that visual information about those objects be processed by circuits in both the ventral and the dorsal stream. Finally, I will speculate that some of the other higher-order functions of the parietal lobe, such as its evident role in numerical processing and working memory, may have evolved from the need to plan actions to multiple goals.

< Back to 2016 Symposia

Boundaries in Spatial Navigation and Visual Scene Perception

Time/Room: Friday, May 13, 2016, 12:00 – 2:00 pm, Pavilion
Organizer(s): Soojin Park, Johns Hopkins University and Sang Ah Lee, University of Trento
Presenters: Sang Ah Lee, Joshua B Julian, Nathaniel J. Killian, Tom Hartley, Soojin Park, Katrina Ferrara

< Back to 2016 Symposia

Symposium Description

The ability to navigate in the world using vision is intrinsically tied to the ability to analyze spatial relationship within a scene. For the past few decades, navigation researchers have shown that humans and nonhuman animals alike compute locations by using a spontaneously encoded geometry of the 3D environmental boundary layouts. This finding has been supported by neural evidence showing boundary-specific inputs to hippocampal place-mapping. More recently, researchers in visual scene perception have shown that boundaries not only play an important role in defining geometry for spatial navigation, but also in visual scene perception. How are boundary representations in scene perception related to those in navigation? What are the defining features of boundaries, and what are their neural correlates? The aim of this symposium is to bridge research from various subfields of cognitive science to discuss the specific role of boundaries in the processing of spatial information and to converge on a coherent theoretical framework for studying visual representations of boundaries. To achieve this, we have brought together an interdisciplinary group of speakers to present studies of boundary representations on a broad range of subject populations, from rodents, to primates, to individuals with genetic disorders, using various experimental methods (developmental, behavioral, fMRI, TMS, single-cell and population coding). The theoretical flow of the symposium will start with behavioral studies showing specificity and primacy of boundaries in spatial navigation and memory in both humans and a wide range of nonhuman vertebrates. Then, we will ask whether neural representations of boundary geometry can be derived from visual input, as opposed to active navigation, using primate’s saccadic eye gaze and human scene perception. Lastly, we will present evidence of spatial impairment marked by a dysfunction of boundary-processing mechanisms in Williams Syndrome. We believe that this symposium will be of great interest to VSS attendees for the following reasons: First, these convergent findings from independent research approaches to spatial representations and their neural correlates will make a powerful impact on theories of spatial information processing, from visual perception to hippocampal spatial mapping. Second, a better understanding of boundary geometry can broadly inform any research that involves visuo-spatial representations, such as studies on spatial perspective and saccadic eye movements. Finally, the methodological breadth of this symposium, and its aim to integrate them to a coherent picture will provide a new perspective on the power of multidisciplinary research in visual and cognitive sciences.

Presentations

Boundaries in space: A comparative approach

Speaker: Sang Ah Lee; Center for Mind/Brain Sciences, University of Trento

Spatial navigation provides a unique window into the evolutionary and developmental origins of complex behaviors and memory, due to its richness in representation and computation, its striking similarities between distantly related species, its neural specificity, and its transformation across human development. Environmental boundaries have been shown to play a crucial role in both neural and behavioral studies of spatial representation. In this talk, I will discuss evidence on boundary coding on three different levels: First, I will share my findings showing the primacy and specificity of visual representations of 3D environmental “boundaries” in early spatial navigation in children. Second, I will argue that the cognitive mechanisms underlying boundary representations are shared and widespread across the phylogenetic tree. Finally, I will bring together insights gathered from behavioral findings to investigate the neural underpinnings of boundary coding. From the firing of neurons in a navigating rat’s brain, to a child’s developing understanding of abstract space, I will argue that boundary representation is a fundamental, evolutionarily ancient ability that serves as a basis for spatial cognition and behavior.

Mechanisms for encoding navigational boundaries in the mammalian brain

Speaker: Joshua B Julian; Department of Psychology, University of Pennsylvania
Authors: Alex T Keinath, Department of Psychology, University of Pennsylvania; Jack Ryan, Department of Psychology, University of Pennsylvania; Roy H Hamilton, Department of Neurology, University of Pennsylvania; Isabel A Muzzio, Department of Biology, University of Texas: San Antonio; Russell A Epstein, Department of Psychology, University of Pennsylvania

Thirty years of research suggests that environmental boundaries exert powerful control over navigational behavior, often to the exclusion of other navigationally-relevant cues, such as objects or visual surface textures. Here we present findings from experiments in mice and humans demonstrating the existence of specialized mechanisms for processing boundaries during navigation. In the first study, we examined the navigational behavior of disoriented mice trained to locate rewards in two chambers with geometrically identical boundaries, distinguishable based on the visual textures along one wall. We observed that although visual textures were used to identify the chambers, those very same cues were not used to disambiguate facing directions within a chamber. Rather, recovery of facing directions relied exclusively on boundary geometry. These results provide evidence for dissociable processes for representing boundaries and other visual cues. In a second line of work, we tested whether the human visual system contains neural regions specialized for processing of boundaries. Specifically, we tested the prediction that the Occipital Place Area (OPA) might play a critical role in boundary-based navigation, by extracting boundary information from visual scenes. To do so, we used transcranial magnetic stimulation (TMS) to interrupt processing in the OPA during a navigation task that required participants to learn object locations relative to boundaries and non-boundary cues. We found that TMS of the OPA impaired learning of locations relative to boundaries, but not relative to landmark objects or large-scale visual textures. Taken together, these results provide evidence for dedicated neural circuitry for representing boundary information.

Neuronal representation of visual borders in the primate entorhinal cortex

Speaker: Nathaniel J. Killian; Department of Neurosurgery, Massachusetts General Hospital-Harvard Medical School
Authors: Elizabeth A Buffalo, Department of Physiology and Biophysics, University of Washington

The entorhinal cortex (EC) is critical to the formation of memories for complex visual relationships. Thus we might expect that EC neurons encode visual scenes within a consistent spatial framework to facilitate associations between items and the places where they are encountered. In particular, encoding of visual borders could provide a means to anchor visual scene information in allocentric coordinates. Studies of the rodent EC have revealed neurons that represent location, heading, and borders when an animal is exploring an environment. Because of interspecies differences in vision and exploratory behavior, we reasoned that the primate EC may represent visual space in a manner analogous to the rodent EC, but without requiring physical visits to particular places or items. We recorded activity of EC neurons in non-human primates (Macaca mulatta) that were head-fixed and freely viewing novel photographs presented in a fixed external reference frame. We identified visual border cells, neurons that had increased firing rate when gaze was close to one or more image borders. Border cells were co-localized with neurons that represented visual space in a grid-like manner and with neurons that encoded the angular direction of saccadic eye movements. As a population, primate EC neurons appear to represent gaze location, gaze movement direction, and scene boundaries. These spatial representations were detected in the presence of changing visual content, suggesting that the EC provides a consistent spatial framework for encoding visual experiences.

Investigating cortical encoding of visual parameters relevant to spatial cognition and environmental geometry in humans.

Speaker: Tom Hartley; Department of Psychology, University of York, UK
Authors: David Watson, Department of Psychology, University of York, UK; Tim Andrews, Department of Psychology, University of York, UK

Studies of firing properties of cells in the rodent hippocampal formation indicate an important role for “boundary cells” in anchoring the allocentric firing fields of place and grid cells. To understand how spatial variables such as the distance to local boundaries might be derived from visual input in humans, we are investigating links between the statistical properties of natural scenes and patterns of neural response in scene selective visual cortex. In our latest work we used a data-driven analysis to select clusters of natural scenes from a large database, solely on the basis of their image properties. Although these visually-defined clusters did not correspond to typical experimenter-defined categories used in earlier work, we found they elicited distinct and reliable patterns of neural response in parahippocampal cortex, and that the relative similarity of the response patterns was better explained in terms of low-level visual properties of the images than by local semantic information. Our results suggest that human parahippocampal cortex encodes visual parameters (including properties relevant to environmental geometry). Our approach opens the way to isolating these parameters and investigating their relationship to spatial variables.

Complementary neural representation of scene boundaries

Speaker: Soojin Park; Department of Cognitive Science, Johns Hopkins University
Authors: Katrina Ferrara, Center for Brain Plasticity and Recovery, Georgetown University

Environmental boundaries play a critical role in defining spatial geometry and restrict our movement within an environment. Developmental research with 4-year-olds shows that children are able to reorient themselves by the geometry of a curb that is only 2 cm high, but fail to do so when the curb boundary is replaced by a flat mat on the floor (Lee & Spelke, 2011). In this talk, we will present evidence that such fine-grained sensitivity to a 3D boundary cue is represented in visual scene processing regions of the brain, parahippocampal place area (PPA) and retrosplenial cortex (RSC). First, we will present univariate and multivoxel pattern data from both regions to suggest that they play complementary roles in the representation of boundary cues. The PPA shows disproportionately strong sensitivity to the presence of a slight vertical boundary, demonstrating a neural signature that corresponds to children’s behavioral sensitivity to slight 3D vertical cues (i.e., the curb boundary). RSC did not display this sensitivity. We will argue that this sensitivity does not simply reflect low-level image differences across conditions. Second, we investigate the nature of boundary representation in RSC by parametrically varying the height of boundaries in the vertical dimension. We find that RSC’s response matches a behavioral categorical decision point for the boundary’s functional affordance (e.g., whether the boundary limits the viewer’s potential navigation or not). Collectively, this research serves to highlight boundary structure as a key component of space that is represented in qualitatively different ways across two scene-selective brain regions.

Neural and behavioral sensitivity to boundary cues in Williams syndrome

Speaker: Katrina Ferrara; Center for Brain Plasticity and Recovery, Georgetown University
Authors: Barbara Landau, Department of Cognitive Science, Johns Hopkins University; Soojin Park, Department of Cognitive Science, Johns Hopkins University

Boundaries are fundamental features that define a scene and contribute to its geometric shape. Our previous research using fMRI demonstrates a distinct sensitivity to the presence of vertical boundaries in scene representation by the parahippocampal place area (PPA) in healthy adults (Ferrara & Park, 2014). In the present research, we show that this sensitivity to boundaries is impaired by genetic deficit. Studying populations with spatial disorders can provide insight to potential brain/behavior links that may be difficult to detect in healthy adults. We couple behavioral and neuroimaging methods to study individuals with Williams syndrome (WS), a disorder characterized by the deletion of 25 genes and severe impairment in a range of spatial functions. When both humans and animals are disoriented in a rectangular space, they are able to reorient themselves by metric information conveyed by the enclosure’s boundaries (e.g., long wall vs. short wall). Using this reorientation process as a measure, we find that individuals with WS are unable to reorient by a small boundary cue, in stark contrast to the behavior of typically developing (TD) children (Lee & Spelke, 2011). Using fMRI, we find a linked neural pattern in that the WS PPA does not detect the presence of a small boundary within a scene. Taken together, these results demonstrate that atypical patterns of reorientation correspond with less fine-grained representation of boundaries at the neural level in WS. This suggests that sensitivity to the geometry of boundaries is one of the core impairments that underlies the WS reorientation deficit.

< Back to 2016 Symposia

Vision Sciences Society