Symposium Submission Guidelines & Policies

Policies

The symposium organizer must be a current 2025 member in good standing.
Invited speakers must register for the meeting but need not be members.
No speaker or organizer can participate in more than one symposium.
Speaker substitutions are not allowed unless due to unforeseeable, valid reasons after submission.
Online presentations are not allowed.
If a symposium talk has more than one author, it must be presented by the first author.
Submitting a symposium proposal or speaking in a symposium does not prevent you from submitting an abstract for a talk or poster presentation at VSS.
Before submitting a proposal, organizers must ensure that all speakers are committed to participating in the symposium and registering for the meeting before submitting a proposal.

For questions about Symposium Submission Policies, please contact us at

Symposia are presented at the VSS 2025 Annual Meeting on the first day (Friday).

VSS 2025 will be a fully in-person meeting with no virtual components.

Four to six symposia will be scheduled, each in a 2-hour time slot.

Symposia can be organized along the lines of content or methodology, but in every case, talks within a symposium should focus on broader conceptual themes than a typical VSS presentation.

Proposals are evaluated by the VSS Board of Directors based on the following criteria:

Scientific merit
Theoretical and/or methodological innovation
Timeliness
Breadth and appeal to a substantial number of VSS attendees
Lack of overlap with the regular program and recent symposia
A slate of speakers that provides appropriately broad representation of the VSS membership’s national and international provenance and range of approaches.

Symposium Format

The recommended format is four to six talks, followed by a panel discussion that involves both the speakers and the audience. The allocation of time for talks and Q&A within the 2-hour session is flexible and can be determined at the organizers’ discretion. This may include Q&A after each individual talk, concentrating it at the end of the session through a panel discussion, or a combination of both approaches. Other formats will be considered, but proposals for other formats should include a clear rationale. Proposals from early career investigators are encouraged.

Symposium Information

The symposium proposal is submitted using a multi-page form that includes information describing the symposium and the talks in the symposium. A symposium may have a maximum of three organizers. A minimum of four talks is required; up to six talks are allowed. Talks must be entered in the order they will be presented. It is best to collect information about the individual talks from their authors before you start the submission process.

As the symposium organizer, you must have prior approval from your talk presenters they consent to participate in this symposium (and no other symposium), and to register and attend VSS 2025 in person to present their talk. The organizer must also agree to the Symposium Policies and disclose any Conflicts of Interest.

Required symposium information:

Symposium title.
Brief description of the symposium (maximum 100 words). Appears on the symposium overview page.
Full description of the symposium (maximum 500 words). Appears on the symposium detail page.
Estimated attendance.
Name, affiliation, and contact information for the organizers (maximum 3 organizers).
Acknowledgements (optional).

Required information for each talk:

Talk title.
Talk abstract (maximum 250 words).
Author names and affiliations.
Email and country of citizenship of the talk presenter (first author).
The presenter must agree to an Ethics statement
The presenter must disclose any Conflicts of Interest.

Organizer and Speaker Requirements

The symposium organizers (maximum of three) must be current VSS members (for 2025), but invited speakers are not required to be VSS members. All speakers are required to register for the meeting. Submitting a symposium proposal or speaking in a symposium does not prevent organizers or speakers from submitting an abstract for a talk or poster presentation at VSS. An individual may participate as organizer or speaker in only one symposium. The symposium organizer(s) may be a speaker in the symposium.

Organizers must ensure that all speakers are committed to participating in the symposium before submitting a proposal, and organizers must also ensure that speakers have not agreed to participate in more than one symposium. If a symposium talk has more than one author, it must be presented by the first author. For each speaker, please provide up to three references to published articles relevant to the proposed talk.

Disclosure of Conflicts of Interest

It is the responsibility of the symposium organizer to disclose any relevant commercial relationships or other conflicts of interest for all speakers and their co-authors in the submitted proposal. This information must include the name of any organization with which a commercial relationship exists for both the speaker and each co-author.

Each symposium speaker must verbally disclose all relevant commercial relationships or conflicts of interest at the beginning of their talk. This information should also be included on a slide during the presentation, outlining the names of the name of the organization(s) involved with any commercial relationship(s) for the speaker and each co-author.

Compliance with this policy is a requirement.

Symposium Review

Proposals for Symposia are evaluated by the VSS Board of Directors. In addition to considering the VSS 2025 focus outlined above, our criteria include scientific merit, timeliness, theoretical innovation and breadth, methodological innovation, lack of overlap with the regular program and recent symposia, and speaker composition.

Submission Schedule

Submissions Open: October 22, 2024
Submissions Close: November 15, 2024
Notification of Accepted Symposia: November 25, 2024

Submitting a Symposium

To submit a symposium, Log in to your MyVSS Account or Create a New MyVSS Account, pay for your 2025 membership, and then click the Submit a Symposium button.

For questions about Symposium Submissions, please contact us at

2024 Symposia

Neurodiversity in visual functioning: Moving beyond case-control studies

Friday, May 17, 2024, 12:00 – 2:00 pm, Talk Room 1

Organizers: Catherine Manning¹, Michael-Paul Schallmo²; ¹University of Reading, UK, ²University of Minnesota

Visual functioning in psychiatric and developmental conditions is typically studied by comparing a single diagnosis against a control group. However this approach cannot tell us whether atypical visual functioning is condition-specific or shared across conditions, and it neglects co-occurrence and heterogeneity. Accordingly, recent conceptualisations have moved away from traditional diagnostic boundaries towards considering transdiagnostic dimensions of neurodiversity. This symposium will bring these recent conceptual advances to the broader VSS community, through cutting-edge work spanning conditions and methods. We will first present studies that directly compare conditions to uncover convergence and divergence, before moving towards transdiagnostic studies of visual functioning. More…

Large-scale visual neural datasets: where do we go from here?

Friday, May 17, 2024, 12:00 – 2:00 pm, Talk Room 2

Organizers: Alessandro Gifford¹, Kendrick Kay²; ¹Freie Universität Berlin, ²University of Minnesota

Recently, there has been an increase in both collection and use of large-scale visual neural datasets (LSVNDs), suggesting that the field of vision science is entering a new era of big open data. This transformation raises new and exciting questions about LSVNDs: their potential strengths, their potential pitfalls, how they can promote theory formation, and what LSVNDs we need most. This symposium addresses these questions through six talks from both LSVND creators and users, along with a guided interactive discussion with the audience aimed at sharing knowledge among VSS members and setting community-centered goals. More…

The temporal evolution of visual perception

Friday, May 17, 2024, 2:30 – 4:30 pm, Talk Room 1

Organizers: Lina Teichmann¹, Chris Baker¹; ¹Laboratory of Brain and Cognition, National Institute of Mental Health, Bethesda, USA

Visual processing is highly dynamic with neural representations evolving over several hundred milliseconds. This symposium will present multiple perspectives on the time course of visual processing, giving insight into how light falling on the retina ultimately gives rise to rich visual percepts. The speakers will focus on different methods (including EEG, MEG, intracranial recordings, and behaviour) and different domains of vision (including colour, object recognition, social perception and attention). Collectively, the work presented in this symposium will provide novel insights into the dynamic nature of visual perception across the visual hierarchy. More…

Attention: accept, reject, or major revisions?

Friday, May 17, 2024, 2:30 – 4:30 pm, Talk Room 2

Organizers: Alon Zivony¹; ¹University of Sheffield

The concept of “attention” has been criticized to be theoretically incoherent and even unsuitable for scientific research. How should we, as a field, respond to these criticisms? Should we avoid using the concept, change the way we conceptualize attention, or simply continue with our research as usual? Our speakers bring different perspectives from psychology and philosophy in an attempt to answer this question. We provide an overview of some of the difficulties with conceptualizing attention, as well as practical and theoretical solutions to these problems. In doing so, we hope to promote a better science of attention and related phenomena. More…

The Multifaceted effects of blindness and how sight might be restored

Friday, May 17, 2024, 5:00 – 7:00 pm, Talk Room 1

Organizer: Ella Striem-Amit¹; ¹Georgetown University

Congenital blindness illustrates the developmental roots of visual cortex functions. Our symposium of speakers from diverse academic careers present perspectives on the multifaceted effects of blindness on the brain and behavior. The speakers will describe the effect of sight loss on multisensory properties and on the visual cortex, highlighting differential effects in areas typically responding to motion and faces, and divergence of plasticity across individuals. We will also discuss the limitations of visual prostheses for restoring sight, and how they may be addressed. Altogether, our symposium will call attention to the substantial impact of plasticity and possibilities to overcome it. More…

Using deep networks to re-imagine object-based attention and perception

Friday, May 17, 2024, 5:00 – 7:00 pm, Talk Room 2

Organizers: Hossein Adeli¹, Seoyoung Ahn², Gregory Zelinsky²; ¹Columbia University, ²Stony Brook University

What are the computational mechanisms that transform visual features into coherent object percepts, and what role does attention play in this process? The speakers in this symposium will use various state-of-the-art deep neural network models to reexamine the cognitive and neural mechanisms underlying object-based attention and perception. They will also explore new computational mechanisms for how the visual system groups visual features into coherent object percepts. This symposium will lay the foundation for the next generation of object-based attention models, ones that harness recent computational tools to advance our understanding of the object-centric nature of human perception. More…

Using deep networks to re-imagine object-based attention and perception

< Back to 2024 Symposia

Symposium: Friday, May 17, 2024, 5:00 – 7:00 pm, Talk Room 2

Organizers: Hossein Adeli¹, Seoyoung Ahn², Gregory Zelinsky²; ¹Columbia University, ²Stony Brook University
Presenters: Patrick Cavanagh, Frank Tong, Paolo Papale, Alekh Karkada Ashok, Hossein Adeli, Melissa Le-Hoa Võ

What can Deep Neural Network (DNN) methods tell us about the brain mechanisms that transform visual features into object percepts? Using different state-of-the-art models, the speakers in this symposium will reexamine different cognitive and neural mechanisms of object-based attention (OBA) and perception and consider new computational mechanisms for how the visual system groups visual features into coherent object percepts. Our first speaker, Patrick Cavanagh, helped create the field of OBA and is therefore uniquely suited to give a perspective on how this question, essentially the feature-binding problem, has evolved over the years and has been shaped by paradigms and available methods. He will conclude by outlining his vision for how DNN architectures create new perspectives on understanding OBA. The next two speakers will review the recent behavioral and neural findings on object-based attention and feature grouping. Frank Tong will discuss the neural and behavioral signatures of OBA through the utilization of fMRI and eye tracking methods. He will demonstrate how the human visual system represents objects across the hierarchy of visual areas. Paolo Papale will discuss neurophysiological evidence for the role of OBA and grouping in object perception. Using stimuli systematically increasing in complexity from lines to natural objects (against cluttered backgrounds) he shows that OBA and grouping are iterative processes. Both talks will also include discussions of current modeling efforts, and what additional measures may be needed to realize more human-like object perception. The following two talks will provide concrete examples of how DNNs can be used to predict human behavior during different tasks. Lore Goetschalckx will focus on the importance of considering the time-course of grouping in object perception and will discuss her recent work on developing a method to analyze dynamics of different models. Using this method, she shows how a deep recurrent model trained on an object grouping task predicts human reaction time. Hossein Adeli will review modeling work on three theories of how OBA binds features into objects: one that implements object-files, another that uses generative processes to reconstruct an object percept, and a third model of spreading attention through association fields. In the context of these modeling studies, he will describe how each of these mechanisms was implemented as a DNN architecture. Lastly, Melissa Võ will drive home the importance of object representations and how they collectively create an object context that humans use to control their attention behavior in naturalistic settings. She shows how GANs can be used to study the hidden representations underlying our perception of objects. This symposium is timely because the advances in computational methods have made it possible to put old theories to the test and to develop new theories of OBA mechanisms that engage the role played by attention in creating object-centric representations.

Talk 1

The Architecture of Object-Based Attention

Patrick Cavanagh¹, Gideon P. Caplovitz², Taissa K. Lytchenko², Marvin R. Maechler³, Peter U. Tse³, David R. Sheinberg⁴; ¹Glendon College, York University, ²University of Nevada, Reno, ³Dartmouth College, ⁴Brown University

Evidence for the existence of object-based attention raises several important questions: what are objects, how does attention access them, and what anatomical regions are involved? What are the “objects” that attention can access? Several studies have shown that items in visual search tasks are only loose collections of features prior to the arrival of attention. Nevertheless, findings from a wide variety of paradigms, including unconscious priming and cuing, have overturned this view. Instead, the targets of object-based attention appear to be fully developed object representations that have reached the level of identity prior to the arrival of attention. Where do the downward projections of object-based attention originate? Current research indicates that the control of object-based attention must come from ventral visual areas specialized in object analysis that project downward to early visual areas. If so, how can feedback from object areas accurately target the object’s early locations and features when the object areas have only crude location information? Critically, recent work on autoencoders has made this plausible as they are capable of recovering the locations and features of the target objects from the high level, low dimensional codes in the object areas. I will outline the architecture of object-based attention, the novel predictions it brings, and discuss how it works in parallel with other attention pathways.

Talk 2

Behavioral and neural signatures of object-based attention in the human visual system

Frank Tong¹, Sonia Poltoratski¹, David Coggan¹, Lasyapriya Pidaparthi¹, Elias Cohen¹; ¹Vanderbilt University

How might one demonstrate the existence of an object representation in the visual system? Does objecthood arise preattentively, attentively, or via a confluence of bottom-up and top-down processes? Our fMRI work reveals that orientation-defined figures are represented by enhanced neural activity in the early visual system. We observe enhanced fMRI responses in the lateral geniculate nucleus and V1, even for unattended figures, implying that core aspects of scene segmentation arise from automatic perceptual processes. In related work, we find compelling evidence of object completion in early visual areas. fMRI response patterns to partially occluded object images resemble those evoked by unoccluded objects, with comparable effects of pattern completion found for unattended and attended objects. However, in other instances, we find powerful effects of top-down attention. When participants must attend to one of two overlapping objects (e.g., face vs. house), activity patterns from V1 through inferotemporal cortex are biased in favor of the covertly attended object, with functional coupling of the strength of object-specific modulation found across brain areas. Finally, we have developed a novel eye-tracking paradigm to predict the focus of object-based attention while observers view two dynamically moving objects that mostly overlap. Estimates of the precision of gaze following suggest that observers can entirely filter out the complex motion signals arising from the task-irrelevant object. To conclude, I will discuss whether current AI models can adequately account for these behavioral and neural properties of object-based attention, and what additional measures may be needed to realize more human-like object processing.

Talk 3

The spread of object attention in artificial and cortical neurons

Paolo Papale¹, Matthew Self¹, Pieter Roelfsema¹; ¹Netherlands Institute for Neuroscience

A crucial function of our visual system is to group local image fragments into coherent perceptual objects. Behavioral evidence has shown that this process is iterative and time-consuming. A simple theory suggested that visual neurons can solve this challenging task relying on recurrent processing: attending to an object could produce a gradual spread of enhancement across its representation in the visual cortex. Here, I will present results from a biologically plausible artificial neural network that can solve object segmentation by attention. This model was able to identify and segregate individual objects in cluttered scenes with extreme accuracy, only using modulatory top-down feedback as observed in visual cortical neurons. Then, I will present comparable results from large-scale electrophysiology recordings in the macaque visual cortex. We tested the effect of object attention with stimuli of increasing complexity, from lines to natural objects against cluttered backgrounds. Consistent with behavioral observations, the iterative model correctly predicted the spread of attentional modulation in visual neurons for simple stimuli. However, for more complex stimuli containing recognizable objects, we observed asynchronous but not iterative modulation. Thus, we produced a set of hybrid stimuli, combining local elements of two different objects, that we alternated with the presentation of stimuli of intact objects. By doing so, we made local information unreliable, forcing the monkey to solve the task iteratively. Indeed, we observed that this set of stimuli induced iterative attentional modulations. These results provide the first systematic investigation on object attention in both artificial and cortical neurons.

Talk 4

Time to consider time: Comparing human reaction times to dynamical signatures from recurrent vision models on a perceptual grouping task

Alekh Karkada Ashok¹, Lore Goetschalckx¹, Lakshmi Narasimhan Govindarajan¹, Aarit Ahuja¹, David Sheinberg¹, Thomas Serre¹; ¹Brown University

To make sense of its retinal inputs, our visual system organizes perceptual elements into coherent figural objects. This perceptual grouping process, like many aspects of visual cognition, is believed to be dynamic and at least partially reliant on feedback. Indeed, cognitive scientists have studied its time course through reaction time measurements (RT) and have associated it with a serial spread of object-based attention. Recent progress in biologically-inspired machine learning, has put forward convolutional recurrent neural networks (cRNNs) capable of exhibiting and mimicking visual cortical dynamics. To understand how the visual routines learned by cRNNs compare to humans, we need ways to extract meaningful dynamical signatures from a cRNN and study temporal human-model alignment. We introduce a framework to train, analyze, and interpret cRNN dynamics. Our framework triangulates insights from attractor-based dynamics and evidential learning theory. We derive a stimulus-dependent metric, ξ, and directly compare it to existing human RT data on the same task: a grouping task designed to study object-based attention. The results reveal a “filling-in” strategy learned by the cRNN, reminiscent of the serial spread of object-based attention in humans. We also observe a remarkable alignment between ξ and human RT patterns for diverse stimulus manipulations. This alignment emerged purely as a byproduct of the task constraints (no supervision on RT). Our framework paves the way for testing further hypotheses on the mechanisms supporting perceptual grouping and object-based attention, as well as for inter-model comparisons looking to improve the temporal alignment with humans on various other cognitive tasks.

Talk 5

Three theories of object-based attention implemented in deep neural network models

Hossein Adeli¹, Seoyoung Ahn², Gregory Zelinsky², Nikolaus Kriegeskorte¹; ¹Columbia University, ²Stony Brook University

Understanding the computational mechanisms that transform visual features into coherent object percepts requires the implementation of theories in scalable models. Here we report on implementations, using recent deep neural networks, of three previously proposed theories in which the binding of features is achieved (1) through convergence in a hierarchy of representations resulting in object-files, (2) through a reconstruction or a generative process that can target different features of an object, or (3) through the elevation of activation by spreading attention within an object via association fields. First, we present a model of object-based attention that relies on capsule networks to integrate features of different objects in the scene. With this grouping mechanism the model is able to learn to sequentially attend to objects to perform multi-object recognition and visual reasoning. The second modeling study shows how top-down reconstructions of object-centric representations in a sequential autoencoder can target different parts of the object in order to have a more robust and human-like object recognition system. The last study demonstrates how object perception and attention could be mediated by flexible object-based association fields at multiple levels of the visual processing hierarchy. Transformers provide a key relational and associative computation that may be present also in the primate brain, albeit implemented by a different mechanism. We observed that representations in transformer-based vision models can predict the reaction time behavior of people on an object grouping task. We also show that the feature maps can model the spreading of attention in an object.

Talk 6

Combining Generative Adversarial Networks (GANs) with behavior and brain recordings to study scene understanding

Melissa Le-Hoa Võ¹, Aylin Kallmayer¹; ¹Goethe University Frankfurt

Our visual world is a complex conglomeration of objects that adhere to semantic and syntactic regularities, a.k.a. scene grammar according to which scenes can be decomposed into phrases – i.e, smaller clusters of objects forming conceptual units – which again contain so-called anchor objects. These usually large and stationary objects further anchor predictions regarding the identity and location of most other smaller objects within the same phrase and play a key role in guiding attention and boosting perception during real-world search. They therefore provide an important organizing principle for structuring real-world scenes. Generative adversarial networks (GANs) trained on images of real-world scenes learn the scenes’ latent grammar to then synthesize images that mimic images of real-world scenes increasingly well. Therefore GANs can be used to study the hidden representations underlying object-based perception serving as testbeds to investigate the role that anchor objects play in both the generation and understanding of scenes. We will present some recent work in which we presented participants with real and generated images recording both behavior and brain responses. Modelling behavioral responses from a range of computer vision models we found that mostly high-level visual features and the strength of anchor information predicted human scene understanding of generated scenes. Using EEG to investigate the temporal dynamics of these processes revealed initial processing of anchor information which generalized to subsequent processing of the scene’s authenticity. These new findings imply that anchors pave the way to scene understanding and that models predicting real-world attention and perception should become more object-centric.

< Back to 2024 Symposia

The Multifaceted effects of blindness and how sight might be restored

< Back to 2024 Symposia

Symposium: Friday, May 17, 2024, 5:00 – 7:00 pm, Talk Room 1

Organizer: Ella Striem-Amit¹; ¹Georgetown University
Presenters: Lara Coelho, Santani Teng, Woon Ju Park, Elizabeth J. Saccone, Ella Striem-Amit, Michael Beyeler

Congenital blindness illustrates the developmental roots of visual cortex functions. Here, a group of early-career researchers will present various perspectives on the multifaceted effects of blindness on the brain and behavior. To start off the symposium, Coelho will describe the effect of sight loss on multisensory properties, and the reliance on vision to develop an intact multisensory body representation. This presentation will highlight the dependence across modalities, revealing rich interactions between vision and body representations. Discussing a unique manifestation of compensation in blindness, Teng will discuss how echolocation functions in naturalistic settings and its properties of active sensing. Continuing the theme of integration across senses and diving into visual cortical reorganization, Park will argue for partial dependence and partial independence on vision for the development of motion processing in hMT+. Saccone will show evidence for a functional takeover of language over typically face-selective FFA in blindness, showing plasticity beyond sensory representations. Together, these two talks will highlight different views of brain plasticity in blindness. Adding to our discussion of the multifaceted nature of plasticity, Striem-Amit will discuss whether plasticity in the visual cortex is consistent across different blind individuals, showing evidence for divergent visual plasticity and stability over time in adulthood. The last speaker will discuss the challenges and potential for sight restoration using visual prostheses. Beyeler will discuss how some of the challenges of sight restoration can be addressed through perceptual learning of implant inputs. This talk highlights how understanding plasticity in the visual system and across the brain has direct applications for successfully restoring sight. Together, the symposium will bring different theoretical perspectives to illustrate the effects of blindness, revealing the extent and diversity of neural plasticity, and clarify the state-of-the-art capacities for sight restoration.

Talk 1

Implications of visual impairment on body representation

Lara Coelho¹, Monica Gori; ¹Unit for visually impaired people, Italian Institute of Technology, Genova, Italy

In humans, vision is the most accurate sensory modality for constructing our representation of space. It has been shown that visual impairment negatively influences daily living and quality of life. For example, spatial and locomotor skills are reduced in this population. One possibility is that these deficiencies arise from a distorted representation of the body. Body representation is fundamental for motor control, because we rely on our bodies as a metric guide for our actions. While body representation is a by-product of multisensory integration, it has been proposed that vision is necessary to construct an accurate representation of the body. In the MySpace project, we are investigating the role of visual experience on haptic body representations in sighted and visually impaired (VI) participants. To this end, we employ a variety of techniques to investigate two key aspects of body representation 1) size perception, and 2)the plasticity of the proprioceptive system. These techniques include landmark localization, psychophysics, and the rubber hand illusion. Our results in sighted participants show distortions in haptic but not visual body representation. In the VI participants there are distortions when estimating forearm, hand, and foot size in several different haptic tasks. Moreover, VI children fail to update their perceived body location in the rubber hand illusion task. Collectively, our findings support the hypothesis that vision is necessary to reduce distortions in haptic body representations. Moreover, we propose, that VI children may develop with impaired representations of their own bodies. We discuss possible opportunities for reducing this impairment.

Talk 2

Acoustic glimpses: The accumulation of perceptual information in blind echolocators

Santani Teng¹; ¹Smith-Kettlewell Eye Research Institute

Blindness imposes constraints on the acquisition of sensory information from the environment. To mitigate those constraints, some blind people employ active echolocation, a technique in which self-generated sounds, like tongue “clicks,” produce informative reflections. Echolocating observers integrate over multiple clicks, or samples, to make perceptual decisions that guide behavior. What information is gained in the echoacoustic signal from each click? Here, I will draw from similar work in eye movements and ongoing studies in our lab to outline our approaches to this question. In a psychoacoustic and EEG experiment, blind expert echolocators and sighted control participants localized a virtual reflecting object after hearing simulated clicks and echoes. Left-right lateralization improved on trials with more click repetitions, suggesting a systematic precision benefit to multiple samples even when each sample delivered no new sensory information. In a related behavioral study, participants sat in a chair but otherwise moved freely while echoacoustically detecting, then orienting toward a reflecting target located at a random heading in the frontal hemifield. Clicking behavior and target size (therefore sonar strength) strongly influenced the rate and precision of orientation convergence toward the target, indicating a dynamic interaction between motor-driven head movements, click production, and the resulting echoacoustic feedback to the observer. Taken together, modeling these interactions in blind expert practitioners suggests similar properties, and potential shared mechanisms, between active sensing behavior in visual and echoacoustic domains.

Talk 3

Constraints of cross-modal plasticity within hMT+ following early blindness

Woon Ju Park¹, Kelly Chang, Ione Fine; ¹Department of Psychology, University of Washington

Cross-modal plasticity following early blindness has been widely documented across numerous visual areas, highlighting our brain’s remarkable adaptability to changes in sensory environment. In many of these areas, functional homologies have been observed between the original and reorganized responses. However, the mechanisms driving these homologies remain largely unknown. Here, we will present findings that aim to answer this question within the area hMT+, which responds to visual motion in sighted individuals and to auditory motion in early blind individuals. Our goal was to examine how the known functional and anatomical properties of this area influence the development of cross-modal responses in early blind individuals. Using a multimodal approach that encompasses psychophysics, computational modeling, and functional and quantitative MRI, we simultaneously characterized perceptual, functional, and anatomical selectivity to auditory motion within early blind and sighted individuals. We find that some anatomical and functional properties of hMT+ are inherited, while others are altered in those who become blind early in life.

Talk 4

Visual experience is necessary for dissociating face- and language-processing in the ventral visual stream

Elizabeth J. Saccone¹, Akshi¹, Judy S. Kim², Mengyu Tian³, Marina Bedny¹; ¹Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, USA, ²Center for Human Values, Princeton University, Princeton, NJ, USA, ³Center for Educational Science and Technology, Beijing Normal University at Zhuhai, China

The contributions of innate predispositions versus experience to face-selectivity in vOTC is hotly debated. Recent studies with people born blind suggest face specialization emerges regardless of experience. In blindness the FFA is said to process face shape, accessed through touch or sound, or maintain its behavioral role in person recognition by specializing for human voices. We hypothesized instead that in blind people the anatomical location of the FFA responds to language. While undergoing fMRI, congenitally blind English speakers (N=12) listened to spoken language (English), foreign speech (Russian, Korean, Mandarin), non-verbal vocalizations (e.g., laughter) and control non-human scene sounds (e.g., forest sounds) during a 1-back repetition task. Participants also performed a ‘face localizer’ task by touching 3D printed models of faces and control scenes and a language localizer (spoken words > backwards speech, Braille > tactile shapes). We identified individual-subject ROIs inside a FFA mask generated from sighted data. In people born blind, the anatomical location of the FFA showed a clear preference for language over all other sounds, whether human or not. Responses to spoken language were higher than to foreign speech or non-verbal vocalizations, which were not different from scene sounds. This pattern was observed even in parts of vOTC that responded more to touching faces. Specialization for faces in vOTC is influenced by experience. In the absence of vision, lateral vOTC becomes implicated in language. We speculate that shared circuits that evolved for communication specialize for either face recognition or language depending on experience.

Talk 5

Individual differences of brain plasticity in early visual deprivation

Ella Striem-Amit¹; ¹Department of Neuroscience, Georgetown University Medical Center, Washington, DC 20057, USA

Early-onset blindness leads to reorganization in visual cortex connectivity and function. However, this has mostly been studied at the group level, largely ignoring differences in brain reorganization across early blind individuals. To test whether plasticity manifests differently in different blind individuals, we studied resting-state functional connectivity (RSFC) from the primary visual cortex in a large cohort of blind individuals. We find increased individual differences in connectivity patterns, corresponding to areas that show reorganization in blindness. Further, using a longitudinal approach in repeatedly sampled blind individuals, we showed that such individual patterns of organization and plasticity are stable over time, to the degree of decoding individual participant identity over 2 years. Together, these findings suggest that visual cortex reorganization is not ubiquitous, highlighting the potential diversity in brain plasticity and the importance of harnessing individual differences for fitting rehabilitation approaches for vision loss.

Talk 6

Learning to see again: The role of perceptual learning and user engagement in sight restoration

Michael Beyeler¹; ¹University of California, Santa Barbara

Retinal and cortical implants show potential in restoring a rudimentary form of vision to people living with profound blindness, but the visual sensations (“phosphenes”) produced by current devices often seem unnatural or distorted. Consequently, the ability of implant users to learn to make use of this artificial vision plays a critical role in whether some functional vision is successfully regained. In this talk, I will discuss recent work detailing the potential and limitations of perceptual learning in helping implant users learn to see again. Although the abilities of visual implant users tend to improve with training, there is little evidence that this is due to distortions becoming less perceptually apparent, but instead may be due to better interpretation of distorted input. Unlike those with natural vision, implant recipients must accommodate various visual anomalies, such as inconsistent spatial distortions and phosphene fading. Furthermore, perceptual measures such as grating acuity and motion discrimination, which are often used with the intention of objectively assessing visual function, may be modulated via gamification, highlighting the importance of user engagement in basic psychophysical tasks. Gamification may be particularly effective at engaging reward systems in the brain, potentially fostering greater plasticity through more varied stimuli and active attentional engagement. However, the effectiveness of such gamified approaches varies, suggesting a need for personalized strategies in visual rehabilitation.

< Back to 2024 Symposia

Attention: accept, reject, or major revisions?

< Back to 2024 Symposia

Symposium: Friday, May 17, 2024, 2:30 – 4:30 pm, Talk Room 2

Organizers: Alon Zivony¹; ¹University of Sheffield
Presenters: Britt Anderson, Ruth Rosenholtz, Wayne Wu, Sarah Shomstein, Alon Zivony

Is attention research in crisis? After more than a century, we have come full circle from the intuition that “everybody knows what attention is” (James, 1890) to the conclusion that “nobody knows what attention is” (Hommel et al., 2019). It has been suggested that attention is an incoherent and sterile concept, or unsuitable for scientific research. And yet, attention research continues as strongly as ever with little response to these critiques. Is the field ignoring glaring theoretical problems, or does the current conception of attention merely require some revisions? In this symposium, our speakers bring different perspectives to examine this critical question. Rather than merely raising issues with the concept of attention, each also suggests practical and theoretical solutions, which can hopefully inform future research. Each speaker will present either a critical view or defence of the concept of attention, and suggest whether attention should be abandoned, kept as is, or redefined. Our first two speakers will argue that scientists may be better off without the concept of attention. Britt Anderson will criticize the use of attention as an explanation of observed phenomena. He will suggest that the common usage is non-scientific and results in circular logic. He offers in its place an attention-free account of so-called attention effects. Ruth Rosenholtz argues that recent work, for example on peripheral vision, calls into question many of the basic tenets of attention theory. She will talk about her year of banning ‘attention’ in order to rethink attention from the ground up. The second group of speakers will question common understanding of attention but will argue in favour of it as a scientific concept. Wayne Wu will suggest that our shared methodology of studying attention commits us to the Jamesonian functional conceptualization of attention. He will argue that attention can and should be retained if we locate it in the right level analysis in cognitive explanation. Sarah Shomstein will discuss “attentional platypuses”, empirical abnormalities that do not fit into current attention research. These abnormalities reveal the need for a new way of thinking about attention. Alon Zivony will argue that many of the conceptual problems with attention stem from the standard view that equates attention with selection. Moving away from this definition will allow us to retain attention but will also require a change in our thinking. Each talk will conclude with a take-home message about what attention is and isn’t, a verdict of whether it should be abandoned or retained, and suggestions of how their understanding of attention can be applied in future research. We will conclude with a panel discussion.

Talk 1

Attention: Idol of the Tribe

Britt Anderson¹; ¹Dept of Psychology and Centre for Theoretical Neuroscience, University of Waterloo

The term ’attention’ has been a drag on our science ever since the early days of experimental psychology. Our frequent offerings and sacrifices (articles and the debates they provoke), and our unwillingness to abandon our belief in this reified entity indicates the aptness of the Jamesian phrase ”idol of the tribe.” While causal accounts of attention are empty, attention might be, as suggested by Hebb, a useful label. It could be used to indicate that some experimental observable is not immediately explained by the excitation of receptor cells. However, labeling of something as ’attention’ means there is something to be explained; not that something has been explained. Common experimental manipulations used to provoke visual selective attention: instructions, cues, and reward are in fact the guide to explaining away ’attention’. The observations provoked by such manipulations frequently induce behavioral performance differences not explainable in terms of differences in retinal stimulation. These manipulations are economically summarized as components of a process in which base rates, evidence, value, and plausibility combine to determine perceptual experience. After briefly reviewing the history of how attention has been confusing from the start, I will summarize the notion of conceptual fragmentation and show how it applies. I will then review how the traditional conditions of an attentional experiment provide the basis for a superior, attention free, account of the phenomena of interest, and I will present some of the opportunities for the use of more formal descriptions that should lead to better theoretically motivated experimental investigations.

Talk 2

Attention in Crisis

Ruth Rosenholtz¹; ¹NVIDIA Research

Recent research on peripheral vision has led to a paradigm-shifting conclusion: that vision science as a field must rethink the concept of visual attention. Research has uncovered significant anomalies not explained by existing theories, and some methods for studying attention may instead have uncovered mechanisms of peripheral vision. Nor can a summary statistic representation in peripheral vision solve these problems on its own. A year of banning “attention” in my lab allowed us to rethink attention from the ground up; this talk will conclude with some of the resulting insights.

Talk 3

Attention Unified

Wayne Wu¹; ¹Department of Philosophy and Neuroscience Institute, Carnegie Mellon University

For over a century, scientists have expressed deep misgivings about attention. A layperson would find this puzzling, for they know what attention is as well as those with sight know what seeing is. People visually attend all the time. Attention is real, we know what it is, and we can explain it. I shall argue that the problem of attention concerns the conceptual and logical structure of the scientific theory of attention. Because of shared methodology, we are committed to a single functional conception of attention, what William James articulated long ago. I show how this shared conception provides a principle of unification that links empirical work. To illustrate this, I show how two cueing paradigms tied to “external” and “internal” attention, spatial cueing and retro-cueing, are instances of the same kind of attention. Against common skepticism, I demonstrate that we are all committed to the existence of attention as a target of explanation. Yet in step with the skeptic, I show that attention is not an explainer in the sense that it is not a neural mechanism. Locating attention at the right level of analysis in cognitive explanation is key to understanding what it is and how science has made massive progress in understanding it.

Talk 4

What does a platypus have to do with attention?

Sarah Shomstein¹; ¹Department of Psychological and Brain Sciences, George Washington University

Decades of research on understanding the mechanisms of attentional selection have focused on identifying the units (representations) on which attention operates in order to guide prioritized sensory processing. These attentional units fit neatly to accommodate our understanding of how attention is allocated in a top-down, bottom-up, or historical fashion. In this talk, I will focus on attentional phenomena that are not easily accommodated within current theories of attentional selection. We call these phenomena attentional platypuses, as they allude to an observation that within biological taxonomies the platypus does not fit into either mammal or bird categories. Similarly, attentional phenomena that do not fit neatly within current attentional models suggest that current models need to be revised. We list a few instances of the ‘attentional platypuses’ and then offer a new approach, that we term Dynamically Weighted Prioritization, stipulating that multiple factors impinge onto the attentional priority map, each with a corresponding weight. The interaction between factors and their corresponding weights determine the current state of the priority map which subsequently constrains/guides attention allocation. We propose that this new approach should be considered as a supplement to existing models of attention, especially those that emphasize categorical organizations.

Talk 5

It’s time to redefine attention

Alon Zivony¹; ¹Department of Psychology, University of Sheffield

Many models of attention assume that attentional selection takes place at a specific moment in time which demarcates the critical transition from pre-attentive to attentive processing of sensory inputs. In this talk, I will argue that this intuitively appealing assumption is not only incorrect, but it is also the reason behind the conceptual confusion about what attention is, and how it should be understood in psychological science. As an alternative, I will offer a “diachronic” framework that views attention as a modulatory process that unfolds over time, in tandem with perceptual processing. This framework breaks down the false dichotomy between pre-attentive and attentive processing, and as such, offers new solutions to old problems in attention research (the early vs. late selection debate). More importantly, by situating attention within a broader context of selectivity in the brain, the diachronic account can provide a unified and conceptually coherent account of attention. This will allow us to keep the concept of attention but will also require serious rethinking about how we use attention as a scientific concept.

< Back to 2024 Symposia

The temporal evolution of visual perception

< Back to 2024 Symposia

Symposium: Friday, May 17, 2024, 2:30 – 4:30 pm, Talk Room 1

Organizers: Lina Teichmann¹, Chris Baker¹; ¹Laboratory of Brain and Cognition, National Institute of Mental Health, Bethesda, USA
Presenters: Lina Teichmann, Iris I. A. Groen, Diana Dima, Tijl Grootswagers, Rachel Denison

The human visual system dynamically processes input over the course of a few hundred milliseconds to generate our perceptual experience. Capturing the dynamic aspects of the neural response is therefore imperative to understand visual perception. By bringing five speakers together who use a diverse set of methods and approaches, the symposium aims to elucidate the temporal evolution of visual perception from different angles. All five speakers (four female) are early-career researchers based in Europe, Australia, the US, and Canada. Speakers will be allotted 18 minutes of presentation time plus 5 minutes of questions after each talk. In contrast to a lot of the current neuroimaging work, the symposium talks will focus on temporal dynamics rather than localization. Collectively, the work presented will demonstrate that the complex and dynamic nature of visual perception requires data that matches its temporal granularity. In the first talk, Lina Teichmann will present data from a large-scale study focusing on how individual colour-space geometries unfold in the human brain. Linking densely-sampled MEG data with psychophysics, her work on colour provides a test case to study the subjective nature of visual perception. Iris Groen will discuss findings from intracranial EEG studies that characterize neural responses across the visual hierarchy. Applying computational models, her work provides fundamental insights into how the visual response unfolds over time across visual cortex. Diana Dima will speak about how responses evoked by observed social interactions are processed in the brain. Using temporally-resolved EEG data, her research shows how visual information is modulated from perception to cognition. Tijl Grootswagers will present on studies investigating visual object processing. Using rapid series of object stimuli and linking EEG and behavioural data, his work shows the speed and efficiency of the visual system to make sense of the things we see. To conclude, Rachel Denison will provide insights into how we employ attentional mechanisms to prioritize relevant visual input at the right time. Using MEG data, she will highlight how temporal attention affects the dynamics of evoked visual responses. Overall, the symposium aims to shed light on the dynamic nature of visual processing at all levels of the visual hierarchy. It will be a chance to discuss benefits and challenges of different methodologies that will allow us to gain a comprehensive insight into the temporal aspects of visual perception.

Talk 1

The temporal dynamics of individual colour-space geometries in the human brain

Lina Teichmann¹, Ka Chun Lam², Danny Garside³, Amaia Benitez-Andonegui⁴, Sebastian Montesinos¹, Francisco Pereira², Bevil Conway^3,5, Chris Baker^1,5; ¹Laboratory of Brain and Cognition, National Institute of Mental Health, Bethesda, USA, ²Machine Learning Team, National Institute of Mental Health, Bethesda, USA, ³Laboratory of Sensorimotor Research, National Eye Institute, Bethesda, USA, ⁴MEG Core Facility, National Institute of Mental Health, Bethesda, USA, ⁵equal contribution

We often assume that people see the world in a similar way to us, as we can effectively communicate how things look. However, colour perception is one aspect of vision that varies widely among individuals as shown by differences in colour discrimination, colour constancy, colour appearance and colour naming. Further, the neural response to colour is dynamic and varies over time. Many attempts have been made to construct formal, uniform colour spaces that aim to capture universally valid similarity relationships, but there are discrepancies between these models and individual perception. Combining Magnetoencephalography (MEG) and psychophysical data we examined the extent to which these discrepancies can be accounted for by the geometry of the neural representation of colour and their evolution over time. In particular, we used a dense sampling approach and collected neural responses to hundreds of colours to reconstruct individual fine-grained colour-space geometries from neural signals with millisecond accuracy. In addition, we collected large-scale behavioural data to assess perceived similarity relationships between different colours for every participant. Using a computational modelling approach, we extracted similarity embeddings from the behavioural data to model the neural signal directly. We find that colour information is present in the neural signal from approximately 70 ms onwards but that neural colour-space geometries unfold non-uniformly over time. These findings highlight the gap between theoretical colour spaces and colour perception and represent a novel avenue to gain insights into the subjective nature of perception.

Talk 2

Delayed divisive normalisation accounts for a wide range of temporal dynamics of neural responses in human visual cortex

Iris I. A. Groen¹, Amber Brands¹, Giovanni Piantoni², Stephanie Montenegro³, Adeen Flinker³, Sasha Devore³, Orrin Devinsky³, Werner Doyle³, Patricia Dugan³, Daniel Friedman³, Nick Ramsey², Natalia Petridou², Jonathan Winawer⁴; ¹Informatics Institute, University of Amsterdam, Amsterdam, Netherlands, ²University Medical Center Utrecht, Utrecht, Netherlands, ³New York University Grossman School of Medicine, New York, NY, USA, ⁴Department of Psychology and Center for Neural Science, New York University, New York, NY, USA

Neural responses in visual cortex exhibit various complex, non-linear temporal dynamics. Even for simple static stimuli, responses decrease when a stimulus is prolonged in time (adaptation), reduce to stimuli that are repeated (repetition suppression), and rise more slowly for low contrast stimuli (slow dynamics). These dynamics also vary depending on the location in the visual hierarchy (e.g., lower vs. higher visual areas) and the type of stimulus (e.g., contrast pattern stimuli vs. real-world object, scenes and face categories). In this talk, I will present two intracranial EEG (iEEG) datasets in which we quantified and modelled the temporal dynamics of neural responses across the visual cortex at millisecond resolution. Our work shows that many aspects of these dynamics are accurately captured by a delayed divisive normalisation model in which neural responses are normalised by recent activation history. I will highlight how fitting this model to the iEEG data unifies multiple disparate temporal phenomena in a single computational framework, thereby revealing systematic differences in temporal dynamics of neural population responses across the human visual hierarchy. Overall, these findings suggest a pervasive role of history-dependent delayed divisive normalisation in shaping neural response dynamics across the cortical visual hierarchy.

Talk 3

How natural action perception unfolds in the brain

Diana Dima¹, Yalda Mohsenzadeh¹; ¹Western University, London, ON, Canada

In a fraction of a second, humans can recognize a wide range of actions performed by others. Yet actions pose a unique complexity challenge, bridging visual domains and varying along multiple perceptual and semantic features. What features are extracted in the brain when we view others’ actions, and how are they processed over time? I will present electroencephalography work using natural videos of human actions and rich feature sets to determine the temporal sequence of action perception in the brain. Our work shows that action features, from visual to semantic, are extracted along a temporal gradient, and that different processing stages can be dissociated with artificial neural network models. Furthermore, using a multimodal approach with video and text stimuli, we show how conceptual action representations emerge in the brain. Overall, these data reveal the rapid computations underlying action perception in natural settings. The talk will highlight how a temporally resolved approach to natural vision can uncover the neural computations linking perception and cognition.

Talk 4

Decoding rapid object representations

Tijl Grootswagers¹, Amanda K. Robinson²; ¹The MARCS Institute for Brain, Behaviour and Development, School of Computer, Data and Mathematical Sciences, Western Sydney University, Sydney, NSW, Australia, ²Queensland Brain Institute, The University of Queensland, Brisbane, QLD, Australia

Humans are extremely fast at recognising objects, and can do this very reliably. Information about objects and object categories emerges within 200 milliseconds in the human visual system, even under difficult conditions such as occlusion or low visibility. These neural representations can be highly complex and multidimensional, despite relying on limited visual information. Understanding emerging object representations necessitates time-resolved neuroimaging methods with millisecond precision, such as EEG and MEG. Recent time-resolved neuroimaging work has used decoding methods in rapid serial visual presentation designs to show that relevant object-information about multiple sequentially presented objects is robustly encoded by the brain. This talk will highlight recent research on the time course of object representations in rapid image sequences, focusing on three key findings: (1) object representations are highly automatic, with robust representations emerging even with fast-changing visual input. (2) emerging object representations are highly robust to changes in context and task, suggesting strong reliance on feedforward processes. (3) object representational structures are highly consistent across individuals, to the extent that neural representations are predictive of independent behavioural judgments on a variety of tasks. Together, these findings suggest that the first sweep of information through the visual system contains highly robust information that is readily available for read-out in behavioural decisions.

Talk 5

Isolating neural mechanisms of voluntary temporal attention

Rachel Denison^1,2, Karen Tian^1,2, Jiating Zhu¹, David Heeger², Marisa Carrasco²; ¹Boston University, Department of Psychological and Brain Sciences, USA, ²New York University, Department of Psychology and Center for Neural Science, USA

To handle the continuous influx of visual information, temporal attention prioritizes visual information at task-relevant moments in time. We first introduce a probabilistic framework that clarifies the conceptual distinction and formal relation between temporal attention, linked to timing relevance, and temporal expectation, linked to timing predictability. Next, we present two MEG studies in which we manipulated temporal attention while keeping expectation constant, allowing us to isolate neural mechanisms specific to voluntary temporal attention. Participants were cued to attend to one of two sequential grating targets with predictable timing, separated by a 300 ms SOA. The first study used time-resolved steady-state visual evoked responses (SSVER) to investigate how temporal attention modulates anticipatory visual activity. In the pre-target period, visual activity (measured with a background SSVER probe) steadily ramped up as the targets approached, reflecting temporal expectation. Furthermore, we found a low-frequency modulation of visual activity, which shifted approximately 180 degrees in phase according to which target was attended. The second study used time-resolved decoding and source reconstruction to examine how temporal attention affects dynamic target representations. Temporal attention to the first target enhanced its orientation representation within a left fronto-cingulate region ~250 ms after stimulus onset, perhaps protecting it from interference from the second target within the visual cortex. Together these studies reveal how voluntary temporal attention flexibly shapes pre-target periodic dynamics and post-target routing of stimulus information to select a task-relevant stimulus within a sequence.

< Back to 2024 Symposia

Large-scale visual neural datasets: where do we go from here?

< Back to 2024 Symposia

Symposium: Friday, May 17, 2024, 12:00 – 2:00 pm, Talk Room 2

Organizers: Alessandro Gifford¹, Kendrick Kay²; ¹Freie Universität Berlin, ²University of Minnesota
Presenters: Eline R. Kupers, Won Mok Shim, Ian Charest, Tomas Knapen, Jacob Prince, Alessandro T. Gifford

Vision science has witnessed an increase in worldwide initiatives collecting and publicly releasing large-scale visual neural datasets (LSVNDs). These initiatives have allowed thousands of vision scientists to readily harness LSVNDs, enabling new investigations and resulting in novel discoveries. This suggests vision science is entering a new era of inquiry characterized by big open data. The rapid growth in the collection and use of LSVNDs spawns urgent questions, the answers to which will steer the direction of the field. How can different researchers across the vision sciences spectrum benefit from these datasets? What are the opportunities and pitfalls of LSVNDs for theory formation? Which kinds of LSVNDs are missing, and what characteristics should future LSVNDs have to maximize their impact and utility? How can LSVNDs support a virtuous cycle between neuroscience and artificial intelligence? This symposium invites the VSS community to engage these questions in an interactive and guided community process. We will start with a short introduction (5 minutes), followed by six brief, thought-provoking talks (each 9 minutes plus 3 minutes for Q&A). Enriched by these perspectives, the symposium will then move to a highly interactive 30-minute discussion where we will engage the audience to discuss the most salient open questions on LSVNDs, generate and share insights, and foster new collaborations. Speakers from diverse career stages will cover a broad range of perspectives on LSVNDs, including dataset creators (Kupers, Shim), dataset users (Prince), and researchers playing both roles (Gifford, Charest, Knapen). Eline Kupers will expose behind-the-scenes knowledge on a particular LSVND that has received substantial traction in the field, the Natural Scenes Dataset (NSD), and will introduce ongoing efforts for a new large-scale multi-task fMRI dataset called Visual Cognition Dataset. Won Mok Shim will introduce the Naturalistic Perception Action and Cognition (NatPAC) 7T fMRI dataset, and discuss how this dataset allows investigation of the impact of goal-directed actions on visual representations under naturalistic settings. Ian Charest will present recent results on semantic representations enabled by NSD, as well as ongoing large-scale data collection efforts inspired by NSD. Tomas Knapen will demonstrate how combining LSVNDs with other datasets incites exploration and discovery, and will present ongoing large-scale data collection efforts in his group. Jacob Prince will provide a first-hand perspective on how researchers external to the data collection process can apply LSVNDs for diverse research aims across cognitive neuroscience, neuroAI, and neuroimaging methods development. Finally, Ale Gifford will highlight broad opportunities that LSVNDs offer to the vision sciences community, and present a vision for the future of large-scale datasets. This symposium will interest any VSS member interested in neural data as it will expose opportunities and limitations of LSVNDs and how they relate to smaller, more narrowly focused datasets. Our goal is to align the VSS community with respect to open questions regarding LSVNDs, and help incentivize and coordinate new large-scale data collection efforts. We believe this symposium will strengthen the impact of LSVNDs on the field of vision science, and foster a new generation of big-data vision scientists.

Talk 1

The Natural Scenes Dataset: Lessons Learned and What’s Next?

Eline R. Kupers^1,2, Celia Durkin², Clayton E Curtis³, Harvey Huang⁴, Dora Hermes⁴, Thomas Naselaris², Kendrick Kay²; ¹Stanford University, ²University of Minnesota, ³New York University, ⁴Mayo Clinic

Release and reuse of rich neuroimaging datasets have rapidly grown in popularity, enabling researchers to ask new questions about visual processing and to benchmark computational models. One highly used dataset is the Natural Scenes Dataset (NSD), a 7T fMRI dataset where 8 subjects viewed more than 70,000 images over the course of a year. Since its recent release in September 2021, NSD has gained 1700+ users and resulted in 55+ papers and pre-prints. Here, we share behind-the-scenes considerations and inside knowledge from the NSD acquisition effort that helped ensure its quality and impact. This includes lessons learned regarding funding, designing, collecting, and releasing a large-scale fMRI dataset. Complementing the creator’s perspective, we also highlight the user’s viewpoint by revealing results from a large anonymous survey distributed amongst NSD users. These results will provide valuable (and often unspoken) insights into both positive and negative experiences interacting with NSD and other publicly available datasets. Finally, we discuss ongoing efforts towards two new large-scale datasets: (i) NSD-iEEG, an intracranial electroencephalography dataset with extensive electrode coverage in cortex and sub-cortex using a similar paradigm to NSD and (ii) Visual Cognition Dataset, a 7T fMRI dataset that samples a large diversity of tasks on a common set of visual stimuli (in contrast to NSD which samples a large diversity of stimuli during a single task). By sharing these lessons and ideas, we hope to facilitate new data collection efforts and enhance the ability of these datasets to support new discoveries in vision and cognition.

Talk 2

Exploring naturalistic vision in action with the 7T Naturalistic Perception, Action, and Cognition (NatPAC) Dataset

Won Mok Shim^1,2, Royoung Kim^1,2, Jiwoong Park^1,2; ¹Institute of Basic Science, Republic of Korea, ²Sungkyunkwan University

Large-scale human neuroimaging datasets have provided invaluable opportunities to examine brain and cognitive functions. Our recent endeavor, the 7T NatPAC project, is designed to provide high-resolution human MRI structural and functional datasets using moderately dense sampling (12–16 2-hr sessions per subject) across a broad range of tasks. While previous large-scale datasets have featured sparse sampling of cognitive functions, our goal is to encompass a more extensive spectrum of cognitive and affective processes through diverse tasks, spanning both structured and naturalistic paradigms. Notably, we incorporated naturalistic tasks to probe a variety of higher-order cognitive functions including watching movies, freely speaking, and interactive 3D video game playing within a Minecraft environment. Through a collection of innovative Minecraft-based games simulating real-world behaviors, we aim to investigate the neural mechanisms of perception, action, and cognition as an integrative process that unfolds in naturalistic contexts. In this talk, I will focus on a shepherding game, where participants engage in strategic planning with hierarchical subgoals and adaptively update their strategies while navigating a virtual world. In combination with high-precision eye tracking data corrected for head motion, we explore how visual responses, including population receptive field (pRF) mapping, are modulated in the visual cortex and frontoparietal regions during free viewing and complex goal-directed behaviors compared to passive viewing of game replays and conventional pRF experiments. I will discuss the broader implications of the impact of goal-directed actions on visual representations and how large-scale datasets enable us to examine such effects in naturalistic settings.

Talk 3

Exploiting large-scale neuroimaging datasets to reveal novel insights in vision science

Ian Charest^1,2, Peter Brotherwood¹, Catherine Landry¹, Jasper van den Bosch¹, Shahab Bakhtiari^1,2, Tim Kietzmann³, Frédéric Gosselin¹, Adrien Doerig³; ¹Université de Montréal, ²Mila – Québec AI Institute, ³University of Osnabrück

Building quantitative models of neural activity in the visual system is a long-standing goal in neuroscience. Though this research program is fundamentally limited by the small scale and low signal-to-noise of most existing datasets, with the advent of large-scale datasets it has become possible to build, test, and discriminate increasingly expressive competing models of neural representation. In this talk I will describe how the scale of the 7T fMRI Natural Scenes Dataset (NSD) has made possible novel insights into the mechanisms underlying scene perception. We harnessed recent advancements in linguistic artificial intelligence to construct models that capture progressively richer semantic information, ranging from object categories to word embeddings to scene captions. Our findings reveal a positive correlation between a model’s capacity to capture semantic information and its ability to predict NSD data, a feature then replicated with recurrent convolutional networks trained to predict sentence embeddings from visual inputs. This collective evidence suggests that the visual system, as a whole, is better characterized by an aim to extract rich semantic information rather than merely cataloging object inventories from visual inputs. Considering the substantial power of NSD, collecting additional neuroimaging and behavioral data using the same image set becomes highly appealing. We are expanding NSD through the development of two innovative datasets: an electroencephalography dataset called NSD-EEG, and a mental imagery vividness ratings dataset called NSD-Vividness. Datasets like NSD not only provide fresh insights into the visual system but also inspire the development of new datasets in the field.

Talk 4

Farewell to the explore-exploit trade-off in large-scale datasets

Tomas Knapen^1,2, Nick Hedger³, Thomas Naselaris⁴, Shufan Zhang^1,2, Martin Hebart^5,6; ¹Vrije Universiteit, ²Royal Dutch Academy of Arts and Sciences, ³University of Reading, ⁴University of Minnesota, ⁵Justus Liebig University, ⁶Max Planck Institute for Human Cognitive and Brain Sciences

LSVNDs are a very powerful tool for discovery science. Due to their suitability for exploration, large datasets synergize well when supplemented with more exploitative datasets focused on small-scale hypothesis testing that can confirm exploratory findings. Similar synergy can be attained when combining findings across datasets, where one LSVND can be used to confirm and extend discoveries from another LSVND. I will showcase how we have recently leveraged several large-scale datasets in unison to discover principles of topographic visual processing throughout the brain. These examples demonstrate how LSVNDs can be used to great effect, especially in combination across datasets. In our most recent example, we combined the HCP 7T fMRI dataset (a “wide” dataset with 180 participants, 2.5 hrs of whole-brain fMRI each) with NSD (a “deep” dataset with 8 participants, 40 hrs of whole-brain fMRI each) to investigate visual body-part selectivity. We discovered homuncular maps in high-level visual cortex through connectivity with primary somatosensory cortex in HCP, and validated the body-part tuning of these maps using NSD. This integration of wide and deep LSVNDs allows inference about computational mechanisms at both the individual and population levels. For this reason, we believe the field needs a variety of LSVNDs. I will briefly present ongoing work from my lab collecting new ‘deep’ LSVND contributions: a brief (2.5-s) video watching dataset and a retinotopic mapping dataset, each with up to 10 sessions of 7T fMRI in 8 subjects.

Talk 5

Large datasets: a Swiss Army knife for diverse research aims in neuroAI

Jacob Prince¹, Colin Conwell², Talia Konkle¹; ¹Harvard University, ²Johns Hopkins University

This talk provides a first-hand perspective on how users external to the data collection process can harness LSVNDs as foundation datasets for their research aims. We first highlight recent evidence that these datasets help address and move beyond longstanding debates in cognitive neuroscience, such as the nature of category selective regions, and the visual category code more broadly. We will show evidence that datasets like NSD have provided powerful new insight into how items from well-studied domains (faces, scenes) are represented in the context of broader representational spaces for objects. Second, we will highlight the potential of LSVNDs to answer urgent, emergent questions in neuroAI – for example, which inductive biases are critical for obtaining a good neural network model of the human visual system? We will describe a series of controlled experiments leveraging hundreds of open-source DNNs, systematically varying inductive biases to reveal the factors that most directly impact brain predictivity at scale. Finally, for users interested in neuroimaging methods development, we will highlight how the existence of these datasets has catalyzed rapid progress in methods for fMRI signal estimation and denoising, as well as for basic analysis routines like PCA and computing noise ceilings. We will conclude by reflecting on both the joys and pain points of working with LSVNDs, in order to help inform the next generation of these datasets.

Talk 6

What opportunities do large-scale visual neural datasets offer to the vision sciences community?

Alessandro T. Gifford¹, Benjamin Lahner², Pablo Oyarzo¹, Aude Oliva², Gemma Roig³, Radoslaw M. Cichy¹; ¹Freie Universität Berlin, ²MIT, ³Goethe Universität Frankfurt

In this talk I will provide three complementary examples of the opportunities that LSVNDs offer to the vision sciences community. First, LSVNDs of naturalistic (thus more ecologically valid) visual stimulation allow the investigation of novel mechanisms of high-level visual cognition. We are extensively recording human fMRI and EEG responses for short naturalistic movie clips; modeling results reveal that semantic information such as action understanding or movie captions is embedded in neural representations. Second, LSVNDs contribute to the emerging field of NeuroAI, advancing research in vision sciences through a symbiotic relationship between visual neuroscience and computer vision. We recently collected a large and rich EEG dataset of neural responses to naturalistic images, using it on the one hand to train deep-learning-based end-to-end encoding models directly on brain data, thus aligning visual representations in models and the brain, and on the other hand to increase the robustness of computer vision models by exploiting inductive biases from neural visual representations. Third, LSVNDs make possible critical initiatives such as challenges and benchmarks. In 2019 we founded the Algonauts Project, a platform where scientists from different disciplines can cooperate and compete in creating the best predictive models of the visual brain, thus advancing the state-of-the-art in brain modeling as well as promoting cross-disciplinary interaction. I will end with some forward-looking thoughts on how LSVNDs might transform the vision sciences.

< Back to 2024 Symposia

Neurodiversity in visual functioning: Moving beyond case-control studies

< Back to 2024 Symposia

Symposium: Friday, May 17, 2024, 12:00 – 2:00 pm, Talk Room 1

Organizers: Catherine Manning¹, Michael-Paul Schallmo²; ¹University of Reading, UK, ²University of Minnesota
Presenters: Catherine Manning, Michael-Paul Schallmo, Victor Pokorny, Brian Keane, Beier Yao, Alice Price

Although vision science has a rich history of investigating atypical functioning in developmental and psychiatric conditions, these studies have tended to compare a single diagnosis against a normative comparison group (the case-control approach). However, by studying diagnoses in isolation, we cannot determine whether case-control differences are condition-specific, or instead reflect neural changes that occur across multiple conditions. A related challenge to the case-control approach is the growing recognition that categorical diagnoses are not biologically or psychologically discrete entities: multiple diagnoses commonly co-occur within individuals, considerable heterogeneity is found among individuals with the same diagnosis, and similarities are often found between diagnosed individuals and those with subclinical traits. Moreover, categorical diagnoses do not clearly map onto the underlying biology (e.g., genes, neural function). Accordingly, there has been a recent conceptual shift away from the traditional case-control approach towards considering continuous, transdiagnostic dimensions of neurodiversity, which might better reflect the underlying biology (c.f. NIH’s Research Domain Criteria framework). By studying dimensions of visual functioning across conditions, we will elucidate the mechanisms implicated in cases of atypical visual functioning, while also helping to understand individual differences in the non-clinical population. This symposium will bring together cutting-edge research that goes beyond the traditional case-control approach to demonstrate this recent conceptual shift. Speakers representing diverse career-stages, scientific approaches and nationalities will present research encompassing a range of conditions (e.g., autism, dyslexia, schizophrenia, bipolar disorder, migraine) and methods (EEG, fMRI, psychophysics, computational modelling, questionnaires). Cathy Manning will first introduce the traditional case-control approach and its limitations, before presenting EEG and behavioural work identifying both convergence and divergence in autistic and dyslexic children’s visual motion processing and decision-making. Second, Michael-Paul Schallmo will show that weaker surround suppression is shared by both adults with autism and schizophrenia, and linked to continuous dimensions of psychiatric symptoms. Third, Victor Pokorny will describe a recent meta-analysis that found surprisingly weak evidence for generally weakened use of visuospatial context in schizophrenia, bipolar disorder, and related sub-clinical populations, but stronger evidence for specific alterations in contrast perception. Fourth, Brian Keane will describe how functional connectivity involving a higher-order visual network is aberrant in psychosis patients, regardless of diagnosis. Fifth, Beier Yao will present a visuomotor mechanism that is altered across psychosis diagnoses and relates to positive symptoms. Finally, Alice Price will describe how factors of the visual Cardiff Hypersensitivity Scale differ across conditions and in the general population. We will finish with a panel discussion drawing out overall themes and covering theoretical and practical considerations for advancing investigations into neurodiversity in visual functioning. The symposium will inform a richer understanding within the VSS community of visual function in psychiatric and neurodevelopmental conditions, and individual differences more broadly. The presentations and discussion will benefit both junior and senior vision scientists by highlighting cutting-edge methods and emerging theories of neurodiversity. The symposium is timely not only because of the recent “transdiagnostic revolution” (Astle et al., 2022), but also due to the increasing prevalence of diagnoses (e.g., autism, mental health difficulties).

Talk 1

Visual processing and decision-making in children with autism and dyslexia: Insights from cross-syndrome approaches

Catherine Manning^1,2; ¹University of Reading, UK, ²University of Birmingham, UK

Atypical visual processing has been reported in a range of developmental conditions, including autism and dyslexia. One explanation for this is that certain neural processes are vulnerable to atypical development, leading to shared effects across developmental conditions. However, few studies make direct comparisons between developmental conditions, or use sensitive-enough methods, to conclude whether visual processing is affected differently in these conditions, or whether they are affected similarly, therefore reflecting a more general marker of atypical development. After evaluating the current state of the science, I will present findings from two sets of studies that apply computational modelling approaches (equivalent noise modelling and diffusion modelling) and measure EEG data in matched groups of autistic, dyslexic and typically developing children aged 6 to 14 years (n = ~50 per group). These methods help pinpoint the component processes involved in processing visual information and making decisions about it, while linking brain and behaviour. The results identify both areas of convergence and divergence in autistic and dyslexic children’s visual processing and decision-making. For example, both autistic and dyslexic children show differences in late stimulus-locked EEG activity in response to coherent motion stimuli, which may reflect reduced segregation of signal-from-noise. However only dyslexic children (and not autistic children) show a reduced accumulation of sensory evidence which is reflected in a shallower build-up of activity in a centro-parietal EEG component. Therefore, while there may be some shared effects across conditions, there are also condition-specific effects, which will require refined theories.

Talk 2

Weaker visual surround suppression in both autism spectrum and psychosis spectrum disorders

Michael-Paul Schallmo¹; ¹University of Minnesota

Issues with sensory functioning and attention are common in both autism spectrum and psychosis spectrum disorders. Despite important differences in symptoms and developmental time course, these conditions share a number of common features with regard to visual perception. One such phenomenon that we and others have observed in both populations is a reduced effect of surrounding spatial context during the perception of basic visual features such as contrast or motion. In this talk, we will consider whether these differences in visual function may have a common source. In a series of psychophysical, and brain imaging experiments, we found that young adults with ASD showed weaker visual surround suppression during motion perception, as compared to neurotypical individuals. This was reflected by differences in behavioral task performance and fMRI responses from area MT. Likewise, across multiple experiments in people with psychosis, we have found that individuals with schizophrenia show weaker behavioral and neural surround suppression during visual contrast perception. Recently, we used a divisive normalization model to show that narrower spatial attention may be sufficient to explain weaker surround suppression in ASD. This theory was subsequently given support by another group who showed weaker suppression for narrow vs. broad attention conditions in healthy adults. Previous studies have also found narrower spatial attention both in people with ASD and in schizophrenia. Thus, we suggest narrower attention may be a common sensory difference that is sufficient to account for weaker surround suppression across both ASD and schizophrenia, versus neurotypicals.

Talk 3

Atypical use of visuospatial context in schizophrenia, bipolar disorder, and subclinical populations: A meta-analysis

Victor Pokorny¹, Sam Klein¹, Collin Teich², Scott Sponheim^1,2, Cheryl Olman¹, Sylia Wilson¹; ¹University of Minnesota, ²Minneapolis Veterans Affairs Health Care System

Visual perception in people with psychotic disorders is thought to be minimally influenced by surrounding visual elements (i.e. visuospatial context). Visuospatial context paradigms have unique potential to clarify the neural bases of psychotic disorders because a) the neural mechanisms are well-studied in both animal and human models and b) generalized cognitive deficits are unlikely to explain altered performance. However, the published literature on the subject is conflicting and heterogeneous such that a systematic consolidation and evaluation of the published evidence is needed. We conducted a systematic review and meta-analysis of 46 articles spanning over fifty years of research. Articles included behavioral, fMRI and EEG reports in schizophrenia, bipolar disorder, and subclinical populations. When pooling across all paradigm types, we found little evidence of reduced use of visuospatial context in schizophrenia (Hedges’ g=0.20), and marginal evidence for bipolar disorder (g=0.25). The strongest evidence was observed for altered contrast perception paradigms in schizophrenia (g=0.73). With respect to subclinical populations, we observed immense heterogeneity in populations of interest, individual-difference measures, and study designs. Our meta-analysis provided surprisingly weak evidence for the prevailing view that psychotic disorders are associated with a general reduction in use of visuospatial context. Instead, we observed strongest evidence for a specific alteration in the effect of visuospatial context during contrast perception. We propose altered feedback to primary visual cortex as a potential neural mechanism of this effect.

Talk 4

A novel somato-visual functional connectivity biomarker for affective and non-affective psychosis

Brian Keane¹, Yonatan Abrham¹, Michael Cole², Brent Johnson¹, Carrisa Cocuzza³; ¹University of Rochester, ²The State University of New Jersey, ³Yale University

People with psychosis are known to exhibit thalamo-cortical hyperconnectivity and cortico-cortical hypoconnectivity with sensory networks, however, it remains unclear if this applies to all sensory networks, whether it impacts affective and non-affective psychosis equally, or whether such differences could form the basis of a viable biomarker. To address the foregoing, we harnessed data from the Human Connectome Early Psychosis Project and computed resting-state functional connectivity (RSFC) matrices for healthy controls and affective/non-affective psychosis patients who were within 5 years of illness onset. Primary visual, secondary visual (“visual2”), auditory, and somatomotor networks were defined via a recent brain network partition. RSFC was determined for 718 regions (358 subcortical) via multiple regression. Both patient groups exhibited cortico-cortical hypoconnectivity and thalamo-cortical hyperconnectivity in somatomotor and visual2 networks. The patient groups were similar on every RSFC comparison. Across patients, a robust psychosis biomarker emerged when thalamo-cortical and cortico-cortical connectivity values were averaged across the somatomotor and visual2 networks, normalized, and subtracted. Four thalamic regions linked to the same two networks disproportionately drove the group difference (p=7e-10, Hedges’ g=1.10). This “somato-visual” biomarker was present in antipsychotic-naive patients and discoverable in a 5 minute scan; it could differentiate psychosis patients from healthy or ADHD controls in two independent data sets. The biomarker did not depend on comorbidities, had moderate test-retest reliability (ICC=.59), and could predict patient status in a held-out sample (sensitivity=.66, specificity=.82, AUC=.83). These results show that- across psychotic disorder diagnoses- an RSFC biomarker can differentiate patients from controls by the early illness stages.

Talk 5

Abnormal oculomotor corollary discharge signaling as a trans-diagnostic mechanism of psychosis

Beier Yao^1,2,3, Martin Rolfs⁴, Rachael Slate⁵, Dominic Roberts³, Jessica Fattal⁶, Eric Achtyes^7,8, Ivy Tso⁹, Vaibhav Diwadkar¹⁰, Deborah Kashy³, Jacqueline Bao³, Katharine Thakkar³; ¹McLean Hospital, ²Harvard Medical School, ³Michigan State University, ⁴Humboldt University, ⁵Brigham Young University, ⁶Northwestern University, ⁷Cherry Health, ⁸Western Michigan University Homer Stryker M.D. School of Medicine, ⁹The Ohio State University, ¹⁰Wayne State University

Corollary discharge signals (CD) are “copies” of motor signals sent to sensory areas to predict the corresponding input. Because they are used to distinguish actions generated by oneself versus external forces, altered CDs are a hypothesized mechanism for agency disturbances in psychosis (e.g., delusion of alien control). We focused on the visuomotor system because the CD relaying circuit has been identified in primates, and the CD influence on visual perception can be quantified using psychophysical paradigms. Previous studies have shown a decreased influence of CD on visual perception in (especially more symptomatic) individuals with schizophrenia. We therefore hypothesized that altered CDs may be a trans-diagnostic mechanism of psychosis. We examined oculomotor CDs (using the trans-saccadic localization task) in 49 participants with schizophrenia or schizoaffective disorder (SZ), 36 psychotic bipolar participants (BPP), and 40 healthy controls (HC). Participants made a saccade to a visual target. Upon saccade initiation, the target disappeared and reappeared at a horizontally displaced position. Participants indicated the direction of displacement. With intact CDs, participants can remap the pre-saccadic target and make accurate perceptual judgements. Otherwise, participants may use saccade landing site as a proxy of pre-saccadic target. We found that both SZ and BPP were less sensitive to target displacement than HC. Regardless of diagnosis, patients with more severe positive symptoms were more likely to rely on saccade landing site. These results suggest a reduced influence of CDs on visual perception in SZ and BPP and, thus, that altered CD may be a trans-diagnostic mechanism of psychosis.

Talk 6

The four factors of visual hypersensitivity: definition and measurement across 16 clinical diagnoses and areas of neurodiversity

Alice Price¹, Petroc Sumner¹, Georgie Powell¹; ¹Cardiff University

Subjective sensitivity to visual stimuli, including repeating patterns and bright lights, is known to associate with several clinical conditions (e.g., migraine, anxiety, autism), and also occurs in the general population. Anecdotal reports suggest that people might be sensitive to different types of visual stimuli (e.g., to motion vs lights). The visual Cardiff Hypersensitivity Scale-Visual (CHYPS-V) was developed to define and measure the different factors of visual hypersensitivity, using questions which focus upon functional impact rather than affective changes. Across five samples (n > 3000), we found four highly replicable factors using bifactor modelling. These were brightness (e.g., sunlight), repeating patterns (e.g., stripes), strobing (e.g., light flashes), and intense visual environments (e.g., supermarkets). The CHYPS-V and its subscales show very good reliability (α > .80, ω > .80) and improved correlations with measures of visual discomfort. We also used the CHYPS-V to delineate how these factors may differentiate clinical diagnoses and areas of neurodiversity from each other, and from the general population. Differences from individuals reporting no clinical diagnoses were most pronounced for the intense visual environments subscale, with individuals reporting a diagnosis of autism, fibromyalgia, or persistent postural perceptual dizziness (PPPD) scoring highest. Whilst many conditions showed a similar pattern of visual sensitivity across factors, some conditions (e.g., migraine, PPPD) show evidence of condition specific sensitivities (e.g., to pattern, or to strobing). Further to identifying the factor structure of visual hypersensitivity, CHYPS-V can be used to help investigate underlying mechanisms which give rise to these differences in visual experience.

< Back to 2024 Symposia

Critical Perspectives On Vision Science: Towards Unbiasing Our Methods and Role in Knowledge Production

< Back to 2023 Symposia

Symposium: Friday, May 19, 2023, 12:00 – 2:00 pm, Talk Room 1

Organizers: Eline Kupers¹, Kathryn Graves², Kimele Persaud³; ¹Stanford University, ²Yale University, ³Rutgers University
Presenters: Sholei Croom, Pawan Sinha, Jasmine Kwasa, Joel E Martinez, Vassiki S Chauhan

Reckoning with a global pandemic and widespread social inequality has resulted in increased consciousness around issues such as racism, sexism, ableism, and discrimination against the LGBTQ+ community. As scientists, academics and industry professionals, our work operates within the larger context of these societal issues. Therefore, we must consider how our science perpetuates or mitigates the systemic oppressions that those most marginalized among us struggle against. The failure to use a critical lens in understanding how these systems impoverish our science can pose the risk of our science reifying oppressive systems. This symposium, organized by a team of historically underrepresented researchers and their allies, brings together complementary perspectives to call attention to how racial, ethnic, gender, and other systemic biases are perpetuated in our current methods and practices, and discuss approaches to ameliorate them in the scientific cycle. Specifically, our theoretical frameworks, visual stimuli, and data collection—both behavioral and neuroimaging. Speakers will present recent empirical and theoretical work in service of important questions such as: What are the bottlenecks in making our human participant pool more inclusive? How do our visual stimuli—a critical component of our scientific hypotheses—reflect historic and structural imbalances in our society? How does the way we study human perception affect how we perceive people? And how may these representations reflect and justify social oppression, or motivate social change? We argue that addressing systematic biases in science and higher education institutions is not only a moral obligation, but an epistemic one: By critically examining our tools and frameworks, and making a conscious effort to promote equity, our science will become more effective, innovative, and impactful. By providing a platform to leaders in our field who are taking this challenge head-on, we hope that more researchers across the vision science community will feel empowered to enact much-needed change. This symposium will start with a brief overview of the history of vision science, delivered by one of the organizers, Sholei Croom. Four speakers will then present their work in 20-minute presentations, plus 5 minutes for audience questions. Speakers will engage in different aspects of our overarching theme, focusing on current disparities as well as suggesting specific solutions. A final talk will be given by Vassiki Chauhan (one of the organizers), to encourage the audience in ways they can promote diversity and inclusivity in their own research and scientific community.

Presentations

Making the Case for Critical Vision Science: Beyond Diversity, Equity and Inclusion

Sholei Croom¹; ¹Johns Hopkins University

The principles of “Diversity, Equity and Inclusion” (DEI) have become standard across industry and academic spaces to promote awareness and advocacy around issues of identity. Many institutions, including VSS, have dedicated DEI committees whose focus is to foster diversity in communities that have historically been homogenous. While such initiatives have certainly led to positive institutional changes and more cultural competency, a primary focus on diversity can obfuscate—rather than illuminate—the myriad ways in which power dynamics shape our field. Further, fixating on inclusion as the remedy to oppressive structures ignores opportunities within our research practices to promote social justice. In this respect, this symposium urges our community to adopt a more critical frame. Borrowing from contemporary perspectives in critical psychology, philosophy, and sociology, this introductory talk explores the premise that our current methods and theoretical frameworks in vision science reflect back the social conditions in which they are produced. Rather than framing structural forces as external to our scientific practice, revisiting the history of vision science reveals that such forces necessarily inform the way we perform our research. From the advent of psychophysics, to technological advances in neuroscience, to the cognitive revolution and the subsequent rise of computational modeling, each step in the intellectual history of our field has been shaped by ideological and socio-historical factors. By elevating this perspective, we hope that researchers in our vision science community can see our field in a new light; one that embraces rather than ignores context in service of positive social change.

Looking Beyond Parochial Participant Pools

Pawan Sinha¹; ¹Massachusetts Institute of Technology (MIT)

Our inclusion criteria for study participants are typically strongly biased by geographic and cultural locality. This is understandably dictated by convenience and logistics, but the upshot is that we end up studying those who are most like us. In holding up our studies as general contributions to the study of cognition, the unstated assumption is that we are adequately representative of the human species at large. In succumbing to this hubris, we not only run the risk of gathering data that do not, in fact, capture a general view of human cognition, but also may miss out on novel opportunities that exist beyond our local catchment areas. I shall present an initiative from my lab, Project Prakash, that illustrates some of the benefits that can accrue by going beyond the constraint of parochiality. The project has proactively enlisted participation from marginalized populations in the Global South and, in doing so, has been able to pursue scientific avenues that would not otherwise have been accessible. I shall also discuss the challenges inherent in operationalizing efforts of this kind, potential approaches to overcome them, and ethical considerations that must necessarily be addressed. The overall takeaway is that although moving beyond parochial participant pools can be difficult, the potential benefits of doing so make it worth the effort.

Addressing Racial and Phenotypic Bias in Human Neuroscience Methods

Jasmine Kwasa¹; ¹Carnegie Mellon University

Typical EEG systems, the standard of care for neurological monitoring and a popular modality for vision sciences, do not work well for individuals with the coarse, dense, and curly hair common in the Black population (Etienne et al., 2020; IEEE EMBC). With more than 1 billion individuals of African descent across the globe, this not only compromises neurological care for a significant portion of the population, but also excludes these groups from basic neuroscience research studies. Our team developed the first solution to this problem by creating Sèvo Systems, a simple yet effective set of devices that leverage the strength of braided hair to improve scalp contact during brain recordings in individuals with coarse, dense, and curly hair. In this talk, I will briefly describe the Sèvo system and outline our ongoing assessments of its effectiveness in both research and clinical settings. Our work is the first step towards mitigating phenotypic biases embedded in this popular technology that may lead to misunderstandings of brain science and the exclusion of marginalized groups in human neuroscience and psychology research. I will also speak to other examples of phenotypic bias in neurotechnologies that we are seeking to improve at Carnegie Mellon including functional near-infrared spectroscopy (fNIRS) and pulse oximetry. I will outline ways that vision scientists can join the cause and use equitable and inclusive methodologies based on published work (Webb et al, 2022; Nature Neuro) and my personal experience in preparing different hair textures for neuroscience research.

Facecraft: Race Reification in Psychological Research with Faces

Joel E Martinez¹; ¹Harvard University

Faces are socially important surfaces of the body upon which various meanings are attached. The widespread physiognomic belief that faces inherently contain socially predictive value is why they make a generative stimulus for perception research. However, critical problems arise in studies that simultaneously investigate faces and race. Researchers studying race and racism inadvertently engage in various research practices that transform faces with specific phenotypes into straightforward representatives of their presumed race category, thereby taking race and its phenotypic associations for granted. I argue that research practices that map race categories onto faces using bio-essentialist ideas of racial phenotype constitute a form of racecraft ideology, whose dubious reasoning presupposes the reality of race and mystifies the causal relation between race and racism. In considering how to study racism without reifying race in face studies, this talk describes how these practices reproduce racecraft ideology and impair theoretical inferences, then explores preliminary ideas for counter-practices.

Scientists in Context

Vassiki S Chauhan¹; ¹Barnard College

This final talk will be a reflection on the topics that have been discussed throughout the course of the symposium. We have collectively recognized the importance of putting issues about representation at the forefront of scientific discourse, mapped some of the current disparities in scientific practice, and heard about necessary and creative ways to address these imbalances. There are certain subdisciplines of vision science where these issues are more apparent than others, but regardless of the discipline we work in, the society within which scientific inquiry occurs shapes how it occurs. To conclude the discussion, I will describe how lived experience shapes our ability to participate in the scientific process and the role scientists can and should play in society. I will go over some existing resources that practitioners at all levels of academia can benefit from in centering equitability and fairness in their work and their lives.

< Back to 2023 Symposia

Protected: Posters for Signs

Announcements

VSS recognizes its founders, Ken Nakayama and Thomas Sanocki for the 25th Anniversary.

VSS honors Hoover Chan with 25th Anniversary Lifetime Service Award

Information for International Travelers is now available. You can request a Letter of Invite from your MyVSS account.

Thank you to our 2025 Sponsors and Exhibitors.

Vote in the 2025 Board of Directors Election.

Tatiana Pasternak receives the 25th Anniversary Lifetime Achievement Award

Leyla Isik is awarded the 2025 Elsevier/VSS Young Investigator Award.

Jody Culham is awarded the 2025 Davida Teller Award.

J. Anthony Movshon is awarded the 2025 Ken Nakayama Medal for Excellence in Vision Science.