Friday, May 16, 2025, 8:00 – 10:00 am, Talk Room 1

Organizers: Oscar Woolnough¹, Alex L White²; ¹UTHealth Houston, ²Barnard College, Columbia University

Our ability to communicate information about the visual world with other people is dependent on interconnection between our brain’s visual and language networks. This is crucial for our ability to name objects and recall their abstract semantic associations, and also for our ability to read written words. This symposium will cover multiple aspects of the vision-language interface, including modulation of visual cortex activation by the demands of linguistic tasks, changes in visual cortex function and white matter connectivity as we learn to read, and representations of semantics in higher-level visual cortex. More…

A vision scientist walks into a clinic...

Friday, May 16, 2025, 8:00 – 10:00 am, Talk Room 2

Organizer: Benjamin Backus^1,2; ¹Vivid Vision Inc, ²SUNY College of Optometry (emeritus)

For Helmholtz, eye doctoring and vision science informed one another. Today, most vision scientists without a joint degree focus on basic research. In this symposium, vision scientists describe their research that has a clinical element. Aside from the science itself, there will be interesting facts about the ways vision can go wrong; practical tips for working with clinicians; and intimations about how new technologies are opening doors for vision scientists to help test, preserve, and improve people's vision. More…

Selective Visual Attention, Alive and Well!

Friday, May 16, 2025, 10:30 am – 12:30 pm, Talk Room 1

Organizers: Marisa Carrasco¹, Miguel Eckstein², Anna C Nobre³; ¹New York University, ²University of California Santa Barbara, ³Yale University

Visual attention is critical for perception, enabling prioritized and selective information processing. We discuss recent studies on attention in human and non-human primates, focusing on different types: spatial [Carrasco, Briggs, Eckstein], feature-based [Ling, Briggs], and temporal [Nobre], across various analysis levels and methods: psychophysics [Carrasco, Briggs, Nobre], neuroimaging [fMRI, Carrasco, Ling], MEG/EEG [Nobre], neurostimulation [TMS, Carrasco], multi-unit recording [Briggs], and computational [CNNs, Eckstein]. Each presentation illustrates how attention systems are defined, operationalized, and manipulated. We highlight the interplay among different analytical levels and methods, showcasing how they collectively inform our understanding of the psychological and neural substrates of attention. More…

25 Years of Seeing ‘Stuff’—Advances and Challenges in Material Perception

Friday, May 16, 2025, 10:30 am – 12:30 pm, Talk Room 2

Organizers: Vivian C. Paulun¹, Roland Fleming²; ¹Massachusetts Institute of Technology, ²Justus Liebig University Giessen and Center for Mind, Brain and Behavior (CMBB), Universities of Marburg, Giessen and TU Darmstadt

"Our world contains both things [e.g., chairs, apples, hammers] and stuff [e.g., snow, honey, cream cheese], but things tend to get the attention.” With his paper “On Seeing Stuff”, Adelson (2001) laid the foundation for the study of material perception. This symposium celebrates the upcoming anniversary of Adelson’s seminal publication by exploring the most exciting advances in the field of material perception through a variety of approaches and perspectives. Six talks will highlight the deep connections between material perception and many areas in vision science, e.g., object, shape, motion, color, scene perception and intuitive physics, and discuss its unique challenges. More…

From insects to fish to mammals: Active vision in non-primate organisms

Friday, May 16, 2025, 1:00 – 3:00 pm, Talk Room 1

Organizers: Lisa Kroell¹, Lisa Fenk¹; ¹Max Planck Institute for Biological Intelligence

The link between perception and action is extensively discussed in human and non-human primate vision. Among the remaining 99.98% of animal species, however, an entire universe of active visual behaviors waits to be discovered. In this symposium, we combine the expertise of six active vision specialists who investigate visual perception in behaving fruit flies, dung beetles, Monarch butterflies, zebrafish, ferrets and mice. Despite their diverse backgrounds, all speakers pursue a common aim: understanding how movements of the eyes, body, paws, wings, tails or fins shape and even support vision. More…

Model-optimized stimuli: more than just pretty pictures

Friday, May 16, 2025, 1:00 – 3:00 pm, Talk Room 2

Organizers: William Broderick¹, Jenelle Feather¹; ¹Center for Computational Neuroscience, Flatiron Institute

Experiments in vision science rely on designing the correct stimulus set to test the properties of the underlying representation. With data collection time inherently limited, choosing stimuli that get the most "bang for your buck" when collecting behavioral or neural data is imperative. In this symposium, we highlight recent progress in utilizing model-optimized synthetic stimuli to investigate properties of visual systems. Speakers will highlight how such stimuli can probe aspects of behavioral, neural, and computational model responses, demonstrating the power of model-optimized stimuli for improving our understanding of visual processing. More…

Symposium Submission Policies

The organizer must be a current 2025 member in good standing.
Invited speakers must register for the meeting but need not be members.
No speaker or organizer can participate in more than one symposium.
Speaker substitutions are not allowed.
If a symposium talk has more than one author, it must be presented by the first author.
Submitting a symposium proposal or speaking in a symposium does not prevent you from submitting an abstract for a talk or poster presentation at VSS.
Before submitting a proposal, organizers must ensure that all speakers are committed to participating in the symposium and registering for the meeting before submitting a proposal.

For questions about Symposium Submission Policies, please contact us.

Symposium Submission Guidelines

VSS is seeking proposals for symposia to be presented at the VSS 2025 Annual Meeting, which begins Friday, May 16, 2025. VSS 2025 will be a fully in-person meeting with no virtual components. Four to six symposia will be scheduled, each in a two-hour time slot.

Symposia can be organized along the lines of content or methodology, but in every case talks within a symposium should focus on broader conceptual themes than a typical VSS presentation.

Proposals are evaluated by the VSS Board of Directors based on the following criteria:

Scientific merit
Theoretical and/or methodological innovation
Timeliness
Breadth and appeal to a substantial number of VSS attendees
Lack of overlap with the regular program and recent symposia
Diversity of the speakers

Symposium Format

The recommended format is four to six talks. Talk time and Q&A time can be scheduled within the 2-hour session according to the organizers’ discretion (e.g., after individual talks, concentrated at the end of a session, or a mixture of the two approaches). Other formats will be considered, but proposals for other formats should include a clear rationale. Symposia often benefit from a diversity of perspectives and institutional affiliations. Proposals from young investigators are encouraged.

Symposium Information

The symposium is submitted using a multi-page form that includes information describing the symposium and the talks in the symposium. A symposium may have a maximum of three organizers. A minimum of four talks is required; up to six talks are allowed. Talks must be entered in the order they will be presented. It is best to collect information about the individual talks from their authors before you start the submission process.

As the symposium organizer, you must have prior approval from your talk presenters they consent to participate in this symposium (and no other symposium), and to register and attend VSS 2025 in person to present their talk. The organizer must also agree to the Symposium Policies and disclose any Conflicts of Interest.

Required symposium information:

Symposium title.
Brief description of the symposium (maximum 100 words). Appears on the symposium overview page.
Full description of the symposium (maximum 500 words). Appears on the symposium detail page.
Estimated attendance.
Name, affiliation, and contact information for the organizers (maximum 3 organizers).
Acknowledgements (optional).

Required information for each talk:

Talk title.
Talk abstract (maximum 250 words).
Author names and affiliations.
Email and country of citizenship of the talk presenter (first author).
The presenter must agree to an Ethics statement and disclose any Conflicts of Interest.

Organizer and Speaker Requirements

The symposium organizers (maximum of three) must be current VSS members (for 2025), but invited speakers are not required to be VSS members. All speakers are required to register for the meeting. Submitting a symposium proposal or speaking in a symposium does not prevent organizers or speakers from submitting an abstract for a talk or poster presentation at VSS. An individual may participate as organizer or speaker in only one symposium. The symposium organizer(s) may be a speaker in the symposium.

Organizers must ensure that all speakers are committed to participating in the symposium before submitting a proposal, and organizers must also ensure that speakers have not agreed to participate in more than one symposium. If a symposium talk has more than one author, it must be presented by the first author. For each speaker, please provide up to three references to published articles relevant to the proposed talk.

Per the VSS Disclosure of Conflicts of Interest Policy, speakers must reveal any commercial interests or other potential conflicts of interest that they have related to the work described. Any conflicts of interest must be declared on your title slide.

While preparing submissions, organizers are encouraged to use a checklist to ensure that their proposals meet the requirements.

Symposium Review

Proposals are evaluated by the VSS Board of Directors. In addition to considering the VSS 2025 focus outlined above, our criteria include scientific merit, timeliness, theoretical innovation and breadth, methodological innovation, lack of overlap with the regular program and recent symposia, and diversity of the speakers.

Submission Schedule

Submissions Open: October 15, 2024
Submissions Close: November 15, 2024
Notification of Accepted Symposia: November 26, 2024

Submitting a Symposium

To submit a symposium, Log in to your MyVSS Account or Create a New MyVSS Account, pay for your 2025 membership, and then click the Submit a Symposium button.

For questions about Symposium Submissions, please contact us at .

Continuous psychophysics

< Back to 2023 Symposia

Symposium: Friday, May 19, 2023, 5:00 – 7:00 pm, Talk Room 2

Organizers: Johannes Burge¹, Kathryn Bonnen²; ¹University of Pennsylvania, ²Indiana University
Presenters: Johannes Burge, Constantin Rothkopf, David Burr, Clara Mestre, Pascal Mamassian

This symposium will highlight continuous psychophysics, a recent and potentially paradigm-shifting methodological advance in the science of perception and action. While traditional psychophysics (e.g. forced-choice, two-alternative tasks) usually acquires measurements on the time-scale of seconds, the computations driving perception and action often take place on the time-scale of milliseconds. Continuous psychophysics closes this temporal gap, providing information about temporal dynamics with millisecond-scale precision. Thus, behavior can be measured at the time-scale at which the computations unfold. There are also a number of additional benefits to continuous psychophysics. It facilitates the collection of large volumes of high-quality data in a short period of time, opening up the study of questions that may otherwise be impractical. It also substantially expands the population of participants from which psychophysical data can be collected, all with low-cost, easy-to-obtain equipment. Excitement about continuous psychophysics has spread quickly. It is now being applied to a startling array of topics in sensory-perception, visuomotor control, development, cognition, and memory. And it is being used to study non-traditional and/or non-verbal participant populations: babies, multiple-sclerosis patients, and non-human-primates. This symposium will showcase prominent examples of many problems that can be tackled with the continuous psychophysics approach, in hopes of increasing awareness and maximizing its contribution to science. Five talks will introduce the method and a sampling of current applications. Burge & Bonnen will describe the development of continuous psychophysics, present recent applications of it in their labs, and—in a rare VSS treat—conduct a rapid in-lecture audience-involved web-based data collection demo that will demonstrate the power and ease of the approach. Constantin Rothkopf will discuss recently developed control-theoretic tools for analyzing data which enable practitioners to explicitly model the effects on tracking performance of movement costs and mistaken assumptions about stimulus dynamics. David Burr will demonstrate how continuous psychophysics can rapidly reveal both the center-surround properties of motion processing, and provide estimates of the processing time required by grouping mechanisms in the context of numerosity judgments. Clara Mestre will discuss applications investigating visual development in human infants and young children, with specific emphasis on how optical mismatches between the eyes interfere with the normal development of oculomotor function and binocular vision. And Pascal Mamassian will speak on the application of continuous psychophysics to the study of sensorimotor confidence, a topic of rapidly increasing interest to the metacognition community. Together, these talks will provide a survey of the current research areas benefiting from continuous psychophysics and will help to stimulate ideas for how the community can benefit from it in the future.

Presentations

Continuous psychophysics: Past, Present, and Future

Johannes Burge¹, Kathryn Bonnen²; ¹University of Pennsylvania, ²Indiana University

The kickoff lecture of the symposium will introduce continuous psychophysics, describe recent results from the Burge and Bonnen laboratories, and conclude with a real-time, audience-involved, data-collection and data-analysis demonstration. Continuous psychophysics is a new methodological approach gaining traction in vision science. Participants continuously track a rapidly changing target with an effector (e.g. a mouse, a finger, the eyes) while the target and response are monitored over time. Traditional psychophysical variables can be estimated rapidly, while simultaneously collecting information about temporal processing dynamics that is typically absent from traditional datasets. Strikingly, the value of the variables estimated with continuous psychophysics often tightly track those estimated with traditional forced-choice methods. Continuous psychophysics is positioned to become a powerful tool for measuring perception and behavior. Bonnen, Burge and others (including those speaking in the symposium) have used continuous psychophysics to investigate a variety of topics. Bonnen has used it to obtain accurate estimates of location sensitivity, and show that processing motion in depth is slower than processing motion in the frontal plane. Burge has used continuous psychophysics to obtain estimates of visual processing latency and the duration of temporal integration having millisecond-scale precision. We will briefly discuss these results and preview the work of other symposium participants. The real-time demonstration will engage symposium attendees in data collection across two experimental conditions. We will show how continuous psychophysics can be used to estimate temporal response functions, temporal integration windows, and the differences in temporal processing across conditions.

Putting perception into action: Inverse optimal control for continuous psychophysics

Constantin Rothkopf¹, Dominik Straub¹; ¹Technical University of Darmstadt, Germany

Psychophysical methods are a cornerstone of vision science, psychology, and neuroscience where they have been used to quantify behavior and its neural correlates for a vast range of mental phenomena. Their power derives from the combination of controlled experiments and rigorous analysis through signal detection theory. Unfortunately, they require many tedious trials and preferably highly trained participants. A recently developed approach, continuous psychophysics, promises to transform the field by abandoning the rigid trial structure involving binary responses and replacing it with continuous behavioral adjustments to dynamic stimuli. However, because behavior now unfolds within the perception and action cycle, classic signal detection theory is not applicable. In this talk, we present our recently developed computational analysis framework for continuous psychophysics based on Bayesian inverse optimal control. We start by formalizing an ideal observer account of these tasks and then move to ideal actors. In the tradition of rational analysis, we subsequently allow for subjects being influenced by internal cognitive costs and, finally, that subjects potentially possess false beliefs about experimental stimulus dynamics. Carrying out inference over these models and applying rigorous model comparison allows a principled explanation of individuals’ behavior and reconciles descriptive with normative models. We show via simulations and on previously published data that this recovers perceptual thresholds and additionally estimates subjects’ action variability, internal behavioral costs, and subjective beliefs. Taken together, we provide further evidence for the importance of including acting uncertainties, subjective beliefs, and, crucially, the intrinsic costs of behavior, even in experiments seemingly only investigating perception.

Continuous tracking as a general tool to study the dynamics and context effects of human perception

David Burr¹, Pierfrancesco Ambrosi¹, Guido Marco Cicchini²; ¹University of Florence, Florence, Italy, ²National Research Council, Pisa, Italy

Continuous tracking is a newly developed technique that measures the correlation between a randomly changing stimulus property (usually 2-D position) and the response of participants tracking the object. This technique can in principle by generalised to measure any dynamic aspect of a stimulus, to provide useful information not only about sensitivity, but also dynamics and contextual effects. Here we apply it to motion and numerosity. Participants tracked the direction of motion of 1-D noise moving randomly over a randomly moving background, target and background following independent motion trajectories. Observer responses correlated positively with the target motion, and negatively with the background motion, demonstrating and quantifying surround inhibition of motion. Separately, participants tracked on a number-line the perceived numerosity of a cloud of dots. Some dot-pairs were connected by lines, producing an illusory reduction of the apparent numerosity of the dot clouds: both the number of dots and the proportion connected by lines varied over time, following independent random walks. The tracking correlations showed that grouping dots by connecting lines caused a robust underestimation of numerosity. The tracking response to the illusion created by connection was about 150 ms slower than to the physical numerosity, suggesting that this time was utilised in processing the grouping effect. Finally, we developed an ideal observer model that closely models human results, providing a generalized framework for modelling the effects on tracking data, and to study the divergence of human participants from ideal behavior.

Applications of continuous tracking to typical and atypical visual development

Clara Mestre¹, Colin Downey¹; ¹Indiana University

Continuous tracking provides a time-efficient paradigm for assessing the reflex behavior of human infants and children. We have used this approach to study the development of simultaneously measured vergence and accommodation responses, demonstrating robust almost adultlike responses to movement of a screen in depth by 3 months of age. In the context of atypical development, we have used computational simulation of optical defocus to understand its impact on the eye alignment and vergence responses of children in the critical period of binocular development. While matched defocus in the two eyes had only a mild effect on vergence tracking performance, unilateral defocus clearly disrupted the ability to continuously realign the eyes in response to small random changes in disparity, especially for the eye with the defocused image. These studies have important implications for the two age groups studied and for adults experiencing symptoms after presbyopia correction with monovision. The continuous tracking approach can be used in less than 2 minutes of testing to assess factors placing infants at risk for atypical development.

Visuo-motor confidence

Pascal Mamassian¹, Shannon Locke¹, Alexander Goettker², Karl Gegenfurtner²; ¹CNRS & École normale supérieure, Paris, France, ²Justus-Liebig University Giessen, Giessen, Germany, ³New York University, New York, NY

Confidence refers to our ability to evaluate the validity of our own performance and there are multiple benefits of successful estimates. In particular, reliable confidence could be used as an internal feedback signal to update our model of the world. In perception, confidence sensitivity can be relatively high, and in some visual tasks, as high as one would predict from the ideal confidence observer. We tested whether these perceptual results would generalize to two visuo-motor conditions. In the first condition, participants used their hand to track the center of a cloud of dots that followed an unpredictable horizontal trajectory. After tracking for several seconds, they reported their confidence as being better or worse than their average performance. The analysis of these confidence judgments indicated that participants were able to monitor their tracking performance, but not optimally. We replicated this manual tracking task in a second condition where participants had to track the cloud of dots with their eyes. Here again, confidence sensitivity was above chance, but barely so. Overall, it appears that human participants have only limited access to their visuo-motor performance, and they are comparatively worse than for purely visual tasks. This limitation might reflect the cost of fast and accurate visuo-motor tracking.

< Back to 2023 Symposia

Object representations in the parietal cortex

< Back to 2023 Symposia

Symposium: Friday, May 19, 2023, 5:00 – 7:00 pm, Talk Room 1

Organizers: Erez Freud¹, Maryam Vaziri Pashkam², Yaoda Xu³; ¹York University, ²National Institute of Mental Health, ³Yale University
Presenters: Maryam Vaziri-Pashkam, Vladislav Ayzenberg, Anne B. Sereno, Erez Freud, Stefania Bracci, Yaoda Xu

Although the primate ventral visual cortex is well known for its role in visual object processing and representations, a host of recent studies have reported that robust visual object information exists in the primate parietal cortex as well. What kind of object information is present in the parietal cortex, and how does it relate to visual representations formed in the ventral visual cortex? What is its functional significance in perception, tasks and visuomotor planning? How is parietal object representation formed during development? The proposed symposium aims to tackle these critical questions and to inform an updated view of the role of the parietal cortex in object vision and cognition by surveying and incorporating recent evidence. This will be accomplished by bringing together researchers working on this topic from different perspectives and using a diverse set of approaches, including visual psychophysics, brain imaging, human neuropsychology, monkey neurophysiology, and computational modeling. Each speaker will present their findings related to object representations in the parietal cortex and provide answers to the questions raised above. They will also share their thoughts on what they think are the critical and unanswered questions, whether it is possible to form a unified view of the role of the parietal cortex in object processing, and what is needed to push the field forward. There will be a total of six speakers, including four female and two male speakers from North America and Europe at both early and mid-careers. Erez Freud will give a short introduction to the symposium Maryam Vaziri Pashkam will present a data-driven approach to explore object responses in the two pathways. Vlad Ayzenberg will discuss fMRI and computational evidence on the contribution of the parietal cortex to object configural perception. Anne Sereno will discuss physiological and computational findings that suggest that object representations in the parietal cortex are independent of ventral stream representations and are necessary for accurate and optimal performance on object and spatial tasks. Erez Freud will provide a developmental perspective by discussing evidence on the emergence of parietal visual representations under typical and atypical development. Stefania Bracci will discuss the role of behavioral goals in shaping the object space in the two visual pathways. Yaoda Xu will further describe the adaptive nature of parietal object representations by showing how attention, task and working memory may shape these representations.

Presentations

Two pathways for processing object shapes

Maryam Vaziri-Pashkam¹; ¹Laboratory of Brain and Cognition, National Institute of Mental Health

The current dominant view on object processing entails that abstract object shape is represented in the ventral visual stream (occipito-temporal cortex). This bias towards the ventral stream has diverted the attention of the field from rigorous study of object responses in the dorsal stream (parietal cortex). To fill the gap in our understanding of object responses in the dorsal stream, we ran a series of experiments using functional MRI to comprehensively study object responses in the human dorsal stream and compared them to those in the ventral stream. We found robust object responses in both streams. Responses in the dorsal parietal cortex, similar to those in the ventral occipitotemporal cortex, were tolerant to changes in position, size, low-level features, and attentional tasks and were sensitive to both static and dynamic cues in the stimuli. In a data-driven approach, we compared the representational structures across the brain. The structure of the visual system, including the visual hierarchy and the dorsal-ventral distinction, emerged from this analysis. Dorsal Stream Regions differed from those in the ventral pathway and early visual cortices. In fact, the structure of the visual hierarchy and the dorsal-ventral stream distinction could be recovered through this bottom-up analysis. The rediscovery of the two-pathway structure using an entirely bottom-up approach demonstrates that first, two pathways exist in the brain for processing object shape information, and second, the dorsal pathway represents features in the visual input distinct from those represented in the ventral pathway.

Dorsal and ventral visual pathways: An expanded neural framework for object recognition

Vladislav Ayzenberg¹, Marlene Behrmann^1,2; ¹Neuroscience Institute and Psychology Department, Carnegie Mellon University, ²Department of Ophthalmology, University of Pittsburgh

Human object recognition is supported by a robust representation of shape that is tolerant to variations in an object’s appearance. Such ‘global’ shape representations are achieved by describing objects via the spatial arrangement of their local features, or structure, rather than by the appearance of the features themselves. Yet, despite its importance for object recognition, the neural mechanisms that support such robust representations of shape remain poorly understood. Here, I will present evidence that the dorsal visual pathway – the primary substrate underlying visuospatial processing and action – plays a crucial role in computing shape information for object recognition. Using fMRI, we find that the dorsal pathway contains regions selective for global shape, which are independent of regions that compute other properties represented by the dorsal pathway (e.g., allocentric relations). Moreover, the multivariate response within dorsal regions is sufficient to both categorize objects at levels comparable to the ventral pathway, as well as mediate representations of shape in the ventral pathway. These results are consistent with an emerging view of object processing, in which complete object representations are formed through the interactions of dorsal and ventral pathways. In this updated view, the ventral pathway is best described as a basis set of local image features, and shape information is, instead, computed by the dorsal pathway. In the final portion of the talk, I will review evidence for this new framework by drawing on neuroimaging data from different levels-of-analysis (single-unit, population-coding level), as well as data from neuropsychology patients.

Independence, not interactions: What simulations suggest about ventral and dorsal pathways.

Anne B. Sereno^1,2, Zhixian Han²; ¹Psychological Sciences Department, Purdue University, ²Weldon School of Biomedical Engineering, Purdue University

Extensive research suggests visual processing proceeds along two relatively segregated streams into temporal and parietal cortices, important for object and spatial processing, respectively. However, recent evidence suggests that object and spatial processing is present in both visual pathways. The functional significance of the presence of object and spatial properties in both pathways is not yet fully understood. Findings using a population decoding approach in physiology suggest there are fundamental differences between ventral and dorsal processing of both shape and space. Using artificial neural networks, we try to address whether the representations of object in dorsal stream and space in ventral stream play a functional role in spatial and object recognition, respectively. Our simulation results show that a model ventral and a model dorsal pathway, separately trained to do object and spatial recognition, respectively, each actively retained information about both identity and space. In addition, we show that these networks retained different amounts and kinds of identity and spatial information. Finally, we show that this differently retained information about object and space in a two-pathway model (as opposed to single-pathway model) was necessary to accurately and optimally recognize, localize, and, in multiple object displays, successfully bind objects and locations. A computational approach provides a framework to test the functional consequences of two independent visual pathways (with no cross connections) and shows that the findings can provide insight into recent contradictory physiological findings. Critical unanswered questions and implications for current and future understanding of object and spatial processing will be discussed.

Object representations in the dorsal pathway are subject to a protracted and susceptible developmental trajectory.

Erez Freud¹; ¹Department of Psychology and the Centre for Vision Research, York University

The dorsal visual pathway extends from the occipital lobe to the parietal lobe and generates object representations that promote different visual functions, including visually guided actions, shape recognition and spatial processing. In my talk, I will address two outstanding questions. First, how do dorsal pathway representations emerge throughout development? Second, how does the emergence of these representations modulate perception, action and the dissociation between these functions? To tackle these questions, we conducted a series of behavioral and neuroimaging investigations with typically developed children alongside individuals with neurodevelopmental disorders that affect their early visual experience (i.e., amblyopia) or cortical organization (i.e., cortical resections, ASD). Across the different studies, we find evidence that object representations in the dorsal pathway are not matured even in school-age children. Additionally, we show that visuomotor behaviors, associated with computations carried out by the dorsal pathway, are more susceptible to atypical development than perceptual behaviors. This greater susceptibility was also evident in terms of a reduced functional dissociation between perception and action in children with neurodevelopmental conditions. To conclude, our findings suggest that object representations derived by the dorsal pathway are subject to protracted development. This longer maturation rate might account for the susceptibility of these representations, and their associated behaviors, to neurodevelopmental disorders.

The role of behavioral goals in shaping object representations in the two visual pathways.

Stefania Bracci¹, Hans Op de Beeck²; ¹Center for Mind/Brain Sciences – CIMeC, University of Trento, Rovereto (TN), Italy, ²KU Leuven, Leuven Brain Institute, Brain & Cognition Research Unit, Leuven, 3000, Belgium.

Classic theories in object vision propose a division between ventral and dorsal stream computations. This standpoint has influenced the way scientists have studied object perception characterizing, in the ventral pathway, representations that support recognition of “what” we see, and in the dorsal pathway, spatial- and action-related representations that support “how” we interact with objects. Recent studies, however, show that this distinction might not fully capture the complexity of ventral and dorsal stream object representations. What if instead to understand this complexity we need to consider object vision in the context of the full repertoire of behavioral goals that underlie human behavior, running far beyond the “what” and “where/how” distinction? In this talk, I will discuss the role of behavioral goals in shaping the organization of object space in the two visual pathways. Complementary, I will show that although object-trained deep neural networks (DNNs) equate humans at object recognition, they struggle at explaining the richness of representational content observed in the visual cortex. I conclude by suggesting that understanding how the brain represents objects needs to not separate object-specific computations from human behavior goals.

Adaptive visual object representation in the human posterior parietal cortex

Yaoda Xu¹; ¹Psychology Department, Yale University

At any given moment, only a fraction of the visual input may be useful to guide our thoughts and actions. Having the entire visual input available could be distracting and disruptive. To extract the most useful information at each moment, visual processing needs to be selective and adaptive. Here I will show that, compared to visual object representations in the human occipito-temporal cortex (OTC), those in the human posterior parietal cortex (PPC) are more adaptive to serve the task at hand. Specifically, attention and task alter how objects are represented in PPC, with both the differences among the objects and task determining the object representational structure in PPC and with object representations in PPC better tracking the perceived object similarity when objects are task-relevant. Similarly, while the representations of a pair of objects shown together may be predicted by the representations of each constituent object shown in isolation in OTC, this is true in PPC only when the task context is equated. In visual working memory tasks, PPC object representations have been shown to be more resilient to distraction. A recent study shows that while VWM representations of target objects in OTC are entangled with those of distractor objects, those in PPC, however, are more distractor invariant. Such a representational scheme could potentially support PPC’s resilience to distraction in VWM tasks. The adaptive nature of visual object representation in PPC thus allows PPC to play a unique and significant role in supporting goal-directed visual information processing in the primate brain.

< Back to 2023 Symposia

The development of categorical object representations: bridging visual neuroscience and deep learning

< Back to 2023 Symposia

Symposium: Friday, May 19, 2023, 2:30 – 4:30 pm, Talk Room 2

Organizers: Marieke Mur¹; ¹Western University
Presenters: Heather L Kosakowski, Michael J Arcaro, Katharina Dobs, Talia Konkle, Marieke Mur

It is well established that object representations in high-level primate visual cortex emphasize categories of ecological relevance such as faces and animals. Yet how these categorical object representations emerge over the course of development is less well understood. The aim of this symposium is to integrate the latest work on visual object learning in the fields of developmental and computational neuroscience. Speakers will address the following key questions: When do category-selective responses emerge during development? What can we learn from the structure of visual experience alone? What constraints may additionally shape the development of categorical object representations in visual cortex? Two of our speakers are pioneers in developmental visual neuroscience, and the other two are at the forefront of developing deep learning models of object vision. Talks will highlight recent experimental work in human and nonhuman primates on developmental trajectories of object category formation in visual cortex as well as recent computational work on the learning objectives that drive object category formation. Heather Kosakowski will discuss her work on awake fMRI in human infants, which shows that category-selective regions for faces, places, and bodies are present in infants as young as 2-9 months of age. She will argue that the early emergence of category-selective regions should be understood within the broader context of cognitive development, which relies on parallel development of brain regions outside of ventral temporal cortex. Michael Arcaro will subsequently discuss his work on the development of the human and nonhuman primate visual systems. His findings suggest that categorical object representations arise from a proto-architecture present at birth that is modified by visual experience to become selectively responsive to frequently encountered elements of the environment. We will then shift from an empirical to a computational focus. Katharina Dobs will discuss her work on modeling human face perception with deep artificial neural networks. Her results show that networks need both visual experience with faces and training for face identification to show behavioral signatures of human face perception. Talia Konkle will subsequently discuss her work on building an integrated empirical-computational framework for understanding how we learn to recognize objects in the world. Her work shows that many artificial neural networks have a similar capacity for brain predictivity, including fully self-supervised visual systems with no specialized architectures, but that no networks yet capture all the signatures of the data. The fifth and final talk will conclude with a synthesis of the presented work to stimulate discussion among the speakers and audience. Speakers will be allotted 17 minutes for their talks followed by 5 minutes of question time. We will end with a 10-minute general discussion period. Bringing these speakers together will yield fruitful discussions on the current challenges and future directions for bridging developmental and computational approaches to investigating visual object learning.

Presentations

Parallel development of cortical regions that support higher-level vision and cognition

Heather L Kosakowski¹; ¹Harvard University

After birth, infants’ brains must parse large amounts of sensory input into meaningful signals. Traditional bottom-up, serial models of cortical development suggest that the statistical regularities in sensory input guide development of high-level visual categories. My work shows that infants have face-selective responses in the fusiform face area, scene-selective responses in the parahippocampal place area, and body-selective response in the extrastriate body area. Thus, under a bottom-up, serial account, 2- to 9-months of visual experience must be sufficient to develop category-selective regions. However, behavioral evidence that infants discriminate complex visual features within days of birth and use abstract knowledge to guide where they look poses a substantial problem for the bottom-up view. For example, shortly after birth infants discriminate familiar faces from visually similar unfamiliar faces and choose which faces to spend more time looking at. Consistent with these results, my recent work indicates that 2- to 4-month-old infants have face-selective responses in superior temporal sulcus and medial prefrontal cortex, regions that support social-emotional cognition in adults. Taken together, a parallel model of cortical development provides a better explanation of these data than traditional bottom-up serial models.

Topographic constraints on visual development

Michael J Arcaro¹; ¹University of Pennsylvania

Primates are remarkably good at recognizing faces and objects in their visual environment, even after just a brief glimpse. How do we develop the neural circuitry that supports such robust perception? The anatomical consistency and apparent modularity of face and object processing regions indicate that intrinsic constraints play an important role in the formation of these brain regions. Yet, the neonate visual system is limited and develops throughout childhood. Here, I will discuss work on the development of the human and nonhuman primate visual systems. These studies demonstrate that regions specialized in the processing of faces are the result of experience acting on an intrinsic but malleable neural architecture. Subcortical and cortical topographic connectivity play a fundamental role, providing an early scaffolding that guides experience-driven modifications. Within the visual system, this connectivity reflects an extensive topographic organization of visual space and shape features that are present at birth. During postnatal development, this proto-architecture is modified by daily experience to become selectively responsive to frequently encountered elements of the environment. Anatomical localization is governed by correspondences between maps of low-level feature selectivity and where in visual space these features are typically viewed. Thus, rather than constituting rigidly pre-specified modules, face and object processing regions instead reflect an architecture that builds on topographic scaffolds to learn and adapt to the regularities of our visual environment.

Using deep neural networks to test possible origins of human face perception

Katharina Dobs¹; ¹Justus-Liebig University Giessen

Human face recognition is highly accurate, and exhibits a number of distinctive and well documented behavioral and neural “signatures” such as the face-inversion effect, the other-race effect and neural specialization for faces. How does the remarkable human ability of face recognition arise in development? Is experience with faces required, and if so, what kind of experience? We cannot straightforwardly manipulate visual experience during development in humans, but we can ask what is possible in machines. Here, I will present our work testing whether convolutional neural networks (CNNs) optimized on different tasks with varying visual experience capture key aspects of human face perception. We find that only face-trained – not object-trained or untrained – CNNs achieved human-level performance on face recognition and exhibited behavioral signatures of human face perception. Moreover, these signatures emerged only in CNNs trained for face identification, not in CNNs that were matched in the amount of face experience but trained on a face detection task. Critically, similar to human visual cortex, CNNs trained on both face and object recognition spontaneously segregated themselves into distinct subsystems for each. These results indicate that humanlike face perception abilities and neural characteristics emerge in machines and could in principle arise in humans (through development or evolution or both) after extensive training on real-world face recognition without face-specific predispositions, but that experience with objects alone is not sufficient. I will conclude by discussing how this computational approach offers novel ways to illuminate how and why visual recognition works the way it does.

Leveraging deep neural networks for learnability arguments

Talia Konkle¹, Colin Connell¹, Jacob Prince¹, George Alvarez¹; ¹Harvard University

Deep neural network models are powerful visual representation learners – transforming natural image input into usefully formatted latent spaces. As such, these models give us new inferential purchase on arguments about what is learnable from the experienced visual input, given the inductive biases of different architectural connections, and the pressures of different task objectives. I will present our current efforts to collect the models of the machine learning community for opportunistic controlled-rearing experiments, comparing hundreds of models to human brain responses to thousands of images using billions of regressions. Surprisingly, we find many models have a similar capacity for brain predictivity – including fully self-supervised visual systems with no specialized architectures, that learn only from the structure in the visual input. As such, these results provide computational plausibility for an origin story in which domain-general experience-dependent learning mechanisms guide visual representation, without requiring specialized architectures or domain-specialized category learning mechanisms. At the same time, no models capture all the signatures of the data, inviting testable speculation for what is missing – specified in terms of architectural inductive biases, functional objectives, and distributions of visual experience. As such, this empirical-computational enterprise brings exciting new leverage into the origins underlying our ability to recognize objects in the world.

Bridging visual developmental neuroscience and deep learning: challenges and future directions

Marieke Mur¹; ¹Western University

I will synthesize the work presented in this symposium and provide an outlook for the steps ahead in bridging visual developmental neuroscience and deep learning. I will first paint a picture of the emerging understanding of how categorical object representations in visual cortex arise over the course of development. The answer to this question can be considered to lie on a continuum, with one extreme suggesting that we are born with category-selective cortical modules, and the other extreme suggesting that categorical object representations in visual cortex arise from the structure of visual experience alone. Emerging evidence from both experimental and computational work suggests that the answer lies in between: categorical object representations may arise from an interplay between visual experience and constraints imposed by behavioral pressures as well as inductive biases built into our visual system. This interplay may yield the categorical object representations we see in adults, which emphasize natural categories of ecological relevance such as faces and animals. Deep learning provides a powerful computational framework for putting this hypothesis to the test. For example, unsupervised learning objectives may provide an upper bound on what can be learnt from the structure of visual experience alone. Furthermore, within the deep learning framework, we can selectively turn on constraints during the learning process and examine effects on the learnt object representations. I will end by highlighting challenges and opportunities in realizing the full potential of deep learning as a modeling framework for the development of categorical object representations.

< Back to 2023 Symposia

The Active Fovea

< Back to 2023 Symposia

Symposium: Friday, May 19, 2023, 12:00 – 2:00 pm, Talk Room 2

Organizers: Martina Poletti¹, Martin Rolfs², Jude Mitchell¹; ¹University of Rochester, ²Humboldt-Universität
Presenters: Wolf Harmening, Martina Poletti, Hamutal Slovin, Lisa Kroell, Shanna Coop, Tong Zhang

It is well established that vision is an active process at the macroscopic scale; humans relocate the center of gaze to actively sample the visual scene. On the other hand, foveal vision is often regarded as passive: at each fixation the visual system simply receives a high-resolution snapshot of different portions of the visual scene. In this symposium we will survey converging evidence demonstrating that even during brief fixation periods vision is an active process and oculomotor behavior is an integral part of foveal processing at multiple stages starting from the retina. The research featured in this symposium uses cutting-edge technologies and experimental techniques to examine the sophistication of foveal vision and its underlying neural mechanisms. We will start with Wolf Harmening who uses high-resolution retinal imaging and psychophysics to show that the seemingly erratic drift of the eye during fixation is modulated in a way that is optimal for visual acuity, bringing high-acuity stimuli toward regions of higher cone density in the central fovea. Second, Martina Poletti with high-precision eye tracking and a gaze-contingent display system, will show how vision and attention are modulated across the foveola and will discuss the contribution of fixational eye movements and attention to fine spatial vision. Third, we will move from the initial stages of foveal processing to examine how activity in foveal and extrafoveal V1 is modulated by microsaccades. Using voltage-sensitive dye imaging (VSDI) to measure the spatio-temporal patterns of activation of neural populations in V1, Hamutal Slovin will showcase work on how extraretinal signals accompanying microsaccades shape neural activity and how this can aid visual stabilization during fixation. Notably, foveal vision is not only active as a result of fine oculomotor behavior and it is not only influenced by stimuli at the center of gaze. Fourth, based on a dynamic noise paradigm and reverse correlations, Lisa Kroell will present psychophysical evidence that foveal vision anticipates visual information that is available at the target of the next saccade, facilitating perceptual continuity across large-scale eye movements. Fifth, by using gaze-contingent eye tracking in combination with neural recording, Shanna Coop will inspect the neural counterpart of such foveal predictions, and show how MT neural activity in the fovea is tuned to the saccade target characteristics as a result of pre-saccadic attention. Finally, using gaze contingent manipulation of the visual input while recording from the Superior Colliculus, Tong Zhang will present findings showing how its foveal representation is modulated by the appearance of the saccadic target allowing for peripheral-to-foveal transfer of visual information across saccades. In sum, this symposium proposes a new view according to which, (a) vision across the foveola is not homogenous and represents a microcosm of its own, (b) fine oculomotor behavior is influenced by foveal anatomy, (c) peripheral stimulation informs foveal vision and can influence the way stimuli at the center of gaze are perceived, and (d) oculomotor behavior both at the micro and macroscopic scale shapes foveal vision and neural activity at different stages of visual processing.

Presentations

Non-random fixational drift and sub-cone resolution in the human fovea

Wolf Harmening¹, Jenny Witten¹; ¹University of Bonn, Department of Ophthalmology, Ernst-Abbe-Str. 2, 53127 Bonn, Germany

When we fixate small visual objects, incessant fixational eye movements translate tens to hundreds of foveal photoreceptors across the retinal image. Perhaps counter intuition, this constant visual jitter does not harm visual performance but in fact improves resolution. With simultaneous adaptive optics foveal cone-resolved imaging and micro-psychophysics, we here studied the direct relationship between visual resolution, photoreceptor topography in the central fovea, and fixational drift in a number of healthy eyes. Across subjects, we find that visual resolution was mainly governed by the photoreceptor sampling capacity of the individual eye. Resolution was highly correlated between fellow eyes, with the dominant eye performing better. When ocular aberrations were removed, resolution acuity was below the Nyquist sampling limit in all eyes, an effect that can in part be attributed to the spatiotemporal information produced by drift. We found that fixational drift showed a directional component that optimized retinal sampling from lower to higher cone density areas, an observation challenging the view that drift is primarily a result of random motor jitter.

The nonhomogeneous foveola and the need for active vision at this scale

Martina Poletti¹, Ashley Clark¹, Sanjana Kapisthalam¹, Yue Zhang¹; ¹University of Rochester

We will review evidence showing that vision is an active process even at its finest scale in the foveola. The need for an active foveola does not only originate from the fact that the visual system is primarily sensitive to changes and absence of retinal motion impairs visual perception in the fovea, but it also originates from the non-uniformity of fine spatial vision across the foveola. Using high-precision eye-tracking and a system for gaze-contingent display capable of localizing the line of sight with arcminute precision, we first demonstrate that visual crowding does not affect the whole foveola equally, as a result, under normal viewing conditions with crowded foveal stimuli, acuity drops considerably even a few arcminutes away from the preferred locus of fixation. We then illustrate the mechanisms through which active vision at this scale is achieved and its benefits. In particular, we show that ocular drift, the incessant jitter of the eye, enhances fine spatial vision to the point that acuity can be directly predicted from this oculomotor behavior. We show that microsaccades, saccades smaller than half degree, are actively used by the visuomotor system to explore the foveal stimulus and are driven by visual saliency and relevance effectively implementing strategies for visual search at this tiny scale. Finally, we discuss the interplay of attention and microsaccades in the foveola. The benefits of microsaccades also come from the ultra-fine resolution of pre-microsaccadic attention, which leads to highly localized perceptual enhancements around the goal location of microsaccades.

A two-phase extra-retinal input into monkey’s V1: the effect of fixational saccades on population responses

Hamutal Slovin¹, Nativ Yarden¹, Bouhnik Tomer¹; ¹The Leslie and Gonda (Goldschmied) Multidisciplinary Brain Res. Ctr., Bar-Ilan Univ., Ramat Gan, Israel

During natural viewing the eyes scan the visual scene, leading to a continuous image motion over the retina. Yet, even during fixation periods, miniature fast eye movements (EM) known as microsaccades (MSs) displace the image across the fovea. Despite this constant image shift, our visual perception of the world is stable, suggesting the existence of an extra-retinal input to the visual cortex that can correct for the image motion and produce perceptual stability. Here we investigated the existence of an extra-retinal input into the primary visual cortex (V1) of fixating monkeys during MSs. We used voltage-sensitive dye imaging (VSDI) to measure the spatio-temporal patterns of neural population in V1 aligned on MSs onset, in the absence or presence of a visual stimulus. VSDI enables to measure the population response at a high spatial (meso-scale) and temporal (ms) resolution. Interestingly, in the absence of a visual stimulus, the VSD signal showed that MSs induced a spatio-temporal modulation in V1, comprised of two phases: an early suppression followed by an enhancement of the neural response. Interestingly, this modulation exhibited a non-homogenous pattern: foveal regions showed mainly the enhancement transient, whereas more parafoveal regions showed a suppression that was followed by a delayed enhanced neural activation. Neural synchronization increased during this modulation. We then compared the MSs modulation in the presence and absence of visual stimulus within stimulated and unstimulated sites at the imaged cortical area. Our results reveal a distinct extra-retinal source that can be involved in visual and perceptual stabilization.

Foveal vision anticipates defining features of eye movement targets: converging evidence from human psychophysics

Lisa Kroell^1-2, Martin Rolfs^1,2,3,4; ¹Department of Psychology, Humboldt-Universität zu Berlin, Germany, ²Berlin School of Mind and Brain, Humboldt-Universität zu Berlin, Germany, ³Exzellenzcluster Science of Intelligence, Technische Universität Berlin, Germany, ⁴Bernstein Center for Computational Neuroscience Berlin, Germany

Perceptual modulations at the target of an impending, large-scale eye movement (saccade) have been studied extensively. Until recently, however, little was known about the concurrent development of visual perception in the pre-saccadic center of gaze. Based on converging evidence from several investigations, we suggest that pre-saccadic foveal vision operates predictively: defining features of a saccade target are enhanced at the pre-saccadic fixation location. Four main findings support this idea: First, using a dynamic large-field noise paradigm, we observed higher Hit Rates for foveal probes with target-congruent orientation and a sensitization to incidental, target-like orientation information in foveally presented noise. Second, by densely sampling the (para-)foveal space, we demonstrate that enhancement is confined to the center of gaze and its immediate vicinity. Moreover, taking single-trial saccade landing errors into account revealed that enhancement is aligned to the fovea, not to the future retinal (predictively remapped) location of the saccade target. Third, foveal enhancement during saccade preparation emerges faster and is more pronounced than enhancement during passive fixation. Lastly, the foveally predicted signal relies on instantaneous peripheral input: as the eccentricity of the saccade target increases (i.e., as its pre-saccadic resolution decreases), the foveal orientation prediction manifests in a progressively lower spatial frequency range. Our findings suggest that, right before a saccade, peripheral information is available for foveal processing, possibly via feedback connections to foveal retinotopic cortex. By commencing foveal target processing before the movement is executed, this mechanism enables a seamless transition once the center of gaze reaches the target.

Enhanced feature tuning for saccade targets in foveal but not peripheral visual neurons

Shanna Coop¹, Jacob Yates², Jude Mitchell³; ¹Neurobiology, Stanford University, USA, ²Department of Biology, University of Maryland College Park, USA, ³Brain and Cognitive Sciences, University of Rochester, USA

Each saccadic eye movement brings peripheral targets to the fovea for inspection. Before each saccade, visual neurons with peripheral receptive fields overlapping the target show enhancements in firing. Previously, we examined neural tuning of peripheral MT neurons during this pre-saccadic attention. We found gain enhancements that were uniform across motion direction consistent with neural studies of covert attention. However, pre-saccadic attention is also thought to involve feature-specific perceptual enhancements concentrated around the saccade target’s features (Li, Barbot, & Carrasco, 2016; Ohl, Kuper, & Rolfs, 2017). Here we examined if feature-specific enhancements might occur in foveal representations where the target is anticipated. We recorded from MT neurons with foveal receptive fields as marmoset monkeys performed a saccade to one of three equally eccentric motion dot fields. During saccade flight we manipulated the motion direction of the saccade target to test if post-saccadic responses were biased towards its predicted motion. In “predictive” trials the stimulus was unchanged during the saccade while in “unexpected” trials we swapped its motion for an orthogonal direction (+/- 90 degrees). If foveal representations exhibited feature-specific enhancements for the target then we would expect enhanced tuning for the predicted targets. We find that for predicted trials MT neurons increase their response around the preferred motion direction while suppressing non-preferred directions. These findings show that saccades can produce feature-specific enhancements in post-saccadic foveal processing that favor processing for the predicted target. This mechanism could support continuity of target selection across the saccades.

From the fovea to the periphery and back: mechanisms of trans-saccadic visual information transfer in the superior colliculus

Tong Zhang^1-2, Ziad Hafed^1-2; ¹Werner Reichardt Center for Integrative Neuroscience, University of Tübingen, Tübingen, Germany 72076, ²Hertie Institute for Clinical Brain Research, University of Tübingen, Tübingen, Germany 72076

The superior colliculus (SC) possesses visual machinery supporting both foveal analysis and peripheral object detection. This structure also emits movement-related discharge that is relayed to both the downstream oculomotor control network and upstream cortical areas. This places the SC in an ideal position to both control orienting responses as well as bridge periods of sensory uncertainty associated with rapid eyeball rotations. Yet, the mechanisms with which foveal state influences peripheral visual sensitivity, or with which peripheral visual information is trans-saccadically relayed to foveal SC visual representations, are not fully understood. Here we will first describe how foveal SC state can have a strong impact on peripheral SC visual sensitivity. Using real-time gaze-contingent image control of instantaneous foveal eye position error, we will demonstrate how a foveal error of only 1-2 min arc is sufficient to not only drive microsaccades, but also strongly influence peripheral SC visual responses and orienting response efficiency. We will then show how SC movement-related discharge is itself not a pure neuronal movement command, but instead represents the sensory state of the periphery at the time of saccades. Thus, SC “motor” bursts not only represent where gaze will shift towards, but they also provide a peripheral preview of the visual appearance of saccade targets. Once these targets are foveated, our final set of results will demonstrate how foveal SC visual representations are predictively sensitive to pre-saccadic peripheral target appearance. Thus, the SC encompasses a full active vision loop, from the fovea to the periphery and back.

< Back to 2023 Symposia

How does the brain combine generative models and direct discriminative computations for visual inference?

< Back to 2023 Symposia

Symposium: Friday, May 19, 2023, 12:00 – 2:00 pm, Talk Room 2

Organizers: Benjamin Peters¹, Nikolaus Kriegeskorte¹; ¹Columbia University
Presenters: Benjamin Peters, Ralf Haefner, Divya Subramanian, Doris Tsao, Thomas Naselaris

How might the brain combine generative models and direct discriminative computations for visual inference? Using primate neurophysiology, human neuroimaging, psychophysics, and computational modeling, the speakers of this symposium will address the questions from different angles. Benjamin Peters will introduce the main question and discuss the importance of scaling visual tasks to real-world complexity. Ralf Haefner will talk about the role of feedback in a hierarchical generative model of the visual system. Divya Subramanian will show how generative and discriminative models can support stability across saccades. Doris Tsao will report evidence of how ambiguous, noisy, and internally generated images evoked by electrical stimulation, drug-induced hallucinations, and dreams are processed in the macaque face patch system. Thomas Naselaris will conclude the panel by discussing how mental imagery might be integral, irrelevant, or even obstructive to visual inference. Each 15-min talk will be followed by a five-minute Q&A session. The talks will be followed by a 20-minute panel discussion about the kind of experiments and data we need to discern whether and how vision combines generative models and discriminative computations. The topic is particularly timely since the recent surge in deep generative and discriminative models in machine learning has led to an enriched explanatory framework for understanding primate vision that integrates established vision theories with progress in statistical inference and engineering. The advent of high-density neural recording techniques and large-scale computational models creates unprecedented opportunities for making substantial progress over the coming years. The symposium aims to develop a unified perspective on the two conceptions of vision, inspire novel hybrid models, and establish agreement on what evidence we should pursue experimentally. The symposium will benefit junior and senior members of the vision science community alike. Speakers will present cutting-edge research relevant to a broad audience while introducing the debate didactically and touching upon deeper questions of models and explanations in vision science.

Presentations

Naturalistic primate vision combines generative and discriminative computations

Benjamin Peters¹; ¹Columbia University

Biological and machine vision needs to be statistically and computationally efficient, enabling both robust and rapid perception from sensory evidence. The normative ideal, as embodied by the generative approach, interprets evidence optimally in the context of a statistical model of the world. This ideal is computationally expensive or even intractable for naturalistic and dynamic vision. The discriminative approach emphasizes rapid inference without the use of an explicit generative model. I will introduce generative and discriminative models as an explanatory framework characterizing visual inference at different levels of analysis. Models of primate vision can be understood as points in a vast space of possible inference models that contains classical algorithms like predictive coding, the feedforward cascade of filters in a convolutional neural network, along with more novel concepts from engineering such as amortized inference. I will clarify what kind of evidence is required to discern whether primate visual inference is generative. A key insight is that “more is different”: what seems viable as a visual inference algorithm for an abstracted toy task may not scale to naturalistic real-world vision. We should therefore scale our tasks to be more naturalistic and dynamic, exposing the brain’s unique combination of generative models and discriminative computations.

Behavioral and neural evidence that the visual system performs approximate inference in a hierarchical generative model

Ralf Haefner¹; ¹University of Rochester

Whether visual processing in cortex is best modeled as Bayesian inference using a generative model, or a discriminative model, is an important open question (DiCarlo et al. 2021, CCN/GAC). A critical clue to answering this question lies in the functional role of the ubiquitous feedback connections in cortex (Felleman & Van Essen 1991). Inference in a hierarchical generative framework suggests that their role is to communicate top-down expectations (Lee & Mumford 2003). I will present recent behavioral and neurophysiological results that are compatible with this hypothesis; results that are difficult to explain in the context of alternative hypotheses about the role of feedback signals, attention or learning. Our behavioral results in the context of a classic discrimination task strongly suggest that expectations are communicated from decision-related areas to sensory areas on a time scale of 10s to 100s of milliseconds. In the context of classic evidence integration tasks, this feedback leads to a perceptual confirmation bias that is measurable as both a primacy effect (Lange et al. 2021) and overconfidence (Chattoraj et al. 2021). Importantly, the strength of this bias depends on the nature of the sensory inputs in a way that is predicted by approximate hierarchical inference, and that can explain a range of seemingly contradictory findings about the nature of temporal biases. Finally, I will present empirical evidence for a surprising neural signature of feedback-related expectation signals, namely that they induce information-limiting correlations between sensory neurons, again as predicted from approximate hierarchical inference (Lange et al. 2022).

Bayesian and Discriminative Models for Visual Stability across Saccades

Divya Subramanian^1,2, John M. Pearson¹, Marc A. Sommer¹; ¹Duke University, ²National Institutes of Health (NIH)

The brain interprets sensory inputs to guide behavior, but behavior itself disrupts sensory inputs. Perceiving a coherent world while acting in it constitutes active perception. For example, saccades constantly displace the retinal image and yet, we perceive a stable world. The visual system must compare the predicted sensory consequence of each saccade with the incoming sensory input to judge whether a mismatch occurred. This process is vulnerable to sensory uncertainty from two potential sources: external noise in the world (“image noise”) and internal uncertainty due to one’s own movements (“motor-driven noise”). Since Bayesian models have been influential in explaining how priors can compensate for sensory uncertainty, we tested whether they are used for visual stability across saccades. We found that non-human primates (2 rhesus macaques) used priors to compensate for internal, motor-driven noise in a Bayesian manner. For external, image noise, however, they were anti-Bayesian. Instead, they likely used a discriminative strategy, suggesting that vision across saccades is governed by both Bayesian and discriminative strategies. Next, we tested whether single neurons in the Frontal Eye Field (FEF), which receives both internal saccade-related and external visual information, support either the Bayesian or anti-Bayesian (discriminative) strategies. We found that FEF neurons predict the anti-Bayesian, but not Bayesian, behavior. Taken together, the results demonstrate that Bayesian and discriminative computations for visual stability are dissociable at the behavioral and neural levels and situate FEF along a pathway that selectively supports the discriminative contribution.

Probing for the existence of a generative model in the macaque face patch system

Doris Tsao^1,2; ¹UC Berkeley, ²Howard Hughes Medical Center

The idea that the brain contains a generative model of reality is highly attractive, explaining both how a perceptual system can converge on the correct interpretation of a scene through an iterative generate-and-compare process, and how it can learn to represent the world in a self-supervised way. Moreover, if consciousness corresponds directly to top-down generated contents, this would elegantly explain the mystery of why our perception of ambiguous images is always consistent across all levels. However, experimental evidence for the existence of a generative model in the visual system remains lacking. I will discuss efforts by my lab to fill this gap through experiments in the macaque face patch system, a set of regions in inferotemporal cortex dedicated to processing faces. This system is strongly connected in both feedforward and feedback directions, providing an ideal testbed to probe for the existence of a generative model. Our experiments leverage simultaneous recordings from multiple face patches with high channel-count Neuropixels probes to address representation in three realms: (1) representation of ambiguous images, (2) representation of noisy/degraded images, (3) representation of internally generated images evoked by electrical stimulation, drug-induced hallucinations, and dreams. In each case, we ask: is the content and dynamics of representation across the face patch network consistent with a generative model of reality?

Why is the human visual system generative?

Thomas Naselaris¹; ¹University of Minnesota

The human visual system is obviously generative: most humans can and do generate imagery in the absence of retinal stimulation, and the internal generation of imagery clearly engages the entire visual cortex. However, we know very little about what the brain does with its ability to generate images. We will consider the hypothesis that the ability of visual cortex to generate imagery is a consequence of housing a generative model of the world that is needed to see. We will present empirical evidence from mental imagery experiments that imagery and vision rely upon the same generative model to make inferences that are conditioned on unseen and seen data, respectively. We will then consider evidence for the alternate hypothesis that generativity is not for seeing, and may even obstruct seeing. According to this hypothesis, non-visual systems may route their process-specific variables through the visual cortex to non-visual solve tasks. These extra-visual inputs may evoke visual imagery, and may even use visual imagery, while contributing nothing to or even obstructing seeing. We review evidence from the animal literature that appears to support this view, and propose several novel experiments to adjudicate between the two hypotheses.

< Back to 2023 Symposia

Announcements

VSS recognizes its founders, Ken Nakayama and Thomas Sanocki for the 25th Anniversary.

VSS honors Hoover Chan with 25th Anniversary Lifetime Service Award

Information for International Travelers is now available. You can request a Letter of Invite from your MyVSS account.

Thank you to our 2025 Sponsors and Exhibitors.

Vote in the 2025 Board of Directors Election.

Tatiana Pasternak receives the 25th Anniversary Lifetime Achievement Award

Leyla Isik is awarded the 2025 Elsevier/VSS Young Investigator Award.

Jody Culham is awarded the 2025 Davida Teller Award.

J. Anthony Movshon is awarded the 2025 Ken Nakayama Medal for Excellence in Vision Science.