VSS, May 13-18
Talk 1, 2:30 pm, 34.21
Orientation selectivity in human V1 revisited
Zvi N. Roth1 (), Kendrick Kay2, Elisha P. Merriam1; 1Laboratory of Brain and Cognition, National Institute of Mental Health, NIH, Bethesda, MD, USA, 2Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota, Minneapolis, MN, USA
Orientation selectivity in primate visual cortex is organized in cortical columns that form pinwheel motifs along the cortical surface. While cortical columns are at a finer spatial scale than the sampling resolution of standard BOLD fMRI measurements, numerous fMRI analysis approaches have been proposed to peer past these spatial resolution limitations of the technique. It was recently reported that these methods are predominantly sensitive to an interaction of the oriented stimulus with the aperture edge—an effect called vignetting (Roth et al., 2018). Beyond vignetting, it is not clear whether, and to what degree, orientation-selective neural responses contribute to BOLD measurements. As a result, the field is at an impasse. Here, we leverage a large dataset of visual cortical responses, measured using high-field 7T fMRI, and preprocessed using state-of-the-art methods (Allen et al., 2021). We fit these responses using two image-computable models based on the steerable pyramid (Simoncelli et al., 1992). The constrained model, which includes both location tuning and spatial frequency tuning, but pools across orientation-selective filters, is sensitive to the effects of stimulus vignetting. The full model, which includes orientation tuning, is sensitive to any additional orientation-selective responses beyond the effects of stimulus vignetting. Using these two models, we compensate for vignetting but nonetheless find clear evidence for reliable tuning for orientation at the macroscopic scale probed by fMRI. Our results further reveal a striking widespread map of orientation preference. This map possesses signatures of both a radial bias and a cardinal bias, and may constitute the neural basis for perceptual orientation anisotropies. Taken together, our findings settle a long-standing debate in human neuroimaging, and lay the groundwork for a deeper understanding of stimulus feature encoding and organizational principles in human visual cortex.
Acknowledgements: This research was supported by the Intramural Research Program of the NIH (ZIA-MH002966), under National Institute of Mental Health Clinical Study Protocol 93-M-1070 (NCT00001360). Collection of the NSD dataset was supported by NSF IIS-1822683 and NSF IIS-1822929.
Talk 2, 2:45 pm, 34.22
Snakes fade slower than gratings: perceptual correlates of neuroimaging support normalization by orientation variance
Recent work using functional magnetic resonance imaging (fMRI) observed stronger blood-oxygen-level-dependent (BOLD) signals in visual cortex when observers were shown curved band-passed contours (snakes) compared to straight and parallel band-passed contours (gratings). To account for these patterns of neural activation, Fang et al (2021) propose a divisive normalization model in which the responses of visual cortical neurons are scaled by the variance in contrast across orientation channels, rather than by the local stimulus contrast. These fMRI findings and the proposed normalization model both predict that snakes and gratings should differ perceptually, and that these differences should be strongest at suprathreshold contrasts. To test these predictions, we performed psychophysical experiments in human participants with suprathreshold stimuli (n=5, 2300 trials). Observers reported which of two briefly flashed stimuli had higher contrast (contrast discrimination), which of two superimposed patterns appeared to be in front (monocular rivalry), and the time it took for a fixated pattern to begin fading (perceptual fading). Contrast discrimination thresholds and monocular rivalry were both biased toward snakes compared to gratings (p<0.001). Further, perceptual fading of these stimuli was proportional to the strength of fMRI (r=0.82, p<0.01) and model (r=0.91, p<0.001) responses. Specifically, during prolonged fixation, gratings faded faster than snakes from perception, and these differences were strongest at high contrasts and high densities. Taken together, these results demonstrate a new class of perceptual biases, validate the normalization model proposed by Fang et al (2021), and link BOLD signals to perception.
Acknowledgements: This research was supported by the DFG (IRTG-1901: ‘The Brain in Action’, SFB-TRR-135: ‘Cardinal Mechanisms of Perception’), and an ERC Consolidator Award (ERC-2015-CoG-682859: ‘SHAPE’). Author KRS was supported by an Alexander von Humboldt fellowship.
Talk 3, 3:00 pm, 34.23
Linking cortical magnification in human primary visual cortex with contrast sensitivity
Marc Himmelberg1, Jonathan Winawer1, Marisa Carrasco1; 1New York University
Goal. A central question in neuroscience is how the organisation of cortical maps relates to perception, for which primary visual cortex (V1) is an ideal model system. V1 nonuniformly samples the retinal image, with greater cortical magnification (mm2 of cortex per deg2 of visual field) at the fovea than periphery, and at the horizontal than vertical meridian. V1 size and cortical magnification greatly vary across individuals - this variation should have important consequences for visual perception. Methods. We united fMRI with psychophysics to quantify individual differences in the organisation of V1 with contrast sensitivity at the four polar angle meridians. In 29 observers, we employed an orientation discrimination task to measure contrast sensitivity at the four cardinal meridians and, in the same observers, used fMRI to measure their V1 maps. We calculated overall V1 size, and the amount of V1 surface area dedicated to processing the same angular locations as the contrast measurements. Across observers, we correlated contrast sensitivity with V1 measurements. Results. First, contrast sensitivity (averaged across locations) was positively correlated with the size of V1 (r=0.47); observers with higher contrast sensitivity had a larger V1. Second, contrast sensitivity was positively correlated with the amount of surface area dedicated to the corresponding meridian (r=0.60); higher contrast sensitivity was associated with greater dedicated local surface area. Third, we computed the horizontal-vertical meridian asymmetry for contrast sensitivity and V1 surface area. The two measures were correlated (r=0.60); a stronger horizontal-vertical asymmetry in contrast sensitivity corresponded to a stronger horizontal-vertical asymmetry in the distribution of V1 surface. Conclusions. These data reveal that individual differences in contrast sensitivity are linked to individual differences in V1 surface area at global and local scales, and more broadly, show that differences in visual perception are rooted in the organisation of V1.
Acknowledgements: NIH R01-EY027401 to MC and JW
Talk 4, 3:15 pm, 34.24
Numerosity selective responses elicited from viewing of natural images
Shir Hofstetter1, Serge Dumoulin1,2,3; 1Spinoza Center for Neuroimaging, Amsterdam, The Netherlands, 2Utrecht University, The Netherlands, 3VU University Amsterdam
Numerosity (the set size of items in a group) is essential for behavior and decision making. Selectively tuned neurons to numerosities were found in animals and humans. In humans, these neurons were shown to be organized in a network of topographic maps. Since many visual features (e.g., circumference, area) change with numerosity, numerosity studies usually use simple and well controlled stimuli (e.g., dots in similar size). Here we challenge the ecological validity of these stimuli and ask whether the numerosity-tuned neural populations within the numerosity maps also respond to the numerosity of items present in natural images? 7 participants were scanned in a 7T MRI scanner where they viewed 6 types of stimuli presented in a randomised block design: (1) natural images with 1-3 main objects; (2) natural images with many objects (mean = 19.42, SD= 8.8); (3) natural images of scenery (vague numerosity); (4) 1-3 dots; (5) 20 dots; (6) 10-42 dots. Participants were asked to respond when the same image was presented repeatedly (1N-back task). No numerosity judgment was required. All participants had previously acquired data of their numerosity maps which was used here to localize the neural populations within the maps that are tuned to numerosities of 1-3. We compared their response to low vs. high numerosities as presented in the natural images and dots conditions. We find significantly higher responses to low vs. high numerosities in the 5 maps covering the occipito-temporal and parietal lobes (p<0.05, Wilcoxon signed rank test, FDR corrected). Only the map in the frontal lobe did not show a significant response to the numerosity of objects in the natural images. Our results reinforce the role of tuned neural populations in numerosity perception, expand the ecological validity of numerosity studies and thus grow our understanding of numerosity perception.
Acknowledgements: This work was supported by the Netherlands Organization for Scientific Research (016.Vici.185.050 to S.O.D.) and the Royal Netherlands Academy of Arts and Sciences (Ammodo award to S.O.D.).
Talk 5, 3:30 pm, 34.25
Data-driven component modeling reveals the functional organization of high-level visual cortex
Prior work has identified regions of high-level visual cortex selectively responsive to faces, places, bodies, and words. However, this largely hypothesis-driven work cannot reveal how prominent these category selectivities are in the overall functional organization of visual cortex, or what other unhypothesized selectivities exist. Further, standard voxel-wise tests cannot detect selective neural populations that coexist with functionally distinct populations within voxels. To overcome these limitations, we applied data-driven voxel decomposition analyses and generalized canonical correlation analysis to identify a robust set of canonical response profiles consistent across subjects in a recently-released public data set of fMRI responses in eight participants to thousands of complex photographic stimuli (Allen et al 2021). Because these analyses permit many degrees of freedom, our strategy is to freely explore only four of the participants, register our hypotheses, and test them on the held-out participants. To date, the first four participants reveal components in the ventral pathway clearly selective for people, scenes, and words, replicating prior results, as well as an intriguing novel component that appears to respond selectively to images of food. Although accounts of this “food” component in terms of low- or mid-level visual properties remain possible, it does not emerge from similar analyses of V1/V2 or from activations to these stimuli in early layers of Alexnet. Analyses of lateral visual cortex reveal components apparently selective for implied motion and social groups, along with other novel components. We find no evidence of components selectively responsive to animals or tools, or to “stubby” or “elongated” shapes. The hypotheses emerging from these analyses about neural selectivities (and lacks thereof) will be refined, registered, and then tested in the held-out participants. We expect our data-driven analyses to powerfully validate some but not all previously reported selectivities and to identify novel selectively-responsive neural populations in high-level visual cortex.
Acknowledgements: NIH Pioneer Award NIH DP1HD091957; NSF Science and Technology Center—Center for Brains, Minds, and Machines Grant NSF CCF-1231216
Talk 6, 3:45 pm, 34.26
Computational modeling of traveling waves using MEG-EEG in human
Laetitia Grabot1, Garance Merholz1, Jonathan Winawer2,3, David Heeger2,3, Laura Dugué1,4; 1Université de Paris, INCC UMR 8002, CNRS, F-75006 Paris, France, 2Department of Psychology, New York University, New York, NY 10003, United States, 3Center for Neural Science, New York University, New York, NY 10003, United States, 4Institut Universitaire de France (IUF), Paris, France
The role of brain oscillations in various cognitive functions including visual perception is extensively studied. However, their spatial organization is rarely scrutinized. Recent studies suggest that brain oscillations can travel across the cortex. Mesoscopic waves, traveling within cortical areas, are mainly observed with invasive measurements (e.g., electrocorticography), which limits their investigation. Measuring traveling waves non-invasively in human, such as with magneto- and electro-encephalography (MEG, EEG), is particularly challenging due to technical and biophysical constrains (e.g., source summation, volume conduction). To address these issues, we developed a novel model-based neuroimaging approach. First, in a two-stage computational model, (1) the putative neural sources of a propagating 5Hz-oscillation were modeled within the early visual region (V1) using individual retinotopic mapping from functional MRI recordings (encoding model); and (2) the modeled sources were projected onto the MEG-EEG sensor space to predict the resulting MEG-EEG signal (forward biophysical head model). Second, we tested our model by comparing its predictions against the MEG-EEG signal obtained when participants viewed a radial visual stimulus consisting of a black-and-white sinusoidal wave oscillating at 5Hz and propagating from the center to the periphery of the screen. This “traveling” stimulus was used to elicit a 5Hz-neural oscillation traveling across the retinotopic space. A “standing” stimulus, oscillating at the same frequency with the same phase across the visual field, was used as control. Correlations on amplitude and phase between predicted and measured data revealed a good performance of the model. Crucially, the model was able to distinguish MEG-EEG recordings while participants viewed a traveling stimulus compared to a standing stimulus. Our model aims at bridging the gap between mesoscopic (neuronal populations) and macroscopic (full brain recordings) scales, to facilitate a better understanding of the functional role of brain oscillations for cognition.
Acknowledgements: This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement N° 852139 to Laura Dugué).
Talk 7, 4:00 pm, 34.27
Purely Perceptual Machines Robustly Predict Human Visual Arousal, Valence, and Aesthetics
Our experience of a beautiful, moving, or aversive image clearly evokes affective processes beyond vision, but the relative contributions of factors along the spectrum from input (image statistics) to ideation (abstract thought) remain a matter of debate. Machine vision systems, lacking both emotion and higher-order cognitive processes, provide an empirical testbed for isolating the contributions of a purely perceptual representation. How well can we predict human affective responses to an image from the purely perceptual response of a machine? Here, we address this question with a comprehensive survey of deep neural networks (e.g. ConvNets, Transformers, MLP-Mixers) trained on a variety computer vision tasks (e.g. vision-language contrastive learning, segmentation), examining the degree to which they can predict aesthetic judgment, arousal, and valence for images from multiple categories across two distinct datasets. Importantly, we use the features of these pre-trained models without any additional fine-tuning or retraining, probing whether affective information is immediately latent in the structure of the perceptual representation. We find that these networks have features sufficient to linearly predict (even with nonparametric mappings) average ratings of aesthetics, arousal, and valence with remarkably high accuracy across the board – at or near the predictions we would make based on the responses of the most representative ('taste-typical') human subjects. Models trained on object and scene classification, and modern contrastive learning models, produce the best overall features for prediction, while randomly-initialized models yield far lower predictive accuracies. Aesthetic judgments are the most predictable of the affective responses (followed by arousal, then valence), and we can predict these responses with greater accuracy for ‘taste-typical’ subjects than for less ‘taste-typical’ subjects. Taken together, these results suggest that the fundamental locus of visually evoked affective experience may be located more proximately to the perceptual system than abstract cognitive accounts of these experiences might otherwise suggest.
Acknowledgements: We thank George Alvarez and Gabriel Kreiman for comments and feedback.