Color, Light and Materials: Neural mechanisms, models

Talk Session: Saturday, May 18, 2024, 10:45 am – 12:30 pm, Talk Room 2
Moderator: Bevil Conway, National Eye Institute, NIH

Talk 1, 10:45 am, 22.21

Natural retinal cone distributions emerge from optical and neural limits to vision

Yazhou Zhao1,2 (), Zeyu Yun2, Ruichang Sun2, Dasheng Bi2; 1University of Hong Kong, 2University of California, Berkeley

Biological visual systems develop highly efficient solutions in response to physical limitations. In particular, the human retinal cone mosaic supports both high spatial and color vision acuity in an imperfect optical environment. Here, we show that naturalistic cone distributions can emerge from simple constraints such as chromatic aberration (CA) and naturalistic behavioral task performance. We model key components of the visual system with a CA-constrained optical simulation, learnable cone mosaic sampling and a state-of-the-art deep artificial neural network. We also designed a custom dataset, ImageNet-Bird, by selecting images that require both high spatial acuity and color acuity for correct classification. By training our model to perform this visual task, we show that the model’s emerged cone mosaic resembles a cone mosaic found in humans. One important characteristic is the relative deficiency of S cones compared to M and L cones. Moreover, in a performance comparison experiment using fixed cone mosaics with different S cone ratios, we showed that the performance is consistently better when the S cone ratio is lower. Finally, we also observed that artificial neural networks have a different set of limitations from biological neural systems due to inductive biases imposed by the network architecture; for example, the model requires stationarity in the cone mosaic to achieve a good performance. More generally, our results serve as a concrete instance in which the functional organization of vision is driven by inherent optical and neural limitations, and may provide a new framework for understanding observed statistics of the visual system.

Talk 2, 11:00 am, 22.22

Midget retinal ganglion cell surrounds in macaque: cone-selective or not?

Nicolas Cottaris1 (), Brian Wandell2, David Brainard1; 1University of Pennsylvania, 2Stanford University

Despite decades-long study of macaque midget retinal ganglion cells (mRGC), significant knowledge gaps exist regarding their receptive field (RF) properties. One example is the controversy regarding cone pooling in mRGC surrounds. Anatomy and in-vitro physiology, the latter in peripheral retina, indicate that L- and M-cones contribute non-selectively to mRGC RF surrounds, whereas in-vivo physiology in more central retina indicates that the RF surrounds are highly cone-type selective. To better understand the mRGCs, we developed a model of their linear spatiochromatic RFs. We model the cone inputs to the mRGCs based on anatomical and physiological data, taking into account the impact of physiological optics. Knowledge of these factors allows us to model the mRGC RFs across a large part of the visual field. We use the model to compute responses of synthetic mRGCs to cone-isolating grating and m-sequence stimuli, matched to those that have been employed by in-vivo physiological studies. Simulation enables us to compute the expected in-vivo responses for mRGCs with different surround L- to M-cone ratios. We perform the simulations over a range of eccentricities, taking into account the eccentricity dependence of the physiological optics, the cone fundamentals used to derive cone-isolating stimuli, and the mRGC RF structure. Our results reveal that near the fovea, where centers receive one or two cone inputs, physiological optics significantly enlarges the stimulus-referred RF center, thereby attenuating the antagonistic responses from surround-cones of the same type as the center cone. For this reason, the surround measured in vivo can appear heavily biased toward selective pooling of cones of the non-center cone type. In particular, this happens for models in which the simulated RF surrounds draw indiscriminately on L- and M-cones. This phenomenon, which we observed with both m-sequence and drifting grating simulations, provides a plausible explanation for the discrepancy in conclusions across studies.

Acknowledgements: Funded by AFOSR FA9550-22-1-0167

Talk 3, 11:15 am, 22.23

Neurophysiological mechanisms of vision at the center-of-gaze in macaque V1

Felix Bartsch1,2 (), Ramon Bartolo Orozco2, Ethan Lott1, Jacob Yates3, Daniel Butts1, Bevil Conway2; 1Department of Biology, University of Maryland, 2Laboratory of Sensorimotor Research, National Eye Institute, National Institutes of Health, 3Herbert Wertheim School of Optometry and Vision Science, UC Berkeley

Almost nothing is known about the spatial structure of V1 receptive fields at the center-of-gaze, yet these neurons are the building blocks for high acuity vision. This knowledge gap is due to the technical challenges in measuring receptive fields (RFs) at the resolution of single-cone inputs present in the retina (~1 arcminute resolution). Given the distinct anatomical and physiological characteristics of foveal retina, it is possible that RFs in foveal V1 are not simply finer versions of parafoveal cells and may have distinctive aspects of visual processing. Here, we developed a model-based approach leveraging neurophysiological data to refine eye position estimates beyond the limits of hardware-based eye trackers using chronic array recordings of foveal RFs in awake fixating macaque. We presented spatiochromatic noise stimuli while recording across cortical layers, using acute laminar arrays sampling many different columns. We applied data-driven nonlinear models to investigate how subcortical inputs are integrated in V1. We recovered detailed spatial RF structure from 429 cells spanning the very center of gaze up to 2o eccentricity. Foveal V1 cells showed a diversity of RF types; some cells were unmodulated by cone-opponent signals (“luminance-only”), and others were modulated by both luminance and cone-opponent signals. RFs were as small as 4 arcminutes, with features subtending ~2 arcminutes. Luminance-only cells had the finest spatial structure RFs (median RF width = 6 arcmin) and typically showed nonlinear responses; cells modulated by cone-opponent signals integrated over larger areas (median RF width = 12 arcmin) and were more likely to be linear. Even within individual cells, the luminance component had finer spatial RF structure than the cone-opponent components, suggesting limits on spatial acuity for cone-opponent computations. Our measurements offer the first detailed observations of spatiochromatic processing in foveal V1 and offer clues to how V1 RFs are constructed from the photoreceptor mosaic.

Acknowledgements: NIH Intramural Program; NSF IIS-2113197

Talk 4, 11:30 am, 22.24

Octopus electroencephalography permits detection of light-induced Steady State Visually Evoked Potentials

Peter Tse1 (), Jay Vincelli2, Gideon Caplovitz3, Walt Besio1; 1Dartmouth College, 2University of Rhode Island School of Engineering, 3University of Nevada, Reno

We describe the first underwater electroencephalography (EEG) involving octopuses. We can detect stimulus frequency-dependent correlates of flickering LEDs in electrodes placed on their skin under saltwater. This activity is biological in origin, because control experiments rule out that this activity is generated by the flickering light alone in the absence of an octopus. Rather than place electrodes onto the skin of an octopus, we place the octopus between two layers of fixed electrodes under saltwater, while presenting the octopus with visual input from outside its enclosure. We flickered light at various fixed frequencies outside of a transparent enclosure that held individual octopus bimaculoides, loosely sandwiched between two plates containing embedded EEG tripolar electrodes. Neural activity entrained to displayed frequencies can be detected as potential from electrodes situated on or near the midpoint between the two eyes of the octopus, but not from electrodes situated below the brain of the octopus. We are able to detect SSVEPs at multiple tested frequencies. EEG offers the major advantage that it is not invasive, so octopuses need not be fixed in place or anaesthetized. The parallel alignment of neurons in their brain’s vertical lobe allows summation of the Local Field Potential (LFP) that emerges from MSF/MIF axons synapsing on the AM cells of the vertical lobe. There is also a similar co-alignment of neurons in some of the other octopus brain lobes, which may permit LFP summation of signals. Conclusion: An electrode placed near the center of the two eyes is able to detect neural responses to light flickering at various frequencies. Octopus EEG may eventually prove to be as fruitful as human EEG has proven to be in deciphering the neural correlates of complex cognition.

Acknowledgements: This work was supported by NSF grant 2122962

Talk 5, 11:45 am, 22.25

Can we improve luminance? Online and lab experiments

Shuchen Guan1 (), Robert Ennis1, Karl Gegenfurtner1; 1Justus-Liebig Universität, Gießen

Luminance has served as the standard measure for light intensity for 100 years. Nevertheless, it has long been known that it has substantial flaws in predicting the perceived intensity of lights with different spectral distributions. Here, we wanted to evaluate potential improvements in the weighting of the cone inputs for heterochromatic brightness perception. To reach a large number of observers, we made measurements in the lab and online, using the same observers. We presented 144 patches encompassing 12 hues and 12 intensities in RGB space. Each trial involved 12 patches varying in both hue and intensity, and 43 observers ranked them based on their perceived brightness in 66 trials. Observers completed the experiment online on personal displays and in a well-controlled lab environment on an OLED. They also brought their personal displays to the lab for calibration. In the lab session, testing observers with a calibrated sRGB display revealed that luminance predicted 76.3% of observer rankings correctly. Radiance predicted more accurately (78.5%), and a non-linear weighted maxRGB model performed best (84.2%). The optimal weights fitted to RGB were [0.40, 0.45, 0.15]. Compared to Vlambda, the contributions of L- and S-cones were increased for heterochromatic brightness. The test-retest reliability, measured with a subset of 20 observers, was 83.9% in these lab-based experiments. For the home session, we first investigated stimuli consistency across displays. The patches presented on the lab OLED had a consistency ranging from 90% to 97% across all predictors on observers' displays based on the calibration data. Intra-observer response consistency across online and lab sessions was 80.8%, inter-observer consistency was 77.6%. Again, the maximum-weighted RGB model consistently outperformed luminance. We conclude that luminance systematically underestimates the contributions of L- and S-cones to heterochromatic brightness. Our results also indicate that online color experiments may be feasible for certain paradigms.

Talk 6, 12:00 pm, 22.26

Deciphering visual representations behind subjective perception using reconstruction methods

Fan L. Cheng1,2, Tomoyasu Horikawa1,3, Kei Majima2, Misato Tanaka1,2, Mohamed Abdelhack2,4, Shuntaro C. Aoki1,2, Jin Hirano2, Yukiyasu Kamitani1,2; 1ATR Computational Neuroscience Laboratories, 2Kyoto University, 3NTT Communication Science Laboratories, 4Krembil Centre for Neuroinformatics

Reconstruction techniques have been widely used to recover physical sensory inputs from brain signals. Numerous studies have consistently refined methods to achieve image reconstruction that faithfully mirrors the presented image at the pixel level. An intriguing extension of these techniques is their potential application to subjective mental contents, a domain that has proven to be especially challenging. Here, we introduce a general framework that can be used to reconstruct subjective perceptual content. This framework translates or decodes brain activity into deep neural network (DNN) representations, and then converts them into images using a generator. Through our research on visual illusions—a classic form of subjective perception defined by a discrepancy between sensory inputs and actual perception— we demonstrate how we successfully reconstructed visual features that were absent in the sensory inputs. Our work shows the potential of reconstruction techniques as invaluable tools for delving into visual mechanisms. The use of natural images as training data and the choice of DNNs were key in obtaining successful reconstruction. While extensive research has probed the neural underpinnings of visual illusions using qualitative hypotheses, our approach materializes mental content into formats amenable to visual interpretation and quantitative analysis. Reconstructions from individual brain areas shed light on the strength of illusory representation and its shared representations with real features at different levels of processing stages, which provides a means to decipher the visual representations underlying illusory perceptions.

Acknowledgements: Japan Society for the Promotion of Science KAKENHI grant JP20H05705 and JP20H05954, Japan Science and Technology Agency CREST grant JPMJCR18A5 and JPMJCR22P3 and SPRING grant JPMJSP2110, New Energy and Industrial Technology Development Organization Project JPNP20006

Talk 7, 12:15 pm, 22.27

Integrated Gradient Correlation: a Method for the Interpretability of fMRI Decoding Deep Models

Pierre Lelièvre1 (), Chien-Chung Chen1; 1Visual Neuroscience Lab, Department of Psychology, National Taiwan University, Taipei, Taiwan

Deep learning has reached the domain of visual perception with artificial models trained on image classification tasks, interestingly expressing some degree of similarity with human mechanisms. Currently, encoding/decoding operations of fMRI activation to features of interest usually stick to individual linear regressions per voxel/vertex. Modelers mitigate associated limitations with carefully hand-crafted linearizing features, however, the multidimensionality and intrinsic non-linearities of artificial neural networks could further improve domain adaptation, and even capture brain area interactions. One explanation of favoring simple models is the lack of interpretability of deep learning, i.e. the ability to compare informational content between different brain areas, for one feature, and across different features. We overcome this issue by introducing a new method called Integrated Gradient Correlation, IGC, completing the original IG attribution method. We also demonstrate the relevancy of our approach by investigating the representation of image statistics using the NSD dataset: a public fMRI dataset consisting of 70k BOLD activations acquired during a long term image recognition task. We particularly focused on surface-based data (fsaverage), limited to visual cortex ROIs (e.g. V1-V4, bodies, places). Statistics under scrutiny encompassed three first moments of image luminance distributions usually associated with human texture perception (i.e. mean luminance, contrast, and skewness), as well as a higher level statistic related to spatial luminance distributions (i.e. 1/f slope). Then, we evaluated several decoding models: traditional individual linear regressors, multidimensional linear models trained per ROI and on the whole visual cortex, and finally different deep architectures (sequences of fully connected layers, and/or graph convolutional layers). IGC results show that deep models provide significantly more accurate decoding predictions, and more informative/selective brain activation patterns, coherent with the literature. Consequently, our method could find applications beyond visual neuroscience, and become beneficial to any scientific inquiry using deep models.

Acknowledgements: Supported by NSTC.