Multisensory Processing

Talk Session: Tuesday, May 21, 2024, 5:15 – 7:15 pm, Talk Room 1
Moderator: Shinsuke Shimojo, Caltech

Talk 1, 5:15 pm, 55.11

Perceptual mechanisms underlying human click-based echolocation

Haydee Garcia-Lazaro1 (), Pushpita Bhattacharyya1, Brendyn Chao2, Santani Teng1; 1Smith-Kettlewell Eye Research Institute, 2University of Washington

Echolocation is an active sensing strategy used by some blind individuals to navigate their surroundings. Human echolocators emit tongue clicks, leveraging the echoes to detect, discriminate, and localize objects within their environment. Proficient blind echolocators outperform non-expert blind and sighted individuals in most echo-acoustic tasks; while visual experience and expertise play significant roles in echolocation performance, the underlying mechanisms of this advantage remain unclear. Recent research from our lab suggests that the emitted click masks the subsequent fainter echo, and superior performance among experts may be attributed to a more effective release from masking relative to novices. We explore this hypothesis by evaluating the influence of two aspects of masking on echolocation performance: click-echo signal-to-noise ratio (SNR) and click-echo temporal separation. Novice-sighted individuals completed an echo-acoustic localization task. Each trial consisted of 2, 5, 8, or 11 synthesized mouth clicks with spatialized echoes from reflectors 1-meter away and 5°–25° from the midsagittal plane. Participants indicated the reflector’s location (left vs. right). In Experiment 1, the click amplitude was variably attenuated relative to its natural amplitude. In Experiment 2, click amplitude was fixed while the click-echo time delay varied from ~6–60 ms, equivalent to 1–10 meters. We hypothesized that performance would improve as click-echo amplitude attenuates (Exp. 1) and click-echo time delay increases (Exp. 2). Our results revealed that novice-sighted individuals, at echo-click-level-difference > -2 dB or click-echo time delay > 52 ms, performed similarly to proficient echolocators presented with naturalistic stimuli (~25 dB; ~6 ms). These findings suggest that a well-tuned click-echo relationship, alongside an optimized click-echo temporal integration window, enhances echolocation performance. Future research will explore their combined or separate roles in auditory filters. Finely tuned click-echo SNRs and a narrower click-echo temporal integration window may underlie echolocation proficiency due to improved click-echo segregation and echo representation.

Acknowledgements: E. Matilda Ziegler Foundation for the Blind, National Eye Institute 1R21EY032282-01, Smith-Kettlewell Eye Research Institute

Talk 2, 5:30 pm, 55.12

Reduced contextual effects and cross-modal calibration demonstrate atypical sensory processing in autism

Avni Ben Zvi Inbar1, Hagit Hel-Or, Bat-Sheva Hadad; 1Student

Introduction: Sensory symptoms are part of the core phenotype of autism, but their underlying mechanisms are unknown. We examined whether altered perception of magnitude in autism arises from modulations in biases and contextual effects known to calibrate perceptual sensitivity in neurotypicals. Specifically, we asked whether calibration of duration perception by context is generalized across modalities or rather mediated by within-modality specific mechanisms. Sensitivity of duration perception for visual and auditory stimuli was tested when context was manipulated within- and between the sensory modalities. Method: Individuals with and without autism preformed a two-interval forced choice task to determine the longer of two temporal signals either visual or auditory. Participants performed the task under three conditions: 1) the central standard was presented in two types of modalities: visual or auditory; 2) the contextual standards formed either a wide or a narrow contextual range around the central standard; 3) the contextual standards were presented in two modalities: visual or auditory independent of the central standard, forming “same-modality” or “between-modality” conditions. Thresholds were determined using two staircase methods: constant stimuli and QUEST. Results: For neurotypicals, thresholds determined in the auditory domain were smaller than in vision, suggesting an auditory specialization. Importantly, the narrower context enhanced sensitivity for standards within the same modality but had no effect on standards of different modality, suggesting that perceptual magnitude normally follows a modality specific calibration process. For individuals with autism, thresholds were similar in the auditory and visual domains suggesting no auditory specialization. Context only mildly affected sensitivity and in a similar magnitude for between- and within-modalities context, suggesting an amodal general calibration mechanism. These results suggest that contrary to the specialized, modality specific calibration processes in neurotypicals, overall reduced and less specialized (amodal) calibration process is shown in autism, that may account for sensory dysregulations and symptoms.

Talk 3, 5:45 pm, 55.13

Vision Without Photoreceptors: Crossmodal Perception Within the Blind Spots

Ailene Chan1 (), Noelle R. B. Stiles1,2, Carmel A. Levitan3, Armand R. Tanguay, Jr.1,4, Shinsuke Shimojo1; 1California Institute of Technology, Division of Biology and Biological Engineering, 2Rutgers University, Departments of Neurology, Ophthalmology and Visual Science, and Biomedical Engineering; Center for Advanced Human Brain Imaging Research in the Brain Health Institute, 3Occidental College, Cognitive Science, 4University of Southern California, Departments of Electrical Engineering, Chemical Engineering and Materials Science, Biomedical Engineering, Ophthalmology, and Physics and Astronomy; Neuroscience Graduate Program

Multisensory illusions are a key tool to investigate crossmodal integration and plasticity, given their resilience across manipulations and unique spatial adaptability. We previously tested the classic Double Flash Illusion across retinal locations in low vision participants. These participants reported stronger double flash perception in visual impairment areas relative to neurotypical participants. To examine whether illusions could span regions of no light perception, the present study induced multisensory interactions within the blind spots of neurotypicals using a postdictive illusion. The Audiovisual Rabbit Illusion consists of a sequence of [beep-flash, beep, beep-flash]; an illusory flash is induced by the second beep, located between the first and second beep-flash pairs (all beeps are centrally located). This illusion is postdictive, as a latter sensory stimulus impacts the perception of an already-presented stimulus. We mapped each participant’s blind spots (with one eye blindfolded) and placed the beep-flash pairs 0.5° outside the borders. We tested four sequences: left-to-right, right-to-left, top-to-bottom, and bottom-to-top, and three conditions: zero-beep, two-flash (0B2F; control); 2B2F (control); and 3B2F (illusion). Participants reported strong illusory percepts within their blind spots, as well as in visible locations. Performance was comparable between these locations. For 0B2F and 2B2F conditions, participants reported perceiving ~2 flashes, confirming that the illusion only occurs when visual and auditory information are incongruent within each beep-flash sequence. There were no significant differences in the number of flashes perceived between the 0B2F and 2B2F conditions. These results support the hypothesis that filling-in within blind areas can be multisensory. In this case, audition may play a key role in inducing visual propagation across visual space, even into regions without visual input capacity. The blind spot provides an interesting test case for how the brain interprets blind regions in the retina, particularly in comparison to scotomas generated from eye diseases.

Acknowledgements: Croucher Scholarships for Doctoral Study, Dominic Orr Graduate Fellowship in BBE, The National Institutes of Health, The National Eye Institute

Talk 4, 6:00 pm, 55.14

Neural dynamics of supramodal conscious perception

Andreas Wutz1 (); 1University of Salzburg

Is the conscious perception of seeing a flash, hearing a sound or feeling a touch associated with one common core-activity pattern in the brain? Here, I present novel magnetoencephalography (MEG) data that reveal such supramodal neural correlates of conscious perception. On each trial, different visual, auditory or tactile stimuli were shown at individual perceptual thresholds, such that about half of the stimuli were consciously detected, while the other half was missed. Four different stimuli per modality were used (i.e. different Gabor patches, sound-frequencies, stimulated fingers) in order to subsequently leverage representational similarity analysis (RSA) for differentiating modality-specific, sensory processing from supramodal conscious experiences, which are similar across modalities. As expected, there was stronger evoked MEG-activity for detected vs. missed stimuli during sensory processing (<0.5 s) in the respective sensory cortices. Moreover consistent with previous work, there was stronger alpha-frequency band power (8-13 HZ) for missed vs. detected trials in the pre-stimulus period and in a later time window after stimulus onset (>0.5 s) for all three modalities. Critically, the RSA distinguished activity patterns related to modality-specific, sensory processing shortly after stimulus onset (<0.5 s) from later supramodal conscious processing (>0.5 s). Overall, our findings suggest a three-stage model for conscious multisensory experiences, involving pre-stimulus alpha oscillations, modality-specific, sensory processing upon stimulus onset and then later supramodal conscious perception. This temporal processing cascade may serve the integration and updating of pre-stimulus brain states, presumably reflecting top-down predictions about upcoming sensory events, with subsequent conscious experiences irrespective of the specific sensory modality.

Acknowledgements: This research was supported by project funding from the FWF - the Austrian Science Fund. Grant agreement number: P36214

Talk 5, 6:15 pm, 55.15

Cross-Modal Tuning in Early Visual and Somatosensory Cortices

Stephanie Badde1, Ilona Bloem2,3, Jonathan Winawer2, Michael S Landy2; 1Tufts University, 2New York University, 3Netherlands Institute for Neuroscience

Conflicts between the senses shape our perception. We used functional magnetic resonance imaging to test whether exposure to spatially offset visual and tactile stimuli shifts population-level spatial tuning in early visual and somatosensory cortices. Participants fixated a marker at the center of a sketched outline of a right hand. During visual stimulation, yellow circles expanding and contracting at 4 Hz were superimposed on one fingertip on the displayed outline. Tactile stimuli were amplitude-modulated vibrations at the fingertips of the participants’ right hand, also pulsating at 4 Hz. Stimuli swept across the fingers, moving from one finger to the next every 4 s, in ascending or descending order. Within a 4 min run, visual and tactile stimuli were either presented in isolation or synchronously. Visual-tactile stimulus pairs were either always located at the same finger, or always located at adjacent fingers, with the visual stimulus shifted either towards the thumb or little finger. Population receptive field (PRF) mapping confirmed topographically organized neural populations tuned to tactile stimulation of one finger in somatosensory but not visual cortex, and vice versa for visual stimulation. Maps from unisensory stimulation agreed well with those from congruent tactile-visual stimulation. Visual-tactile spatial discrepancy resulted in a PRF shift in all participants. Shift direction was independent of sweep direction ruling out prediction of the upcoming stimulus as the source of the effect. Rather, PRFs in somatosensory cortex were shifted toward the neighboring finger, consistent with tuning for combined visual-haptic locations and vice versa in visual cortex. In sum, our results reveal cross-modal effects on population-level spatial tuning in early visual and somatosensory cortices.

Acknowledgements: RO1EY08266, R01MH111417

Talk 6, 6:30 pm, 55.16

High-level sensory and motor regions encode object mass after real-world object interactions

Shubhankar Saha1 (), Prithu Purkait1, SP Arun1; 1Indian Institute of Science

We experience real-world objects not just by seeing them but by interacting with them. Such interactions give us information about their physical properties such as mass. Are such physical properties integrated into the underlying object representations? To investigate this fundamental question, we performed wireless brain recordings from two monkeys with electrodes implanted into high-level sensory and motor regions before and after they interacted with real-world objects of varying mass. We created 5 water bottles painted with different colors, and added weights (100-500 grams) chosen to be uncorrelated with their (R,G,B) colors. We then recorded neural responses to images of these bottles on a screen while each animal passively viewed these images, prior to any interaction with these bottles. Each bottle was then loaded with a small juice reward and presented to each monkey in randomized order. Monkeys readily interacted with these bottles, lifting them up to drink the juice, thereby ensuring that they had experience with the varying masses of these bottles. Following these interactions, we again recorded neural responses to images of these bottles on a screen as before. We hypothesized that neural activity would show a greater correspondence with the experienced mass of these objects, following the real-world interaction compared to before the interaction. To this end, we calculated the correlation between the multiunit firing rate from each electrode with the object mass. Our main finding is that neural responses showed an increased correlation with object mass after real-world interactions. This effect was present in the premotor/prefrontal cortex (PMv/vlPFC) as well as in inferior temporal cortex (IT). Taken together, our results show that object mass is rapidly encoded into both high-level sensory and motor regions of the brain following real-world interactions with objects.

Acknowledgements: This research was funded through a Senior Fellowship from the DBT-Wellcome India Alliance (Grant # IA/S/17/1/503081) and the DBT-IISc partnership programme to SPA, Prime Minister’s Research Fellowship to SS and KVPY Fellowship to PP.

Talk 7, 6:45 pm, 55.17

Informative vision alters tactile perception

Anupama Nair1, Jared Medina2; 1University of Delaware, 2Emory University

Previous studies have shown that individuals are more likely to detect a near-threshold tactile stimulus when seeing touch at the same location, leading to the hypothesis that informative vision enhances tactile perception. However, such results could also be explained by a more liberal response criterion when seeing touch. To examine if viewed touch enhances tactile perception, we presented participants with two tasks. Vibrotactile stimuli were presented at varying intensities to both the index and ring finger while participants watched videos of a hand being touched on one cued finger. In the comparative judgment task, participants indicated which finger received the more intense tactile stimulus on their own hand. Across multiple experiments, participants consistently demonstrated a shifted point of subjective equality, reporting that the tactile stimulus associated with the cued finger was more intense. These results provide evidence that the cue clearly altered performance but are agnostic regarding whether there was a shift in response bias or perceptual enhancement. In the equality judgment task, which is more resistant to response bias, participants indicated whether the stimulus intensities on their fingers were the same or different while watching the videos. For equality judgment performance, we found evidence for noise processes at the tails, as participants were more likely to judge tactile stimuli as equal when they were either near threshold or well-above threshold. For stimuli outside of this noise regime, we found a significant shift in the peak of the equality judgment curve (alpha) such that participants were most likely to respond ‘equal’ when the cued stimulus was less intense than the uncued stimulus. These findings suggest that viewing informative touch enhances tactile perception.

Acknowledgements: This material is based upon work supported by the National Science Foundation under grant no. 1632849

Talk 8, 7:00 pm, 55.18

Common computations in automatic cue combination and metacognitive confidence reports 

Yi Gao1 (), Kai Xue1, Brian Odegaard2, Dobrimir Rahnev1; 1Georgia Institute of Technology, 2University of Florida

Sensory stimuli introduce varying degrees of uncertainty, and it is crucial to accurately estimate and utilize this sensory uncertainty for appropriate perceptual decision making. Previous research has examined the estimation of uncertainty in both low-level multisensory cue combination and metacognitive estimation of confidence. However, it remains unclear whether these two forms of uncertainty estimation share the same computations. To address this question, we used a well-established method to induce a dissociation between confidence and accuracy by manipulating energy levels in a random-dot kinematogram. Subjects (N = 99) completed a direction discrimination task for visual stimuli with low vs. high overall motion energy. We found that the high-energy stimuli led to higher confidence but lower accuracy in a visual-only task. Importantly, we also investigated the impact of these visual stimuli on auditory motion perception in a separate task, where the visual stimuli were irrelevant to the auditory task. The results showed that both the high- and low-energy visual stimuli influenced auditory judgments, presumably through automatic low-level mechanisms. Critically, the high-energy visual stimuli had a stronger influence on auditory judgments compared to the low-energy visual stimuli. This effect was in line with the confidence but contrary to the accuracy differences between the high- and low-energy stimuli in the visual-only task. These effects were captured by a simple computational model that assumes that common computations underly confidence reports and multisensory cue combination. Our results reveal a deep link between automatic sensory processing and metacognitive confidence reports, and suggest that vastly different stages of perceptual decision making rely on common computational principles. 

Acknowledgements: We thank Minzhi Wang for his help with data collection. This work was supported by the National Institute of Health (award: R01MH119189) and the Office of Naval Research (award: N00014-20-1-2622).