Using hearing and vision for localization, motion perception, and motion prediction

Poster Presentation 26.408: Saturday, May 18, 2024, 2:45 – 6:45 pm, Pavilion
Session: Multisensory Processing: Audiovisual behavior

There is a Poster PDF for this presentation, but you must be a current member or registered to attend VSS 2024 to view it.
Please go to your Account Home page to register.

Yichen Yuan1 (), Nathan Van der Stoep1, Surya Gayet1; 1Experimental Psychology, Helmholtz Institute, Utrecht University

Predicting motion in noisy environments is essential to everyday behavior, for instance when participating in traffic. Although many objects provide multisensory information, it remains unknown how humans use multisensory information to track moving objects, and how this depends on sensory interruption or interference (e.g., occlusion). In four experiments, we systematically investigated localization performance for auditory, visual, and audiovisual targets in three situations. That is, (1) locating static target objects, (2) locating moving target objects, and (3) predicting the location of target objects moving under occlusion. Performance for audiovisual targets was compared to performance predicted by Maximum Likelihood Estimation (MLE). In Experiment 1, a substantial multisensory benefit was found when participants localized static audiovisual target objects, showing near-optimal audiovisual integration. In Experiment 2, no multisensory precision benefits were found when participants localized moving audiovisual target objects. Yet, localization estimates were in line with MLE predictions. In Experiment 3A, moving targets were occluded by an audiovisual occluder at an unpredictable timepoint, and participants had to infer the final target location from target speed and occlusion duration. In this case, participants relied exclusively on the visual component of the audiovisual target, even though the auditory component demonstrably provided useful location information when presented in isolation. In contrast, when a visual-only occluder was used in Experiment 3B, participants relied primarily on the auditory component of the audiovisual target (which remained available during visual occlusion), even though the visual component demonstrably provided useful location information during occlusion when presented in isolation. In sum, observers use both hearing and vision when tracking moving objects and localizing static objects, but use only unisensory input when predicting motion under occlusion, perhaps to minimize short-term memory load. Moreover, observers can flexibly prioritize one sense over the other, in anticipation of modality-specific interference.