Separate neural representations for physical and communicative social interactions along the lateral visual pathway: evidence from data-driven voxel decomposition

Poster Presentation 23.349: Saturday, May 18, 2024, 8:30 am – 12:30 pm, Banyan Breezeway
Session: Face and Body Perception: Neural mechanisms of social cognition

Yuanfang Zhao1 (), Emalie McMahon1, Leyla Isik1; 1Johns Hopkins University

Recognizing social interactions is remarkable for both its adaptive significance and visual complexity. Previous studies have suggested that the lateral visual cortex and superior temporal sulcus (STS) are involved in social interaction perception. However, it has been difficult to disentangle neural responses of different types of social interaction with hypothesis-driven approaches, due to challenges with feature labeling, sampling and experimenter bias. To overcome these issues, we employ a data-driven voxel decomposition technique (i.e., non-negative matrix factorization) to a largescale naturalistic fMRI dataset of participants freely viewing two hundred 3-second video clips. These naturalistic videos depicted two individuals engaging in various social and nonsocial activities sampled from everyday scenes. Our analysis of the lateral visual cortex and STS revealed two components with distinct functional profiles that were shared across all participants. We used extensive dataset annotations and free-response captions to characterize these components. The first component responds strongly to videos that are rated as highly communicative in feature annotations and are captioned primarily as “talking”. Voxel weight analysis revealed that anterior STS is most highly weighted by this component. Conversely, the second component responds strongly to joint physical actions between people in the videos. This component has a significant correlation with the “joint action” feature annotations, even after controlling for motion energy, and top-responding videos are captioned as physical interactions such as “dancing”. These results are particularly noteworthy since neural responses to the labeled feature “joint action” have not been identified before. Voxel weight analysis indicates that this component is most strongly weighted in mid-level regions of the lateral stream, including middle temporal area (MT) and extrastriate body area (EBA). Together, our findings suggest that joint action and communication represent two distinct forms of social interaction that are encoded differently in posterior to anterior regions along the lateral visual pathway.

Acknowledgements: This work is supported by R01 grant (No. NIH R01MH132826)