Social and Perceptual Attributions Derived from Moving Shapes: A Language Model Analysis
Poster Presentation 36.333: Sunday, May 19, 2024, 2:45 – 6:45 pm, Banyan Breezeway
Session: Scene Perception: Virtual environments, intuitive physics
Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
There is a Poster PDF for this presentation, but you must be a current member or registered to attend VSS 2024 to view it.
Please go to your Account Home page to register.
Emily D Grossman1 (), Sajjad Torabian1, Zhenze Zhang1, John A Pyles2, Hongjing Lu3; 1University of California Irvine, 2University of Washington, 3University of California Los Angeles
Introduction. In 1944, Heider and Simmel showed that humans spontaneously generate social interpretations when viewing sparse animations with moving shapes. Further studies have followed, investigating how motion trajectories are recognized as human actions (Roemmele et al. 2016), how cuing can elicit social meaning in simple animations (Tavares et al. 2008), and how motion within a context can drive attributions of beliefs (Baker et al. 2017). Here, we combine and compare attributions of actions, intentions, emotions, and beliefs elicited by animations, as this intersectional approach has been understudied. Specifically, we transform humans’ descriptions of Heider-Simmel like animations to a semantic space, where we can then examine representational structures that underlie perceptual and cognitive processes. Methods. Each participant viewed two subsets of 100 animations, while labeling the gist of each animation with single keywords in the first phase, and choosing from a predefined list of labels in the second phase. The list was created based on previous literature and was broadly categorized into action, intention, and belief. Human labels of each animation were then embedded into a semantic vector using Google’s Universal Serial Encoder (USE) language model. We generated three models capturing emotion, interactivity, and mental-state attribution, and correlated each with the semantic similarity structure of the animations. Results. A network frequency analysis showed that participants most saliently identified emotional narratives from the sequences, with nodes of negative emotion and avoidant action appearing as hubs. The semantic structure of the animations as observed by the participants was strongly correlated with the emotion model, followed by the model of interactivity and more weakly by the mentalistic model. Conclusion. Humans are sensitive to perceiving emotional attributes from animations of moving shapes, as compared to action- and belief-based attributes.
Acknowledgements: NSF BCS-1658560 to EG and BCS-1658078 to JP