What are you doing? Recent advances in visual action recognition research.
The visual recognition of actions is critical for humans when interacting with their physical and social environment. The unraveling of the underlying processes has sparked wide interest in several fields including computational modeling, neuroscience, and psychology. Recent research endeavors on how people recognize actions provide important insights into the mechanisms underlying action recognition. Moreover, they give new ideas for man-machine interfaces and have implications for artificial intelligence. The aim of the symposium is to provide an integrative view on recent advances in our understanding of the psychological and neural processes underlying action recognition. Speakers will discuss new and related developments in the recognition of mainly object- and human-directed actions from a behavioral, neuroscientific, and modeling perspective. These developments include, among other things, a shift from the investigation of isolated actions to the examination of action recognition under more naturalistic conditions including contextual factors and the human ability to read social intentions from the recognized actions. These findings are complemented by neuroscientific work examining the action representation in motor cortex. Finally, a novel theory of goal-directed actions will be presented that integrates the results from various action recognition experiments. The symposium will first discuss behavioral and neuroscientific aspects of action recognition and then will shift its attention to the modeling of the processes underlying action recognition. More specifically, Nick Barraclough will present research on action recognition using adaptation paradigms and object-directed and locomotive actions. He will talk about the influence of the observer's mental state on action recognition using displays that present the action as naturalistic as possible. Cristina Becchio will talk about actions and their ability to convey social intentions. She will present research on the translation of social intentions into kinematic patterns of two interacting persons and discuss the observers' ability to visually use these kinematic cues for inferring social intentions. Stephan de la Rosa will focus on social actions and talk about the influence of social and temporal context on the recognition of social actions. Moreover, he will present research on the visual representation underlying the recognition of social interactions. Ehud Zohary will discuss the representation of actions within the motor pathway using fMRI and the sensitivity of the motor pathway to visual and motor aspects of an action. Martin Giese will wrap up the symposium by presenting a physiologically plausible neural theory for the perception of goal-directed hand actions and discuss this theory in the light of recent physiological findings. The symposium is targeted towards the general VSS audience and provides an comprehensive and integrative view about an essential ability of human visual functioning.
Other peoples' actions interact within our visual system
Speaker: Nick Barraclough; Department of Psychology, University of York, York, UK
Perception of actions relies on the behavior of neurons in the temporal cortex that respond selectively to the actions of other individuals. It is becoming increasingly clear that visual adaptation, well known for influencing early visual processing of more simple stimuli, appears also to have an influence at later processing stages where actions are coded. In a series of studies we, and others, have been using visual adaptation techniques to attempt to characterize the mechanisms underlying our ability to recognize and interpret information from actions. Action adaptation generates action aftereffects where perception of subsequent actions is biased; they show many of the characteristics of both low-level and high-level face aftereffects, increasing logarithmically with duration of action observation, and declining logarithmically over time. I will discuss recent studies where we have investigated the implications for action adaptation in naturalistic social environments. We used high-definition, orthostereoscopic presentation of life-sized photorealistic actors on a 5.3 x 2.4 m screen in order to maximize immersion in a Virtual Reality environment. We find that action recognition and judgments we make about the internal mental state of other individuals is changed in a way that can be explained by action adaptation. Our ability to recognize and interpret the actions of an individual is dependent, not only on what that individual is doing, but the effect that other individuals in the environment have on our current brain state. Whether or not two individuals are actually interacting in the environment, it seems they interact within our visual system.
On seeing intentions in others' movements
Speaker: Cristina Becchio; Centre for Cognitive Science, Department of Psychology, University of Torino, Torino, Italy; Department of Robotics, Brain, and Cognitive Science, Italian Institute of Technology, Genova, Italy
Starting from Descartes, philosophers, psychologists, and more recently neuroscientists, have often emphasized the idea that intentions are not things that can be seen. They are mental states and perception cannot be smart enough to reach the mental states that are hidden away (imperceptible) in the other persons mind. Based on this assumption, standard theories of social cognition have mainly focused the contribution of higher-level cognition to intention understanding. Only recently, it has been recognized that intentions are deeply rooted in the actions of interacting agents. In this talk, I present findings from a new line of research showing that intentions translate into differential kinematic patterns. Observers are especially attuned to kinematic information and can use early differences in visual kinematics to anticipate what another person will do next. This ability is crucial not only for interpreting the actions of individual agents, but also to predict how, in the context of a social interaction between two agents, the actions of one agent relate to the actions of a second agent.
The influence of context on the visual recognition of social actions.
Speaker: Stephan de la Rosa; Department Human Perception, Cognition and Action; Max Planck Institute for Biological Cybernetics, Tübingen, Germany
Actions do not occur out of the blue. Rather, they are often a part of human interactions and are, therefore, embedded in an action sequence. Previous research on visual action recognition has primarily focused on elucidating the perceptual and cognitive mechanisms in the recognition of individual actions. Surprisingly, the social and temporal context, in which actions are embedded, has received little attention. I will present studies examining the importance of context on action recognition. Specifically, we examined the influence of social context (i.e. competitive vs. cooperative interaction settings) on the observation of actions during real life interactions and found that social context modulates action observation. Moreover, we investigated the perceptual and temporal factors (i.e. action context as provided by visual information about preceding actions) on action recognition using an adaptation paradigm. Our results provide evidence that experimental effects are modulated by temporal context. These results in the way that action recognition is not guided by the immediate visual information but also by temporal and social contexts.
On the representation of viewed action in the human motor pathways
Speaker: Ehud Zohary; Department of Neurobiology, Alexander Silberman Institute of Life Sciences, Hebrew University of Jerusalem, Israel
I will present our research on the functional properties of brain structures which are involved in object-directed actions. Specifically, we explore the nature of viewed-action representation using functional magnetic resonance imaging (fMRI). One cortical region involved in action recognition is anterior intraparietal (AIP) cortex. The principal factor determining the response in AIP is the identity of the observed hand. Similar to classical motor areas, AIP displays clear preference for the contralateral hand, during motor action (i.e., object manipulation) without visual feedback. This dual visuomotor grasping representation suggests that AIP may be involved in the specific motor simulation of hand actions. Furthermore, viewing object-directed actions (from an egocentric-viewpoint, as in self action) elicits a similar selectivity for the contralateral hand. However, if the viewed action is seen from an allocentric viewpoint (i.e. being performed by another person facing the viewer), greater activation in AIP is found for the ipsilateral hand. Such a mapping may be useful for imitation of hand action (e.g. finger tapping) made by someone facing us which is more accurate when using the opposite (mirror-image) hand. Finally, using the standard "center-out" task requiring visually guided hand movements in various directions, we show that primary motor cortex (M1) is sensitive to both motor and visual components of the task. Interestingly, the visual aspects of movement are encoded in M1 only when they are coupled with motor consequences. Together, these studies indicate that both perceptual and motor aspects are encoded in the patterns of activity in the cortical motor pathways.
Neural theory for the visual perception of goal-directed actions and perceptual causality
Speaker: Martin. A. Giese; Section for Computational Sensomotorics, Dept. for Cognitive Neurology, HIH and CIN, University Clinic Tübingen, Germany
The visual recognition of goal-directed movements even from impoverished stimuli, is a central visual function with high importance for survival and motor learning. In cognitive neuroscience and brain imaging a number of speculative theories have been proposed that suggest possible computational processes that might underlie this function. However, these theories typically leave it completely open how the proposed functions might be implemented by local cortical circuits. Complementing these approaches, we present a physiologically-inspired neural theory for the visual processing of goal-directed actions, which provides a unifying account for existing neurophysiological results on the visual recognition of hand actions in monkey cortex. The theory motivated, and partly correctly predicted specific computational properties of action-selective neurons in monkey cortex, which later could be verified physiologically. Opposed to several dominant theories in the field, the model demonstrates that robust view-invariant action recognition from monocular videos can be accomplished without a reconstruction of the three-dimensional structure of the effector, or a critical importance of an internal simulation of motor programs. As a 'side-effect', the model also reproduces simple forms of causality perception, predicting that these stimuli might be processed by similar neural structures as natural hand actions. Consistent with this prediction, F5 mirror neurons can be shown to respond selectively to such stimuli. This suggests that the processing of goal-directed actions might be accounted for by relatively simple neural mechanisms that are accessible by electrophysiological experimentation.