Action

Talk Session: Wednesday, May 22, 2024, 8:15 – 10:00 am, Talk Room 1

Talk 1, 8:15 am

Predicting action type from visual perception: a kinematic study.

Annalisa Bosco1,2 (), Elena Aggius Vella1, Patrizia Fattori1,2; 1Department of Biomedical and Neuromotor Sciences, University of Bologna, Bologna, Italy, 2Alma Mater Research Institute for Human-Centered Artificial Intelligence (Alma Human AI)

The planning of a movement toward an object influences the visual perception of the object properties relevant for the action. This suggests a bidirectional interaction between the motor and the visual systems. In the present study, we investigate whether this interaction can be decoded even during the visual estimation of the object properties before the onset of the movement. To this aim, we tested 15 healthy right-handed participants (males=5, females=10; mean age=21.12) in a task consisting of two subsequent phases: 1) a perceptual phase, in which the participants manually estimated the size and orientation of a visual stimulus by extending the index and thumb and, simultaneously, rotating the grip and 2) an action phase, in which participants performed a grasping or a reaching movement (according to the instruction given at the trial onset) towards the same stimulus. A motion capture system recorded the participant’s hand position and movement. In order to test if the action type can be predicted during the estimation phase, i.e. if the type of action requested influences the object estimation, we applied a Random Forest classification model to the perceptual phase. The size and orientation estimations, and the velocity of index and thumb (calculated during the perceptual phase) were used as predictors. We found that the model accuracy in classifying the reaching and grasping was on average 99% for the testing dataset. The corresponding sensitivity (ability in classifying true positives) and specificity (ability in classifying true negatives) of the model were 99,5% and 100%, respectively. The most informative predictor was the orientation estimation that contributed for the 99,94%, followed by the size estimation: 78.02% and the index and thumb velocities: 1.2% and 0.6%, respectively. These results suggest that action-based perceptual information can be optimally used to extract action intentions well before the onset of the movement.

Acknowledgements: MAIA project has received funding from the European Union's Horizon 2020 Research and Innovation Programme under grant agreement No 951910; work supported by Ministry of University and Research, PRIN2020-20208RB4N9 and by National Recovery and Resilience Plan (NRRP), project MNESYS (PE0000006).

Talk 2, 8:30 am

Neural representations of visual motion for perception and interception

Deborah A. Barany1 (), Casey Delaney1, Haleh Mahmoudi1, Michelle Marneweck2; 1University of Georgia, 2University of Oregon

Eye movements are critical for guiding interactions with moving objects according to a behavioral goal, such as tracking an object to perceive its speed or to intercept it with the hand. There are overlapping brain areas involved in motion perception, eye movements, and visually-guided reaching, yet little is known about how these brain areas govern eye-hand interactions with moving objects in different behavioral contexts. Here, we used functional magnetic resonance imaging (fMRI) to investigate how the task goal (perceive/act) and eye movement (fixation/pursuit) modulate neural representations of visual motion. Participants (N = 20) either passively observed (View) or actively intercepted (Intercept-Go) a target moving at a constant rightward or leftward velocity toward an interception zone while their right-hand position and force were recorded on an MR-compatible tablet. On some interception trials, the target changed color prior to movement initiation, indicating that participants should inhibit their planned interception (Intercept-NoGo). Across trials, participants fixated their eyes on the interception zone (Fixate) or smoothly pursued the moving target (Pursue). In-scanner tablet recordings of hand movements and posthoc decoding of eye movements from the MR signal confirmed adherence to the task conditions. Bayesian variational representational similarity analyses of the fMRI data showed that during the initial target motion phase, neural activity patterns in primary visual cortex and human middle temporal area were specific to the eye movement (Fixate vs. Pursue) and motion direction (Right vs. Left), whereas patterns in motor, premotor, and parietal regions were most sensitive to the task goal (View vs. Intercept-NoGo). Analysis of the execution phase of the Intercept-Go trials showed neural activity patterns in primary visual and motor cortices were strongly sensitive to the direction of the target and hand movement. Together, these results reveal distinct eye- and goal-dependent representations for processing visual motion along the sensorimotor hierarchy.

Acknowledgements: University of Georgia Mary Frances Early College of Education and University of Georgia Office of Research

Talk 3, 8:45 am

Activity in primate visual areas is modulated during running

Declan Rowley1, Jake Yates2, Alex Huk1; 1UCLA, 2UCB

Introduction: Running roughly doubles activity in mouse V1, acting as a ~2x gain change (Niell & Stryker, 2010). We recently tested whether these profound modulations of V1 activity are also present in primate, by recording responses in foveal V1 while marmosets running on a treadmill viewed visual stimuli. We found only modest modulations, with hints of suppression. However, whether running affects peripheral representations and/or later visual areas remains unknown. Methods: Following our recent work (Liska, Rowley, et al., 2023), we presented drifting gratings of various orientations to three marmosets while they were perched on a treadmill. Using Neuropixels probes, we recorded from the foveal and peripheral representations of V1, as well as from V2 and MT. Results: We tested whether baseline activity and visually-driven responses were different during running versus not running. In foveal V1, we replicated our finding of little-to-no running effect. However, in the peripheral representation of V1, activity was higher during running in both stimulus viewing and blank periods. Running:Stationary firing rate ratio was 1.196 during stimulus viewing ([1.126, 1.265], 95% CI, p=1.8e-6, 63 cells) and 1.399 during blanks ([1.312, 1.477], p=2e-10). V2 activity was even more strongly modulated during stimulus viewing (1.252, [1.173, 1.351], p = 0.00016 , 26 cells) and during blanks (1.676, [1.416, 1.757], p = 0.00014). Saccade frequency and amplitudes did not differ strongly between conditions, arguing against these modulations arising from different patterns of eye movements when running. In MT, responses were not significantly affected by running (mean firing rate ratio of 1.032 [0.960, 1.109], 77 cells). Conclusion: Although primate foveal V1 is not much affected by running, peripheral V1 and V2 show clear running-correlated modulations. This forms a connection with the striking results found in mice, and calls for a comprehensive dissection of potential modulatory sources and effects.

Talk 5, 9:00 am

Dynamic Sequential Interactions of Spatial Uncertainties Explain Human Navigation Strategies, Errors, and Variability

Fabian Kessler1 (), Julia Frankenstein1, Constantin Rothkopf1; 1Centre for Cognitive Science TU Darmstadt

Human spatial navigation involves integrating visual cues about our motion and position relative to landmarks with internal signals from self-motion to form a sense of location and direction. However, navigating in the dark or trying to return to a starting point in an environment reveals the uncertainty of these multisensory inferences. Previous studies have revealed many navigational behaviors, including beaconing and path integration, and puzzling patterns of errors and variability in navigation. Ideal observer accounts of navigation have found evidence for perceptual cue integration, but some studies have reported single cues often dominating homing responses. However, purely perceptual accounts do not explicitly account for internal representations, motor planning, and the sequentiality of perception and action. Here, we present an ideal actor model of goal-directed navigation in terms of path planning in the framework of optimal control under uncertainty. This model explicitly accounts for state estimation and learning (Where am I? Where is my goal?) and planning and control (Where should I go? How do I get there?) while taking uncertainty in perception, action, and representation into account. Through simulation of five different triangle-completion experiments from three different laboratories with a single set of biologically plausible parameters, we demonstrate that the observed patterns of navigation are caused by the continuous and dynamic interaction of these three uncertainties. Contrary to ideal observer models, which attribute human endpoint variability to perceptual cue combination processes only, our ideal actor model provides a unifying account of a wide range of phenomena while considering variability in perception, action, and internal representations jointly. Importantly, these findings highlight how dynamic interactions of spatial uncertainties profoundly shape goal-directed navigation behavior and how active vision results from shaping uncertainties along the navigation trajectory, impacting cognitive maps, route planning, movement execution, and ultimately observed behavioral variability.

Acknowledgements: Calculations for this research were conducted on the Lichtenberg high performance computer of the TU Darmstadt. This research was supported by the European Research Council (ERC; Consolidator Award 'ACTOR'-project number ERC-CoG-101045783).

Talk 6, 9:15 am

Unconstrained visually-guided grasping is not precision grip

Fabrizio Lepori1 (), Frieder Hartmann2, Kira Dehn2, Manuela Chessa3, Roland W. Fleming2,4, Guido Maiello1; 1Department of Psychology, University of Southampton, 2Department of Experimental Psychology, Justus Liebig University Giessen, 3Department of Informatics, Bioengineering, Robotics, and Systems Engineering, University of Genoa, 4Centre for Mind, Brain and Behaviour (CMBB), University of Marburg and Justus Liebig University, Giessen

In everyday life our hands serve as versatile tools, interacting with objects in numerous ways. Yet most research on visually-guided grasping focuses on precision grips, constraining participants to use only their thumb and index finger. We asked whether insights gained from precision grip studies extend to situations in which participants are free to grasp objects however they chose. To test this, we used a subset of 3D objects from a recent study investigating how participants select precision grips on multi-material (brass and wood) objects [Klein, Maiello et al 2020]. Twenty participants grasped these objects while we tracked their hand movements using a Qualisys passive marker motion capture system. In a first, unconstrained grasping session, participants were free to grasp the stimulus objects however they wanted. In a second, precision grip session, participants were required to grasp the objects using only thumb and index finger. We find that in unconstrained sessions participants rarely employed two-digit precision grips, which accounted for only 9.5% of unconstrained trials (p<.001), and the average position of the digits on the objects differed significantly between precision and unconstrained sessions (p=.019). Nevertheless, in both precision grip and unconstrained sessions participants shifted their grasps towards the objects’ center of mass to minimize grip torque (p=.023). Our data thus confirmed the influence of object visual material appearance—previously observed in precision grip experiments—and extended this result to unconstrained grasping. Additionally, upon closer inspection we found that the position of the thumb and index finger on the stimulus objects did not significantly differ between precision and unconstrained sessions (p=.218), suggesting that the remaining fingers primarily provided a support function. Thus, while participants may rarely spontaneously choose two-digit grasps, previous insights gained from precision gip experiments may still extend to natural, unconstrained grasping behaviours.

Talk 6, 9:30 am

Modeling the information-based control of steering through multiple waypoints

Brett Fajen1 (), AJ Jansen1; 1Rensselaer Polytechnic Institute

Modeling the visual control of steering has been an active area of research for decades, but the majority of work up to this point has focused on steering to a single target or along a winding road. There is much less work on the strategies used to steer through multiple waypoints, which is relevant in locomotor tasks such as slalom skiing. A critical open issue for modeling the multiple-waypoint task is how to capture the influence of information from waypoints that lie beyond the most immediate one. Recently, we found that humans do use such information, often altering their approach to the nearest waypoint, affording a smoother trajectory through the subsequent waypoint. The aim of the present study was to develop and test competing models that capture human steering behavior observed in multiple-waypoint tasks. We consider four models: (1) the behavioral dynamics model (Fajen & Warren, 2003) with a single goal (most immediate waypoint), (2) the behavioral dynamics model with two goals (two upcoming waypoints), (3) a pure-pursuit controller with a single goal (fixated waypoint) (Tuhkanen et al., 2023), and (4) a new model that relies on information about the constant-radius path that passes through the two upcoming waypoints. We simulated all four models and compared the model-generated trajectories to those produced by human subjects in a task that involves steering through multiple waypoints. Only Model 4 captures the shape of the human trajectories, initially veering away from the nearest waypoint before turning back, setting up a smoother trajectory through both waypoints. The other three models either do not anticipate (Model 1) or were influenced by the future waypoint but not in the way that was consistent with human behavior (Models 2 and 3). The findings demonstrate that human-like anticipation of multiple waypoints can be captured within an information-based framework.

Acknowledgements: NSF 2218220

Talk 7, 9:45 am

Modularity of Brain Networks for Egocentric and Allocentric Memory-guided Reaching

Lina Musa1,3 (), Amirhossein Ghaderi1, Ying Chen6, J. Douglas Crawford1-5; 1Centre for Vision Research, York University, Toronto, ON, Canada, 2Vision Science to Applications (VISTA), York University, Toronto, ON, Canada, 3Department of Psychology York University, Toronto, ON, Canada, 4Department of Biology York University, Toronto, ON, Canada, 5Department of Kinesiology York University, Toronto, ON, Canada, 6Centre for Neuroscience Studies, Queen’s University

The brain encodes targets for reaching in egocentric (EGO) and/or allocentric (ALLO) reference frames (Byrne and Crawford 2010). Differences in the cortical activation, but not functional organization, of these two representations have been described (Chen et al., 2014; Neggers et al., 2006). Based on previous findings, we expected increased integration & hubness in the ventral visual stream in ALLO brain networks. Here, we performed a secondary analysis of an event-related fMRI task (Chen et al., 2014). The paradigm consisted of 3 tasks with identical stimulus display but different instructions: remember absolute target location (EGO), remember target location relative to a visual landmark (ALLO), and a nonspatial control, color report. We performed a graph theoretical analysis (GTA) on contrast reduced, time-series data during the memory delay period. GTA measures, including the hubness, clustering coefficient, and efficiency were found, as well as the organization of the network into modules. Dynamical measures of network connectivity (synchrony and complexity) were quantified for individual task modules. EGO and ALLO brain networks showed increased functional segregation & integration, relative to control. Contrary to expectations, there were no inferotemporal modules in both tasks, rather the network was largely segregated into occipito-dorsal-parietal (ODP) and & temporo-frontal (TF) networks modules. The ALLO network demonstrated significantly higher modularity and hubs in the ODP module, relative to the EGO network. When the subtracting the common baseline correlation, the EGO showed segregation of occipital brain areas from the OPD module, but ALLO did not. Our results demonstrate that rather than increased ALLO encoding of visual reach targets in the ventral stream, there is increased specialization in the interaction between early visual brain areas and dorsal parietal brain areas. There was also increase in desynchronization & complexity in the OPD module, in the ALLO task, indicating an increase in difficulty of information processing.

Acknowledgements: Canadian Institutes of Health Research (CIHR), the Natural Sciences and Engineering Research Council (NSERC), Vision: Science to Applications (VISTA) program.