Action

Talk Session: Sunday, May 17, 2026, 5:15 – 7:15 pm, Talk Room 2
Moderator: Cristina de la Malla, Universitat de Barcelona

Talk 1, 5:15 pm, 35.21

Contact-point selection in grasping under degraded vision

Carla Lanca1,2, Zoltan Derzsi1,2,3, Robert Volcic1,2,3; 1Division of science, New York University Abu Dhabi, 2Center for Brain and Health, New York University Abu Dhabi, 3Center for Artificial Intelligence and Robotics, New York University Abu Dhabi

Successful grasping relies not only on motor execution but also on the selection of effective grasp contact positions on the object surface from available visual information. When vision is degraded, the visuomotor system must compensate to maintain successful object interaction. Under full vision, we hypothesized that participants would engage in refined planning, selecting object regions and approach trajectories that maximize stability and overall grasp quality. To test this, we examined how visuomotor choices change when vision is reduced using controlled blur. Participants (n=23) performed reach-to-grasp movements toward 3D-printed objects presented at different orientations and two levels of contrast while hand kinematics were recorded. Under full-vision conditions, behavior showed clear signatures of visually guided selection of grasp contact positions. Movements reflected active consideration of optimal approach paths and contact regions. Grip shaping unfolded later in the reach, consistent with the use of detailed visual cues to refine grasp contact positions up to the moment of grasp. Importantly, grasping angles sometimes deviated intentionally from each participant’s natural grasp axis (NGA), reflecting visually driven choices to optimize grasp contact quality. When vision was degraded, behavior shifted toward a more conservative strategy. Movement initiation slowed and maximum grip aperture occurred earlier, indicating reduced reliance on fine-grained visual monitoring. Grasps were less clustered around the center of mass and grasping angles aligned more closely with each participant’s NGA, suggesting increased reliance on habitual grasp configurations. Together, these patterns suggest that when vision is available, the sensorimotor system prioritizes grasp quality by refining the selection of grasp contact positions. Under degraded vision, the system simplifies movements, trades spatial precision for robustness, and reverts to natural grasp strategies to maintain successful object interaction. The findings highlight the central role of visual information in shaping the selection of grasp contact positions and in structuring grasping behavior.

We acknowledge the support of the NYUAD Center for Artificial Intelligence and Robotics and the NYUAD Center for Brain and Health, funded by Tamkeen under the NYUAD Research Institute Awards CG010 and CG012.

Talk 2, 5:30 pm, 35.22

Dissociating the effects of haptic object size and orientation on grasp aperture and orientation

Robert Whitwell1,2, Angus Lau1; 1Department of Physiology & Pharmacology, Western University, London ON Canada, 2Department of Psychology, Western University, London ON Canada

When we reach for an object, visual input about the object's size and orientation informs the hand’s in-flight grasp aperture and orientation, respectively. The visuomotor relationships between these object properties and their corresponding grasp features rely on distinct but overlapping areas of intraparietal cortex. Nevertheless, grasps do not rely solely on a visual analysis of the target. For example, a persistent absence of tangible, real object induces variable and exaggerated scaling of grasp aperture to object size in normally-sighted participants, and in the case of 'DF', who has visual form agnosia, an absence of grip scaling altogether. Is object size and orientation coded independently in the haptic domain like they are in the visual domain? We used a dual adaptation paradigm to address this question. Normally-sighted participants (N=96) reached for visual virtual targets whose real (haptic) sizes and/or orientations were congruent or incongruent with their visual counterparts. Adaptation to incongruent haptic object orientation and size was assessed by measuring aftereffects on grasp orientation and aperture, respectively. Using the logic of dual adaptation, when compensatory updates require changes to the same underlying circuit, interference ensures one or both aftereffects are null. By this same token, independent circuitry affords both compensatory updates to occur and, accordingly, aftereffects for both manipulations. Consistent with independence, haptic changes in target size and orientation induced aftereffects on grasp aperture and orientation, respectively. Furthermore, aftereffects on grasp orientation and aperture under dual adaptation conditions were not significantly different from the effects observed when each haptic target feature was manipulated in isolation. Moreover, inconsistent with a simpler visuomotor memory based explanation, aftereffects on grasp aperture and grasp orientation extended to non-trained visual target sizes and orientations. Overall, these findings indicate that haptic object size and orientation contribute independently to updating motor plans for grasp orientation and aperture.

This work was supported by an NSERC Discovery Grant to RLW (RGPIN-2022-05050)

Talk 3, 5:45 pm, 35.23

Visual representations reflect the potential for action

Michael L. Paavola1, Kade T. Tanke1, J. Toby Mordkoff1, Cathleen M. Moore1; 1University of Iowa

The ultimate function of vision is the guidance of action. Using the Simon Effect as a tool, we tested the hypothesis that spatial representations include information about whether and how objects can be acted on in addition to their locations in space. The Simon Effect is the finding that responses made with effectors that are on the same side of space as the stimuli on which the responses are based tend to be faster and more accurate than responses made with opposite-side effectors. Critically, Simon Effects occur despite location being irrelevant to the task. The task may be, for example, make a right-hand response to orange stimuli and a left-hand response to blue stimuli; location is irrelevant. The Simon Effect, therefore, provides a measure of spatial information that is incorporated into visual representations independent of instruction. The question addressed here was what is the nature of that spatial information? Using virtual reality to present stimuli at variable distances from the observer, we showed that the Simon Effect was substantially reduced when stimuli appeared at unreachable distances. In another study, again using virtual reality, we presented colored balls that started far away from the observer on the left or right side. The balls moved toward the observer in either a straight trajectory toward the hand on the same side or in a crossed trajectory toward the hand on the opposite side. The Simon Effect for cross-trajectory balls was almost completely reversed relative to same-trajectory balls. Thus, the effective spatial information was not where the balls were in space, which was the same across trajectory conditions, but where they would be if acted on. These studies demonstrate that spatial representations of objects in three-dimensional dynamic environments reflect the potential for acting on the objects, not simply where they are in space.

Talk 4, 6:00 pm, 35.24

Pantomimed actions recruit intuitive knowledge about visuomotor feedback

Sholei Croom1 (), Chaz Firestone1; 1Johns Hopkins University

Visually guided behavior arises from a complex synergy between perceptual and motor systems; when someone grabs a cup to take a drink, for example, the mechanics of their reach are updated online in response to evolving perceptual input. To what extent are ordinary observers aware of this aspect of others’ goal directed actions? Here, we explored these questions through “pantomimed actions”, in which people perform actions with imaginary objects. We created a stimulus set of videos where agents performed both genuine object-directed actions (e.g., stepping over a box), and pantomimes of those actions (e.g., stepping over an imagined box). We asked both (a) whether naive observers who watch these videos can distinguish real actions from pantomimed actions, and also (b) which kinds of information underwrite this performance. In Experiment 1, subjects watched raw video of real and pantomimed actions side-by-side, with a black ‘censor bar’ covering the real (or imagined) object’s location. Under these conditions, subjects were able to distinguish the two action types at rates above chance; for example, they could tell whether someone was interacting with a real (vs merely imagined) box, and also whether someone was shuffling between two real (vs merely imagined) poles. Moreover, subject text responses reflected rich inferences about which features of the movement should be diagnostic. However, in Experiment 2, subjects viewed the same actions but with body movements instead depicted by simple ‘pose skeletons’ (dots connected by lines on a black background) generated from the original videos. Under these conditions, observer performance dropped to chance, despite the kinematic information being preserved across experiments. Together, these results suggest that ordinary people can relate differences in action kinematics to differences in sensory conditions, but that this capacity must be grounded in contextual information about the actor’s relation to their environment.

Talk 5, 6:15 pm, 35.25

Dissociable systems for visually guided navigation and reaching in human parietal cortex

Hee Kyung Yoon1, Yaelan Jung1, Daniel Dilks1; 1Emory University

The parietal cortex is widely thought to support visually guided actions, but whether it contains distinct regions specialized for different action classes – such as navigation, reaching, and grasping – remains unknown. Prior work implicates the superior parietal lobule (SPL) in navigation and the superior parietal occipital cortex (SPOC) in reaching, yet whether these regions are truly dissociable is unclear because each has never been tested for the other function. Here, we addressed this question using fMRI in human adults. Participants viewed four types of stimuli: Dynamic Scenes (video clips of first-person motion through scenes), Static Scenes (static images taken from these same movies, rearranged such that first-person motion could not be inferred), Contextual Reaching (video clips of first-person reaching motion on a scene background), and Isolated Reaching (video clips of the same actions on a black background). A clear double dissociation emerged: SPL responded significantly more to Dynamic than Static Scenes – consistent with its role in visually-guided navigation – and, critically, more to Dynamic Scenes than either reaching condition. By contrast, SPOC responded significantly more to both reaching conditions than to either scene condition. Resting-state functional connectivity further supported this double dissociation: SPL showed stronger connectivity with “leg-motor cortex” than with “arm-motor cortex”, whereas SPOC showed the opposite pattern. Together, these findings reveal two distinct parietal systems: SPL for visually-guided navigation and SPOC for visually-guided reaching – clarifying how the parietal cortex organizes visually guided action.

R01 EY29724

Talk 6, 6:30 pm, 35.26

Environment and task structure natural gaze behavior: Spatial and spatio-temporal patterns in stair vs. hill climbing

Mahdis Dadfar1 (), Kathryn Bonnen2, Trenton D Wirth1; 1Department of Psychology, University of Cincinnati, 2Department of Optometry, Indiana University

During locomotion over complex terrain, surface geometry constrains gaze behavior: walkers adapt their strategies to locomotor demands, with gaze tuned to foothold availability (Matthis, Yates & Hayhoe, CB 2018; Wirth & Matthis, VSS 2022). But how does a highly regular, constructed surface like a staircase, compared to a natural slope, constrain the spatio-temporal structure of gaze during walking? This study (N=5) examined gaze strategies as participants ascended and descended 30m of stairs (18°) and an adjacent sloped hill (22°). A mobile eye tracker and IMU (Pupil Neon, Berlin) recorded eye elevation and head pitch during each walk; their sum defined gaze angle. Because stairs impose greater foot-placement constraints, we predicted more structured gaze patterns on the stair than on the hill. Mean gaze angle showed strong effects of terrain and direction, with steeper, more downward-angled gaze on stairs and during descent, and a significant interaction (all p<.01). We analyzed spatio-temporal gaze behavior using recurrence quantification analysis (RQA) of z-scored gaze angle, assessing recurrence rate (RR; ε=0.07, Lmin=5) in a mixed-effects model. RR was higher on the staircase than the hill F(1,192)=60.64, p<.001, with more revisitation during descent, with a larger descent–ascent difference on stairs (all p<.001). This finding supports our hypothesis by demonstrating greater spatial regularity on stairs. Next, we locked RR at 0.05, allowing ε to vary. We found only walking direction to be significant; ascending walks exhibited greater determinism, entropy, and laminarity (all p<.001), indicating more regular and complex gaze sequences during ascent, independent of terrain. Additionally, mean gaze autocorrelation (lag 0.1-5s) displayed only a terrain × direction interaction (p≈.015), suggesting context-specific temporal persistence. Taken together, terrain predictability and locomotor demands jointly shape gaze dynamics, but in different ways. Stair walking produced more spatial structure (revisitation) and was direction-sensitive, while ascent elicited greater spatio-temporal structure within those recurrences.

University of Cincinnati Office of Research Collaborative Pilots Grant 1022124

Talk 7, 6:45 pm, 35.27

Retinal curl as a direct control variable for locomotion: evidence from gaze-contingent biases

Joan Lopez-Moliner1 (), Kontessa Ioanna Zorpala1; 1Universitat de Barcelona

Traditional models of visually guided locomotion assume that the brain must recover the Focus of Expansion (FOE) by filtering out rotational flow (curl) caused by eye movements. We propose an alternative control procedure where the visual system utilizes retinal curl directly to steer locomotion, rendering the explicit recovery of heading direction unnecessary. We exposed participants (n=10) to visual stimuli simulating walking along straight and curved paths (1 m/s), presented on a large screen (2.03 m x 1.16 m; PROPixx at 120 Hz) displaying a natural-like textured floor generated with simplex noise. Participants maintained fixation on a ground target with different eccentricities to their path, a natural behavior that introduces sustained retinal curl. Participants continuously reported their perceived direction of motion via a rotational encoder. To isolate the role of rotational flow, we employed a real-time manipulation: in critical trials, we cancelled the curl component centered on the fovea while preserving translational flow, effectively removing the rotational signature of gaze stabilization. Under natural flow conditions, participants exhibited systematic heading biases opposite to the direction of gaze (e.g., fixating left induced a rightward bias). Crucially, these biases vanished in the 'cancelled curl' condition, identifying retinal curl as the specific driver of the perceptual shift. We successfully modeled these trajectories using a simple feedback controller that steers locomotion proportionally to the instantaneous mean retinal curl, without any explicit reconstruction of heading direction. The model fits demonstrate that a system reacting solely to retinal curl—treating it as an error signal rather than noise —automatically produces the observed behaviors. These findings challenge the assumption that the visual system must decompose flow into rotation and translation. Instead, by exploiting the geometry of gaze stabilization, the brain may simplify navigation into a low-dimensional control task where recovering the true heading vector is computationally superfluous.

JLM was supported by Grant PID2023-150081NB-I00 funded by MICIU/AEI/10.13039/501100011. KIZ was supported by fellowship PREP2023-001890 from the Ministry of Science and innovation

Talk 8, 7:00 pm, 35.28

Visual model of collision avoidance generalizes to mutual avoidance

Kyra Veprek1, William Warren1; 1Brown University

Walking pedestrians rely on visual information to avoid collisions with moving obstacles. A visual model based on the obstacle’s (i) change in bearing angle and (ii) optical expansion rate closely reproduced avoidance behavior with single and multiple moving obstacles, and crossing flows of pedestrians (Bai, 2022; Veprek & Warren, VSS 2024, 2025). However, we developed the model for participants avoiding non-responsive obstacles that moved on linear trajectories. Here, we investigate whether the model generalizes to mutual collision avoidance between two responding pedestrians. We recorded pairs of participants who simultaneously walked across a room and avoided colliding with each other. Participants walked between virtual targets in a mixed-reality environment (Meta Quest 3), where the physical room and the other participant were visible. Dyads started each trial at different locations on a circle (r = 2.75m) and we manipulated their angle of crossing (60, 90, 120, 150, 180 degrees) and walking speed (‘fast’ and ‘normal’) to vary the potential collisions. We tested the visual model against the mutual avoidance data under three model-fitting conditions: (1) fixed parameters previously fit to single-avoidance data (Bai, 2022), (2) refitting the optical-expansion threshold for initiating avoidance, and (3) refitting all parameters to the mutual-avoidance data. Position error, defined as the mean Euclidean distance between predicted and observed trajectories (0.30m with fixed parameters), was only marginally reduced (~0.01 m) by refitting. This result indicates that the single-avoidance model also explains mutual avoidance behavior. Consistent with model predictions, the difference in Time-to-Collision-Point, defined as the projected time to the intersection of the two paths, was a strong predictor of which participant passed in front of the other (logistic regression, accuracy = 92.96%). These findings show that the same visual control principles governing single-obstacle avoidance extend to mutual pedestrian interactions.

This research was supported by NIH R01EY029745.