Gaze Reveals How Task Demands Shape Learning from Demonstrations

Poster Presentation 16.319: Friday, May 15, 2026, 3:45 – 6:00 pm, Banyan Breezeway
Session: Eye Movements: Cognition

Inga Ibs1, Ingolf Tegtmeier1, Constantin A. Rothkopf1; 1Technical University of Darmstadt

In natural, everyday tasks, perception, cognition, and action are tightly intertwined. For example, when learning how to act from demonstrations, people need to select relevant information with their gaze based on their internal beliefs, reason about the demonstrations, and plan their subsequent actions. Here, we investigated how participants select information with their gaze in a controlled experiment that uses a navigation game to manipulate the task difficulty and informativeness of visual demonstrations. Participants (n=20) were shown trajectories of a fictional robot moving through environments with different terrains. They were instructed to infer the robot's movement costs associated with the different terrains and then guide the robot through a new environment. Task difficulty was manipulated by varying the number of terrains, and the informativeness of demonstrations was manipulated by how well an ideal-observer model could infer movement costs from the provided demonstrations. Analysis of the gaze data reflected a two-phase strategy. Participants first gathered information, and then executed their plan at the end of each trial. Furthermore, attention was directed more toward the most informative parts of the demonstrations, as quantified by the ideal observer model. Gaze allocation was sensitive to the two task manipulations: In complex environments, participants switched more frequently between demonstrations and the environment they had to navigate, with shorter fixation sequences allocated to each surface, whereas in less complex environments, they switched less often with longer sequences. Uninformative demonstrations led to extended switching, reflecting a more thorough search for information. In contrast, informative demonstrations led to slightly longer fixation sequences, particularly in lower-complexity environments. These results indicate that participants adapted their visual attention strategies to support their inference and planning in response to task demands. Finally, using extensions of the ideal-observer model, we show how gaze patterns relate to participants' solutions to the navigation task.

Acknowledgements: This work was supported by the LOEWE research priority program “WhiteBox” [grant number LOEWE/ 2/13/519/03/06.001(0010)/77] (funded by the Hessian Ministry of Higher Education, Research, Science and the Arts).