V-VSS, June 1-2

Perception and Action, Eye Movements

Talk Session: Wednesday, June 1, 2022, 8:30 – 9:45 am EDT, Zoom Session

Times are being displayed in EDT timezone (Florida time): Friday, September 30, 11:57 am EDT America/New_York.
To see the V-VSS schedule in your timezone, Log In and set your timezone.

Search Abstracts | VSS Talk Sessions | VSS Poster Sessions | V-VSS Talk Sessions | V-VSS Poster Sessions

Talk 1, 8:30 am, 71.61

Measuring and modelling fixational eye movements

Allie C. Hexley1 (), Laura K. Young2, Tom C. B. McLeish3, Hannah E. Smithson1; 1University of Oxford, 2Newcastle University, 3University of York

Introduction: Fixational eye movements (FEMs) comprise periods of drift (slow, meandering motion), with superimposed tremor (fast oscillations), interrupted by microsaccades (fast, jump-like movements). Here, we record FEMs at high spatial and temporal resolution using an adaptive optics scanning laser ophthalmoscope (AOSLO). We use the data to develop models of FEMs, with emphasis on characterising the drift component. Methods: From each of 10 participants, we recorded 50 two-second AOSLO movies during foveal fixation. FEM traces were extracted in post-processing, relative to separately collected retinal image montages. Within each trace we classified periods of microsaccades, drift and superimposed tremor, and tracking failures, using both existing and new automated techniques. We validate each technique against ground-truth data, generated with an AOSLO simulator (ERICA, Young and Smithson, 2021), and manual labelling. 500 ms drift periods were isolated from each trace and evaluated against candidate models selected from random walk models that are common in the statistical (especially polymer) physics literature. Different models capture different behaviours, such as persistence or anti-persistence, self-avoidance and bounding or localisation. Such differences in behaviour can characterise either drift under varying conditions or different timescales of the same drift response. They were evaluated using diagnostic plots, like the autocorrelation function and log-log plots of mean-squared-displacement against time-lag. Results: Diagnostic plots showed high levels of consistency across participants, once microsaccades were removed. There were individual differences in mean drift velocity but random walk characteristics were largely preserved across participants. These characteristics are poorly fitted by existing models and their extraction is dependent on microsaccade detection. Conclusions: We analysed FEMs against different random walk models to characterise drift and we report the model that best fits the new data. Improved methods for extracting high-resolution drift traces from AOSLO recordings are important in delivering data that can discriminate candidate models of ocular drift.

Acknowledgements: This project has received funding from the EPSRC (EP/W004534/1); the Leverhulme Trust (VP1-2019-057); a Reece Foundation Fellowship in Translational Systems Neuroscience (Newcastle University) and a UKRI Future Leaders Fellowship (MR/T042192/1).

Talk 2, 8:45 am, 71.62

The saccadic eye movement system is ineffective for the perception of depth from motion parallax

Mark Delisi1 (), Mark Nawrot1; 1Center for Visual and Cognitive Neuroscience, North Dakota State University

Retinal image motion, absent other visual depth cues, is perceptually depth-sign ambiguous. For motion parallax (MP), pursuit eye movement signals disambiguate the depth-sign of the opposing directions of retinal motion. Disambiguation is achieved with extraordinarily brief stimulus presentations: 30 msec with unimpeded viewing and 70 msec with processing disrupted with masking. Transcranial magnetic stimulation (TMS) of the Slow-Eye Movement region of the Frontal Eye Fields (FEFsem) disrupts both the generation of pursuit eye movement signals and perception of depth from MP. Considering proximity of FEF regions for pursuit and saccadic eye movements, involvement of saccadic eye movements in the disruption of MP must be explored. To investigate the role of saccadic eye movements in the perception of depth-sign from MP, we varied stimulus presentation duration (PD) in an adaptive staircase. PD started at 167 msec, and moved between a floor of 17 msec and a ceiling of 333 msec. The random-dot stimulus depicted a vertically oriented sinusoid with a peak local dot velocity of 4.7 deg/s. To elicit pursuit, stimulus window translation was 9.4 deg/s. To elicit saccades, the window stepped laterally 1 deg. Five conditions compared: stationary stimulus window, lateral window translation (pursuit), stepped window movement (saccade). Eye position was monitored. Absent stimulus window translation, PD tracked towards the ceiling, meaning these stimuli were perceptually ambiguous. Conditions with pursuit found PD tracking towards the floor, meaning these stimuli were depth-sign unambiguous. The saccade condition found PD tracking towards the ceiling, meaning that saccades do not disambiguate the depth-sign of MP. Saccadic eye movements do not disambiguate depth from MP as do pursuit eye movements. This means that any disruption of MP produced by TMS of FEFsem is linked to disruption of pursuit, and not linked to a possible disruption of the saccadic system.

Acknowledgements: Supported by NIH NEI R15 EY031129 and NIH NIGMS P30 GM114748

Talk 3, 9:00 am, 71.63

Oculomotor influences on the dynamics of visual sensitivity

Michele A. Cox1,2 (), Janis Intoy1,2, Yuanhao H. Li1,2, Scott Murdison3, Bin Yang1,2, Zhetuo Zhao1,2, Michele Rucci1,2; 1Department of Brain and Cognitive Sciences, University of Rochester, USA, 2Center for Visual Science, University of Rochester, USA, 3Reality Labs, Redmond, WA, USA

Humans continually move their eyes, alternating saccades with a smooth fixational motion known as ocular drift. The saccade/drift cycle modulates the visual input to the retina in a highly specific manner, delivering spatiotemporal signals with power that shifts from low to high spatial frequencies during the course of post-saccadic fixation. Recent research has shown that these signals contribute to a coarse-to-fine dynamics of contrast sensitivity when stimuli are presented in the central visual field. Here we show that the saccade/fixation cycle carries similar perceptual consequences across eccentricities. We measured contrast sensitivity to low and high spatial frequencies (2 or 10 cycles/deg) at three visual eccentricities (0, 4 and 8) and various delays (50, 150, or 500 ms) following an instructed saccade (6.6 degrees). Subjects were asked to report the presence/absence of a circular grating embedded within a naturalistic noise field, as their eye movements were recorded by a digital DPI eyetracker. To elicit a normal saccade transient while preventing visibility of the grating before the saccade, the grating appeared at saccade onset and remained on the display for a fixed interval following its offset. Sensitivity to low spatial frequency was high immediately following the saccade and uniform across eccentricities. Continued exposure during fixation provided minimal improvement to sensitivity. In contrast, sensitivity to high spatial frequency declined with increasing eccentricity, as expected. However, sensitivity improved with fixation duration at similar rate at all eccentricities. To examine the origins of these effects, we exposed models of the retinal ganglion cells (magno- and parvo-cellular, ON and OFF) to reconstructions of the visual input signals experienced by subjects in the experiments. A standard decision-making model that cumulated responses over the saccade/fixation cycle accurately replicated visual dynamics. Dissection of the model shows that the oculomotor-shaped luminance dynamics is primarily responsible for these effects.

Acknowledgements: This work was supported by Reality Labs. MR and JI contributions were supported by National Institutes of Health grants EY018363 and EY029565.

Talk 4, 9:15 am, 71.64

Virtual visual navigation during context-dependent learning in the human hippocampus using intracranial recordings (SEEG)

Nasim Mortazavi1 (), Milad Khaki1, Greydon Gilmore2, Jorge Burneo3, David Steven3, Ana Suller-Marti3, Julio Martinez-Trujillo; 1Schulich School of Medicine and Dentistry, Western University, 2Deprtment of Biomedical Engineering, Western University, 3Department of Clinical Neurological Sciences, London Health Sciences Centre, Western University

The hippocampal cognitive map can reflect spatial and nonspatial task conditions by binding relevant aspects of experiences within a context. Sharp Wave Ripples (SWRs) are known as the most synchronous neural pattern during memory consolidation. Using a visual context associative learning paradigm, we look for the temporal relationship between the incidence of hippocampal SWRs and collected targets across trials. This study investigates how activity in the human hippocampus changes according to different contextual conditions in the same space. We anticipate that learning the task will increase the likelihood of recording these task-related activities. As part of Western University's epilepsy program, participants were implanted with depth intracranial electrodes using StereoElectroEncephaloGraphy (SEEG) for preoperative evaluation. Participants navigated the circular maze's boundaries while collecting treasure boxes and earning points. Contextual information was displayed on the maze walls and then coloured targets were displayed in the decision zone after leaving the navigation zone. We developed our algorithm for detecting SWRs and synchronized it with a behavioural state space model. The 35th trial of 42 was considered the initial learning trial, and 0.99% of the performance was correct from there. We determined that 86% of SWRs were detected before the initiation point across four implanted electrodes in the right and left hippocampus. When the player reached 30% of the task's total duration, the rate of ripples was at its highest. The rate of normalized events was about 10 times greater for the significant change in incorrect vs correct trials. Preliminary findings from one patient recording indicate that the rate of SWRs increases as learning occurs. Two specific increases in SWRs were detected in both successful and unsuccessful trials. Moreover, the significant change in the rate of ripples was associated with the unsuccessful trial, illustrating the importance of failed trials during the learning process.

Talk 5, 9:30 am, 71.65

Gaze Grammars - Is there an invariant hierarchical sequential structure of human visual attention in natural tasks?

John Harston1 (), Roshan Chainani1, Aldo Faisal1,2,3; 1Brain and Behaviour Lab, Imperial College London, 2UKRI Centre in AI for Healthcare, Imperial College London, 3MRC London Institute of Medical Sciences

Human visual attention is highly structured around gathering relevant information on the underlying goals or sub-goals one wishes to accomplish. Typically, this has been modelled using either qualitative top-down saliency models, or through the use of highly reductionist psychophysical experiments. Modelling an information-gathering process in these ways however often ignores the rich and complex repertoire of behaviour that make up ecologically-valid gaze. We propose a new way of analysing natural data, suitable for analysing temporal structure of visual attention in complex, freely moving environments. To achieve this, we capture visual information from subjects performing an unconstrained task in the real world - in this case, cooking in a kitchen. We use eye-tracking glasses with a built-in scene camera (SMI ETG 2W @ 120Hz) to record n=15 subjects, setting up, cooking breakfast and eating in a real-world kitchen. We process the visual data using a deep-learning based pipeline (Auepanwiriyakul et al., 2018, ETRA), to obtain the stream of objects in the field of view. Eye-tracking gives us a sequence of objects people are focussing their overt attention on throughout the task. We resolve ambiguities using pixel-level object segmentation and classification techniques. We analyse these sequences using HMM and context-free grammar induction models (IGGI), revealing a potential hierarchical structure which is invariant across subjects. We compare this grammatical structure against a “ground-truth”, the WordNet lexical database of semantic relationships between objects (Miller et al, 1995), and find some surprising similarities and counterintuitive differences between attention-derived and textual base structure of objects, suggesting that the differences between how we look and how we verbally reason about tasks is an open question of cognition and attention.

Acknowledgements: EPSRC