Attention: Mechanisms and models

Talk Session: Sunday, May 21, 2023, 5:15 – 7:15 pm, Talk Room 1
Moderator: Sarah Shomstein, The George Washington University

Talk 1, 5:15 pm, 35.11

Direct attention-independent expectation effects on visual perception

Alon Zivony1 (), Martin Eimer1; 1Birkbeck, University of London

It is often claimed that probabilistic expectations can affect visual perception directly, without any role for selective attention. However, these claims have recently been disputed, as effects of expectation and attention are notoriously hard to dissociate experimentally. Thus, despite voluminous amount of research on the topic, clear-cut demonstrations of direct expectation effects are still needed. In this study, we used a new approach to separate expectations from attention. In three experiments (N=45), participants searched for a digit or a letter defined by a low-level cue (color or shape) in a rapid serial visual presentation (RSVP) stream and had to report its identity. Expectations about the alphanumeric category of the target were probabilistically manipulated. Since the target was embedded among many distractors that shared its category and since category membership is a high-level feature, we predicted that targets from the expected category should not attract attention more than targets from the unexpected category. In all three experiments, expected targets were more likely to be identified relative to unexpected targets, indicative of a direct attention-independent expectation effect on perception. In Experiments 2 and 3, attention and expectation effects were measured separately using behavioral and electrophysiological measures. The results showed that category-based expectations had no modulatory effects on indices of attentional capture and of attentional engagement, confirming that the observed expectation effects were not mediated by attentional modulations. Expectation effects did, however, affect processing at later encoding-related stages. Alternative interpretation of the observed expectation effects in terms of repetition priming or response bias were also ruled out. Together, these observations provide new evidence for direct expectation effects on perception. We suggest that even when expectations partially overlap with attentional mechanisms, they also uniquely affect the speed with which expected target objects are encoded in working memory.

Acknowledgements: This work received funding from the European Union's Horizon 2020 research (grant No. 896192) and innovation programme and from ESRC (grant no. ES/V002708/1)

Talk 2, 5:30 pm, 35.12

Attentional Ungluing: Uncertainty modulates task-irrelevant object representations in human early visual cortex

Xiaoli Zhang1 (), Andrew J. Collegio1, Dwight J. Kravitz1, Sarah Shomstein1; 1The George Washington University

Recent behavioral evidence suggests that object representations more strongly constrain attention when spatial location of the target is uncertain, making a direct prediction that spatial uncertainty should modulate the strength of object representations (Shomstein, Zhang, & Dubbelde, 2022). This prediction, however, is counter to the established ‘binding’ theories of attention postulating that binding an object’s features necessitates selecting its location. Here, we test this prediction that the strength of object representation is higher under high uncertainty conditions. We obtained fMRI data and focused on changes in neural activity patterns in object-selective lateral occipital complex (LOC), spatial-selective intraparietal sulcus (IPS), and early visual cortex (EVC), as a function of spatial uncertainty. On each trial, one vertically-shaped object was presented at the center of the screen. Two target Gabor patches were superimposed respectively over the central fixation and one peripheral end of the object. Participants reported whether the patches’ orientations matched. The uncertainty of the peripheral Gabor patch location was manipulated: (1) high uncertainty - target appeared 50% on either end of the object; (2) low uncertainty - target appeared 75% on one end. Using multivoxel pattern analysis, we found that while completely task-irrelevant, object identities could be decoded significantly above chance in EVC. Critically, and as predicted, this decoding accuracy was higher in high spatial uncertainty compared to low uncertainty condition. This finding runs counter to ‘binding’ theories positing that spatial attention automatically “glues” and facilitates object representations; rather, explicit spatial attention guidance resulted in weaker object representations. This discrepancy, likely coming from the task-irrelevant nature of objects in our study, suggests that the automaticity of attentional binding has to be reevaluated. Current models of attention should be revised to incorporate contributions from task-irrelevant aspects of the environment which may dynamically interact with the ongoing cognitive selection processes.

Acknowledgements: This work is supported by National Science Foundation (BCS-1921415 and BCS-2022572 to SS).

Talk 3, 5:45 pm, 35.13

Perceptual awareness occurs along a graded continuum: Evidence from psychophysical scaling

Michael Cohen1,2 (), Jonathan Keefe3, Timothy Brady3; 1Amherst College, Department of Psychology, 2MIT, Department of Brain and Cognitive Sciences, 3UCSD, Department of Psychology

Does sensory information reach conscious awareness in a discrete, all-or-nothing manner, or a gradual, continuous manner? To answer this question, researchers have used numerous paradigms that render stimuli invisible (e.g., backwards masking) and modeled the data from these paradigms using probabilistic mixture models (Zhang & Luck, 2008). This particular approach takes participants’ responses on a continuous reproduction task and models the errors using a combination of a von Mises distribution and a uniform distribution. Modeling responses in this manner allows researchers to quantify the precision of represented items and the rates of not having a representation of an item. Using this approach, numerous studies have claimed that paradigms that render stimuli invisible do so by affecting the guess rate of these mixture models, but not the precision parameter. In other words, these findings suggest that information reaches conscious awareness in a quantal, all-or-nothing manner. Recently, however, work by Schurgin et al. (2020) has undermined foundational assumptions of mixture models and shown that this modeling approach does not reveal two distinct psychological processes. Specifically, Schurgin et al. (2020) showed that precision and guess rate always change together, as though they are just different reflections of a single underlying construct that varies along a single continuum. Thus, in the current work, we asked how well this continuous model fits the data from four radically different paradigms that manipulate visual awareness: the attentional blink, backwards masking, the Sperling paradigm, and retro-cueing. In each of these cases, formal model comparisons showed that the continuous model outperforms the models that had been used to support an all-or-nothing view of consciousness. This result even held when re-analyzing data from prior studies that had argued for a discrete view of perceptual awareness. These results suggest that information is accessed by conscious awareness along a graded continuum.

Acknowledgements: This work was supported by NSF-BCS-1829470 and a CIFAR Fellowship to M.A.C. and an NSF-BCS-1829434 to T.F.B. Thanks to Sarah Cormiea for comments on the manuscript.

Talk 4, 6:00 pm, 35.14

Consequences of relaying top-down attentional modulations via neurons with high-dimensional selectivity

Sunyoung Park1 (), John Serences1; 1University of California San Diego

When we select a specific visual feature as a focus of attention, neural responses in early visual areas to similar stimuli are amplified across the entire visual field. Top-down gain modulations originating from the parietal/prefrontal cortex have been suggested as one mechanism underlying this global effect of feature-based attention. However, neurons in the parietal/prefrontal cortex typically have mixed selectivity to multiple features, so propagating top-down feedback via these neurons may also cause gain modulations in early sensory neurons that are tuned to behaviorally irrelevant features. To test this, we used a recurrent spiking neural network (e.g. Bouchacourt&Buschman,2019) consisting of a sensory layer that has random projections to a second ‘random’ layer. The sensory layer consisted of eight ring attractor sub-networks, in which neurons were topographically arranged by stimulus selectivity in a circular feature space. The ‘random’ layer consisted of neurons that were randomly and reciprocally connected to multiple sensory neurons. The connections to multiple sets of sensory neurons gave rise to linear mixed selectivity to multiple features in random layer neurons. As a result, top-down modulations originating in the random layer will excite/inhibit sensory neurons across many sub-networks. To simulate feature-based attention, we adopted a classical feature-based attention task (Treue & Trujillo, 1999), presenting two stimuli in one sub-network and applying attentional gain to one of them. Consistent with previous fMRI findings, we could decode the attended stimulus from firing rate patterns of the unattended, unstimulated sub-networks (Serences & Boynton, 2007). Importantly, these patterns were different from the stimulus-driven pattern when the same stimulus was presented to the unattended sub-networks. This implies that feedback from neurons with high-dimensional tuning imposes structure on unstimulated neurons that is consistently, but idiosyncratically, related to the attended feature. Our findings highlight previously unrecognized consequences of relaying top-down feedback via neurons with high-dimensional tuning functions.

Acknowledgements: This research was supported by National Eye Institute grant R01 EY025872.

Talk 5, 6:15 pm, 35.15

Frontocentral EEG activity phase predicts subsequent visual target detection in healthy participants but not in schizophrenia

Eric Reavis1,2 (), Jonathan Wynn2,1, Michael Green1,2; 1University of California, Los Angeles, 2VA Greater Los Angeles Healthcare System

Healthy individuals’ ability to detect visual targets is influenced by the phase of ongoing oscillatory brain activity. Specifically, previous work has found that the phase of 6-10Hz frontocentral electroencephalographic (EEG) activity during the period 300ms-50ms prior to the onset of a visual target differs significantly between hit-trials, in which the participant subsequently detects the target, versus miss trials. This relationship between ongoing brain activity and target detection has not previously been investigated in schizophrenia, although specific deficits in attention and perception are known to occur in the disorder. In the present study, individuals with schizophrenia (n=30) and healthy controls (n=20) performed a visual target detection task while EEG activity was recorded at 1024Hz. We preprocessed the EEG data using standard methods, then performed a time-frequency analysis of the trial epochs using a continuous wavelet transform-based approach. Phase angles were contrasted between hit and miss trials with a bootstrapping technique. In healthy controls, the data clearly replicated published findings, showing a significant phase difference for ongoing 6-10Hz frontocentral brain activity between hit and miss trials beginning about 300ms prior to target onset. However, the schizophrenia group showed no such relationship between the phase of pre-target EEG activity and subsequent target detection. These results demonstrate that the normal relationship between ongoing oscillatory brain activity and visual target detection is disrupted in schizophrenia. This disrupted relationship could contribute to known attentional deficits and perceptual abnormalities in the disorder.

Acknowledgements: Supported by an internal UCLA grant (the Stephen R. Mallory Award for Schizophrenia Research) to EAR. During the initial phase of the research, EAR was supported by a Ruth L. Kirschstein National Research Service Award (NRSA) fellowship from the National Institute of Mental Health (F32MH108317).

Talk 6, 6:30 pm, 35.16

Top-down effects on Cross-Modal Stimulus Processing: A Predictive Coding Framework

Soukhin Das1,2 (), Sreenivasan Meyyappan1,2, Mingzhou Ding3, George R. Mangun1,2; 1Center for Mind and Brain, University of California Davis, 2Department of Psychology, University of California Davis, 3Pruitt Family Department of Biomedical Engineering, University of Florida

Studies have shown that attention can operate across different sensory modalities, such as vision and audition, and play a crucial role in our ability to integrate and process multisensory information. However, the neural mechanisms underlying cross-modal attention remain largely unknown. We used event-related potentials (ERPs) to investigate the neural basis of cross-modal attention using a 2x2 design where auditory (HEAR or SEE) or visual cues (H or S) were used to indicate the modality (visual/auditory) of the to-be-attended target. After a random delay, in 80% of the trials, auditory tones, or visual gratings were presented as target stimuli in the cued modality. For the remaining trials (20%), the targets were presented in the un-cued modality (invalid trials). The participants (n=30) were instructed to distinguish the frequency (wide versus narrow) of the visual gratings or the tone (high versus low) of the auditory stimuli on all trials irrespective of the cue validity. The ERPs for targets (cued vs un-cued) showed the effects of attention in both modalities. In the auditory modality, significant differences between valid and invalid cued trials were observed in N100, P300 components in the central channels (Cz, CPz), and late positive potentials (LPP) over posterior channels (CPz, Pz). For visual targets, cueing effects were prominent in the N1-P2 and P300 over posterior and occipital channels, along with posterior LPPs. Furthermore, the amplitudes of the ERP components (auditory-N100, P300, and visual-N1-P2, P300) were enhanced for invalidly cued targets than validly cued targets. Such differences may indicate a re-orientation of top-down cross-modal signals to match the incongruent target and update internal goals and predictions, based on prior cues. Our findings for top-down modulations of early sensory processing can be aided by different aspects of predictive coding in terms of the difference between the predicted information (cued) and the actual stimuli(uncued).

Acknowledgements: This work was supported by NIMH grant MH117991.

Talk 7, 6:45 pm, 35.17

A novel eye-tracking paradigm to investigate the focus of object-based attention

Lasyapriya Pidaparthi1 (), Frank Tong1; 1Vanderbilt University

We use object-based attention in our daily lives to selectively attend to and interact with task-relevant objects in our environment. Previous research has suggested that common neural mechanisms serve to guide eye movements and covert shifts of spatial attention, but might eye movements be also used to determine the focus of object-based attention? We investigated this question by developing a novel eye-tracking paradigm, wherein two objects followed pseudo-randomly generated trajectories while remaining partially overlapping. At random times, either object would be briefly spatially distorted. In separate blocks, participants had to attend to face only, flower only, or to both objects to perform a change detection task while free-viewing eye movements were monitored. We hypothesized that the selectivity of object-based attention would be reflected by the correlation strength between eye position and stimulus location. Using a sliding-window correlation analysis (N=11), we found that gaze trajectories were highly correlated with the trajectory of the task-relevant stimulus (mean r = 0.7061), weakly correlated with that of the irrelevant stimulus (mean r = 0.1612), and intermediately correlated with both stimuli in the attend-both condition (mean r = 0.4278). Behavioral performance also revealed a two-object cost, with higher accuracy for single-object tracking than for the attend-both condition (mean face accuracy, 79% vs. 57%; mean flower accuracy, 66% vs. 42%). To measure the precision with which gaze following was indicative of the attentional focus, we binned behavioral accuracy based on average correlation of each trial (10 trials/bin). This analysis revealed that trials with higher stimulus-gaze correlations had better behavioral performance for the attended stimulus. In the attend-both condition, the degree of gaze following for each of the two stimuli was also predictive of detection accuracy (p < .05). Overall, we demonstrate a promising, novel method to evaluate the trial-by-trial focus of object-based attention using gaze-tracking.

Acknowledgements: NIH R01EY029278 (FT)

Talk 8, 7:00 pm, 35.18

Reconstruction-guided attention improves the object recognition robustness of neural networks

Seoyoung Ahn1 (), Hossein Adeli1, Gregory Zelinsky1; 1Stony Brook University

Many visual phenomena suggest that humans use top-down generative or reconstructive processes to create visual percepts (e.g., imagery, object completion, pareidolia), but little is known about the role reconstruction plays in robust object recognition. We built an iterative encoder-decoder network that generates an object reconstruction and uses it as top-down attentional feedback to route the most relevant spatial and feature information to feed-forward object recognition processes. We tested this model using the challenging out-of-distribution object recognition dataset, MNIST-C (handwritten digits under corruptions) and IMAGENET-C (real-world objects under corruptions). Our model showed strong generalization performance against various image corruptions and significantly outperformed other feedforward convolutional neural network models (e.g., ResNet) on both datasets. Our model’s robustness was particularly pronounced under high levels of distortions, where it showed a maximum 20% accuracy improvement from the baseline model in the maximally noisy conditions in IMAGENET-C. Ablation studies further reveal two complementary roles of spatial and feature-based attention in robust object recognition, with the former largely consistent with spatial masking benefits in the attention literature (the reconstruction serves as a mask) and the latter mainly contributing to the model’s inference speed (i.e., number of time steps to reach a certain confidence threshold) by reducing the space of possible object hypotheses. Finally, the proposed model also yields high behavioral correspondence with humans, which are evaluated by the correlation between human and model’s response time (Spearman’s r=0.36, p<.001) and the types of error made. By infusing an AI model with a powerful attention mechanism, we show how reconstruction-based feedback can be used to explore the role of generation in human visual perception.