Comparing Convolutional Neural Networks to Traditional Models of Covert Attention During Visual Search

Poster Presentation 56.330: Tuesday, May 21, 2024, 2:45 – 6:45 pm, Banyan Breezeway
Session: Visual Search: Mechanisms, models

Ansh K. Soni1 (), Sudhanshu Srivastava1, Miguel P. Eckstein1; 1University of California - Santa Barbara

Introduction: Performance degradation in Visual Search with an increased number of distractors (the set-size effect) is often used to make inferences about the properties of covert attention. Recently, studies have utilized Convolutional Neural Network (CNN) models (Srivastava, 2023; Nicholson, 2022; Poder, 2022), although their relationship to traditional models is not well understood. Our goal is to compare CNN models to traditional models of visual search. Methods: We implement six models of covert attention during visual search to predict target detection accuracy (yes/no) in feature and conjunction tasks. For each model, we adjust the two feature searches (line angle or luminance) to match model performance. These feature values were used to build the conjunction task (angle and luminance). Three of the models are image-computable, acting directly on the pixels of the images: Image computable Bayesian ideal observer (IC-BIO), a fully-trained 5-layer CNN, and a large network trained on image classification (VGG-16 network) with transfer learning. The remaining three models operated on assumed extracted feature values (normally distributed activations from the target and each distractor): a Signal Detection Theory (SDT; Green and Swets, 1966) model without or with capacity limits (SDTc; Poder, 2019) and a Guided Search Accuracy model (GSA; Wolfe 1989; Eckstein, 2000). Results: For feature and conjunction search, the CNN and VGG-16 models showed similar set-size effects to unlimited capacity models (IC-BIO/SBIO and SDT) and smaller than the models with capacity limitations (SDTc and GSA with serial attention). The accuracy degradation from feature to conjunction search was lowest to highest for: IC-BIO, CNN, VGG-16, SDT (assuming independent processing of features; Ecktein, 1998), and SDTc. Conclusion: Our findings benchmark newer CNN models against traditional search models, showing a correspondence between CNN set-size effects and Signal Detection/Ideal Observer models but distinct feature/conjunction search accuracy relationships.