Integrating vision and decision-making models with end-to-end trainable recurrent neural networks

Poster Presentation 53.444: Tuesday, May 21, 2024, 8:30 am – 12:30 pm, Pavilion
Session: Decision Making: Perceptual decision making 3

Yu-Ang Cheng1 (), Ivan Felipe Rodriguez1, Takeo Watanabe1, Thomas Serre1,2; 1Brown University, 2Carney instititue for Brain Science

Historically, research in visual perception and perceptual decision-making has been pursued independently. Models of visual perception have primarily focused on developing neurocomputational mechanisms for visual processing, particularly in object and face recognition. These models, however, largely approximate only human accuracy levels, not fully utilizing reaction time data. In contrast, decision-making models have sought to replicate both accuracy and reaction times in human behavior, but they do not adequately address underlying visual processing mechanisms. Here, we bridge this gap and introduce an integrated end-to-end trainable recurrent neural network model. First, we optimize a vision module, a convolutional neural network, for a well-known perceptual decision-making task, i.e., the random dot motion task (Britten et al., 1992). We show that fitting a straightforward nonlinear reaction time function (Goetschalckx et al., 2023) to the vision module outputs fails to capture the distributions of human reaction times for the same task. However, fitting the drift-diffusion model (Ratcliff & Rouder, 1998), a traditional cognitive model significantly improves the goodness of fit. We further turn to a discrete-time recurrent neural network (RNN) approximation of the Wong-Wang circuit (Wong & Wang, 2006) for decision-making, which we optimize end-to-end together with the vision module using human behavioral data. We show that this combination offers a better fit for experimental data. In addition, analyzing the weights of the resulting model yields novel insights about the underlying integration process’s time course and the image features driving these decisions. Our integrated RNN model of vision and decision-making represents a first step towards a complete computational model of perceptual decision-making.