Recurrent processing reshapes tuning over time in deep neural networks
Poster Presentation 23.443: Saturday, May 16, 2026, 8:30 am – 12:30 pm, Pavilion
Session: Face and Body Perception: Models
Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
Amirhossein Farzmahdi1, Hossein Adeli1, Wang Boran1, Chase King1, Nikolaus Kriegeskorte1; 1Zuckerman Mind Brain Behavior Institute, Columbia University
The primate ventral visual stream is rich in recurrent connections. Unlike feedforward models, recurrent networks change their internal representations over time, supporting dynamic processing of visual information. Here we ask: how does recurrence transform neural network representations across hierarchy and time during naturalistic object recognition? We trained a 6-layer recurrent convolutional neural network with bottom-up and lateral connections to recognize different face identities. Using a controlled stimulus set of faces and objects with systematically varying pose (yaw, pitch, roll) and identity, we probed the model’s internal representations across layers and recurrent steps at both population and single-unit levels. Early convolutional layers showed largely stable representations over time, indicating that recurrence did little to change low-level feature tuning. In contrast, deeper layers exhibited pronounced tuning shifts: as recurrence unfolded, representations transitioned from supporting coarse face detection to fine identity discrimination, accompanied by increased invariance of face identities to in-depth and in-plane pose changes. At the same time, object representations became less distinctive and less invariant under the same transformations. Importantly, recurrent processing in deeper layers produced tuning shifts at the single-unit level: for example, narrowing from many faces to a smaller set of face identities. RISE-based importance maps suggested a mechanism for these shifts: the diagnostic image features and face parts driving each unit’s response also changed over time (e.g., moving from eyes to mouth). Together, these unit-level effects reveal genuine tuning shifts, rather than simple sharpening of a fixed preference, underlying the coarse-to-fine transition at the population geometry. These results show that recurrent processing can flexibly reshape representational tuning over time, especially in deeper layers of a network. The global-to-specific transition and increasing face invariance replicates neurophysiological findings in the primate IT cortex and suggest that recurrent dynamics are a key mechanism for achieving brain-like visual recognition.
Acknowledgements: This work was supported in part by the National Institute of Neurological Disorders and Stroke, National Institutes of Health, under award RF1NS128897.