An Expanded Triple-N Dataset Reveals Hierarchical Dynamics and Cross-Species Correspondence from V1 to IT

Poster Presentation 23.439: Saturday, May 16, 2026, 8:30 am – 12:30 pm, Pavilion
Session: Functional Organization of Visual Pathways: Cortical visual processing 2

Pinglei Bao1 (pbao@pku.edu.cn), Yipeng Li1, Xieyi Liu1, Jia Yang1; 1Peking University

The initial release of the Triple-N (Non-human Primate Neural Responses to Natural Scenes) dataset targets high-level vision, focusing on inferotemporal (IT) cortex (Li et al., 2025), leaving the upstream stages that feed IT much less well characterized under matched conditions. Here, we introduce a new component of the Triple-N dataset that centers on early to intermediate visual areas and uses it to quantitatively link macaque V1, V2, and V4, human visual cortex, and deep convolutional neural networks. Using Neuropixels recordings in two macaques, we targeted V1, V2, and V4 based on each animal’s anatomy. Across 19 sessions, we obtained 8,794 visually responsive units with split-half reliability larger than 0.4 while the animals passively viewed 1,000 shared NSD images. With this dataset, we first confirmed that temporal dynamics and representational structure evolve systematically along the ventral stream. Response onset and peak latencies shifted earlier from IT to V4, V2, and V1, confirming a robust temporal hierarchy under naturalistic stimulation. Encoding models based on a deep convolutional network revealed a clear layer-by-layer mapping: V1 and V2 were best predicted by early convolutional layers, V4 by intermediate layers, and IT by deep fully connected layers. More interestingly, receptive fields estimated from spike-triggered averages of local contrast showed orderly, stable spatial tuning within each session. Correlating macaque recordings with human fMRI responses to the same stimuli further revealed an organized cross-species correlation map, with high correspondence in both visual hierarchy and retinotopic organization. Taken together, the expanded Triple-N dataset delivers a continuous, experimentally matched set of neural measurements from V1 through IT, allowing quantitative alignment with human visual cortex and deep networks. Recordings from V1, V2, and V4 thus provide the essential bridge from early feature processing to the high-level representations observed in IT.