VSS, May 13-18

Representation of object motion in the macaque ventral visual stream

Poster Presentation 26.339: Saturday, May 14, 2022, 2:45 – 6:45 pm EDT, Banyan Breezeway Poster 4
Session: Motion: Object motion, biological motion

Times are being displayed in EDT timezone (Florida time): Wednesday, July 6, 4:00 am EDT America/New_York.
To see the V-VSS schedule in your timezone, Log In and set your timezone.

Search Abstracts | VSS Talk Sessions | VSS Poster Sessions | V-VSS Talk Sessions | V-VSS Poster Sessions

Kohitij Kar1 (), Lynn K. A. Sörensen2, James J. DiCarlo1; 1Massachusetts Institute of Technology, 2University of Amsterdam

Primates seamlessly integrate dynamic visual information of moving objects to navigate their daily activities. However, we currently lack a neurally mechanistic understanding of how the brain supports the joint representation of object identity and position across time, leading to a unified perception of a moving object. Building on previous reports of behaviorally explicit object identity (Majaj et al., 2015; Kar et al., 2019) and object position information (Hong et al., 2016) in the macaque inferior temporal (IT) cortex, here we explicitly tested whether we can approximate object velocities from the distributed IT population activity. We repeatedly showed 600 movies (300ms long) that contained objects (one of ten) moving in specific directions (one of eight), at varying speeds, to monkeys (n=3) that passively fixated a central dot. We simultaneously measured large-scale neural activity (using chronic multielectrode arrays) from areas V4 (155 sites), IT (212 sites), and ventrolateral PFC (78 sites) across these monkeys. First, we observed that a nonlinear temporal integration model could dynamically transform V4, IT, and vlPFC population activity into object velocity readouts. Interestingly, however, unlike V4 and vlPFC-based decodes, object velocity could be also decoded linearly from instantaneous (~10ms) IT population activity (peaking ~300ms post-movie onset), indicating the presence of a precomputed velocity signal across the IT population activity pattern. Consistent with previous studies, the corresponding object identity decodes from IT significantly preceded (~150ms) these motion signals. In addition, we observed that IT-like layers from two-stream convolutional neural network models (of action recognition) also support simultaneous readouts of object identity and velocity –- establishing these as good baseline hypotheses to model primate object motion processing. These results challenge the common functional segregation of primate visual processing into the ventral (“what”) and dorsal (“where”) pathways and motivate the development of integrated (dorsal+ventral) models to study dynamic scene perception.

Acknowledgements: This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216 and Simons Foundation grant SCGB-542965 (J.J.D.)