Border ownership signals emerge in an artificial neural network trained to predict future visual input

Poster Presentation 63.302: Wednesday, May 22, 2024, 8:30 am – 12:30 pm, Banyan Breezeway
Session: Perceptual Organization: Segmentation, shapes, objects

Zeyuan Ye1 (), Ralf Wessel1, Tom P. Franken1; 1Washington University in St. Louis

To identify the objects that surround us, the brain needs to segment visual scenes into organized collections of objects. A dominant segmentation signal in the visual cortex of the primate brain is border ownership: border ownership neurons signal which side of a border is owned by a foreground surface. Neural border ownership is known to display hysteresis with scene changes or object motion, suggesting that these neurons aid in processing dynamic visual scenes. Here we explore whether border ownership signals emerge in a neural network trained to predict future visual input. We evaluated whether units in PredNet, an artificial neural network trained to predict the next frame in natural videos, are selective for border ownership. We measured the response of units in PredNet to static scenes of an isoluminant square on an isoluminant background. We focused our analysis on units for which the classical receptive field only contained a luminance contrast border which could – in different trials – be owned by a square on one or the other side (border ownership). Scenes also varied in border orientation and luminance contrast polarity. We found that R and E units in PredNet are often selective for border ownership, irrespective of the luminance contrast polarity. The preferred side of ownership was remarkably tolerant to border orientation, similar as in our data from the non-human primate brain. The proportion of border ownership units was higher in deeper layers (L1,2,3>L0). The magnitude of border ownership signals increased with depth (L3>L2>L1>L0). Our data show that units selective for border ownership emerge in PredNet even though this network was not explicitly trained to segment visual scenes, but instead to predict the next frame in natural videos. This suggests that border ownership units in neural networks aid in efficiently predicting future input in natural videos.

Acknowledgements: This work was supported by NIH grant R00EY031795 (TPF) and the "Incubator for Transdisciplinary Futures: Toward a Synergy Between Artificial Intelligence and Neuroscience" (RW).