No true representations: Storing objects as either items or holistic representations

Poster Presentation: Tuesday, May 19, 2026, 8:30 am – 12:30 pm, Banyan Breezeway
Session: Visual Working Memory: Models, neural

Ian Deal1 (), Ziyi Su1, Brad Wyble1; 1The Pennsylvania State University

Visual working memory (VWM) is the process that stores temporary visual information needed by other cognitive processes. There is some debate as to the true nature of representations in VWM, but it is typically described as maintaining discrete, item representations. Through current advances in deep-learning architectures and training, we model the interaction between the visual system and VWM using a generative deep-learning neural network: Model for Latent Representation (MLR). MLR consists of four main components: a spatial transformer for attending to objects or groups, a modified variational autoencoder for compressing visual representations, a classifier for categorizing colors and objects, and finally a memory binding pool to store information. This architecture allows for simulations to encode, store, and reconstruct visual scenes with multiple discrete elements. This model offers various ways to encode a single scene, including the use of multiple items or alternatively encoding the whole scene as a holistic representation. Importantly, even when using holistic representations, MLR can simulate effects traditionally viewed as supporting item-based storage. For example, when simulating multi object change detection experiments, MLR exhibits a set-size effect even when using a single holistic representation due to the loss of details as the complexity of a stored item is increased. Moreover, holistic encoding replicates the item-based conjunction advantage of (Olson, Jiang 2002). We also explain data that challenge item-based encoding using a traditional change detection task. When subjects are shown 6 items simultaneously, accuracy is higher when probed on the whole display as opposed to half of the items and this impairment disappears when the initial display is shown sequentially. MLR implies that memory representations can be configured for a specific task, suggesting that a true, underlying representation of VWM may not exist.

Acknowledgements: This paper was written with the generous support of the National Science Foundation grant number BCS-2216127