No true representations: Storing objects as either items or holistic representations

Poster Presentation 53.353: Tuesday, May 19, 2026, 8:30 am – 12:30 pm, Banyan Breezeway
Session: Visual Working Memory: Models, neural

Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions

There is a Poster PDF for this presentation, but you must be a current member or registered to attend VSS 2026 to view it.
Please go to your Account Home page to register.

Ian Deal¹ (imd5205@psu.edu), Ziyi Su¹, Brad Wyble¹; ¹The Pennsylvania State University

Visual working memory (VWM) is the process that stores temporary visual information needed by other cognitive processes. There is some debate as to the true nature of representations in VWM, but it is typically described as maintaining discrete, item representations. Through current advances in deep-learning architectures and training, we model the interaction between the visual system and VWM using a generative deep-learning neural network: Model for Latent Representation (MLR). MLR consists of four main components: a spatial transformer for attending to objects or groups, a modified variational autoencoder for compressing visual representations, a classifier for categorizing colors and objects, and finally a memory binding pool to store information. This architecture allows for simulations to encode, store, and reconstruct visual scenes with multiple discrete elements. This model offers various ways to encode a single scene, including the use of multiple items or alternatively encoding the whole scene as a holistic representation. Importantly, even when using holistic representations, MLR can simulate effects traditionally viewed as supporting item-based storage. For example, when simulating multi object change detection experiments, MLR exhibits a set-size effect even when using a single holistic representation due to the loss of details as the complexity of a stored item is increased. Moreover, holistic encoding replicates the item-based conjunction advantage of (Olson, Jiang 2002). We also explain data that challenge item-based encoding using a traditional change detection task. When subjects are shown 6 items simultaneously, accuracy is higher when probed on the whole display as opposed to half of the items and this impairment disappears when the initial display is shown sequentially. MLR implies that memory representations can be configured for a specific task, suggesting that a true, underlying representation of VWM may not exist.

Acknowledgements: This paper was written with the generous support of the National Science Foundation grant number BCS-2216127

Vision Sciences Society

No true representations: Storing objects as either items or holistic representations

Important Dates

MyVSS

Join VSS

Future Meetings