The time course of visual image generation
Poster Presentation 56.305: Tuesday, May 19, 2026, 2:45 – 6:45 pm, Banyan Breezeway
Session: Visual Memory: Imagery
Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
Yueyang Zhang1, David Melcher1; 1New York University Abu Dhabi
We investigated the time course of visual image generation: does imagery involve an instantaneous retrieval of a stored image or is it a gradual, constructive process? In order to map the time course of visual imagery vividness, we moved beyond traditional self-report vividness scales to directly compare imagery with the processing steps of diffusion-based image-generation models. In two behavioral experiments, participants viewed images of common objects (e.g., animals, tools, foods) that were created using Stable Diffusion 1.4 and CLIP ViT-B/32. Prompts followed the structure “a + color + object”. Each image was produced with 10 inference steps, and outputs from steps 1–10 served as our vividness scale. In Experiment 1 (N = 33), participants heard an auditory prompt, imagined the object for 1 s or 4 s, answered a catch question to ensure comprehension, then rated the vividness of their mental image by comparing it to the diffusion-step images, focusing solely on vividness. A linear mixed model controlling for familiarity showed that longer imagery time significantly increased vividness, and mean vividness was correlated with VVIQ scores. Experiment 2 (N= 38) extended the design to four imagery durations: 1, 2, 3, and 4 seconds (N = 38). Again, reported vividness increased with imagery time, with the largest gain between 1 and 2 seconds, with diminishing gains for longer durations. These findings support a gradual construction view of imagery, where representations develop from more unclear to more detailed images. This suggests that visual imagery is a stepwise, constructive process rather than an instantaneous retrieval.