Object-Based Information Predicts Delayed Neural Decoding of Scenes
Poster Presentation 33.333: Sunday, May 17, 2026, 8:30 am – 12:30 pm, Banyan Breezeway
Session: Scene Perception: Neural mechanisms
Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
Sundari Ruth1, Gillian Rosenberg1, Michelle Greene1; 1Barnard College
Although scene understanding is rapid and possibly automatic, there is also substantial evidence that some scenes are processed faster than others (Greene et al., 2015; Caddigan et al., 2017). This work examines the hypothesis that scenes with higher information content are slower to process. Prior work shows that visually information-rich scenes are detected and categorized more slowly and less accurately than visually simpler scenes. In contrast, semantically rich scenes show an early detection advantage but a later categorization disadvantage (Aronson et al., 2025). Here, we examine the role of object-based information, as objects contain both visual and semantic information. We began by comprehensively labeling all objects in each of 67,000 scenes using multimodal algorithms. To reduce labeling noise, we used two pipelines: one combined the Segment Anything Model (SAM, Kirillov et al., 2023) with GroundingDINO (Liu et al., 2024), and the other used Detectron2 (Kirillov et al., 2020). Both pipelines output the proportion of pixel area encompassed by each of ~1300 objects in each scene. We computed the object-based entropy for both pipelines and averaged the results. We selected the 20 highest-entropy and the 20 lowest-entropy images for our EEG experiment. Each image was shown to participants 30 times in a randomized order for 500 ms, while observers performed a border color change detection task and recorded a 128-channel EEG. We conducted a whole-brain decoding analysis using a linear support vector machine to assess the ability to read out image identity at each time point and across conditions. Strikingly, we found that scenes with low object information were decoded very early (maximum accuracy: 112 ms post-stimulus onset), while high object information scenes were decoded at a significant delay (maximum accuracy: 284 ms). This suggests that greater object variety is associated with slower scene processing.