Early Task-specific Scene Semantics Revealed Through ERP Encoding

Poster Presentation 33.326: Sunday, May 17, 2026, 8:30 am – 12:30 pm, Banyan Breezeway
Session: Scene Perception: Neural mechanisms

Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions

Vivian Gao¹, Gillian Rosenberg¹, Bruce Hansen², Michelle Greene¹; ¹Barnard College, ²Colgate University

Visual scene categorization is rapid and automatic, yet it is not the endpoint of understanding. Scene meaning is contextual and depends on an observer’s goals (a realtor and a burglar may “see” the same home differently). Here, we employ EEG encoding with variance partitioning to map the time course of goal-dependent scene semantics. Observers provided verbal descriptions of 40 scenes according to one of four prompts: affordances (what can you do?), affect (what would you feel?), mental simulation (what happens next?), or multisensory experience (what would you hear, smell, or taste?). All descriptions were encoded using MPNet (Song et al., 2020). We computed shared scene semantics as the averaged embedding for all descriptions for each image across tasks. Task-specific representations were then computed as the residual between each task's embedding and the global mean. In our experiment, 128-channel EEG was recorded while observers (N=30) viewed scenes and performed an orthogonal border-color change detection task. We first predicted brain activity from shared image features; then, we predicted the residual brain activity from task-specific features. Thus, unique R² reflects only the additional variance explained by task-specific semantic framing. Overall, the affordances and mental simulation features were reflected earlier in the epoch (111 - 144 ms) and explained over twice as much unique variance as the affective and multisensory features, consistent with affordance and mental simulation’s importance for scene understanding (Greene et al., 2016; Battaglia et al., 2013). Notably, we observed task-specific semantic activity for all prompts before 200 ms, even though several required inference (emotion, prediction, multisensory experience) rather than purely perceptual processing, implying an earlier emergence of task-specific semantics than suggested by previous literature (e.g. Kutas & Federmeier, 2011).

Acknowledgements: NSF 2522311

Vision Sciences Society

Early Task-specific Scene Semantics Revealed Through ERP Encoding

Upcoming Deadlines

MyVSS

Join VSS

2026 Meeting

2027 Meeting