A nonlinear predictive model of natural scenes and visual saliency and search
53.3019, Tuesday, 19-May, 8:30 am - 12:30 pm, Banyan Breezeway
Jinhua Xu1,2, Zhiyong Yang1; 1Brain and Behavior Discovery Institute,James and Jean Culver Vision Discovery Institute, Department of Ophthalmology, Georgia Regents University, 2Department of Computer Science and Technology, East China Normal University
We now have models of visual neurons that account for responses to simple laboratory stimuli, but we do not know how and why neuronal responses are affected by a range of contexts. At the perceptual level, there is a long list of percepts that are influenced by contexts. This issue of visual encoding is also tied to visual search and attention. Much recent research on visual search has focused on visual saliency, which is a signal computed from an input stimulus and is proposed to guide the deployment of visual attention. A variety of measures of visual saliency are proposed, including self-information, Bayesian surprise, and discriminant power. However, it is unclear how these measures are related to neural encoding of natural scenes. To address these related issues, we propose a predictive model of natural scenes, neuronal responses, and visual saliency. In this model, 1) independent components (ICs) of natural scenes are basic visual features; 2) the ICs of the center are modeled as predictive, nonlinear functions of the ICs of the surround in a hexagonal center-surround configuration; 3) neurons estimate the probabilities of the predictive errors; and 4) visual saliency, perceptual quality that makes some items in visual scenes stand out from their immediate contexts, is based on the probabilities of the predictive errors. We performed statistical analysis of natural scenes and developed the predictive model. We then derived a measure of saliency from the model and used it to predict human behaviors on free-viewing of static and dynamic natural scenes and visual search in natural contexts. We found that the proposed predictive model accounts well for human performance on these tasks. This result suggests that, rather than detecting features, visual neurons estimate probabilities of visual stimuli in terms of predictive errors based on natural scene statistics.