Perceptual Organization: Bistability, representation

Talk Session: Saturday, May 20, 2023, 8:15 – 9:45 am, Talk Room 2
Moderator: Cathleen Moore, University of Iowa

Talk 1, 8:15 am, 21.21

Perceptual Organization is Limited in the Peripheral Vision

Cathleen Moore1 (), Qingzi Zheng1, Nicole Jardine; 1University of Iowa

The abstraction of 3D scene structure from 2D image information—e.g., representing the relative depths of surfaces, the continuation of occluded surfaces behind other surfaces, the grouping of discontinuous contrast regions as reflecting a single object in the world, etc.—is a critical pre-categorical component of establishing internal representations of the external world that guide successful action; i.e., the general function of vision. Much of what we know about these perceptual organization processes, however, has been learned from experiments that present stimuli at or near fixation. The quality of image-level representations, on which mid-level processes depend, declines sharply from central to peripheral regions of the visual field due to decreased acuity, and more critically, to the spatial uncertainty that leads to visual crowding. This raises the possibility that perceptual organization processes do not function in peripheral vision in the same way that they function in central vision. Using the configural-superiority effect as a metric, we measured multiple perceptual organization processes, including closure, surface completion, 3D structure from 2D geometry, and surface scission from transparency, for stimuli presented at fixation versus 15° to 24° in the periphery, controlling for cortical magnification. We found substantial differences for most of these organization processes, consistent with the possibility that the periphery is perceptually unorganized; it may be represented in terms of non-unitized segmented textures, rather than perceptual units (a.k.a, objects). Downstream processing consequences of this possibility, including failures of object correspondence and object-mediated representational updating, are discussed.

Acknowledgements: NIH R21 EY029432

Talk 2, 8:30 am, 21.22

Perceptual popout may be linked to de-suppression of orientation-untuned surround suppression in macaque V1

Xingnan ZHAO1 (), Shenghui ZHANG1, Shiming Tang1,2,4, Cong Yu1,3,4; 1PKU-Tsinghua Center for Life Sciences,Peking University, 2School of Life Sciences, 3School of Psychological and Cognitive Sciences, 4IDG-McGovern Institute for Brain Research, Peking University

A line target surrounded by orthogonal lines can be detected effortlessly. This perceptual pop-out phenomenon has been linked to weakened suppression of V1 neuronal responses by cross-orientation surrounds than by iso-orientation surrounds when compared to the target-only condition. Here we studied the “neuronal” pop-out effect with two-photon calcium imaging in V1 of awake fixating macaques, investigating its response properties and mechanisms from a population perspective. The stimulus was a central Gabor target centered on the classical RFs with iso- or cross-orientation surrounds of various sizes placed outside the classical RFs. Data collected from four macaques show that besides weakening iso-surround suppression, cross surrounds also reduce trial-by-trial variability (Fano factor). Further, response enhancement by cross surrounds is accompanied by suppression of responses to cross surrounds per se, suggesting a push-pull effect once reported in contour integration. Comparing population orientation tuning functions indicates that iso surrounds mostly produce orientation-untuned suppression, reducing responses of neurons tuned to all orientations, plus some target orientation-tuned suppression. Meanwhile, cross surrounds mainly reduce orientation-untuned suppression by scaling up all neurons’ responses, with little impact on orientation-tuned suppression. These changes can be nicely described by a simplified divisive gain control model R_s (θ)=R_t (θ)^n/k, in which R_t (θ) is the measure target-only population orientation tuning function, and changes of n and k represent orientation-tuned and untuned surround modulation. While iso surrounds induce n and more substantial k changes, cross surrounds mainly reduce k. These population results suggest mostly orientation-untuned iso surround suppression, inconsistent with extant models of surround modulation, but in line with some recent evidence for V1 horizontal connections targeting heterogeneous orientation domains, especially at longer distances (Chavene et al., 2022). Meanwhile, cross surrounds create orientation discontinuity, which may boost all neurons’ responses to de-suppress the orientation-untuned suppression and produce neuronal popout, likely through separate feedback modulation.

Talk 3, 8:45 am, 21.23

Neural representation of occluded objects in visual cortex

Courtney Mansfield1 (), Tim Kietzmann2, Jasper van den Bosch3, Ian Charest4, Marieke Mur5, Nikolaus Kriegeskorte6, Fraser Smith1; 1University of East Anglia, 2University of Osnabruck, 3University of Birmingham, 4Universite de Montreal, 5Brain and Mind Institute, Western University, 6Zuckerman Institute, Columbia University

The ability of the human visual system to recognize occluded objects is striking, yet current models of vision struggle to account for this successfully. Previous studies investigating occlusion at both the behavioural and neural levels typically used simple shapes or cut outs as occluders, rather than other objects. The goal of the present study was to understand what best explains neural representations of occluded objects under more realistic occlusion i.e., when objects occlude other objects. We approached this by explicitly relating activity patterns of occluded objects (e.g. a cup occluding a face) with those generated when viewing the same objects in isolation (the cup or the face). In an event-related fMRI design, participants (N=12) performed a one-back task while being presented with objects presented in isolation (un-occluded), occluded by another object, or cut out by a corresponding object silhouette. We defined anatomical regions of interest in EVC (V1-V3), mid-visual regions (V4/LO1-3) and IT. Decoding analyses showed that EVC responses to occluded objects were better determined by the visible features whereas in IT inferred features also explained the responses well. Our data also showed strong effects of competition across multiple object representations in EVC, although these were significantly weaker in IT. Separate linear regression analyses further showed that the weights assigned to occluded objects in IT were well predicted by independent categorization judgements (higher weights corresponded to lower accuracy and slower RT). Whereas in EVC weights instead were predicted by the magnitude of occlusion, with smaller weights assigned as the percentage of object occluded increases. In sum our results demonstrate that IT better decouples responses to real-world occluded objects with robust representations evident across multiple competing objects. Thus, our data support the importance of investigating neural mechanisms underlying object recognition under more naturalistic occlusion scenarios.

Talk 4, 9:00 am, 21.24

Gestalt formation promotes awareness of suppressed visual stimuli during binocular rivalry

Mar Nikiforova1 (), Rosemary Cowell, David Huber; 1University of Massachusetts, Amherst

Continuous flash suppression leverages binocular rivalry to render observers unaware of a static image for several seconds. To achieve this effect, rapidly flashing noise masks are presented to the dominant eye while a static stimulus is presented to the non-dominant eye. Eventually "breakthrough" occurs, wherein awareness shifts to the static image shown to the non-dominant eye. We tested the hypothesis that Gestalt formation can promote breakthrough. In two experiments, we presented “pacman”-shaped objects that might or might not align to form illusory Kanizsa bars, finding that breakthrough was faster when the pacmen were aligned. To measure the inception of breakthrough, observers were instructed to press a key at the moment of partial breakthrough. After pressing the key, which stopped the trial, observers reported how many pacmen were seen and where they were located. Supporting the Gestalt hypothesis, observers more often reported pairs of pacmen if they were aligned. To address whether these effects reflected illusory shape perception, a computational model was applied to the pacman report distributions and breakthrough times for an experiment with four pacmen. A full account of the data required an increased joint probability of reporting all four pacmen, suggesting an influence of a perceived illusory cross induced by the figures.

Talk 5, 9:15 am, 21.25

How many perceptual categories do observers experience during visual multistability?

Jan Skerswetat1 (), Peter J. Bex1; 1Northeastern University, USA

Multistability perception, e.g. binocular rivalry, is a phenomenon widely used in visual neuroscience. Classic methods use experimenter-determined perceptual categories that track when and for how long principal categories (exclusive and mixed percepts) were seen. Unfortunately, these methods bias observers toward experimenter-defined categories and thus may not correspond with an observer’s experience, they do not generate continuous data, nor do they record gradual changes within mixed percepts. We recently published the InFoRM (Indicate-Follow-Replay Me) method, which measures near-continuously multistability, gradual between-and within-percept changes, and generates introspection maps. Here, we introduce an a priori method to estimate how many perceptual states people perceive during physical and perceptual rivalry. 28 participants performed eight 1min trials for three different contrast conditions while viewing obliquely-oriented sinusoidal gratings, which changed across time either physically or perceptually their spatial composition. Participants were trained to highlight continuously six perceptual categories (chosen based on reports in the literature) via joystick tilting. We applied k-means unsupervised machine learning to partition clusters of 2D-joystick data for each trial (3600 data/trial), participant, and contrast condition and compared those between perceptual rivalry and physical replay data. Averaged across trials, participants, and conditions, six clusters explained 96.41% ± 0.29σ and 96.42% ± 0.34σ of data for physical and perceptual rivalry, respectively. We then used silhouette value analysis, fitted those data to the number of clusters using polynomial functions for each observer to determine the minimum cluster separation. Averaged across trials, observers, and conditions, 9 clusters (range across observers: 2-10) had minimum silhouette values for both perceptual-rivalry and physical-replay and an intra-observer-agreement of 12/28. InFoRM’s novel approach allows autonomous inter- and intra-individual classification and counting of perceptual clusters during multistable perception.

Acknowledgements: Supported by NIH grant R01EY029713. InFoRM(Indicate-Follow-Replay Me) is disclosed as a patent held by Northeastern University, Boston USA. Both authors are founders and shareholders of the company PerZeption Inc. (USA).

Talk 6, 9:30 am, 21.26

Increasing Interocular Grouping Demands during Binocular Rivalry with MEG

Eric Mokri1 (), Jason da Silva Castanheria1, Janine D. Mendola1; 1McGill University

Binocular rivalry is a phenomenon where two incompatible images are simultaneously presented, one to each eye, and elicit perceptual alternations between image dominance and suppression. Remarkably, binocular rivalry accommodates interocular grouping so that if portions of two globally coherent images are shown to each eye, subjects still perceive the global pattern far more often than would be expected by chance. In this study, we recorded the subject's perceptual reports (N=48) and MEG brain activity (N=30) while viewing classic rivalry (BR) or rivalry with interocular grouping (IOG). This was compared to conditions with increasing grouping demands; stimuli were divided into 2, 4, or 6 complementary patches shown to each eye. In all cases, participants reported their frequency and durations of dominant or mixed percepts, and stimuli consisted of flickering red or green orthogonal gratings at 5 or 6.7Hz, respectively. This allowed for analysis of tagged fundamental and intermodulation frequencies in the SSVEP with MEG. During BR, subjects perceived one of the two coherent images ~75% of the time, while for IOG, the coherent global percepts were seen at least 40% of the time. Compared to BR, IOG produced weaker MEG power at the fundamental frequencies in early visual cortex. However, IOG produced greater MEG signal beyond early visual cortex. Behavioral data demonstrates that grouping across the vertical image meridian is slightly more robust than across horizontal meridian, and this is supported by differences in MEG topography. Specifically, topography for vertical meridian IOG more closely resembles that obtained for BR. The reduced power for IOG in V1/V2 might be expected from less time spent perceiving/attending a coherent image, but patterns of intermodulation power suggest enhanced binocular integration, consistent with prior evidence from fMRI. We interpret this data with regard to collinearity and the balance of excitation/inhibition between early and later visual areas.

Acknowledgements: NSERC Discovery Grant & NSERC CGSM