3D Perception

Talk Session: Monday, May 20, 2024, 10:45 am – 12:15 pm, Talk Room 1

Talk 1, 10:45 am

Competition between priors for convexity and rigidity in Structure-From-Motion

Ryne Choi1,2 (), Jacob Feldman1,2, Manish Singh1,2; 1Rutgers University - New Brunswick, 2Rutgers University, Center for Cognitive Science

We investigated the competition between priors for convexity and rigidity in Structure-From-Motion. We found that a preference for convexity can override the ubiquitous rigidity assumption: a rigidly rotating plane with a convex hill and a concave valley is perceived as a surface with two hills, moving non-rigidly. Our SFM stimuli consisted of two vertically elongated parts (both half-ellipsoids or bivariate Gaussians; each a convex “hill” or concave “valley”), centered in the left and right halves of a square plane. Competition between priors occurs with “one hill–one valley” stimuli: when convexity wins, the surface is seen as moving non-rigidly, with two hills. When rotated about its central vertical axis (Experiments 1,2), the non-rigid percept is of “folding” along that axis; when rotated about its horizontal axis (Experiments 3,4) it is of “twisting.” We manipulated the strength of convexity through the shape of the parts: half-ellipsoids (more convex) vs. bivariate-Gaussians (less convex); and non-rigidity through the angular range of rotation (larger ranges lead to greater non-rigidity). Observers reported whether the surface was “rigid” or “non-rigid,” and the parts “hills” or “valleys”. Under orthographic projection (Experiments 1,3), observers perceived non-rigid motion on a large percentage of trials for both vertical (93%) and horizontal (53%) axis rotation. The proportion of “non-rigid” responses was higher for the ellipsoids than Gaussians, and for smaller ranges of rotation (consistent with the expected effects of convexity and rigidity respectively). While perspective (Experiments 2,4) lowered the overall percentage of non-rigid responses (31% and 10% for vertical and horizontal axis rotation respectively), they remained significantly above zero (the prediction of the rigidity assumption), while maintaining the trends observed under orthographic projection. The results demonstrate even when a rigid interpretation is available, and even when perspective supports that interpretation, a convexity bias can overcome both, leading to a non-rigid percept.

Talk 2, 11:00 am

Can “prior knowledge” of isotropy be varied trial by trial? Yes, in slant from texture

Zihan SHEN1 (), Zhongting Chen1; 1School of Psychology and Cognitive Science, East China Normal University

When perceiving slant from texture, observers tend to presume that elements of texture are initially isotropic. Multiple studies confirm this presumption by showing that variation of aspect ratio of texture (i.e., deviation from isotropy) alters slant perception from texture. However, there is yet little literature on how this “prior knowledge” is made. The current study addressed this issue by attempting to convey “knowledge” of isotropy/anisotropy to observers. We introduced a set of two-folded surfaces, of which the upper parts were slanted and the lower parts frontal-parallel. Both parts were planar but connected smoothly with a curved surface. The whole surface was textured with Voronoi textures and the aspect ratio of texture were independently manipulated for the upper and the lower surfaces, in a range between 0.8 (compressed) to 1.0 (isotropic). The aspect ratio of texture on the connection part gradually changed to avoid any abrupt change of texture. In both experiments (N = 30 for Experiment 1; N = 47 for Experiment 2), observers viewed the two-folded surfaces and estimated 3D slants of the upper surfaces by aligning their hand with the orientations of the upper surfaces while they were asked to ignore the lower surfaces. The estimates from both experiments showed that decrease in aspect ratio of the upper surfaces led to the observers’ overestimation of surface slant, as previous studies have shown. Most interestingly, decrease in aspect ratio of the lower surfaces, which had no direct relations to the task, made the observers significantly underestimate slant of the upper surface, even when aspect ratios of the lower surfaces varied trial by trial. These findings indicate that isotropy is not fixed knowledge but more likely to be contextual information. Observers can flexibly choose whether to adhere to the presumption of isotropy, depending on their environmental understanding."

Acknowledgements: the Shanghai Municipal Natural Science Foundation [grant No. 23ZR1417900]

Talk 3, 11:15 am

Shading and contour cooperate to modulate the perceived 3D shape of disparity-defined surfaces

Celine Aubuchon1 (), Roland W. Fleming2,3, Fulvio Domini1; 1Cognitive, Linguistic, and Psychological Sciences, Brown University, 2Department of Experimental Psychology, Justus Liebig University Giessen, 3Centre for Mind, Brain and Behaviour, University of Marburg and Justus Liebig University Giessen

Multiple cues are used to estimate 3D shape, which can mutually constrain each other. Notably, contours have a prominent effect on perceived shape from shading1, where covariations (coordinated changes) in luminance and contour create a strong impression of 3D shape. Conversely, a bumpy shading gradient can be flattened by a smooth contour in special cases where this covariation is disrupted. However, we found no such effect of contour on disparity, which is already highly specified at close distances. This is not surprising given that contour does not generally constrain disparity information. Still, disparity is expected to constrain the interpretation of shading. Here, we tested whether the effect of covariation between contour and luminance would be strong enough to override the shape specified by disparity. We started by cropping a periodic luminance pattern with either a covarying corrugated contour or a smooth contour. This effectively modulated whether the image was perceived to be a shaded 3D corrugated surface or a smooth surface with light and dark blurry stripes. We then combined these images with disparity fields that were either corrugated or smooth. Observers were asked to report the shape of the surface in a 2AFC task. Remarkably, we found that when the surface specified by disparity was smooth, the presence of covarying luminance and corrugated contour information significantly increased the rate that observers responded ‘corrugated’. This was the case despite the null effect of contour on the perceived shape from disparity when the luminance pattern was not present. Together, these findings suggest that the covariation between contour and luminance supports a 3D interpretation that is strong enough to override the otherwise powerful disparity cue. 1Todorović, D. (2014). How shape from contours affects shape from shading. Vision Research, 103, 1-10

Acknowledgements: his material is based upon work supported by the National Science Foundation under Grant No. 2120610

Talk 4, 11:30 am

Failures in depth magnitude estimation in 3D displays

Arleen Aksay1 (), Deborah Giaschi2,3, Laurie M. Wilcox1; 1York University, 2The University of British Columbia, 3British Columbia Children's Hospital

Using naturalistic 3D ‘thicket’ and ‘branch’ stimuli we have shown that experienced observers generate accurate depth magnitude estimates for fused targets viewed in virtual reality (VR). However, we have observed that individuals with little to no experience with 3D displays exhibit striking errors in estimating depth from disparity. We conducted a set of experiments with inexperienced viewers to quantify and better understand their poor performance. Our first study was a replication of our experiment with experienced observers where novice participants viewed low and high complexity stimuli using a VR headset. In the ‘branch’ condition, two branches were presented, one on either side of a central reference branch. The more complex ‘thickets’ were composed of two clusters of overlapping branches centred on a reference branch. We varied the separation between the branches, and within the thickets, from 1.5 to 12 cm by displacing their components equally in front of and behind the fixation point. Sixteen inexperienced observers indicated the overall depth of the structures with a virtual ruler. In another study we evaluated the role of depth averaging by displacing the branches and thickets in a single direction relative to fixation. Unlike our previous results with experienced observers, in both experiments we found that the association between perceived depth and increasing disparity was weak. This, and the fact that all participants could perceive depth from all disparities in this range using a depth-order (near/far) discrimination task argues against an explanation based on depth averaging. We conclude that cue conflicts, particularly related to the contribution of vergence to estimation of viewing distance, interfere with inexperienced participants’ ability to compute depth from disparity. Our working hypothesis is that with extended experience observers learn to disregard such conflicts; how they do this is the focus of ongoing research.

Acknowledgements: Natural Sciences and Engineering Research Council (NSERC) Grant # RGPIN-2019-06694; CF-REF program Vision Sciences to Applications (VISTA)

Talk 5, 11:45 am

Canonical perspectives of rendered 3D objects are related to affordance

Athanasios Bourganos1 (), Dirk B. Walther1; 1University of Toronto

Humans prefer to view objects from some but not other perspectives. Palmer, Rosch, and Chase (1981) were first to use the term “canonical perspectives” to describe these preferred viewing angles. More recently, this phenomenon has been studied as it relates to perspective invariance, object identification (human & algorithmic), and navigation. Contemporary studies rely on some of the foundational observations of early canonical perspective research. However, those original results are contradictory in several respects. Past literature includes contradictory findings on between-observer agreement on preferred perspectives, reaction time effects, and support for mental rotation theories of 3D object perception. To address those contradictions and improve our understanding of canonical perspectives, we constructed a digital dataset of three-dimensional objects from three categories: graspable familiar objects, non-graspable familiar objects, and graspable unfamiliar objects. We rendered the objects as viewed from 26 different orientations, covering the full range of viewing angles. We collected canonical perspective ratings via a pairwise comparison task, where participants indicated their preference between two displayed views in a two-alternative, forced-choice task. We presented 325 pairs of views of each object. Ratings were highly consistent between observers. Some viewing angles of graspable objects (e.g., coffee mug) were rated differently between left- and right-handed participants, based on experienced handle placement. This result indicates a significant connection between canonical perspective and affordance. We see a similar, although slightly weaker effect when comparing canonical viewing angles between participants of different body height. Taller participants are biased toward views from the top, smaller participants to views from the front. In summary, our results suggest that viewing angle influences people’s aesthetic preference for viewing objects, and that the preferred canonical perspective is frequently related to the individually specific affordance of a particular view.

Acknowledgements: This work was supported by NSERC Discovery Grant (RGPIN-2020-04097) and SSHRC Insight Grant (435-2023-0015) to DBW.

Talk 6, 12:00 pm

Late development of sensitivity to relative disparity in human visual cortex in the face of precocial development of sensitivity to absolute disparity

Anthony Norcia1 (), Milena Kaestner, Yulan Chen, Caroline Clement; 1Stanford University

Introduction: Immaturities exist at multiple levels of the developing human visual pathway, starting with immaturities in photon efficiency and spatial sampling in the retina and on through immaturities in early and later stages of cortical processing. Here we use Steady-State Visual Evoked Potentials (SSVEPs) and controlled visual stimuli to determine the degree to which sensitivity to horizontal retinal disparity is limited by the visibility of the monocular half-images, the ability to encode absolute disparity or the ability to encode relative disparity. Methods: Responses were recorded from male and female participants at average ages of 5 months, 5 and 25 years. SSVEPs were recorded in response to contrast and blur modulation of dynamic random dot patterns to measure sensitivity to the spatio-temporal content of the monocular half-images. Disparity sensitivity was measured using planar stereograms that modulated absolute disparity and in stereograms portraying disparity gratings that additionally had relative disparity in them. Results: Disparity thresholds derived from SSVEP amplitude vs disparity response functions for planar stimuli modulating absolute disparity changed little over development, but those for grating stimuli modulating relative disparity changed by a factor of ~10. Equating subjective contrasts between infants, children and adults did not equate disparity sensitivity. Disparity sensitivity at age 5 was adult-like, but disparity tuning at supra-threshold levels was not. Conclusion: The protracted developmental sequence for relative disparity coding shown in our measurements is not simply inherited from immaturities in encoding absolute disparity, but rather reflects immaturities in the computations needed to represent relative disparity that likely involve extra-striate cortical areas where relative disparity is first extracted.

Acknowledgements: Acknowledgements: This research was supported by grant EY018875 from the National Eye Institute, National Institutes of Health. The authors would like to thank Vladimir Vildavski and Alexandra Yakovleva for the development of instrumentation used in the experiments.