Monocular estimation of ground surface orientation in natural scenes

Poster Presentation 23.419: Saturday, May 16, 2026, 8:30 am – 12:30 pm, Pavilion
Session: 3D Shape and Space Perception: Surfaces, objects

David N. White1 (), Seyed Hosseini1, Richard Murray1, James H. Elder1; 1York University

Accurate estimation of 3D scene structure is essential for effective interaction with the environment. 3D reconstruction of the ground surface is particularly important for navigation and for anchoring the locations of objects resting on the ground. In the near field, stereo and motion cues provide rich depth information but these cues become weaker in the far field, suggesting a role for monocular cues to surface orientation. To better understand this monocular contribution, we measured human slant discrimination performance for natural ground surfaces and compared human performance with two computational models tested with the same stimuli and task. In a two-interval task, observers judged whether a monocular test stimulus was more or less slanted than a monocular reference stimulus. Stimuli were presented monocularly within a 3D virtual environment. The test stimuli were natural image patches sampled from roughly planar ground-surface regions in scenes from the SYNS image database. Reference stimuli were planar surfaces painted with a high-contrast checkerboard pattern known to elicit relatively accurate percepts of surface orientation. The tilt of the reference stimulus was matched to the tilt of the test stimulus and the slant was systematically varied to sweep out a psychometric function for slant discrimination, providing estimates of both bias and sensitivity relative to the reference. Observers repeated each stimulus condition twice, allowing for analysis of both intra- and inter-observer variability. Results were compared with two-alternative image-computable surface orientation models. The first relies upon an analysis of changes to the shape of the spectral density as a function of depth. The second relies upon perspective distortions in orientation cues. Our results provide a human performance benchmark for the important visual task of monocular ground surface orientation estimation and offer insight into the computational mechanisms underlying this performance.

Acknowledgements: Supported by the VISTA postdoctoral fellowship at York University