Effect or artifact? Assessing the stability of comparison-based scales

Poster Presentation 33.338: Sunday, May 19, 2024, 8:30 am – 12:30 pm, Banyan Breezeway
Session: Color, Light and Materials: Neural mechanisms, models, disorders

There is a Poster PDF for this presentation, but you must be a current member or registered to attend VSS 2024 to view it.
Please go to your Account Home page to register.

David-Elias Künstle1 (), Felix A. Wichmann1; 1University of Tübingen

Measuring the subjective similarity of stimuli—for example, the visual impression of materials or the categories in object images—can be achieved through multidimensional scales. These scales represent each stimulus as a point, with inter-point distances reflecting similarity measurements from behavioral experiments. An intuitive task used in this context is the ordinal triplet comparison: ”Is stimulus i more similar to stimulus j or k?”. Modern ordinal embedding algorithms infer the (metric) scale from a subset of (ordinal) triplet comparisons, remaining robust to observer response noise. However, the unknown residual errors raise concerns about interpreting the scale’s exact shape and whether additional data may be necessary or helpful. These observations demand an examination of the scale’s stability. Here, we present an approach to visualize the variation of comparison-based scales via bootstrapping techniques and a probabilistic model. Simulation experiments demonstrate that variation is broadly captured by an ensemble of scales estimated from resampled trials. However, common methods to align the ensemble parts in size, rotation, and translation can distort the local variation. For example, standardization results in zero variation at some points but bloats variation at others, while Procrustes analysis leads to uniform “noise” distribution across all scale points. To address this, we propose a probabilistic model to identify the variation at individual scale points. In essence, we ”wiggle” scale points to observe changes in triplet correspondence, indicating their stability level. These localized estimates are combined in the ensemble to provide a robust measure of variation. Simulations validate our approach, while case studies on behavioral datasets emphasize the practical relevance. In these case studies, we visualize perceptual estimates through regions instead of points and identify the most variable stimuli or trials. Beyond data analysis, our stability measures enable further downstream tasks like adaptive trial selection to expedite experiments.

Acknowledgements: Funded by EXC number 2064/1 – project number 390727645 and by German Federal Ministry of Education and Research (BMBF): Tübingen AI Center, FKZ: 01IS18039A. Supported by the International Max Planck Research School for Intelligent Systems (IMPRS-IS).