Stimulus distributions affect uncertainty sampling approaches to adaptive estimation of classification images

Poster Presentation 53.441: Tuesday, May 21, 2024, 8:30 am – 12:30 pm, Pavilion
Session: Decision Making: Perceptual decision making 3

Rabea Turon1 (), Lars Reining1, Thomas S. A. Wallis1,2, Frank Jäkel1; 1Technical University of Darmstadt, 2Centre for Mind, Brain and Behaviour (CMBB), Universities of Marburg, Giessen and Darmstadt, Germany

In binary decision tasks, it is often assumed that humans perform template matching by comparing a presented stimulus to a template in their mind. In effect, the template matching process defines a decision boundary in stimulus space. Many different methods have been proposed to measure decision boundaries and relevant features (reverse-correlation, bubbles, sparse regression, etc.), but to work well, all require exhaustive experimental testing. One way to measure decision boundaries efficiently is to adaptively sample stimuli to be maximally informative. A popular strategy for adaptive sampling in machine learning is to use “uncertainty sampling”, where stimuli are chosen that are close to the current estimate of the decision boundary (e.g. as used in active learning for support vector machines). Here we show that this strategy can fail to adequately constrain the decision boundary, producing biased estimates compared even to random sampling, depending on the stimulus distribution and observer model. We use simulations in two dimensions to show that when class distributions are far apart in the stimulus space, uncertainty sampling repeatedly samples only a few stimuli close to the decision boundary, even for deterministic observers. This poorly constrains the direction of the decision boundary (a problem that will become more acute in higher dimensions). Uncertainty sampling does work when stimulus distributions densely populate the area around the decision boundary. However, in some psychophysical settings this might not be possible (for example, in cases where only a single fixed pool of stimuli is available such as sampling faces from a limited set of identities). We propose a solution to this problem by minimizing the entropy of the posterior distribution over the parameters of a multidimensional psychometric function. This implicitly constrains uncertainties over the whole stimulus space, and samples informative samples across the available stimulus pool.

Acknowledgements: Funded by the Hessian research priority program LOEWE within the project ”WhiteBox”.