An experimental investigation of how stimulus shape modulates perceptual stability in the silhouette illusion

Poster Presentation 26.428: Saturday, May 16, 2026, 2:45 – 6:45 pm, Pavilion
Session: Perceptual Organization: Features, parts, wholes, objects

Kohei Ronden1 (), Kiyofumi Miyoshi1, Shin'ya Nishida1; 1Kyoto University

In bistable perception, the interpretation of an ambiguous stimulus alternates spontaneously over time. Perceptual stability has been theoretically explained in terms of attractor “well depths” in an energy landscape, as well as model parameters such as input strength, adaptation, and noise. However, how perceptual stability is determined by physical stimulus properties remains unclear. Here, using the silhouette dancer illusion—where perceived rotation switches between clockwise (CW) and counterclockwise (CCW)—we explored how stimulus shape and representational format (silhouette vs. dot cloud) modulate perceptual stability. Forty-five participants viewed nine ambiguous rotating stimuli for three 60-second trials each and reported perceptual changes via keypress. All nine stimuli—five biological and four non-biological variants, each including one dot-cloud version—were constructed in Blender from a baseline silhouette reproducing the Spinning Dancer pose. Perceptual states were categorized as CW, CCW, Swing (an oscillatory motion between lateral endpoints), or other. Durations pooled across CW and CCW perception were estimated using a Gamma-family GLMM, and all stimulus pairs were compared with FDR correction. Compared to the baseline spinning-dancer stimulus, a symmetric pose and a cylinder-simplified dancer showed shorter dominance durations, whereas a inverted unfamiliar pose yielded longer durations. One non-biological stimulus (a chain of five connected cylinders) also showed significantly longer durations. A scrambled dot-cloud stimulus showed shorter durations than a structured dot-cloud dancer, while no significant difference emerged between the silhouette and dot-cloud dancer versions. PredNet next-frame prediction accuracies were computed for the seven silhouette stimuli, and a subset showed positive correlations with dominance duration. The correlation suggests that shape complexity modulates predictability, which in turn may influence the depth of attractor wells and thereby regulate perceptual stability. These results suggest that structural complexity and familiarity, rather than silhouette format or biological relevance, drive differences in perceptual stability.

Acknowledgements: This work was supported by JSPS KAKENHI (JP24H00721).