Talk 1, 2:30 pm
Sensor-based quantization of a color image disturbs material perception
Materials in our daily environments undergo diverse color changes based on environmental contexts. For instance, water is inherently colorless, but wetting a surface changes its colors due to optical interactions. Previous studies have explored the effect of colors on material perception while following the literature on object recognition, i.e., examining the effect of categorical colors on a grayscale image. However, unlike object recognition, categorical colors are not always diagnostic for material changes due to context dependence. To address the issue, this study explores color dimensions diagnostic to material perception. Building on recent studies showing that material perception depends on image color entropy (Sawayama et al., 2017), this study investigated the extent to which modulating the image color entropy, defined by the color quantization in a sensor color space (e.g., RGB or LMS), affects the material estimation. Specifically, the experiment leveraged a zero-shot prediction paradigm using pre-trained vision and language machine-learning models. It used 2AFC text prompts related to material perception, such as wet/dry or glossy/matte. The FMD (Sharan et al., 2014) and THINGS (Hebart et al., 2019) datasets were chosen for visual images. Color quantization was applied through the median cut to each image, reducing the quantized numbers from 128 to 2. Additionally, grayscale images were created from the original images. Results showed that the distribution of prediction probabilities was diversely distributed for original and grayscale images across all dataset images. However, when an original image was modulated by color quantization, the distribution diversity was biased heavily towards specific attributes, particularly dry and matte. Further experiments confirm that color quantization has less impact on zero-shot object recognition performance. These findings suggest that diverse material perception of an object image is available for high color entropy, where the color space is defined while mixing chromatic and luminance components.
Talk 2, 2:45 pm
Cortical representations of core visual material dimensions
Hua-Chun Sun1,2 (), Filipp Schmidt1,2, Alexandra C. Schmid3, Martin N. Hebart1,2,4, Roland W. Fleming1,2; 1Justus Liebig University Giessen, 2Center for Mind, Brain and Behavior (CMBB), University of Marburg and Justus Liebig University Giessen, 3Laboratory of Brain and Cognition, National Institute of Mental Health, National Institutes of Health, 4Vision and Computational Cognition Group, Max Planck Institute for Human Cognitive and Brain Sciences
In every waking moment, we perceive numerous visual materials from the objects, surfaces and environment around us. How does the brain represent the great diversity of materials and their properties? We recently addressed the mental representation of materials using a novel dataset consisting of 600 images spanning 200 material categories (the STUFF dataset), by crowdsourcing over 1.8 million material similarity judgments. This revealed 36 core dimensions that capture similarity relationships between materials (Schmidt, Hebart, Schmid & Fleming, 2023). To determine the neural representation of these dimensions in the human brain, here we acquired a densely-sampled functional MRI dataset using these images, which we paired with an encoding model of the 36 material dimensions. Each of the 600 images was presented to six participants 14 times each across multiple scanning sessions. The whole brain activation map of each material dimension was then obtained by modeling the dimension score of each image in each of the 36 dimensions (Schmidt et al, 2023). Comparing the voxel-wise activation intensity across material dimensions revealed superimposed cortical maps associated with each of the dimensions. We found that dimensions related to the fine scale granularity of the material are particularly represented in early visual areas (V1-V3). In contrast, dimensions related to hard shapes preferably activated lateral occipital (LO) cortex, indicating a dichotomy between cortical regions associated with shape and fine texture. Flexible and soft material dimensions exhibited particularly strong responses in area hMT+/V5, suggesting that motion sensitive regions also encode the capacity of materials to deform. Finally, color dimensions, which span diverse material categories, were represented less consistently across participants, suggesting that material properties might actually be a more consistent organizing principle than color. Together, our findings provide a comprehensive mapping of material representations across cortical regions in the human brain.
Acknowledgements: This research is funded by the DFG (222641018 – SFB/TRR 135 TP C1), the HMWK (“The Adaptive Mind”) and European Research Council Grant ERC-2022-AdG “STUFF” (project number 101098225).
Talk 3, 3:00 pm
Perception of material properties from dynamic line drawings
Amna Malik1 (), Ying Yu5, Huseyin Boyaci1,2,3,4, Katja Doerschner1; 1Department of Psychology, Justus Liebig University, Giessen, Germany, 2Interdisciplinary Neuroscience Program, Bilkent University, Ankara, Turkey, 3Aysel Sabuncu Brain Research Center & National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara, Turkey, 4Department of Psychology, Bilkent University, Ankara, Turkey, 5EYSZ Inc., United States
Recent studies have shown that people are able to recognize material qualities based on motion information alone (Schmid et al, 2018, Bi et al 2018). In these experiments, stimuli were deforming nonrigid objects (e.g. liquids, jelly, cloth) with dots ‘stuck on’ at random places in and on the object. The dots ‘inherited’ the kinematic material properties and global shape deformations of the object and the overall motion pattern yielded vivid nonrigid material qualities, such as wobbliness. Dynamic dot materials contain both, interior-, and boundary motion. However, how much does each of these types of motion contribute to the percept of a given material quality? To answer this question we contrast ratings of material qualities of dynamic dot stimuli with those for dynamic (out)line drawings of the same object-material deformations, as well as with ratings for corresponding full-texture (i.e. color & reflectance) renderings (for sample stimuli, see https://jlubox.uni-giessen.de/getlink/fi9xJ5W9kN1drXPBodkz1HCY/). Animations of five material categories (fabrics, hard breakables, jelly, liquids, smoke) were rated each six times on eight material attributes (dense, flexible, wobbly, fluid, airy motion, motion coherence, oscillatory motion, and motion dynamics), blocked by rendering style (dots, lines, full). Comparing dissimilarity matrices and cluster analysis of attribute ratings between the three rendering conditions suggest that 1) also animated line drawings vividly convey mechanical material properties, 2) similarity in material judgements between line drawings and fully textured animations was larger than that between dynamic dots and fully textured stimuli. We conclude that boundary motion might play a critical role in the perception of mechanical material qualities.
Talk 4, 3:15 pm
Perceiving materials and objects from semi-visible interactions
Visual material perception is computationally complex because physical properties such as rigidity or friction are not directly observable. In many cases, however, viewing a dynamic interaction between different objects reveals their internal properties. If a couch cushion deforms dynamically under the weight of a box, we infer the cushion’s stiffness as well as the weight of the box. This indicates that the brain jointly interprets the interplay of multiple objects in a physical scene. Can the brain infer the physical structure when only one of the interacting objects is visible, while all others are artificially rendered invisible? To answer this question, we leveraged computer graphics: First, we simulated short interactions of liquid, granular, and non-rigid materials with rigid objects of various shapes. Then, crucially, we rendered only the target material while the remaining scene was black. We presented the videos to 100 observers and asked them to identify which of two alternative interactions showed the same target material as the test video. Match and distractor varied in their material properties (e.g., cohesion), thus implicitly requiring inference of those parameters. Observers were as accurate in judging these videos, as they were when presented with fully rendered versions. Strikingly, we found that observers did not only perceive the target material in rich detail; in most cases, they were able to select which of two alternative 3D shapes was underlying the observed interaction. This finding suggests that the brain imputes the hidden objects in a physically plausible manner. In comparison, a distance-based classifier based on features from pretrained neural networks showed overall lower performance in both tasks and the pattern of errors was different from human observers. Taken together, our results are consistent with the hypothesis that people use an internal generative physics model in online perception.
Acknowledgements: This work was supported by the German Research Foundation (grant PA 3723/1-1) and National Science Foundation NSF NCS Project 6945933.
Talk 5, 3:30 pm
The sound of shininess: Cross-modal influence of auditory pitch on the perception of gloss
When shopping for jewelry online, we typically just see pictures of items, and cannot hear or feel them. One might assume that this is not a big loss—at the very least, that removing auditory or tactile information should not influence our perception of intuitively visual properties, such as gloss. However, it is also possible that perceivers irresistibly integrate auditory information, such as pitch, into their perception of gloss. We investigated this in two experiments. In Experiment 1, subjects saw pairs of spheres, which were both rendered in the same material (metal, wood, or leather). The spheres differed slightly in glossiness, and subjects reported which was glossier, while ignoring concomitant sounds. In one condition, the glossier sphere was paired with a high-pitched sound and the less glossy sphere with a low-pitched sound; in the other condition, the pairings were reversed. Surprisingly, subjects could not ignore the sounds when discriminating gloss. Rather, they were more accurate when the glossier sphere was paired with the high-pitched sound, suggesting an automatic association between higher pitch and higher gloss. These objects were computer-generated; does the same association also hold when viewing pictures of real objects? In Experiment 2, subjects viewed pictures of jewelry from the Metropolitan Museum of Art’s digital archive. Each item was paired once with a high-pitched sound and once with a low-pitched sound. Subjects were instructed to ignore the sounds, and to rate the gloss of each item from “Not at all shiny” to “Very shiny”. They rated jewelry items as much shinier when paired with the high-pitched sound, indicating that the association between high pitch and high gloss also holds for pictures of real objects. We conclude that when displaying objects in digital spaces such as online stores and museum catalogs, auditory pitch can be used to drive impressions of gloss.
Talk 6, 3:45 pm
Lightness constancy can be very weak in an immersive VR environment
Previous studies have revealed important differences between how viewers perceive real and virtual scenes. Virtual reality (VR) plays a growing role in performance-critical applications such as medical training and vision research, and so it is crucial to characterize perceptual differences between real and VR environments. We compared lightness constancy in real and VR environments. We used a demanding task that required observers to compensate for the orientation of a reference patch relative to a light source in a complex scene. On each trial the reference patch had reflectance 0.40 or 0.58, and a range of 3D orientations (azimuth -50º to 50º). Ten observers adjusted a grey match patch to match the perceived grey of the reference patch. We used a custom-built physical apparatus, and four VR conditions: All-Cues (replicated the physical apparatus); Reduced-Depth (zero disparity, no parallax); Shadowless (no cast shadows); and Reduced-Context (no surrounding objects). Scenes were rendered in Unity and shown in a Rift S headset. Surprisingly, constancy was weak, and approximately the same in all conditions. The mean Thouless ratio (0= no constancy, 1= perfect constancy) was 0.40, with no significant differences between conditions. The above-zero constancy in the Reduced-Context condition, with no cues to support constancy, suggested that observers learned environmental lighting cues in some conditions and transferred this knowledge to other conditions. Accordingly, we re-tested the All-Cue and Reduced-Context conditions in VR, with 10 new observers per condition, and each observer ran in just one condition. Here we found substantially reduced constancy (average Thouless ratio 0.14). We conclude that lightness constancy can be weak in VR, and that observers may use lighting information from real environments to guide performance in virtual environments. We are currently developing experiments with high-performance VR configurations to test whether constancy improves with more realistic rendering of lights and materials.
Acknowledgements: NSERC, VISTA
Talk 7, 4:00 pm
Representational momentum is domain-general: Evidence from brightness
A classic finding in visual cognition is “representational momentum”. Show people a photo of a wave crashing on the beach and they are prone to confuse it with a photo taken a moment later. Abruptly mask a video of a rotating shape or a rapidly melting ice cube and people will overestimate how far they saw the shape rotate or the ice melt. Anticipated motion affects what we see, or at least what we remember seeing. Prior research has argued that representational momentum is strictly limited to anticipated motion, in part by appealing to experiments that found no representational momentum for changes in brightness. Here, we refute this claim with new evidence. In five experiments, we demonstrate that richer stimuli and a more sensitive task reveal people to experience representational momentum for changes in brightness. Participants watched animations of a stationary, achromatic shape that increased in brightness before being masked. Using a slider, they selected a specific frame from each animation to indicate precisely how bright they thought the shape was when it disappeared. Participants reliably judged the shape to have been brighter than what they had truly been shown. We found analogous representational momentum effects for darkening shapes, the brightness of chromatic stimuli, and—generalizing our results beyond 2D shapes—the brightness of the ambient light illuminating a 3D object in a computer-rendered scene. These findings are unlikely to be an artifact of the task design because representational momentum for brightness replicated in a 2AFC version of the task and, most importantly, no analogous effects were observed for changes in hue, which are less intuitively directional and predictable. These results suggest representational momentum is a domain-general phenomenon related to anticipatable change, not one that is narrowly limited to motion. The mind actively anticipates changes in many perceptual domains.