Layer-wise representation of glossiness features in the deep neural network for object recognition.
Poster Presentation 26.445: Saturday, May 16, 2026, 2:45 – 6:45 pm, Pavilion
Session: Color, Light and Materials: Material perception
Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
Hiroaki Kiyokawa1 (), Kai Yoshida1, Ichiro Kuriki1; 1Saitama University
Glossiness is one of the remarkable features in object surfaces that helps identifying the object material. However, the key image feature for the glossiness perception is still under debate. This study addressed this issue by utilizing hierarchical structure of a DNN model VGG19 (Simonyan & Zisserman, 2014) to elucidate the image features necessary for the glossiness. VGG19 is known for high similarity in the hierarchical structure of object representations in human brains (Nonaka et al., 2020). To visualize image features in each layer, we employed style-transfer technique (Gatys et al., 2015) to transplant image features of glossy surface onto a non-glossy one, while manipulating which layers to transfer image features. Glossy object images were rendered using actual BRDFs (EPFL database) of metallic and non-metallic objects. Five layers after the pooling process in VGG19 model were manipulated under two experimental conditions: transferring features from only one layer (one-layer condition) and transferring features except using one of the five layers (one-layer exclude condition). Human observers rated glossiness of the style transferred images on a 9-point scale. As a result, perceived glossiness across all materials were the highest when image features in the third layer were used under the one-layer condition. The glossiness scores were also significantly higher for metallic materials when the fourth layer was used, and non-metallic materials was so when the second layer was used. In contrast, no significant difference was observed when only one layer was excluded. These results imply that glossiness representation is spread widely across layers. Additionally, considering that the shallower DNN layers represent more spatially localized information than deeper layers, the need of deeper layer for metallic objects and shallower layer for non-metallic objects in the style transfer suggest the importance of spatially distributed features to render glossiness in metallic surfaces than non-metallic ones.