Exploring Individual Differences in DNN Representations

Poster Presentation 56.420: Tuesday, May 19, 2026, 2:45 – 6:45 pm, Pavilion
Session: Object Recognition: Models

Yinuo Peng1, Jason Chow1, Thomas Palmeri1; 1Vanderbilt University

Deep neural networks (DNNs) are used to explore individual differences in visual cognition. We quantitatively explore differences in representational dissimilarity matrices (RDMs) for (1) the same objects through different model architectures (19 different AlexNet-VGG, ResNet, Transformer, InceptionNet, CLIP, and CORnet models), (2) different images sampled from the same object categories (from ImageNet), and (3) the same objects subject to the kinds of transformations used as augmentation during training. We previously explored differences in DNN representations caused by random differences in initial weights and training image order in (small) trained DNNs, arguing that meaningful individual differences in representations attributable to what might be considered trait differences needed to exceed a baseline comparable to what might be considered state differences. We chose image transformations (augmentations) that maintained >95% network classification performance. We first examined robustness of object representations (penultimate layer representations) to transformations, testing N=19 models from different architectural families on 50 image sets and 100 image transformation sets, comparing RDMs of the original image sets with transformations of each image set. We observed large differences in robustness of representations to image transformation across different model architectures, with linear mixed-effects modeling showing that variance in RDM distances was largely attributed to model architecture (96.5%) with little variance from image sampling (1.3%) or transformation set (2.3%). We then relate differences in object representations (RDMs) relative to these baseline transformations between different image samples from the same category and between different model architectures. Representational differences across image samples exceed those caused by modest image transformation (akin to state differences). Representational differences across models diverge and depend on architectural differences between models; for most model pairs compared, especially those differing vastly in architecture, representational differences exceed those caused by image transformation and by differences in image sampling.