Learning Behavioral Representations of Color Combinations With Sequential Modeling

Poster Presentation 26.452: Saturday, May 16, 2026, 2:45 – 6:45 pm, Pavilion
Session: Color, Light and Materials: Affect, cognition

Henry A.S. Lewinsohn1 (), Erin Goddard2, Bevil R. Conway1; 1National Institutes of Health, 2University of New South Wales

Color preferences are important for a wide variety of perceptual behaviors. They underwrite individual choices about clothing, influence the development of technology, and impact many aspects of Western civilization. Although single-color preferences follow a universally predictable pattern and evolve consistently with age (Ling and Hurlbert, 2011), little is known about how people evaluate combinations of colors and the underlying factors that shape these preferences. To address this question, we created a novel web-based color game deployed with many same-sex monozygotic (MZ, N = 424) and dizygotic (DZ, N = 127) twin pairs. Participants completed two tasks: (1) they created 3x3 grids following prompts, “make grids that you do (don’t) like,” using a palette of 37 colors sampling color space (hue, chroma, tone) and including focal basic colors; and (2) they provided aesthetic ratings of both individual colors and grids created by others. Results of individual color preferences replicated prior cross-cultural findings (Schloss et al., 2018). Twin modeling showed that most variance in both single-color and color combination preferences are attributable to environmental factors, aligning with results from face-preference heritability studies (Germine et al., 2015). To characterize individual variance in decision-making behavior during grid construction, we applied sequential models originally developed for natural language processing. In addition to behavior quantification, the sequential models learned latent representations of both color and spatial position, allowing us to derive embeddings that represent how participants implicitly structure the task space. By modeling the entire grid sequence, we obtain image representations based on spatial and temporal data. This approach opens avenues for comparing representations learned from sequential behavior with those derived from supervised or contrastive visual learning methods on static images, and for integrating modern sequential models with behavioral quantification.

Acknowledgements: Supported by NIH IRP (1ZIAEY000558 to BRC), NSF (0918064 to BRC) and NIH (R01 EY023322 to BRC). Contributions of NIH authors are considered Works of the United States Government. The findings and conclusions are those of the authors and do not necessarily reflect the views of the NIH or the U.S. Department of Health and Human Services.