The quest for 'the average person' in vision science: Promise and pitfalls
Symposium: Friday, May 15, 2026, 1:45 – 3:45 pm, Talk Room 2Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
Organizers: Jeremy Wilmer1, Michael Herzog2; 1Wellesley College, 2Swiss Institute of Technology (EPFL)
Presenters: Michael Herzog, Jeremy Wilmer, Jonathan Winawer, Anna Kosovicheva, Michael Webster, Alasdair Clarke
In everyday language, average means typical, usual, and ordinary. In data analysis, the average is a mathematically defined summary statistic. A good summary statistic should match the ordinary meaning of average, capturing the main tendencies while “dispensing with needless details” (Oxford Languages). But does the scientific practice of describing the human mind and brain through the average reasonably capture what is important? Human variation is ubiquitous. It demands explanation, gives life color, and inspires scientific curiosity. Moreover, as soon as one considers multiple dimensions—even highly correlated ones—the "average person" quickly ceases to exist (Rose, 2015; Downey, 2024). In this sense, the average person may, in fact, represent no one at all. At the same time, data descriptions that retain all details can be uninterpretable; they may lead one to miss the forest for the trees. In this symposium, we critically examine the scientific reliance on averaging across individuals in vision research. When does this approach succeed? When does it fall short? And what complementary methods might it require? The talks in this symposium tackle these questions via diverse methodologies (neural, behavioral, computational) and across multiple content areas (faces, illusions, color, scenes, graphs, eye movements). Herzog and colleagues show that the classic N1 component of visual evoked potentials is an artifact of averaging, and that there is no common factor in visual illusions; in both cases, the average response is atypical and deviations from it constitute stable individual traits. Wilmer and colleagues show that the average viewer’s reading of plotted averages masks massive variability and misperception, concluding that such graphs are a suboptimal method of visual data communication. Winawer and Benson show that standard cortical atlases, derived from population averages, often have large errors when applied to individual brains, and explore parcellation methods that balance efficiency with individual-level accuracy. Kosovicheva and colleagues show that even processes as simple as localization vary considerably between individuals and are tied to other forms of spatial judgment, suggesting that results relying on the average observer do not reflect individuals’ perceptual reality. Webster and colleagues review the individual differences manifest at different levels of color vision, the neural coding of color, and how they can be compensated to increase a common color experience among observers. Finally, Clarke and colleagues considers what we can infer from correlational analyses on averaged data and proposes new modeling solutions for studying individual differences. Taken together, these talks reveal a wide array of rich insights embedded in non-average-ness; that is, in deviations from the average. They show, for example, that patterns of covariance across individuals can reveal mechanisms; that proper stratification before averaging can reap benefits of averaging, such as increased signal to noise ratio, without compromising naturalism; and that the existence and magnitude of stable variation can be a phenomenon worthy of study in its own right. We hope that this symposium will stimulate ongoing discussion about the benefits and risks of averaging data across individuals, and about the complementarity between traditional average-based approaches and approaches that embrace individual differences.
Talk 1
Inter-individual variability is (often) not noise
Michael Herzog1, Melissa Faggella1, Simona Garrobio1; 1Swiss Institute of Technology (EPFL)
Classically, vision scientists treat inter-individual differences as a nuisance and eliminate it by aggregating individuals (grand-group average). Behind this procedure is the assumption that there is one “true” waveform, hidden in noise: vision research is ergodic, i.e., conclusions based on the mean are also valid for the single participants and vice versa. Here, we tested this assumption. We analysed visual evoked potentials (VEPs) recorded during a backward masking task and found that different observers had different numbers of peaks in their waveforms. When averaging, we obtained a single peak, the classic N1 component. We tested the same participants 5 and 10 years later and obtained virtually the same waveforms, i.e., a participant with 3 peaks had 3 peaks also after a decade. Hence, differences in waveforms are traits, not noise, and the N1 is an artifact of averaging. In fact, many participants did not show an N1. Similar results hold true for behavioural tests. Participants performed a battery of illusions. We found almost no correlations between the illusions. An observer may experience a strong Ebbinghaus and a weak Müller-Lyer illusion. Hence, there is no common factor for illusions. Still, in each participant, the respective illusion magnitudes were stable across a year. We will discuss the implications of heterogeneity for vision research, including how to measure such heterogeneity properly, and we will show how inter-individual differences may pave new avenues in vision research.
Talk 2
Graphed averages sow disagreement: The average reading of an average is atypical
Jeremy Wilmer1, Sarah Kerns1, Ken Nakayama2; 1Wellesley College, 2University of California, Berkeley
A key aim when graphing data is that, as far as possible, different viewers should gain a comparable conception of the data from the same graph. The average (mean) has long been the most common computed and plotted value in vision research. Yet little is known about how plotted averages are perceived and understood, especially by viewers who may vary in statistical training. Here, we first developed an accessible yet information-rich drawing-based measure of how viewers read graphs of averages. We then gathered readings from a broad array of participants. Surprisingly, roughly 1 in 5 viewers, across a wide range of education levels, misinterpreted the bar in a bar chart of averages as depicting the range of the data, with the average frequently placed near the middle of the bar. The average read of the average, therefore, was inside the bar, a location with few to no real responses. A second surprising finding was that viewers both wildly underestimated and disagreed upon the amount of variation around individual averages. For example, for a graph taken directly from the most popular Introductory Psychology textbook, a gender difference was falsely perceived, by the average viewer, to have zero overlap, as if all men outperformed all women. Moreover, across many graphs and levels of statistical training, viewers’ conceptions of variation and overlap differed so much that, again, the average response reflected virtually no actual responses. We conclude that average-only graphs routinely fail at their basic tasks of consistent, and correct, data communication.
Talk 3
Common patterns and individual differences in visual cortical maps
Jonathan Winawer1, Noah Benson2; 1New York University, 2University of Washington
About a quarter of the human cerebral cortex is visual, consisting of multiple retinotopic maps and other areas that are highly responsive to certain classes of stimuli such as faces. Identifying these areas in individual participants has many applications in neuroscience and medicine: interpreting visual disorders; tracking cortical changes over development and aging; localizing intracranial electrodes in patients; characterizing how different areas respond to stimuli or tasks; tracing EEG or MEG signals back to specific cortical maps. Functional MRI experiments can identify most of these areas in any healthy individual, but this can be time-consuming and costly. An alternative way to identify visual areas is by standardized atlases, which take advantage of the tendency for specific visual areas to align with particular sulci and gyri. The atlases rely on two levels of averaging: one for the cortical surface itself, and another for the location of visual maps on this surface. Such atlases are now used in many research studies. However, atlases without functional localization at the individual level raise a number of questions: How often do they mislabel cortical areas? When they err, how large is the error? In short, how much do we lose by treating each individual as if they are the average? We argue that for some applications, cortical atlases without individual functional localizer data are insufficiently accurate. We then consider new methods that have some of the efficiency advantages of atlases while still respecting individual differences.
Talk 4
Individual differences in localization reveal links between fundamental visual processes
Anna Kosovicheva1; 1University of Toronto Mississauga
Approaches to examining individual differences in visual perception often emphasize high-level processes like object recognition, but what variability can we observe in fundamental visual tasks, and are these between-observer differences simply random noise? In a striking example of variability, several studies demonstrate that individuals exhibit consistent, idiosyncratic patterns of directional error in position judgments. In a series of laboratory studies, participants report the positions of briefly flashed targets in the periphery, and the directional errors in these tasks have been shown to be highly stable over several months and consistent across measurement methods (Kosovicheva & Whitney, 2017). These responses are weakly correlated between observers, such that the “average observer” shows accurate responses. Importantly, these directional biases have been linked to individual differences in other fundamental processes, including perceived size and acuity (Wang, Murai, & Whitney, 2020), as well as the strength of visual crowding (Haseeb, Wolfe, & Kosovicheva, 2023). More recently, we have shown that these idiosyncrasies exist not only in laboratory samples but in much larger, less-controlled, tasks (with over 9,400 observers and 4.5 million trials), demonstrating that stable individual variation is a meaningful, underacknowledged element in understanding localization. In this online sample, we found that directional localization errors were, on average, weakly correlated between all possible pairs of participants, yet were highly consistent and stable within each observer. Focusing on the average observer has revealed much about vision, but understanding visual function requires tackling the complexity and variability of individuals to reveal important links between different processes.
Talk 5
Interpreting and accounting for individual differences in color vision
Michael Webster1, Kara Emery2, Camilla Simoncelli1; 1University of Nevada, Reno, 2New York University
The concept of a “standard” or average observer is central to color science and application, but masks the marked variability even among observers with “typical” trichromatic color vision. These differences arise independently at many levels and affect diverse aspects of color coding, from genetics and sensitivity to color perception and cognition. As a result, two observers viewing the same stimulus may have very different experiences. Measurements of these variations have provided important and often surprising clues about the mechanisms and representational structure of color vision. Differences can be amplified in modern wide-gamut devices, and there is increasing recognition and interest in accounting for them to better display and communicate information about color. One approach is to introduce individual differences into the stimulus by tailoring it to the individual, so that two observers, viewing their own stimuli, have more standard color experiences.
Talk 6
Generative models can explain (or at least help us understand) individual differences in cognition
Alasdair Clarke1, Anna Hughes1, Amelia Hunt2; 1University of Essex, 2University of Aberdeen
Nearly all experiments in cognition and visual perception involve taking multiple measurements from each participant. This includes taking multiple recordings of the same concept (e.g., recording response time on a number of trials), and taking multiple measurements per trial (e.g., response time and accuracy). The number of measurements expands as we add in eye-tracking metrics, EEG statistics, and so on. Researchers interested in individual differences typically assume that these measurements are i) mutually independent and ii) free from confounds, before taking simple summary statistics and then applying correlation procedures. A general theme of research on individual differences in our field has been a lack of correlations. I argue that, on reflection, this shouldn’t be surprising, as the measurements we put into our correlations are imprecise composites of multiple causal factors. For example, it is well established that reaction times are influenced by learning and inter-trial serial dependencies. This means that participants in a hypothetical visual search experiment may all have identical “skill” in search, but could exhibit wildly different RTs due to differences in learning rates and serial dependency effects. I present simulated data to demonstrate how such processes can both mask correlations and produce spurious correlations. A potential solution to my hypothetical problem is the use of generative models with latent parameters. I provide an example in the form of FoMo, which decomposes performance in visual foraging into a small number of per-participant parameters.