Face Perception: Neural mechanisms and models

Talk Session: Wednesday, May 24, 2023, 10:45 am – 12:30 pm, Talk Room 2
Moderator: Richard Krauzlis, NIH

Talk 1, 10:45 am, 62.21

Reconstructing the neurodynamics of face perception during real world vision in humans using intracranial EEG recordings

Arish Alreja1 (), Michael J. Ward2, Jhair A. Colan3, Qianli Ma1, R. Mark Richardson4, Louis-Phillipe Morency1, Avniel S. Ghuman3; 1Carnegie Mellon University, 2Univerity of California, Los Angeles, 3University of Pittsburgh, 4Harvard University and Massachusetts General Hospital

We use face perception to see and understand people around us during natural behavior in the real world. Here, we take advantage of the unique opportunity afforded by intracranial recordings in two epilepsy patients to assess the neural basis of face perception during natural, unscripted interactions in real world settings with friends, family, and experimenters. With eye tracking glasses, we captured what subjects saw, time locked to the corresponding neural activity, on a fixation-by-fixation basis for hours during these interactions. We restricted the analysis to face fixations annotated using a combination of manual annotations and computer vision. After training a bidirectional Canonical Component Analysis (CCA) model on training fixations, we sought to reconstruct an image of the face people were seeing based on the corresponding pattern of neural activity, and reconstruct an image of the neural activity based on the corresponding face image, on a fixation-by-fixation basis in a left out test sample of fixations. Significant reconstruction of both the face image subjects were seeing (out of sample R= 0.46; 0.26) and neural activity (out of sample R= 0.29; 0.14) was observed. By assessing which features are reconstructed accurately, we find that parietal, temporal and occipital cortices around 200 ms after fixation onset are important for face processing during natural social interactions. Individual Canonical Components of the model enable a more granular breakdown to examine which specific face features are coded by which particular aspects of neural activity. We will use this approach to test norm based and metric code models of face perception during natural face perception. Our results lay the foundation for understanding the neural basis of visual perception during natural behavior in the real world.

Acknowledgements: NIH R01MH107797, NIH R21EY030297, NSF 1734907

Talk 2, 11:00 am, 62.22

Rapid face preference during visual object processing by the primate superior colliculus

Gongchen Yu1 (), Leor Katz1, Christian Quaia1, Adam Messinger1, Richard Krauzlis1; 1Laboratory of Sensorimotor Research, National Eye Institute, NIH

We recently found that inactivation of the macaque superior colliculus (SC) reduces responses to visual objects in temporal cortex neurons, suggesting that visual object processing may be an important component of how the SC contributes to visual attention and orienting. Here, to investigate how SC activity is modulated by visual object stimuli, we recorded visually responsive neurons in the superficial and intermediate layers of the SC of two rhesus macaques while they passively viewed images presented in the neurons’ visual receptive fields. We used 150 grayscale images of objects belonging to 5 categories that have been extensively used to test visual object representation in the temporal cortex: face, body, hand, fruit/vegetable, and human-made objects. Crucially, the 30 images comprising these categories were matched in their distributions of low-level features (RMS contrast, size, power in three spatial frequency bands), allowing us to determine how SC responses varied with object category in a manner that is independent of low-level visual features. We found that many SC neurons exhibited an object category preference – specifically, a preference for faces – within 60ms of stimulus onset. A linear classifier using SC spike counts in the interval of 40 to 80ms after stimulus onset, distinguished faces from each of the other 4 object categories with accuracies well above chance, but could not reliably distinguish amongst the other object categories. Together, our results reveal that the primate SC, the most important subcortical brain structure for controlling where we look, signals the presence of a face with an extremely short latency, providing a plausible neural substrate for the primates’ well-recognized ability to rapidly detect and orient towards faces. By biasing us to look at faces, the SC may also play a crucial role in how cortical face processing operates, especially during development.

Talk 3, 11:15 am, 62.23

Comparing iEEG responses and deep networks with Bayesian statistics challenges the view that lateral face-selective regions are specialized for facial expression recognition over identity recognition

Emily Schwartz1 (), Arish Alreja2,3,4, R. Mark Richardson5,6, Avniel Ghuman3,4, Stefano Anzellotti1; 1Boston College, 2Carnegie Mellon University, 3University of Pittsburgh, 4University of Pittsburgh Medical Center, 5Massachusetts General Hospital, 6Harvard Medical School

Following a classical view, face identity and facial expression recognition are performed by separate neural mechanisms. However, some neuroimaging studies demonstrate that identity and expression recognition may not be disjoint processes: response patterns in ventral and lateral temporal pathways decode valence (Skerry & Saxe, 2014) and identity (Anzellotti & Caramazza, 2017), respectively. If the ventral pathway is identity-specialized like the classical view suggests, deep neural networks (DNNs) trained to recognize identity should provide a better model of neural responses in these regions as compared to networks trained to recognize expressions. Conversely, if the lateral pathway is expression-specialized, expression-trained DNNs should provide a better model of lateral region responses. Importantly, there would be an interaction between DNN type and brain region. We used intracranial Electroencephalography (iEEG) to compare similarity between neural representations and DNN representations trained to recognize identity or expressions. Patients were shown face images while data from face-selective ventral temporal and lateral regions were collected. For each electrode over sliding temporal windows, we compared neural representational dissimilarity matrices (RDMs) to RDMs obtained from identity-trained models and expression-trained models. Similarity between RDMs from DNN layers and iEEG RDMs at multiple timepoints was analyzed. We evaluated how similar each electrode was to identity and expression RDMs using semi-partial tau-B. Schwarz Criterion was then used to assess if these correlations were better explained by modeling ventral and lateral electrodes separately or combining the two sets of electrodes together. Critically, the data was better explained by a single slope that combined ventral and lateral electrodes. The relative contribution of the models did not differ between ventral and lateral electrodes, and identity models better accounted for the ventral and lateral responses compared to expression models. Results deviate from what the classical view proposes in which lateral electrodes should be better explained by expression models.

Acknowledgements: This work was supported by the National Science Foundation CAREER Grant 1943862 to S.A., National Institutes of Health R01MH107797 and R21EY030297 to A.G and the National Science Foundation 1734907 to A.G.

Talk 4, 11:30 am, 62.24

Reversed contributions of visual and semantic information to the representations of familiar faces in perception and memory

Adva Shoham1 (), Idan Daniel Grosbard1, Yuval Navon1, Galit Yovel1; 1Tel Aviv University

Familiar faces can be described by their visual features and their biographical information. This information can be retrieved from their images as well as in their absence by recalling information from memory based on their names. But what is the relative contribution of visual and semantic information to mental representations in perception and memory and in what order are they retrieved? These questions are hard to answer, as visual and semantic information are intermixed in human mental representations. Here we addressed these questions in two studies. In Study 1, participants rated the visual similarity of familiar faces based on their pictures (perception) or by recalling their visual appearance from memory based on their names. To disentangle the contribution of visual and semantic information we used visual and semantic deep neural networks (DNNs) as predictors of human representations in perception and memory. A face-trained DNN (VGG-16) was used to measure the representational geometry of visual information based on their images. A natural language processing (NLP) DNN was used to measure the representational geometry of semantic information, based on the Wikipedia descriptions of the famous identities. We found a larger contribution of visual than semantic information in human perception but a reversed pattern in memory. In Study 2, participants made speeded visual (i.e., hair color) or semantic decisions (i.e., occupation) about familiar faces based on their images (perception) or their names (memory). Reaction times were faster for visual than semantic decisions in the perception condition but vice versa in the memory condition. Taken together, our studies demonstrate reversed contributions and retrieval order of visual and semantic information in mental representations of familiar faces in perception and memory. Our approach can be used to study the same questions for other categories including objects, scenes as well as voices and sounds.

Acknowledgements: The study was supported by an ISF grant 971/21

Talk 5, 11:45 am, 62.25

A NARROW BAND OF IMAGE DIMENSIONS IS CRITICAL FOR THE LEARNING AND RECOGNITION OF FACE IDENTITY

Dan Rogers1 (), Tim Andrews1, Mila Mileva2; 1The University of York, 2The University of Plymouth

A key theoretical challenge in human face recognition is to determine what information is critical for judgements of identity. For example, as we move about or as gaze or expression changes, the size and shape of a face image on the retina also changes. The visual system must ignore these ambient sources of image variation to facilitate recognition. In this study, we used principal components analysis to reveal the image dimensions from a large set of naturally varying face images. In Experiment 1 (n=78), we asked how the recognition of familiar faces was affected when we systematically removed image dimensions from faces. We found that recognition increased when the early image dimensions were removed. These image dimensions would appear to reflect ambient variation in images that is not important for recognition. However, recognition of faces then decreased when intermediate dimensions were removed, suggesting that these image dimensions contain the critical information for recognizing familiar faces. In Experiment 2 (n=102), we asked the orthogonal question of what image dimensions are important when learning new faces. Again, we found that removing early image dimensions from the training images had a minimal effect on learning new faces (when tested with unmanipulated images). In contrast, removing an intermediate band of image dimensions significantly reduced subsequent recognition of learnt faces. Finally, in Experiment 3 (n=78), we asked whether these critical intermediate image dimensions are organized according to a norm-based or an exemplar-based model. The prediction from a norm-based model was that recognition should increase when the intermediate image dimensions are caricatured. However, we found that recognition rates decreased when the critical intermediate dimensions were caricatured. These findings support an exemplar-based model in which a narrow band of image dimensions are critical for the learning and the subsequent recognition of face identity.

Talk 6, 12:00 pm, 62.26

A familiar face and person processing area in the human temporal pole

Ben Deen1,2 (), Winrich A Freiwald1; 1Rockefeller University, 2Tulane University

How does the brain process the faces of familiar people? Neuropsychological studies have argued for an area of the temporal pole (TP) linking faces with person identities, but magnetic susceptibility artifacts in this region have hampered its study with fMRI. We ask this question using data acquisition and analysis methods optimized to overcome this artifact, including multi-echo sequences that substantially boost signal quality in the anterior temporal lobes. To precisely characterize functional organization in individual human brains, we scanned N = 10 participants using fMRI on a range of perceptual and cognitive tasks, across three scan sessions (7.5 hours per participant). Tasks involved visual perception, semantic judgment, and episodic simulation of close familiar people and places, and everyday objects. The resulting data identify a familiar face response in TP, reliably observed across each individual participant. This area responds strongly to visual images of familiar faces over images of unfamiliar faces, objects, and scenes, but also responds to a variety of abstract cognitive tasks that involve thinking about people, including semantic judgment and episodic simulation. In contrast, a nearby region of perirhinal cortex (PR) – consistent in location with the previously describe “anterior temporal lobe face area” – responds specifically to faces (familiar and unfamiliar), but not to social cognition tasks. This result argues for two separate streams for person and face processing within anterior temporal cortex. Face responses in TP and PR had a similar functional organization to regions our lab has previously observed in macaques, suggesting a possible homology across species. This work identifies a missing link in the human familiar face processing system that is well placed to integrate visual information about faces with higher-order conceptual information about other people.

Talk 7, 12:15 pm, 62.27

Thinking outside of the face network: face recognition deficits are related to reduced connectivity between high-level face areas and non-face-selective sensory, memory, and social processing regions

Alison Campbell1,2 (), Xian Li3, Michael Esterman1,2,4, Joseph DeGutis1,5; 1Boston Attention and Learning Laboratory, VA Boston Healthcare System, Boston, MA, 2Department of Psychiatry, Boston University School of Medicine, Boston MA, 3Department of Psychological and Brain Sciences, Johns Hopkins University, Baltimore, MD, 4National Center for PTSD, VA Boston Healthcare System, Boston, MA, 5Department of Psychiatry, Harvard Medical School, Boston MA

Despite severe face recognition deficits, the neural basis of developmental prosopagnosia (DP) remains a matter of debate. As the majority of studies have sought to characterize abnormalities within the face network, the interface between the face network and broader brain areas is often overlooked. Such pathways are of particular interest given recent evidence that, during face recognition, recollection processes needed to access contextual and person-related information are especially compromised in DP. Using resting-state fMRI, we first compared functional connectivity (FC) within the functionally localized face-selective network, including bilateral OFA, FFA, pSTS, and ATL. DPs (N=35) showed reduced FC throughout the network compared to controls (N=24). We also found that several connections within this network predicted face memory among controls (measured using the Cambridge Face Memory Task, CFMT), confirming that mechanisms associated with face recognition deficits are also related to normal variation in this ability. Next, we examined the brain-wide FC to each face region. Broad group differences were found, with controls showing increased FC between the face regions and other areas of the brain, relative to DPs. We then examined whether these connections predicted individual differences in face memory across controls. This revealed that FC between the ATL and several brain areas including anterior and middle STS, left insula, and left inferior frontal gyrus predicted CFMT. These regions, especially those identified in the left hemisphere, have been implicated in voice and speech processing, social processing, and episodic memory. These results shed new light on the neural basis of DP, and indicate that face memory deficits arise in the context of reduced connectivity both within the face network and to broader brain regions involved in memory and social processing.

Acknowledgements: This work was supported by a grant to JD from the National Eye Institute (R01 EY032510-02).