Comparing Human Face and Body Recognition at Various Distance and Rotation Viewing Conditions.

Poster Presentation 23.342: Saturday, May 18, 2024, 8:30 am – 12:30 pm, Banyan Breezeway
Session: Face and Body Perception: Bodies

Matthew Groth1 (), Michal Fux1, Hojin Jang1, Suayb S. Arslan1, Walt Dixon2, Joydeep Munshi2, Pawan Sinha1; 1Massachusettes Institute of Technology, 2GE

Deep networks trained on large face datasets have achieved impressive performance on recognition tasks. However, as we reported in last year’s meeting (Fux et al., 2023), humans still outperform DNNs when viewing conditions are challenging as with large distances, non-frontal regard, and atmospheric turbulence. In the current study, we investigate recognition performance of humans and deep networks with images of whole bodies. This task is akin to the ‘person re-identification’ challenge of great interest to the machine vision community. We worked with a large database of images acquired at a variety of distances and from multiple yaw/pitch angles. We ran an online behavioral study in which participants were asked to rank whole-body images of people from most to least likely to be the same identity as the person in the query image. The query image depicts individuals in three conditions: Whole body, head occluded, and body occluded. Distance to the camera ranged from 10m to 500m. The results enable an analysis of the relative contribution of the head and body to recognition as a function of viewing distance. A comparative analysis of humans against a whole-body trained DNN establishes a clear superiority of human performance across all tested distances. Moreover, this comparison reveals differences in the relative importance of head and body regions, with humans deriving significant identity information from the head region, unlike the DNN. These preliminary results indicate potential divergent strategies employed by humans and DNNs, offering insights into distant person identification and implications for the design of future machine models.

Acknowledgements: This research is supported by ODNI, IARPA. The views are of the authors and shouldn't be interpreted as representing official policies of ODNI, IARPA, or the U.S. Gov., which is authorized to reproduce & distribute reprints for governmental purposes notwithstanding any copyright annotation therein.