Excitation-inhibition balanced pruning yields sparse, accurate category computing circuits with emergent animate-inanimate routes

Poster Presentation 43.304: Monday, May 18, 2026, 8:30 am – 12:30 pm, Banyan Breezeway
Session: Object recognition: Categories

Jeffery Andrade1 (), Parisa Vaziri1, George Alvarez1; 1Harvard University

The ventral stream maps patterns of light to untangled high-level category representations through hierarchical processing stages. What is the nature of the circuitry underlying this untangling process? We address this question in a “model organism,” AlexNet, to identify computational motifs of category-computing circuits in a “white-box” system. Specifically, we ask which connections between layers must be preserved for correct classification and what the resulting minimal circuits reveal about shared wiring. Prior work extracted sparse category-selective circuits by pruning connections with low importance scores computed from forward activations and backward gradients (Hamblin et al., 2023). We improve this method using (i) “sign-aware pruning,” which separately prunes positive and negative links to maintain excitation–inhibition balance, and (ii) a cross-layer pruning scheme that uses a marginal-loss metric to allocate pruning across layers. These additions yield more consistent, sparser, and more accurate circuits than prior approaches. Dissecting 1000 circuits (one per ImageNet category), we obtain circuits that require 6% of the original network while being 2% more accurate than the unpruned network. Analyzing the circuit structure, we see a common macro-motif, retaining an average of ~93% of weights in layer 1, declining across the convolutional hierarchy to dramatic sparsity in the fully-connected layers (5.2% in layer 6). We find that shared wiring between categories declines across the hierarchy, where the median link appears in only 3/1000 circuits by layer 6. Considering the aggregate animate and inanimate circuits, we observe increasing separation through early convolutional stages and a bimodal distribution in later fully connected layers, with many links computing nearly exclusively animate or inanimate categories. Broadly, this method provides a computational approach toward quantifying the size, overlap, and distributedness of category circuits in deep neural network systems, and offers new experimental targets for predictions of competition, overlap, and priming effects in biological vision.

Acknowledgements: This work was supported by funding from the Kempner Institute and NSF-CAREER 7835425-01 (to Talia Konkle).