Time/Room: Friday, May 11, 1:00 – 3:00 pm, Royal Ballroom 6-8
Organizer: Johan Wagemans, Laboratory of Experimental Psychology, University of Leuven
Presenters: Johan Wagemans, Charles E. Connor, Scott O. Murray, James R. Pomerantz, Jacob Feldman, Shaul Hochstein
With his famous paper on phi motion, Wertheimer (1912) launched Gestalt psychology, arguing that the whole is different from the sum of the parts. In fact, wholes were considered primary in perceptual experience, even determining what the parts are. Gestalt claims about global precedence and configural superiority are difficult to reconcile with what we now know about the visual brain, with a hierarchy from lower areas processing smaller parts of the visual field and higher areas responding to combinations of these parts in ways that are gradually more invariant to low-level changes to the input and corresponding more closely to perceptual experience. What exactly are the relationships between parts and wholes then? Are wholes constructed from combinations of the parts? If so, to what extent are the combinations additive, what does superadditivity really mean, and how does it arise along the visual hierarchy? How much of the combination process occurs in incremental feedforward iterations or horizontal connections and at what stage does feedback from higher areas kick in? What happens to the representation of the lower-level parts when the higher-level wholes are perceived? Do they become enhanced or suppressed (“explained away”)? Or, are wholes occurring before the parts, as argued by Gestalt psychologists? But what does this global precedence really mean in terms of what happens where in the brain? Does the primacy of the whole only account for consciously perceived figures or objects, and are the more elementary parts still combined somehow during an unconscious step-wise processing stage? A century later, tools are available that were not at the Gestaltists’ disposal to address these questions. In this symposium, we will take stock and try to provide answers from a diversity of approaches, including single-cell recordings from V4, posterior and anterior IT cortex in awake monkeys (Ed Connor, Johns Hopkins University), human fMRI (Scott Murray, University of Washington), human psychophysics (James Pomerantz, Rice University), and computational modeling (Jacob Feldman, Rutgers University). Johan Wagemans (University of Leuven) will introduce the theme of the symposium with a brief historical overview of the Gestalt tradition and a clarification of the conceptual issues involved. Shaul Hochstein (Hebrew University) will end with a synthesis of the current literature, in the framework of Reverse Hierarchy Theory. The scientific merit of addressing such a central issue, which has been around for over a century, from a diversity of modern perspectives and in light of the latest findings should be obvious. The celebration of the centennial anniversary of Gestalt psychology also provides an excellent opportunity to doing so. We believe our line-up of speakers, addressing a set of closely related questions, from a wide range of methodological and theoretical perspectives, promises to be attracting a large crowd, including students and faculty working in psychophysics, neurosciences and modeling. In comparison with other proposals taking this centennial anniversary as a window of opportunity, ours is probably more focused and allows for a more coherent treatment of a central Gestalt issue, which has been bothering vision science for a long time.
Part-whole relationships in vision science: A brief historical review and conceptual analysis
Johan Wagemans, Laboratory of Experimental Psychology, University of Leuven
Exactly 100 years ago, Wertheimer’s paper on phi motion (1912) effectively launched the Berlin school of Gestalt psychology. Arguing against elementalism and associationism, they maintained that experienced objects and relationships are fundamentally different from collections of sensations. Going beyond von Ehrenfels’s notion of Gestalt qualities, which involved one-sided dependence on sense data, true Gestalts are dynamic structures in experience that determine what will be wholes and parts. From the beginning, this two-sided dependence between parts and wholes was believed to have a neural basis. They spoke of continuous “whole-processes” in the brain, and argued that research needed to try to understand these from top (whole) to bottom (parts ) rather than the other way around. However, Gestalt claims about global precedence and configural superiority are difficult to reconcile with what we now know about the visual brain, with a hierarchy from lower areas processing smaller parts of the visual field and higher areas responding to combinations of these parts in ways that are gradually more invariant to low-level changes to the input and corresponding more closely to perceptual experience. What exactly are the relationships between parts and wholes then? In this talk, I will briefly review the Gestalt position and analyse the different notions of part and whole, and different views on part-whole relationships maintained in a century of vision science since the start of Gestalt psychology. This will provide some necessary background for the remaining talks in this symposium, which will all present contemporary views based on new findings.
Ventral pathway visual cortex: Representation by parts in a whole object reference frame
Charles E. Connor, Department of Neuroscience and Zanvyl Krieger Mind/Brain Institute, Johns Hopkins University, Anitha Pasupathy, Scott L. Brincat, Yukako Yamane, Chia-Chun Hung
Object perception by humans and other primates depends on the ventral pathway of visual cortex, which processes information about object structure, color, texture, and identity. Object information processing can be studied at the algorithmic, neural coding level using electrode recording in macaque monkeys. We have studied information processing in three successive stages of the monkey ventral pathway: area V4, PIT (posterior inferotemporal cortex), and AIT (anterior inferotemporal cortex). At all three stages, object structure is encoded in terms of parts, including boundary fragments (2D contours, 3D surfaces) and medial axis components (skeletal shape fragments). Area V4 neurons integrate information about multiple orientations to produce signals for local contour fragments. PIT neurons integrate multiple V4 inputs to produce representations of multi-fragment configurations. Even neurons in AIT, the final stage of the monkey ventral pathway, represent configurations of parts (as opposed to holistic object structure). However, at each processing stage, neural responses are critically dependent on the position of parts within the whole object. Thus, a given neuron may respond strongly to a specific contour fragment positioned near the right side of an object but not at all when it is positioned near the left. This kind of object-centered position tuning would serve an essential role by representing spatial arrangement within a distributed, parts-based coding scheme. Object-centered position sensitivity is not imposed by top-down feedback, since it is apparent in the earliest responses at lower stages, before activity begins at higher stages. Thus, while the brain encodes objects in terms of their constituent parts, the relationship of those parts to the whole object is critical at each stage of ventral pathway processing.
Long-range, pattern-dependent contextual effects in early human visual cortex
Scott O. Murray, Department of Psychology, University of Washington, Sung Jun Joo, Geoffrey M. Boynton
The standard view of neurons in early visual cortex is that they behave like localized feature detectors. We will discuss recent results that demonstrate that neurons in early visual areas go beyond localized feature detection and are sensitive to part-whole relationships in images. We measured neural responses to a grating stimulus (“target”) embedded in various visual patterns as defined by the relative orientation of flanking stimuli. We varied whether or not the target was part of a predictable sequence by changing the orientation of distant gratings while maintaining the same local stimulus arrangement. For example, a vertically oriented target grating that is flanked locally with horizontal flankers (HVH) can be made to be part of a predictable sequence by adding vertical distant flankers (VHVHV). We found that even when the local configuration (e.g. HVH) around the target was kept the same there was a smaller neural response when the target was part of a predictable sequence (VHVHV). Furthermore, when making an orientation judgment of a “noise” stimulus that contains no specific orientation information, observers were biased to “see” the orientation that deviates from the predictable orientation, consistent with computational models of primate cortical processing that incorporate efficient coding principles. Our results suggest that early visual cortex is sensitive to global patterns in images in a way that is markedly different from the predictions of standard models of cortical visual processing and indicate an important role in coding part-whole relationships in images.
The computational and cortical bases for configural superiority
James R. Pomerantz, Department of Psychology, Rice University, Anna I. Cragin, Department of Psychology, Rice University; Kimberley D. Orsten, Department of Psychology, Rice University; Mary C. Portillo, Department of Social Sciences, University of Houston-Downtown
In the configural superiority effect (CSE; Pomerantz et al., 1977; Pomerantz & Portillo, 2011), people respond more quickly to a whole configuration than to any one of its component parts, even when the parts added to create a whole contribute no information by themselves. For example, people discriminate an arrow from a triangle more quickly than a positive from a negative diagonal even when those diagonals constitute the only difference between the arrows and triangles. How can a neural or other computational system be faster at processing information about combinations of parts – wholes – than about parts taken singly? We consider the results of Kubilius et al. (2011) and discuss three possibilities: (1) Direct detection of wholes through smart mechanisms that compute higher order information without performing seemingly necessary intermediate computations; (2) the “sealed channel hypothesis” (Pomerantz, 1978), which holds that part information is extracted prior to whole information in a feedforward manner but is not available for responses; and (3) a closely related reverse hierarchy model holding that conscious experience begins with higher cortical levels processing wholes, with parts becoming accessible to consciousness only after feedback to lower levels is complete (Hochstein & Ahissar, 2002). We describe a number of CSEs and elaborate both on these mechanisms that might explain them and how they might be confirmed experimentally.
Computational integration of local and global form
Jacob Feldman, Dept. of Psychology, Center for Cognitive Science, Rutgers University – New Brunswick, Manish Singh, Vicky Froyen
A central theme of perceptual theory, from the Gestaltists to the present, has been the integration of local and global image information. While neuroscience has traditionally viewed perceptual processes as beginning with local operators with small receptive fields before proceeding on to more global operators with larger ones, a substantial body of evidence now suggests that supposedly later processes can impose decisive influences on supposedly earlier ones, suggesting a more complicated flow of information. We consider this problem from a computational point of view. Some local processes in perceptual organization, like the organization of visual items into a local contour, can be well understood in terms of simple probabilistic inference models. But for a variety of reasons nonlocal factors such as global “form” resist such simple models. In this talk I’ll discuss constraints on how form- and region-generating probabilistic models can be formulated and integrated with local ones. From a computational point of view, the central challenge is how to embed the corresponding estimation procedure in a locally-connected network-like architecture that can be understood as a model of neural computation.
The rise and fall of the Gestalt gist
Shaul Hochstein, Departments of Neurobiology and Psychology, Hebrew University, Merav Ahissar
Reviewing the current literature, one finds physiological bases for Gestalt-like perception, but also much that seems to contradict the predictions of this theory. Some resolution may be found in the framework of Reverse Hierarchy Theory, dividing between implicit processes, of which we are unaware, and explicit representations, which enter perceptual consciousness. It is the conscious percepts that appear to match Gestalt predictions – recognizing wholes even before the parts. We now need to study the processing mechanisms at each level, and, importantly, the feedback interactions which equally affect and determine the plethora of representations that are formed, and to analyze how they determine conscious perception. Reverse Hierarchy Theory proposes that initial perception of the gist of a scene – including whole objects, categories and concepts – depends on rapid bottom-up implicit processes, which seems to follow (determine) Gestalt rules. Since lower level representations are initially unavailable to consciousness – and may become available only with top-down guidance – perception seems to immediately jump to Gestalt conclusions. Nevertheless, vision at a blink of the eye is the result of many layers of processing, though introspection is blind to these steps, failing to see the trees within the forest. Later, slower perception, focusing on specific details, reveals the source of Gestalt processes – and destroys them at the same time. Details of recent results, including micro-genesis analyses, will be reviewed within the framework of Gestalt and Reverse Hierarchy theories.