Intuitive physical reasoning in brains, minds, and machines

Symposium: Friday, May 15, 2026, 8:00 – 10:00 am, Talk Room 1

Organizers: RT Pramod1, Nancy Kanwisher1; 1MIT
Presenters: Tomer Ullman, Shari Liu, Kelsey Allen, SP Arun, RT Pramod

How do we perceive, predict and plan actions in the physical world like pouring milk, catching a ball, or playing jenga? Inferring the physical state of the world to make predictions and generate explanations is called ‘intuitive physics’. From infancy, humans expect objects to be solid, continuous, and subject to gravity; and they distinguish between things and stuff, recognizing that rigid objects maintain shape while substances deform or pour. They also understand relational concepts such as containment, support, and collision, revealing an early foundation for causal reasoning about the physical world. In adults, research has begun to reveal both the sophistication and the limits of intuitive physics, inspiring computational frameworks that model how humans reason flexibly about the physical world. Competing accounts debate whether intuitive physics relies on simple perceptual heuristics and pattern recognition, or on structured internal models that perform approximate simulations for prediction—analogous to the physics engines used in video games. In the latter view, intuitive physical reasoning requires a rich understanding of the physical scene, including the shapes, physical properties, relationships and dynamics of objects, and the ability to use this information to predict future states through approximate probabilistic simulations. These theoretical distinctions have motivated new empirical investigations into how such computations might be implemented in the brain. Over the past decade, neuroimaging studies have identified a network of fronto-parietal regions in the human brain, called the Physics Network, that encodes object properties, spatial relationships, and predicted future states. This network has been proposed as a neural substrate for simulation-based intuitive physical reasoning, yet key questions remain: What computations do they perform, and how do these computations depend on the participant’s current goals? Are the regions not only involved in but necessary for intuitive physical reasoning? This symposium brings together researchers from cognitive science, neuroscience, and computational modeling to synthesize emerging findings, highlight convergent themes across disciplines, and chart a path forward for understanding how brains, minds, and machines reason about the physical world. The organizers will kick off the symposium by introducing the topic and outlining the symposium (5 mins). Tomer Ullman will lay out how the human mind might be solving intuitive physics through various approximations under constraints of time, memory, and computation. Shari Liu will present behavioral evidence for interaction between intuitive physics and intuitive psychology and propose a computational framework that captures the interaction. Kelsey Allen will describe computational modeling studies that probe how people use intuitive physics to solve novel problems. SP Arun will share some exciting results on the representation of physical scene properties in the non-human primate brain using wireless recordings in freely moving monkeys. Finally, RT Pramod will review neuroimaging evidence for the ‘physics engine’ hypothesis in the human brain. Each speaker will present for 20 minutes with 3 additional minutes for Q&A. This symposium should be of interest to VSS members from various subfields including 3D perception, Object and Scene perception, Spatial vision, Material perception, Motion, Events, Relations, Perception and Action, Visual neuroscience, and Computational models of vision.

Talk 1

Good enough: Approximations in mental simulation and Intuitive physics

Tomer Ullman1; 1Harvard University

From spreading mayo on toast to dodging an errant frisbee, people handle the everyday physical world with remarkable ease. Without a sense of intuitive physics, every day would be a series of small disasters. How do people do it? One current model of intuitive physics supposes that people are carrying out a kind of mental simulation, moving objects in the mind step by step. While successful in several cases, even the people who champion this idea recognize that humans can't possibly be running a perfect simulation. But what sort of approximations might humans be using? Engineers concerned with 'good enough' simulations use principled short-cuts and workarounds to get around constraints of time, memory, and computation. I argue that these provide a good starting point for considering how the human mind, which is trying to create good-enough simulations under constraints of time, memory, and computation, may implement mental simulation. In this talk, I consider specifically approximate bodies in tracking, partial simulation, lazy evaluation in imagery, and bounds on the number of objects that can be simulated at once. I will also discuss the computational models that capture these approximations, which include a proposed fundamental split between physics and graphics.

Talk 2

Naive psychology depends on naive physics

Shari Liu1; 1Johns Hopkins University

Across the cognitive sciences, researchers have studied naive psychology (making sense of other people’s actions in terms of their mental states) and naive physics (making sense of physical events in terms of their underlying mechanics and dynamics), as two separate processes. In this talk, I’ll review evidence for these domain-specific systems for reasoning about the psychological and physical world, with distinct computational goals, representations, and neural substrates. At the same time, I propose that from early on in human development people navigate the social world by using two distinct but interacting systems for reasoning about other agents’ immaterial minds and their material bodies. I’ll review research from developmental psychology and cognitive neuroscience that provides evidence for this interaction: (1) human minds and brains represent the bodies of animate agents as objects, and their actions as physical events, and (2) we use physical knowledge to make inferences about other minds, including what other people want, feel and know, how hard they are trying, and how much danger they are in. I will also discuss Bayesian computational models of theory of mind, which articulate a formal hypothesis for this interaction. The talk will end by discussing key future empirical tests of this proposal.

Talk 3

Computational models of human intuitive physics

Kelsey Allen1; 1University of British Columbia

Intuitive physics lets us do more than predict how the world will unfold, it also allows us to solve problems. In this talk, I will describe my lab's work investigating the computational underpinnings of how people use intuitive physics to select and create novel objects to solve videogame puzzles. Across multiple timescales of learning, we find that people adapt the ways that they sample possible solutions, guided by their predictions of how those possibilities will affect the game environment. Finally, I will conclude by drawing comparisons with modern large scale machine learning methods applied to the same puzzles to illustrate the importance of structured reasoning in human behavior.

Talk 4

Representation of intuitive physics properties in monkey inferotemporal cortex

SP Arun1; 1Indian Institute of Science

We see and recognize objects every day but also form expectations about how they will handle when touched, or how they interact with other objects when moved. Where are these expectations created in the brain? One strong contender is the inferior temporal cortex, a region known for its highly object-selective neurons that are invariant to properties like size, position and viewpoint. While we know that IT neurons encode interesting visual features, whether they integrate physics expectations into their representations is unknown. Here, I will describe some studies in which we have performed wireless brain recordings from high-level sensory and motor regions in freely moving monkeys, to find that (1) After monkeys physically interact with objects, their IT neurons start encoding their mass; (2) When monkeys watch objects bouncing at various angles, their IT neurons show congruent tuning for incidence and bounce angle; and (3) When monkeys watch a ball disappear into an object or bounce off from it, their IT neurons dynamically infer that the object is hollow or solid. Taken together, these observations indicate that high-level visual cortex not only encodes visual properties, but more broadly encodes all object properties and even carries expectations about how objects will behave in the real world.

Talk 5

Intuitive physics in the human brain

RT Pramod1; 1MIT

Intuitive physics has long fascinated researchers studying human behavior, yet its neural basis is still not well understood. In this talk, I will focus on a set of regions in the fronto-parietal cortex of the human brain, called the ‘Physics Network (PN)’, and review neuroimaging (fMRI) evidence for a variety of physical scene properties. Specifically, I will show that the PN encodes information about object properties such as mass and material, their configurations (including stability and contact relationships), and their predicted future states. In addition, I will present evidence that these regions respond not only to visual but also auditory and linguistic stimuli, indicating multi-modal processing. Our findings also reveal that the PN responds strongly to causal relationships (versus non-causal ones) specifically in the physical (versus social) domain, suggesting that these regions are representing the causal structure of the physical world. Finally, I will provide anatomical, functional, and network-level evidence that the PN is at least partially dissociable from the neighboring Multiple Demand (MD) system, which supports domain-general reasoning. Altogether, these results are indicative of the PN being the brain's ‘Physics Engine’, representing latent physical properties and causal structure of the physical scene, and enabling predictions through forward simulation.