Ilker Yildirim (Cognitive and Statistical Sciences Department of Brain & Cognitive Sciences, MIT)
From a quick glance, the touch of an object, or a brief sound snippet, our minds construct scene representations composed of rich and detailed shapes and surfaces. These representations are not only the targets of perception, but also support aspects of cognition including reasoning about physics of objects, planning actions, and manipulating objects -- as in the paradigmatic case of using or making tools. A longstanding view in the psychology of perception and cognition holds that in order to compute such rich representations, the brain must draw on internal causal models of the outside physical world. How in the mind and brain do we build and use such causal models of the world?
In this talk, I will begin to answer this question by presenting a novel approach that synthesizes a diverse range of tools including generative models, simulation engines, and deep neural networks. For one key high-level visual capacity, I will show that this approach explains both human behavioral data and multiple levels of neural processing in non-human primates, as well as a classic illusion, the “hollow face” effect. In addition to perception of faces, I will also show that this approach can naturally extend to reverse-engineering computations in other domains of perception supporting high-level vision more generally, multisensory perception, and aspects of cognition beyond perception such as intuitive physical reasoning and understanding goal-directed actions.