The Nonlinear Geometry of Visual Neurons
This work is generously funded by a Google Faculty Research Award to David J. Field.
The Field Lab is currently working on a new approach to understanding how the brain represents the visual world. Since the experiments of Hubel & Wiesel, the field has been engaged in an effort to build mathematical models that have similar properties to observed physiological responses of neurons in mammalian visual cortex. A great deal of progress has been made on this front by building models that follow the principles of efficient coding that have their origins in the work of Horace Barlow (among others). A number of models have been constructed that successfully learn response properties similar to actual V1 neurons based on efficient coding of natural images. Multi-layer extensions to these models, or deep belief networks, have become the field's state-of-the-art model for object recognition tasks.
Despite these advances, there are fundamental theoretical questions about visual processing that remain unanswered. The biggest may be how the visual system is able to recognize objects, or, more specifically, how information about the identity of an object is extracted from the retinal image by processing in V1, V2, V4 and IT. Similarly, the deep networks that have recently dominated the ImageNet classification challenge are not completely understood. The intermediate and high-level features that are learned both in the brain and in deep networks have proved resistant to principled explanations, although some progress has been made.
A common theme that has emerged in both the physiology literature and machine learning literature is that of selectivity and tolerance (or invariance). There is widespread agreement that high-level neurons in the brain and deep networks must have responses that are both selective to some features and tolerant to others. However, there is not a good theoretical basis for defining or measuring what selectivity and tolerance are beyond a max operation and local spatial pooling, respectively.
We argue that a geometric approach to understanding a neuron's response can both provide insight into how object recognition is accomplished and provide the mathematical formalism necessary to quantify these properties, for both neurons in deep networks and in the brain.
Hubel & Wiesel observed both end-stopping/hyperselectivity and position invariance in V1 responses, both nonlinear responses. There are a host of other nonlinear responses that have been observed in V1, including cross-orientation inhibition, contrast gain control, bandwidth variation with stimulus basis set. How do each of these nonlinearities arise? Are there separate mechanisms that can be used to explain each of them? The responses of V2, V4 and IT neurons generally become even more nonlinear. We argue that this geometric approach to understanding responses of neurons may obviate the need for the myriad nonlinear models necessary to predict responses for different stimulus sets. A geometrical description of a nonlinear operation is identical to writing down the equations that describe, but our approach may allow us to circumvent this problem and provide a more parsimonious framework to describe neural responses in high-dimensional image space.
The figure above shows a user interface that was developed to probe the curved responses of neurons in a high dimensional space. Three neurons were selected, shown in the bottom left. The basis function representing each neuron is a vector in the high-dimensional space, and a 3D subspace was found and probed using Gram-Schmidt orthogonalization. The iso-response surfaces of the three neurons within the subspace are shown. The 3D lattice represents the Euclidean space of images and how it is distorted in the subspace by the representation in the network. With the movement of the first slider, the iso-response surfaces curve and the lattice is increasingly distorted as the network finds a sparser representaiton. With the movement of the second slider, the subspace is rotated through an angle in the high-dimensional space.