Four Not-So-Easy Pieces: Improving Neural Network Models of the Visual System
Convolutional Neural Networks (CNNs) optimized for invariant category recognition have turned out to be OK, if imperfect, models of neural responses in the ventral visual pathway of humans and non-human primates. However, these very imperfections are the touchstones for a series of deep unsolved questions about neural architecture, visual cognition, and biological learning. In fact, every major component of the visual pathway-qua-task-optimized system is problematic in the standard CNN model: (1) from an architectural point of view, the lack of recurrent and feedback; (2) from a task objective point of view, the need for large numbers of supervision labels; (3) from an environment point of view, the reliance on offline batch learning of stereotyped image data; and (4) from a learning-rule perspective, the failures of back-propagation as a biologically realistic process. In this talk, I will discuss each of these problems in turn, together with initial forays toward their solution. Ultimately, I hope to communicate a sketch of what a more holistically plausible model of the ventral visual pathway might look like.