Tagged: Object recognition

Perceptual Decision Making Investigated via Sparse Decoding of a Spiking Neuron Model of V1

Recent empirical evidence supports the hypothesis that invariant visual object recognition might result from non-linear encoding of the visual input followed by linear decoding [1]. This hypothesis has received theoretical support through the development of neural network architectures which are based on a non-linear encoding of the input via recurrent network dynamics followed by a linear decoder [2], [3]. In this paper we consider such an architecture in which the visual input is non-linearly encoded by a biologically realistic spiking model of V1, and mapped to a perceptual decision via a sparse linear decoder. Novel is that we 1) utilize a large-scale conductance based spiking neuron model of V1 which has been well-characterized in terms of classical and extra-classical response properties, and 2) use the model to investigate decoding over a large population of neurons. We compare decoding performance of the model system to human performance by comparing neurometric and psychometric curves.

Hierarchical multi-resolution models for object recognition: Applications to mammographic computer-aided diagnosis

A fundamental problem in image analysis is the integration of information across scale to detect and classify objects. We have developed, within a machine learning framework, two classes of multiresolution models for integrating scale information for object detection and classification-a discriminative model called the hierarchical pyramid neural network and a generative model called a hierarchical image probability model. Using receiver operating characteristic analysis, we show that these models can significantly reduce the false positive rates for a well-established computer-aided diagnosis system.

Hierarchical image probability (HIP) models

We formulate a model for probability distributions on image spaces. We show that any distribution of images can be factored exactly into conditional distributions of feature vectors at one resolution (pyramid level) conditioned on the image information at lower resolutions. We would like to factor this over positions in the pyramid levels to make it tractable, but such factoring may miss long-range dependencies. To capture long-range dependencies, we introduce hidden class labels at each pixel in the pyramid. The result is a hierarchical mixture of conditional probabilities, similar to a hidden Markov model on a tree. The model parameters can be found with maximum likelihood estimation using the EM algorithm. We have obtained encouraging preliminary results on the problems of detecting various objects in SAR images and target recognition in optical aerial images.