Tagged: CAD

Application of Information Theory to Improve Computer-Aided Diagnosis

Mammographic Computer-Aided Diagnosis (CAD) systems are an approach for low-cost double reading. Though results to date have been promising, current systems often suffer from unacceptably high false positive rates. Improved methods are needed for optimally setting the system parameters, particularly in the case of statistical models that are common elements of most CAD systems. In this research project we developed a framework for building hierarchical pattern recognizers for CAD based on information theoretic criteria, e.g., the minimum description length (MDL). As part of this framework, we developed a hierarchical image probability (HIP) model. HIP models are well-suited to information theoretic methods since they are generative. We developed architecture search algorithms based on information theory, and applied these to mammographic CAD. The resulting mass detection algorithm, for example, reduced the false positive rate of a CAD system by 30% with no loss of sensitivity. We showed that the criteria reliably correlate with performance on new data. The framework allows many other applications not possible with most pattern recognition algorithms, including rejection of novel examples that can’t be reliably classified, synthesis of artificial images to investigate the structure learned by the model, and compression, which is as good as JPEG.

Mammographic mass detection with a hierarchical image probability (HIP) model

We formulate a model for probability distributions on image spaces. We show that any distribution of images can be factored exactly into conditional distributions of feature vectors at one resolution (pyramid level) conditioned on the image information at lower resolutions. We would like to factor this over positions in the pyramid levels to make it tractable, but such factoring may miss long-range dependencies. To fix this, we introduce hidden class labels at each pixel in the pyramid. The result is a hierarchical mixture of conditional probabilities, similar to a hidden Markov model on a tree. The model parameters can be found with maximum likelihood estimation using the EM algorithm. We have obtained encouraging preliminary results on the problems of detecting masses in mammograms.