Conferences

  • Deep Learning Methods to Predict Cognitive Performance in Midlife

    August 29, 2020 | Sharath Koorathota, Paul Sajda, G Liu, M Lachman, RP Sloan

    Deep Learning Methods to Predict Cognitive Performance in Midlife S Koorathota, P Sajda, G Liu, M Lachman, RP Sloan Psychosomatic Medicine 82 (2)

  • Sequence Models in Eye Tracking: Predicting Pupil Diameter During Learning

    August 29, 2020 | Sharath Koorathota, Kaveri Thakoor, P Adelman, Y Mao, Xueqing Liu, Paul Sajda

    S Koorathota, K Thakoor, P Adelman, Y Mao, X Liu, P Sajda (2020) ACM Symposium on Eye Tracking Research and Applications, 3, https://doi.org/10.1145/3379157.3391653

  • Impact of Reference Standard, Data Augmentation, and OCT Input on Glaucoma Detection Accuracy by CNNs on a New Test Set

    August 21, 2020 | Kaveri Thakoor, Emmanouil Tsamis, De Moraes C.G., Paul Sajda, Donald C. Hood

    Thakoor, K.A., Tsamis, E.M., De Moraes, C.G., Sajda, P., Hood, D.C. Impact of Reference Standard, Data Augmentation, and OCT Input on Glaucoma Detection Accuracy by CNNs on a New Test Set. Investigative Ophthalmology and Visual Science, 61(7), pp. 4540-4540, 2020.

  • Assessing the Ability of Convolutional Neural Networks to Detect Glaucoma from OCT Probability Maps

    February 28, 2020 | Kaveri Thakoor, Q Zheng, L Nan, X Li, Emmanouil Tsamis, R. Rajshekhar, I. Dwivedi, I. Drori, Paul Sajda, Donald C. Hood

    Thakoor, K.A., Zheng, Q., Nan, L., Li, X., Tsamis, E.M., Rajshekhar, R., Dwivedi, I., Drori, I., Sajda, P. and Hood, D.C., Assessing the Ability of Convolutional Neural Networks to Detect Glaucoma from OCT Probability Maps. Investigative Ophthalmology and Visual Science, 60(9), pp.1464-1464, 2019.

  • Enhancing the Accuracy of Glaucoma Detection from OCT Probability Maps using Convolutional Neural Networks

    July 28, 2019 | Kaveri Thakoor, Xinhui Li, Emmanouil Tsamis, Paul Sajda, Donald C. Hood

    Thakoor, K.A., Li, X., Tsamis, E., Sajda, P. and Hood, D.C., (2019) Enhancing the Accuracy of Glaucoma Detection from OCT Probability Maps using Convolutional Neural Networks. 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 2036-2040,

  • An EEG-fMRI-TMS system for investigating BOLD response to alpha phase-locked TMS

    June 27, 2018 | Yida Lin

    Josef Faller

  • Relating deep neural network representations to EEG-fMRI spatiotemporal dynamics in a perceptual decision-making task

    June 14, 2018 | Tao Tu, Jonathan Koss, Paul Sajda

    The hierarchical architecture of deep convolutional neural networks (CNN) resembles the multi-level processing stages of the human visual system during object recognition. Converging evidence suggests that this hierarchical organization is key to the CNN achieving human-level performance in object categorization. In this paper, we leverage the hierarchical organization of the CNN to investigate the spatiotemporal dynamics of rapid visual processing in the human brain. Specifically we focus on perceptual decisions associated with different levels of visual ambiguity. Using simultaneous EEG-fMRI, we demonstrate the temporal and spatial hierarchical correspondences between the multi-stage processing in CNN and the activity observed in the EEG and fMRI. The hierarchical correspondence suggests a processing pathway during rapid visual decisionmaking that involves the interplay between sensory regions, the default mode network (DMN) and the frontal-parietal control network (FPCN).

  • Unsupervised Adaptive Transfer Learning for Steady-State Visual Evoked Potential Brain-Computer Interfaces

    March 19, 2018 | Nick Waytowich, Josef Faller, J.O. Garcia, J.M. Vettel, Paul Sajda

    Recent advances in signal processing for the detection of Steady-State Visual Evoked Potentials (SSVEPs) have moved away from traditionally calibrationless methods, such as canonical correlation analysis, and towards algorithms that require substantial training data. In general, this has improved detection rates, but SSVEP-based brain-computer interfaces (BCIs) now suffer from the requirement of costly calibration sessions. Here, we address this issue by applying transfer learning techniques to SSVEP detection. Our novel Adaptive-C3A method incorporates an unsupervised adaptation algorithm that requires no calibration data. Our approach learns SSVEP templates for the target user and provides robust class separation in feature space leading to increased classification accuracy. Our method achieves significant improvements in performance over a standard CCA method as well as a transfer variant of the state-of-the art Combined-CCA method for calibrationless SSVEP detection.

  • Development of an EEG-fMRI-TMS system for optimal alpha phase synchronized activation of anterior cingulate cortex

    March 6, 2018 | James McIntosh, Josef Faller, GT Saber, J Doose, Yida Lin, H Moss, Robin Goldman, MS George, Paul Sajda, Truman R. Brown

    Rostral anterior cingulate cortex (rACC) is an integral part of the depression network. rACC activity may predict antidepressant effect. Stimulation of left dorsolateral prefrontal cortex (lDLPFC) causes a reciprocal change in certain parts of the ACC. We hypothesise that improvements in the method of transcranial magnetic stimulation (TMS) delivery to the lDLPFC may produce an antidepressant effect.

  • Level of TMS-evoked activation in anterior cingulate cortex depends on timing of TMS delivery relative to frontal alpha phase

    March 6, 2018 | GT Saber, James McIntosh, J Doose, Josef Faller, Yida Lin, H Moss, Robin Goldman, MS George, Paul Sajda, Truman R. Brown

    To test whether the level of activation of the anterior cingulate cortex (ACC) following a TMS pulse delivered to the dorsolateral prefrontal cortex depends on precise timing of its delivery relative to an individual’s alpha rhythm, we developed an integrated EEG-fMRI-TMS instrument capable of acquiring simultaneous EEG-fMRI while delivering TMS pulses in the scanner. We found a statistically significant effect of BOLD signal change in ACC dependent on individual subject frontal alpha phase just prior to TMS delivery. Specifically, TMS-evoked BOLD response in the ACC increased when TMS pulse was synchronized to the rising slope of the frontal alpha oscillation.

  • SigViewer – Current Status and Recent Developments

    March 6, 2018 | Clemens Brunner, Yida Lin, Paul Sajda, Josef Faller

    SigViewer is an open source cross-platform biosignal viewer designed to visualize and annotate biomedical data streams. It supports a wide variety of file formats, including BDF, EDF, GDF, CNT, BrainVision, and BCI2000. Recently, support for loading multi- stream XDF data has been added. Besides visualizing raw data, SigViewer supports loading, displaying, creating, and editing events that can be used to annotate specific segments within a signal. Other useful tools include offset removal, computation of event-related potentials, and calculation of power spectral densities. To our knowledge, SigViewer is the only open source cross-platform multi-format biosignal viewer currently available that supports XDF files. Furthermore, SigViewer is completely free in that it does not depend on any proprietary software such as e. g. MATLAB. SigViewer is actively maintained and widely used across the globe (as measured by the monthly downloads). Filtering data in the frequency domain before visualization to e. g. remove line noise or excessive drift is one of the next planned features for a future release.

  • Deep Reinforcement Learning Using Neurophysiological Signatures of Interest

    We present a study where human neurophysiological signals are used as implicit feedback to alter the behavior of a deep learning based autonomous driving agent in a simulated virtual environment.

  • Your eyes give you away: pupillary responses, EEG dynamics, and applications for BCI

    March 6, 2018 | Paul Sajda

    As we move through an environment, we are constantly making assessments, judgments, and decisions about the things we encounter. Some are acted upon immediately, but many more become mental notes or fleeting impressions — our implicit “labeling” of the world. In this talk I will describe our work using physiological correlates of this labeling to construct a hybrid brain-computer interface (hBCI) system for efficient navigation of a 3-D environment. Specifically, we record electroencephalographic (EEG), saccadic, and pupillary data from subjects as they move through a small part of a 3-D virtual city under free-viewing conditions. Using machine learning, we integrate the neural and ocular signals evoked by the objects they encounter to infer which ones are of subjective interest. These inferred labels are propagated through a large computer vision graph of objects in the city, using semi-supervised learning to identify other, unseen objects that are visually similar to those that are labelled. Finally, the system plots an efficient route so that subjects visit similar objects of interest. We show that by exploiting the subjects’ implicit labeling, the median search precision is increased from 25% to 97%, and the median subject need only travel 40% of the distance to see 84% of the objects of interest. We also find that the neural and ocular signals contribute in a complementary fashion to the classifiers’ inference of subjects’ implicit labeling. In summary, we show that neural and ocular signals reflecting subjective assessment of objects in a 3-D environment can be used to inform a graph-based learning model of that environment, resulting in an hBCI system that improves navigation and information delivery specific to the user’s interests.

  • NEXUS: A tool for simulating large-scale hybrid neural networks

    January 18, 2017 | Paul Sajda, K. Sakai, L. H. Finkel
  • Computer Simulations of Object Discrimination by Visual Cortex

    January 18, 2017 | L. H. Finkel, Paul Sajda

    We present computer simulations of how the visual cortex may discriminate objects based on depth-from-occlusion. We propose neural mechanisms for how the visual system binds edges into contours, and binds contours and surfaces into objects. The model is simulated by a system of physiologically-based neural networks which feature feedback connections from higher to lower cortical areas, a distributed representation of depth, and phase-locked cortical neuronal firing. The system demonstrates psychophysical properties consistent with human perception of real and illusory visual scenes. The model addresses both the binding problem and the problem of object segmentation.

  • Training neural networks for computer-aided diagnosis: experience in the intelligence community

    January 17, 2017 | Paul Sajda, C. Spence

    Neural networks are often used in computer-aided diagnosis (CAD) systems for detecting clinically significant objects. They have also been applied in the AI community to cue image analysts (IAs) for assisted target recognition and wide-area searching. Given the similarity between the applications in the two communities, there are a number of common issues that must be considered when training these neural networks. Two such issues are: (1) exploiting information at multiple scales (e.g. context and detail structure), and (2) dealing with uncertainty (e.g. errors in truth data). We address these two issues, transferring architectures and training algorithms originally developed for assisting IAs in search applications, to improve CAD for mammography. These include hierarchical pyramid neural net (HPNN) architectures that automatically learn and integrate multi-resolution features for improving microcalcification and mass detection in CAD systems. These networks are trained using an uncertain object position (UOP) error function for the supervised learning of image searching/detection tasks when the position of the objects to be found is uncertain or ill-defined. The results show that the HPNN architecture trained using the UOP error function reduces the false-positive rate of a mammographic CAD system by 30%-50% without any significant loss in sensitivity. We conclude that the transfer of assisted target recognition technology from the AI community to the medical community can significantly impact the clinical utility of CAD systems.

  • Multiresolution neural networks for mammographic mass detection

    January 17, 2017 | C. Spence, Paul Sajda

    We have previously presented a hierarchical pyramid/neural network (HPNN) architecture which combines multi-scale image processing techniques with neural networks. This coarse-to- fine HPNN was designed to learn large-scale context information for detecting small objects. We have developed a similar architecture to detect mammographic masses (malignant tumors). Since masses are large, extended objects, the coarse-to-fine HPNN architecture is not suitable for the problem. Instead we constructed a fine-to- coarse HPNN architecture which is designed to learn small- scale detail structure associated with the extended objects. Our initial result applying the fine-to-coarse HPNN to mass detection are encouraging, with detection performance improvements of about 30%. We conclude that the ability of the HPNN architecture to integrate information across scales, from fine to coarse in the case of masses, makes it well suited for detecting objects which may have detail structure occurring at scales other than the natural scale of the object.

  • Converging evidence of linear independent components in EEG

    January 17, 2017 | L. Parra, Paul Sajda

    Blind source separation (BSS) has been proposed as a method to analyze multi-channel electroencephalography (EEG) data. A basic issue in applying BSS algorithms is the validity of the independence assumption. In this paper we investigate whether EEG can be considered to be a linear combination of independent sources. Linear BSS can be obtained under the assumptions of non-Gaussian, non-stationary, or non-white independent sources. If the linear independence hypothesis is violated these three different conditions will not necessarily lead to the same result. We show, using 64 channel EEG data, that different algorithms which incorporate the three different assumptions lead to the same results, thus supporting the linear independence hypothesis.

  • Single-trial Event Detection of Visual Object Recognition in EEG

    January 17, 2017 | Adam Gerson, L. Parra, Paul Sajda
  • The role of the LGN on the spatial frequency dependence of surround suppression in V1: Investigations using a computational model

    January 17, 2017 | Jim Wielaard, Paul Sajda

    Using a large scale model of macaque V1, we have shown (Wielaard & Sajda 2003, 2005) how only local short-range (

  • A probabilistic network model of the influence of local figure-ground representations on the perception of motion

    January 17, 2017 | Kay Baek, Paul Sajda

    Psychophysical experiments have shown that integration of motion signals, distributed across space, must be integrated with form cues, such as those associated figure-ground segregation. These experiments have led several to conclude that mechanisms exist which enable form cues to ‘veto’ or completely suppress ambiguous motion signals. We present a probabilistic network model in which local figure-ground representations encoded by direction-of-figure (Sajda and Finkel, 1995) modulate the degree of certainty of local motion signals. In particular, we consider the modulation at junctions where line terminators are defined as either intrinsic or extrinsic (Shimojo, Silverman, and Nakayama, 1989). The strength of local motion suppression at extrinsic terminators is a function of the belief in the local direction-of-figure, which is defined as the strength of the evidence for surface occlusion. Unlike previous studies/models investigating the influence of motion signals at terminators and occlusion cues (Grossberg, Mingolla, and Viswanathan, 2001; Lidén and Pack, 1999), our model directly exploits the uncertainties in the observations (i.e. figure-ground cues) leading to uncertainty in the inferred direction-of-figure, which for the case of terminators provides a smooth transition between intrinsic and extrinsic classes. Simulation results show that our model can account for the continuum of perceptual bias seen for motion coherence and perceived direction of motion in psychophysical experiments (McDermott, Weiss, and Adelson, 2001; Lidén and Mingolla, 1998).

  • Large-scale simulation of the primary visual cortex

    January 17, 2017 | Jim Wielaard, Paul Sajda

    We have developed a large-scale computational model of a 4×4 mm2 patch of a primary visual cortex (V1) input layer. The model is constructed from basic established anatomical and physiological data. Based on numerical simulations with this model we are able to suggest neural mechanisms for a wide variety of classical response properties of V1, as well as for a number of extraclassical receptive field phenomena. The nature of our model is such that we are able to address stationary as well as dynamical behaviour of V1, both on the single cell level and on a population level of up to about 105 cells.

  • Automatic Segmentation of Drusen in Fundus Image Using Non–Negative Matrix Factorization

    January 17, 2017 | S. Du, Paul Sajda, J.P. Koniarek, R.T. Smith

    Purpose: : The segmentation and quantitation of drusen is important for diagnosis and monitoring of retinal disease, particularly age–related macular degeneration (AMD). Many approaches to drusen segmentation have been based on heuristics, such as morphological analysis, image thresholding, etc. and require significant parameter tuning. Here we describe an unsupervised method for segmenting drusen in fundus images. Methods: : Color fundus images were acquired using the Topcon50EX fundus camera and digitized on the Nikon 2000 Coolscan. Two sets of images were considered a) leveled (using a published quadratic and spline model of green (G) channel background (R. T. Smith, Arch. Ophthalmol., 2005; 123:200–6.)) and b) unleveled. A “gold–standard” was constructed by manual segmentation of the drusen by a retinal specialist. Data from the three channels (R,G,B) were processed using non–negative matrix factorization (NMF, D. Lee, NIPS, 2001; 13:556–62). NMF decomposes a multi–variant data set (X) into two matrices: a matrix of spectral signatures (S) and their corresponding spatial distribution (A). The spatial distribution matrix (A) was analyzed using K–means clustering to label all pixels in the image into one of three classes. The entire method is unsupervised and does not require manual intervention. This method has been previously demonstrated for use in NMR based metabolomics (S. Du, Proc. EMBS, Sept., 2005). Results: : Visual inspection of the labellings produced by the algorithm tended to correspond to drusen, blood vessels and normal retinal tissue. Comparison of segmentation with the manually defined gold standard showed a range of sensitivity for detection of drusen (leveled data 51%–88%, unleveled 70–75%), with a specificity of (leveled data 78–97%, unleveled 71–97%) across four cases. Interestingly, in many cases false negatives produced by the algorithm were along the borders of the gold–standard defined drusen, indicating that individual drusen were detected by the algorithm, though their size was underestimated. Conclusions: : The NMF method is able to recover spectral signatures of drusen, as well as other anatomical structures in the retina, using only the three bands in color fundus images. Related work of our group is exploring the use of hyperspectral imaging which provides richer spectral signatures and is in fact even better suited for an NMF–based decomposition/segmentation.

  • Hyperspectral Signatures of Rabbit Retina Sections

    January 17, 2017 | J.P. Koniarek, S. Du, Paul Sajda, P. Gouras, R.T. Smith

    Purpose: : These studies were carried out to determine the spectral signatures of retinal structures which can be used for analysis and automated diagnosis of retinal disease. Since conventional pathology cannot determine the nature of retinal lesions in–situ, non–invasive methods must be used to quantify retinal pathology. One such a method may be by applying multispectral and hyperspectral methods and automated data analysis. Methods: : Unstained cross–sections of rabbit retinas mounted on glass slides were placed under a microscope and illuminated by white light. A monochromatic CCD camera combined with a liquid crystal tunable filter operating in the visible range was used to record the images at 10 nm intervals between 440 nm (blue) and 720 nm (red). Two methods were used to characterize the spectral signatures of the constituent tissues. The first required manual segmentation and consisted of determining the gray scale values as a function of frequency of the reflected light for the neural retina, the RPE, the choroid, and the sclera. The second used an unsupervised decomposition called non–negative matrix factorization (NMF) for the same four layers. NMF decomposes a multivariate data set into two matrices; a matrix of spectral signatures and their corresponding spatial distribution. Results: : The reflectance spectrum of each of the tissue layers obtained by the manual method formed a characteristic curve (signature) distinct in the frequency range studied and different for each layer. The signatures recovered using NMF have spatial distributions consistent with those obtained with manual segmentation. Both were consistent in recovering four distinct signatures. Conclusions: : Spectral signature characteristic of each of retinal layers investigated appears to be unique by both methods. As such these signatures lend themselves to being a tool for diagnosing retinal lesions that may have a different neural retina, RPE, choroidal, or scleral component.

  • Electrooculogram based system for computer control using a multiple feature classification model

    January 17, 2017 | A. R. Kherlopian, J. P. Gerrein, M. Yue, K. E. Kim, J. W. Kim, M. Sukumaran, Paul Sajda

    This paper discusses the creation of a system for computer-aided communication through automated analysis and processing of electrooculogram signals. In situations of disease or trauma, there may be an inability to communicate with others through standard means such as speech or typing. Eye movement tends to be one of the last remaining active muscle capabilities for people with neurodegenerative disorders, such as amyotrophic lateral sclerosis (ALS) also known as Lou Gehrig’s disease. Thus, there is a need for eye movement based systems to enable communication. To meet this need, the Telepathix system was designed to accept eye movement commands denoted by looking to the left, looking to the right, and looking straight ahead to navigate a virtual keyboard. Using a ternary virtual keyboard layout and a multiple feature classification model, a typing speed of 6 letters per minute was achieved

  • Spectral separation resolves partial volume effect in MRSI: A validation study

    January 13, 2017 | K. Sasan, S. Du, S. Thakur, Y. Su

    Magnetic resonance spectroscopic imaging (MRSI) is utilized clinically in conjunction with anatomical MRI to assess the presence and extent of brain tumors and evaluate treatment response. Unfortunately, the clinical utility of MRSI is limited by significant variability of in vivo spectra. Spectral profiles show increased variability due to partial coverage of large voxel volumes, infiltration of normal brain tissue by tumors, innate tumor heterogeneity and measurement noise. This study investigates spectral separation as a novel quantification tool, addressing these problems directly by quantifying the abundance (i.e. volume fraction) within a voxel for each tissue type instead of the conventional estimation of metabolite concentrations from spectral resonance peaks. Present results on 20 clinical cases of brain tumors show reduced cross-subject variability. This reduced variability leads to improved discrimination between high and low-grade gliomas, confirming the physiological relevance of the extracted spectra. Further validation on phantom data demonstrates the accuracy of the estimated abundances. These results show that the proposed spectral analysis method can improve the effectiveness of MRSI as a diagnostic tool.

  • Accessing the Cortical Response to Macular Disease via a Large-Scale Spiking Neuron Model of V1

    January 13, 2017 | Paul Sajda, Jim Wielaard, R.T. Smith, Jianing Shi

    Purpose:: Current efforts for assessing macular disease have focused on the retina, for instance quantitation of drusen distributions. Retinal imaging, however, does not provide a complete picture of the nature of the expected vision loss. Important to consider is how the visual cortex responds to the resulting scotomata and distortion of the retinal input. Methods:: In this study we used an anatomically and physiologically detailed spiking neuron model of V1 (Wielaard and Sajda, Cerebral Cortex. 2006 16(11) 1531-1545) to investigate the effect of macular disease on cortical activity, tuning, and selectivity. We segmented fundus images and use them as “masks” for input to our cortical simulations. The model was probed using simulated drifting sinusoidal grating stimuli. All simulations were done using monocular input. We analyzed the firing rates and orientation selectivity of cells in parvocellular (4Cß) and magnocellular (4Cα) versions of the cortical model as a function of normal and abnormal retinal input. To analyze orientation selectivity we computed the circular variance (CV) across the population of cells. Results:: We found for the magnocellular model an overall reduction of firing rates of all cortical neurons. However there were no obvious “holes” of activity indicative of clusters of inactive neurons whose spatial position could be correlated with the spatial distribution of drusen. Analysis of orientation selectivity showed a dramatic reduction in selectivity for the normal vs abnormal cases. For the abnormal cases there was a shift of the CV distribution toward 1.0, indicating poorer orientation selectivity of the cells in 4Cα. For 4Cß the results are somewhat different. Unlike the magnocellular model, the parvocellular model showed clusters of inactivity which correlated with the spatial distribution of drusen. However the orientation selectivity was not significantly affected, with distributions between normal and abnormal cases being indistinguishable. Conclusions:: The magno system appears to fill-in spatial information though at the cost of a loss of orientation selectivity, were as the parvo system maintains orientation selectivity however with scotoma present in the cortical activity. This analysis is only “first order” in that drusen are treated purely as masking out the visual input, when in fact their effect on retinal ganglion cell activity can be more complex. Nonetheless, the simulations offer some insight into how responses of cortical neurons are affected by retinal disease.

  • Automated Analysis of 1 H Magnetic Resonance Metabolic Imaging Data as an Aid to Clinical Decision-Making in the Evaluation of Intracranial Lesions

    January 13, 2017 | D.C. Shungu, S. Du, X. Mao, L. A. Heier, S. C. Pannullo, Paul Sajda

    Proton magnetic resonance spectroscopic imaging ( 1H MRSI) is a noninvasive metabolic imaging technique that has emerged as a potentially powerful tool for complementing structural magnetic resonance imaging (MRI) in the clinical evaluation of neurological disorders and diagnostic decision making. However, the relative complexity of methods that are currently available for analyzing the derived multi-dimensional metabolic imaging data has slowed incorporation of the technique into routine clinical practice. This paper discusses this impediment to widespread clinical use of 1H MRSI and then describes an automated data analysis approach that promises to facilitate use of the technique in the evaluation of intracranial lesions, with the potential to enhance the specificity of MRI and improve clinical decision-making.

  • Sparse decoding of neural activity in a spiking neuron model of V1

    January 13, 2017 | Jianing Shi, Jim Wielaard, Paul Sajda

    We investigate using a previously developed spiking neuron model of layer 4 of primary visual cortex (V1) [1] as a recurrent network whose activity is consequently linearly decoded, given a set of complex visual stimuli. Our motivation is based on the following: 1) Linear decoders have proven useful in analyzing a variety of neural signals, including spikes, firing rates, local field potentials, voltage sensitive dye imaging, and scalp EEG, 2) linear decoding of activity generated from highly recurrent, nonlinear networks with fixed connections has been shown to provide universal computational capabilities, with such methods termed liquid state machines (LSM) [2] and echo state networks (ESN) [3], 3) in LSMs or ESNs often little is assumed about the recurrent network architecture. However it is likely that for a given type of stimulus/input, the architecture of a biologically constrained recurrent network is important since it shapes the spatio-temporal correlations across the neuronal population, which can potentially be exploited efficiently by an appropriate decoder. We conduct experiments using a two-alternative forced choice paradigm of face and car discrimination, where a set of 12 face (Max Plank Institute face database) and 12 car grey-scale images are used [4]. All the images (512 x 512 pixels, 8 bits/pixel) have identical Fourier magnitude spectra. The phase spectra of the images are manipulated using the weighted mean phase method to introduce noise, resulting in a set of images graded by phase coherence. The sequence of images are presented to the V1 model (detailed in [1]) in a block design, where a face or car image is flashed for 50 ms, followed by interval of 200 ms in which a mean luminance background is shown. We use a linear decoder to map the spatio-temporal activity in the recurrent V1 model to a decision on whether the input stimulus is a face or a car. We employ a sparsity constraint on the decoder in order to control the dimension of the effective feature space. Sparse decoding is also consistent with previous research efforts on decoding multi-unit recording and optical imaging data. We evaluate the decoding accuracy of the linear decoding of the activity in the V1 model and compare that to a set of psychophysical data using the same stimuli. We construct a neurometric function for the decoder, with the variable of interest being the stimulus phase coherence. We find that linear decoding of neural activity in arecurrent V1 model can yield discrimination accuracy that is at least as good as, if not better than, human psychophysical performance for relatively complex visual stimuli. Thus substantial information for superaccurate decoding remains at the level of V1 and loss of information needed to better match behavioral performance is predicted to occur downstream in the decision making process. We also find a small improvement in discrimination accuracy when a spatio-temporal word is used relative to a spatial-only word, providing insight into the utility of a temporal vs. a rate code for behaviorally relevant decoding.

  • Using a Spiking Neuron Model of V1 as a Substrate for Mapping Visual Stimuli to Perception

    January 13, 2017 | Jianing Shi, Jim Wielaard, M. Busuioc, R.T. Smith, Paul Sajda

    Purpose: : How visual stimuli map to neural activity and ultimately perception is important not only for understanding normal visual function but also for assessing how abnormalities and pathologies, for instance those arising in the retina, may ultimately affect perception. In this study we use a model of primary visual cortex (V1) as a substrate for mapping visual stimuli to a large population of neural activity and subsequently compare the accuracy of decoding this activity to the accuracy of human subjects for the same visual discrimination task. Methods: : We use a previously developed spiking neuron model of V1 as a recurrent network whose activity is consequently linearly decoded, providing a link to perception in the context of a visual discrimination task. We introduce a sparsity constraint in the decoder, given the hypothesis that information is sparsely distributed in a highly recurrent network of V1. A spatio-temporal word is constructed from the population spike trains, as input to the sparse decoder, to fully exploit the full dynamics of the model. We evaluate the decoding accuracy using a two alternative forced choice paradigm (face versus car discrimination) where we control the difficulty of the task by modulating the phase coherence in the images. We compare neurometric functions, constructed via the sparse decoding of the neural activity in the model, to psychometric functions obtained from 10 human subjects. Results: : In general, we find that relatively small fractions of the neurons are required for highly accurate decoding of the visual stimuli. We find that linear decoding of neural activity in a recurrent V1 model can yield discrimination accuracy that is at least as good as, if not better than, human psychophysical performance for relatively complex visual stimuli. Thus substantial information for super-accurate decoding remains at the level of V1 and loss of information needed to better match behavioral performance is predicted to occur downstream in the decision making process. We also find marginally better decoding accuracy by fully utilizing the spatial-temporal dynamics compared with a static decoding strategy. Conclusions: : We have demonstrated how we can link the visual stimulus to perception via a mapping through a spiking neuron model of the early visual system. Future work will consider this as a framework for potentially analyzing the perceptual effect of retinal vision loss in patients with mild yet progressive macular disease, comparing predictions to those obtained strictly from the analysis of the spatial distribution of retinal abnormalities such as drusen.

  • Perceptual Consequences of Macular Disease Evaluated Using a Model of V1

    January 13, 2017 | Jianing Shi, Jim Wielaard, M. Busuioc, R.T. Smith, Paul Sajda

    Purpose: : Clinical assessment of macular disease typically relies on direct analysis of retinal imaging, which does not necessarily provide a complete picture of expected vision loss. A potential advancement is a framework for predicting how retinal disease affects cortical activity and ultimately perceptual performance. Methods: : Fundus images for low-vision patients with macular disease were segmented to create masks, used to simulate disease-specific distortion at the level of the retina. A 2-AFC perceptual task was designed with the goal to discriminate face and car images in the presence of noise. 10 subjects with normal vision performed the task and their results were assessed via psychometric curves. We simulated the cortical activity given the stimuli and used linear decoding of spike trains to generate neurometric curves for the model. The sparse linear decoder was optimized to maximize discrimination and not to match subjects’ psychometric curves. We simulated the cortical activity of low-vision subjects using the mask-distorted stimuli and carried out the decoding analysis in the same manner as normal subjects. Results: : Shown are the mean psychometric curve for normal subjects (red), individual subjects (light red), mean neurometric curve for simulated “normal” subjects (black), and a simulated “low-vision” subject (gray). The mean simulated “normal” subject has a neurometric curve that is a reasonable match to normal subjects, for the most part falling within the inter-subject variation. For the simulated “low vision” case, the neurometric curve is shifted to the right indicating degradation in perceptual performance. Conclusions: : Our results are promising in that they predict healthy subject perceptual performance and also result in systematic shifts in performance for simulated “low-vision” cases. Future work will quantify the predictive value of the model for a population of low-vision patients.

  • We find before we Look: Neural signatures of target detection preceding saccades during visual search

    January 13, 2017 | An Luo, L. Parra, Paul Sajda

    We investigated neural correlates of target detection in the electroencephalogram (EEG) during a free viewing search task and analyzed signals locked to saccadic events. We adopted stimuli similar to ones we used previously to study target detection in serial presentations of briefly flashed images. Subjects performed the search task for multiple random scenes while we simultaneously recorded 64 channels of EEG and tracked subjects’ eye position. For each subject we identified target saccades (TS) and distractor saccades (DS). For TS, these were always saccades which were directly to the target and were followed by a correct behavioral response (button press); for DS, we used saccades in correctly responded trials having no target (these were 28% of the trials). We sampled the sets of TS and DS saccades such that they were equalized/matched for saccade direction and duration, ensuring no information in the saccade properties themselves was discriminating for their type. We aligned EEG to the saccade and used logistic repression (LR), in the space of the 64 electrodes, to identify components discriminating a TS from a DS on a single-trial basis. Specifically, LR was applied to narrow time windows (50ms) and discrimination was done for windows having varying latencies relative to the saccade. We found that there is significant discriminating activity in the EEG both before and after the saccade—average discriminability across 7 subjects was AUC=0.64, 80 ms before the saccade, and AUC=0.68, 60 ms after the saccade (p[[lt]]0.01 established using bootstrap resampling). Between these time periods we saw substantial reduction in discriminating activity (for 7 subjects, mean AUC=0.59). We conclude that that we can identify neural signatures of detection both before and after the saccade, indicating that the subject anticipates where the target is before he/she makes the last saccade to foveate and respond.

  • Coupling Retinal Imaging With Psychophysics to Assess Perceptual Consequences of AMD

    January 13, 2017 | Jianing Shi, Jim Wielaard, R.T. Smith, Paul Sajda

    Purpose: Retinal imaging does not necessarily provide a complete picture of expected vision loss for macular disease. We use a psychophysics test coupled with computational modeling to relate pathologies, found via fundus imaging, to expected perceptual function for a group of AMD patients. Methods: We recruited 10 low-vision patients with mild yet progressive AMD, as well as 10 age-matched healthy controls at the Edward Harkness Eye Institute, Columbia Presbyterian Medical Center. Both patients and controls, whose ages ranged from 65 to 84, were corrected to 20/20 to 20/50 visual acuity. All the subjects participated in a 2-AFC perceptual task, in monocular mode, where they were required to discriminate face and car images in the presence of variable noise. Color fundus photographs were collected using a Zeiss FF 450 Plus camera. Fundus images were segmented using a robust and automated algorithm to quantify disease-specific pathologies on the retina. We mapped each patient’s retinal pathology to cortical activity and neurometric curves using a computational model of V1 and a decoding framework. We compared the psychometric curves between controls and patients, and investigated the quality of the neurometric predictions. We further analyzed the correlation between the neurometric curves with statistics of drusen in the masks. Results: AMD patients had substantially lower discrimination accuracies compared to controls. Moreover, the degradation in the discrimination accuracy of AMD patients was much more pronounced at higher signal-to-noise (SNR) levels of the stimulus. We observed a positive correlation (r = 0.67) between the fraction of drusen free area on the mask with the predicted perceptual discrimination at the highest SNR level for the stimulus. Conclusions: The psychophysics and modeling framework we developed provides a quantitative assessment for the perceptual consequences of AMD and can potentially serve as a method for relating clinical findings in retinal imaging to perceptual function.

  • On Hyperspectral Signatures of Drusen

    January 13, 2017 | A. A. Fawzi, Paul Sajda, A. Laine, G. Martin, M. S. Humayun, R.T. Smith

    Purpose: Drusen, the hallmark lesions of age related macular degeneration (AMD), are biochemically heterogeneous and the identification of their biochemical distribution is key to understanding AMD. Yet the challenges are to develop imaging technology and analysis tools which respect the physical generation of the hyperspectral signal in the presence of noise and multiple mixed sources while maximally exploiting the full data dimensionality to uncover clinically relevant spectral signatures. Methods: 7 patient eyes with drusen were imaged with the snapshot hyperspectral camera previously described (doi:10.1117/1.2434950). Regions of interest (ROI’s) of drusen were identified in each image. Multiple images were acquired of one eye. We performed statistical intra-subject analysis to investigate the reproducibility of non-negative matrix factorization (NMF) in AMD patients with different types of drusen. Given a data matrix D and a positive integer r the NMF problem is to compute a decomposition with r being the low-rank factor, W the basis vectors, and H the linear encoding representing the mixing coefficients. Results: Figure 1 shows central slices of 5 different ROIs for patient P=c. In each ROI a drusen sensitivity spectrum was recovered with a response peak between 550 and 600nm. This spectrum had low variability across different ROIs within a patient (mean-standard error (σ) = 0.01) and between patients (σ=0.041). Conclusions: Snapshot hyperspectral images analyzed with NMF, which imposes physically realistic positivity constraints on the mixing process, recovered spectral profiles that reliably identified drusen. The recovered spectra were consistently similar for drusen in different areas of the macula from the same eye and also in different eyes. Our results suggest that hyperspectral imaging can detect biochemically meaningful components of drusen.

  • Simultaneous decomposition of multiple retinal pigment epithelium (RPE) autofluorescence hyperspectral datasets for fluorophor discovery

    January 11, 2017 | R. Post, A. Johri, B. Ganti, Paul Sajda, T. Ach, C. Curcio, T. Smith

    Purpose Excitation of RPE autofluorescence with different wavelengths produces different but closely related spectral data. We hypothesized that simultaneous decomposition of multiple hyperspectral datasets into major spectral signatures and their spatial distributions with non negative matrix factorization (NMF) could exploit these relationships to recover results superior to factoring any single hypercube. Methods Pure RPE/BrM flat mounts were separately excited at 436-460nm and 480-510nm and hyperspectral emission data were captured by methods described in detail by Johri and Agarwal abstracts. Standard NMF factors a hypercube A into the product of matrices W and H (Fig 1a), where W is the spectra of the recovered sources and H carries their spatial localizations (abundance images). In our formulation, we always retrieve 4 spectral signatures for RPE and one for BrM. We paired each signal found at 436nm excitation to its corresponding signal at 480nm, and linked the two datasets by requiring that the spatial localizations of the paired signals must be exactly the same, because they come from the same compound. (Fig 1b) Results Fig. 2 (a, b) shows the 5 spectra recovered from the fovea of a 34 y/o female donor at 436nm and 480nm with standard NMF. The spectra are clearly paired according to the emission maxima. Fig 2c shows the results when the data are decomposed simultaneously: 10 abundant spectra are clearly paired in shape and location, suggesting single species. Each pair corresponds to one clearly defined abundance image. Conclusions Simultaneous decomposition of multiple RPE hyperspectral datasets is superior to standard NMF at breaking down a complex spectrum representing a mixture of fluorophors into its individual spectral signals, hence providing better candidates for biochemical identification.

  • Hyperspectral Imaging Evidence for Four Abundant Retinal Pigment Epithelium (RPE) Fluorophors Across Age and Retinal Location

    January 11, 2017 | K. Agarwal, R. Post, A. Johri, A. Cowley, Paul Sajda, T. Ach, C. Curcio, T. Smith

    Purpose Isolate and compare individual candidate autofluorescence (AF) signals from human RPE/Bruch’s membrane (BrM) flat mounts with hyperspectral AF imaging and mathematical modeling across age, retinal locations and two excitation wavelengths. Methods RPE/BrM-only flat-mounts from 11 belts of normal chorioretinal human tissue (5 donors < 50 yrs, 6 > 80 yrs; 8 females, 3 males) were prepared by removing the retina and choroid under photographic control for maintaining foveal position. Spectral microscopy and hyperspectral AF imaging were performed at 2 excitation bands, 436-460nm and 480-510nm, with emissions captured using the Nuance FX camera (Caliper Life Sciences, US) between 420-720nm in 10nm intervals at 3 locations: fovea, parafovea (2-4mm superior to fovea, at the rod peak) and periphery (8-10mm superior, at the highest rod:cone ratio), giving 66 hyperspectral data sets, consisting of photon counts per second recorded at each spatial pixel in the 40X field and wavelength. Results Gaussian mixture modeling and mathematical factorization of the hypercubes were applied to extract four RPE candidate spectra for lipofuscin at each location for each donor (see abstract by Johri et al for details). The four peaks were seen at average wavelengths of 566±6nm, 604±27nm, 645±8nm, and 701±8nm at the 436-460nm excitation (Fig. 1) and 558±8nm, 606±5nm, 646±8nm and 694±13nm at the 480-510nm excitation across all donors. The peak near 600nm (A2E-like) was generally the smallest peak amongst the four. The emission maxima varied for donors across age and locations, but all spectra were present in all but 6/108 data sets. There were no consistent regional or age trends in peak intensities. Conclusions Hyperspectral AF imaging analysis of the RPE ex vivo consistently reports the presence of at least 4 abundant fluorophors with well-defined emission maxima across all studied ages, retinal locations, and excitation wavelengths. Determining the actual abundant source molecules that produce these signals will be important in understanding RPE physiology.

  • Hyperspectral image analysis of ex vivo autofluorescence (AF) of human Bruch’s membrane (BrM)

    January 11, 2017 | A. Fatoo, A. Johri, R. Post, Paul Sajda, C. Curcio, T. Ach, T. Smith

    Purpose Identify and quantify candidate AF signals from BrM in RPE/BrM with flat-mounts of human donor eyes using ex vivo hyperspectral AF imaging and mathematical modeling. Methods Flat-mounts from 11 human eyes lacking chorioretinal pathology (6 donors 80 yrs) were prepared by removing the retina and choroid and studied at 3 locations with distinct photoreceptor content in overlying retina: fovea, perifovea (2-4mm superior to fovea), and periphery (10-12mm superior to fovea). RPE was further removed from a region at each location to provide 33 samples of isolated BrM for spectral microscopy (Zeiss Axio Imager A2 microscope (Carl Zeiss, Jena, Germany) with Plan-Apochromat objective optics (excitation: 430 nm; emission: long-pass fluorescence filter). Hyperspectral AF images were captured at emissions between 420 to 720 nm in 10 nm intervals using the Nuance FX camera (Caliper Life Sciences, US) and saved as data hypercubes with two spatial and one wavelength dimension. Results Gaussian mixture modeling and mathematical factorization of the hypercubes were applied to extract 4 dominant BrM candidate spectra from each sample. Comparison with lipofuscin spectra independently obtained from these locations showed two shorter wavelength peaks that were unique to BrM, one always present near 533nm and another near 488nm that was present at a statistically significantly higher rate in the older donor populations (Fischer exact test, p = 0.0272) (table1). There was also a trend for the 488nm wavelength to be present more peripherally (table 2). Two other peaks were found near 600nm and 690nm. The mean values (nm) of peak centers were: Fovea: 698 ± 22.4, 605 ± 15.7, 534 ± 4.0, 489 ± 2.9 Parafovea: 688 ± 28.2, 603 ± 19.4, 536 ± 7.2, 492 ± 4.4 Periphery: 689 ± 12.6,, 602 ± 11.4, 534 ± 3.7, 489 ± 3.3 Conclusions Candidate individual emission spectra for BrM suggest a population of fluorophors. A well-defined source with emission at 488nm appears to increase with age. Peaks at 600-690nm resemble those independently determined for RPE lipofuscin at the same locations. Whether these represent bis-retinoids requires further elucidation in tissues subject to lipid extraction. Biochemical identification of these species will be important in understanding BrM physiology in health and disease and for interpreting clinical hyperspectral imaging.

  • Mathematical modeling of retinal pigment epithelium (RPE) autofluorescence (AF) with Gaussian mixture models and non-negative matrix factorization (NMF)

    January 11, 2017 | A. Johri, R. Post, B. Ganti, A. A. Fawzi, Paul Sajda, T. Ach, C. Curcio, T. Smith

    Purpose To devise a mathematical algorithm that can extract individual spectral fluorophor components and their spatial localizations from hyperspectral autofluorescence (AF) emission data taken from RPE and Bruch’s membrane (BrM) human donor flat mounts (ex vivo). Methods Step 1: Hyperspectral cube acquisition: The AF of eleven pure human RPE/BrM flatmounts was studied at 3 locations (fovea, parafovea and periphery) via excitation at wavelengths 436-460 nm and 480-510 nm at 40X magnification. The corresponding hyperspectral emission data (hypercubes of two spatial and one spectral dimension) were captured using the Nuance FX camera (Caliper Life Sciences, US). (Further details in K. Agarwal abstract); Step 2: Gaussian modeling: We fit the original RPE spectra with mixtures of four Gaussian curves (Fig. 1), which provided single peak, smooth candidates for individual fluorophor components; Step 3: NMF modeling: We used these candidate spectra to initialize an NMF technique that factors the entire hypercube to recover constituent source spectra and their spatial localizations minimizing error. We also initialized the NMF with the emission signal from a patch of bare BrM because BrM, underlying the RPE, contributes its signal throughout. Results NMF models with Gaussian/BrM initialization consistently decomposed RPE AF hypercubes into smooth individual candidate spectra with histologically plausible localizations within the flat-mount images (Fig. 2). For example, the shorter wavelength spectral component C3 localized to BrM (Fig. 2, Spatial Abundance C3), while the other four, emitting from 575 nm to 700 nm, localized to the lipofuscin compartment. Conclusions The Gaussian/NMF mixture model enabled consistent recovery of candidate spectra for individual RPE fluorophor emission signals with histologically plausible localizations. These spectra should now be matched to their corresponding biochemical components with techniques like imaging mass spectroscopy.

  • Quantitation of Hyperspectral Autofluorescence (AF) from Human Retinal Pigment Epithelium (RPE) Ex Vivo

    January 11, 2017 | C. Nabati, A. Johri, R. Post, Paul Sajda, T. Ach, C. Curcio, T. Smith

    Purpose Quantify the hyperspectral AF signal from RPE/Bruch’s membrane (BrM) flat mounts. Methods Hyperspectral AF images (hypercubes) were captured from 66, 40X fields in 11 RPE/BrM flat mounts from human donor eyes using techniques described in detail in the abstract submitted by K. Agarwal. Briefly, for each 40X field, the hypercube has the two spatial dimensions of the field, and at each spatial point the photon counts recorded at each wavelength, hence the third or spectral dimension. For reproducible quantification of these data, exposure times were calibrated so that photon counts per spectral channel fell within the 12-bit linear range of the detector and then were offset by the dark current. Scaled counts-per-second were determined by exposure time (Eqn. 1) and calibrated to a standard fluorescent reference (courtesy of F Delori) to correct for any variation in power of the excitation light, yielding quantified hypercubes with units of photon counts per second at each point and wavelength. Results The Root Mean Square(RMS ) difference of quantified hypercubes from repeat imaging of the same location was within the noise level (dark current) of the Nuance detector, establishing reproducibility. Separation of RPE signal from BrM (Fig. 2) and further mathematical analyses of the hypercubes (see abstract of A. Johri) therefore extracted reliable quantitative RPE lipofuscin spectra for individual constituents and their corresponding spatial co-localizations. Conclusions Hyperspectral AF images of human RPE flatmounts may be reliably quantified for use as a surrogates for measurement of abundant of lipofuscin components. Such quantitative information can help guide the analysis of RPE physiology and biochemistry.

  • Neural Correlates of Spatiotemporal Event Recognition: Application to Brain-Computer Interfaces for Video Exploitation

    January 11, 2017 | Paul Sajda

    Detection of events of interest in video involves evidence accumulation across space and time; the observer is required to integrate features from both motion and form to decide whether a behavior constituents a target event. Do such events that extend in time elicit evoked responses of similar strength as evoked responses associated with instantaneous events such as the presentation of a static target image? Using a set of simulated scenarios, with avatars/actors having different behaviors, we identified evoked neural activity discriminative of target vs. distractor events (behaviors) at discrimination levels that are comparable to static imagery. EEG discriminative activity was largely in the time-locked evoked response and not in oscillatory activity, with the exception of very low EEG frequency bands such as delta and theta, which simply represent bands dominating the event related potential (ERP). The discriminative evoked response activity we see is observed in all target/distractor conditions and is robust across different recordings from the same subjects. The results suggest that we have identified a robust neural correlate of target detection in video, at least in terms of the stimulus set we used-i.e., dynamic behavior of an individual in a low clutter environment. We discuss implications for using such a neural correlate for building a brain-computer interface (BCI) to search and annotate video. This work was done with Lucas Parra of the City College of New York (CCNY) and Dan Rosenthal and Paul DeGuzman of Neuromatters, LLC.

  • Advanced Technologies for Brain Research [Scanning the Issue]

    January 6, 2017 | Metin Akay, Paul Sajda, S. Micera, J. M. Carmena

    We believe that this special issue will serve to increase the public awareness and foster discussions on the multiple worldwide BRAIN initiatives, both within and outside the IEEE, providing an impetus for development of long-term cost-effective healthcare solutions. We also believe that the topics presented in this special issue will serve as scientific evidence for health and policy advocates of the value of neurotechnologies for improving the neurological and mental health and wellbeing of the general population. Below we briefly highlight the papers and technologies in this special issue.

  • Correlating Speaker Gestures in Political Debates with Audience Engagement Measured via EEG

    January 3, 2017 | John R. Zhang, Jason Sherwin, Jacek Dmochowski, Paul Sajda, John R. Kender

    We hypothesize that certain speaker gestures can convey significant information that are correlated to audience engagement. We propose gesture attributes, derived from speakers’ tracked hand motions to automatically quantify these gestures from video. Then, we demonstrate a correlation between gesture attributes and an objective method of measuring audience engagement: electroencephalography (EEG) in the domain of political debates. We collect 47 minutes of EEG recordings from each of 20 subjects watching clips of the 2012 U.S. Presidential debates. The subjects are examined in aggregate and in subgroups according to gender and political affiliation. We find statistically significant correlations between gesture attributes (particularly extremal pose) and our feature of engagement derived from EEG both with and without audio. For some stratifications, the Spearman rank correlation reaches as high as ρ = 0.283 with p < 0.05, Bonferroni corrected. From these results, we identify those gestures that can be used to measure engagement, principally those that break habitual gestural patterns.

  • Neuro-Robotic Technologies and Social Interactions

    January 3, 2017 | K. McDowell, A. R. Marathe, B. Lance, J. S. Metcalfe, Paul Sajda

    The current bandwidth for understanding cognitive and emotional context of a person is much more limited between robots and humans than among humans. Advances in human sensing technologies over the past two decades hold promise for providing online and unique information sources that can lead to deeper insights into human cognitive and emotional state than are currently attainable. However, blind application of the human sensing technologies alone is not a solution. Here, we focus on the integration of neuroscience with robotic technologies for improving social interactions. We discuss the issue of uncertainty in human state detection and the need to develop approaches to estimate and integrate knowledge of that uncertainty. We illustrate this by discussing two application areas and the potential neuro-robotic technologies that could be developed within them

  • Fusing Simultaneous EEG-fMRI by Linking Multivariate Classifiers

    Multivariate pattern analysis (MVPA) has typically been used in neuroimaging to draw inferences from a single modality, e.g., functional magnetic resonance imaging (fMRI) or electroencephalography (EEG). As simultaneous acquisition of different neuroimaging modalities becomes more common, one consideration is how to apply MVPA methods to analyze the resulting multimodal dataspaces. We present a multi-modal fusion technique that seeks to simultaneously train a multivariate classifier and identify correlated components across the two modalities. We validate our approach on a real simultaneous EEG-fMRI dataset.

  • Feature selection for gaze, pupillary, and EEG signals evoked in a 3D environment

    January 3, 2017 | David Jangraw, Paul Sajda

    As we navigate our environment, we are constantly assessing the objects we encounter and deciding on their subjective interest to us. In this study, we investigate the neural and ocular correlates of this assessment as a step towards their potential use in a mobile human-computer interface (HCI). Past research has shown that multiple physiological signals are evoked by objects of interest during visual search in the laboratory, including gaze, pupil dilation, and neural activity; these have been exploited for use in various HCIs. We use a virtual environment to explore which of these signals are also evoked during exploration of a dynamic, free-viewing 3D environment. Using a hierarchical classifier and sequential forward floating selection (SFFS), we identify a small, robust set of features across multiple modalities that can be used to distinguish targets from distractors in the virtual environment. The identification of these features may serve as an important factor in the design of mobile HCIs.

  • A System for Measuring the Neural Correlates of Baseball Pitch Recognition and Its Potential Use in Scouting and Player Development

    In this paper we use state-of-the-art multimodal neuroimaging to tease apart the spatio-temporal sequence of neural activity that “goes through a hitter’s mind” when they recognize a baseball pitch. Specifically we utilize electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) to investigate the neural networks activated for correct and incorrect pitch classifications. Our previous analysis has shown where in the trajectory of a pitch the hitter’s neural activity correctly discriminates a pitch type (e.g. fastball, curveball or slider). Here, we show that correct classifications correlate with a neural network including both visual and sub-cortical motor areas, likely demonstrating a link between visual identification and the required rapid motor response. Conversely, we find that not only is this activity lacking in incorrect classifications, but that it is instead replaced by prefrontal cortex activity, which has been shown to be responsible for more deliberative conflict resolution. Synthesizing these and other results, we hypothesize the potential uses of this technology in the form of a brain computer interface (BCI) to measure and enhance baseball player performance.

  • Fast, Exact Model Selection and Permutation Testing for l2-Regularized Logistic Regression

    January 3, 2017 | Bryan Conroy, Paul Sajda

    Regularized logistic regression is a standard classification method used in statistics and machine learning. Unlike regularized least squares problems such as ridge regression, the parameter estimates cannot be computed in closed-form and instead must be estimated using an iterative technique. This paper addresses the computational problem of regularized logistic regression that is commonly encountered in model selection and classifier statistical significance testing, in which a large number of related logistic regression problems must be solved for. Our proposed approach solves the problems simultaneously through an iterative technique, which also garners computational efficiencies by leveraging the redundancies across the related problems. We demonstrate analytically that our method provides a substantial complexity reduction, which is further validated by our results on real-world datasets.

  • Multivariate Analysis of fMRI using Fast Simultaneous Training of Generalized Linear Models (FaSTGLZ)

    January 3, 2017 | Bryan Conroy, Paul Sajda

    We present an efficient algorithm for simultaneously training elastic-net-regularized generalized linear models across many related problems, which may arise from bootstrapping, cross-validation and nonparametric permutation testing. Our approach leverages the redundancies across problems to obtain ≈ 10x computational improvements relative to solving the problems sequentially by the standard glmnet algorithm of (Friedman et al., 2010). We demonstrate our fast simultaneous training of generalized linear models (FaSTGLZ) algorithm, for multivariate analysis of fMRI and run otherwise computationally intensive bootstrapping and permutation test analyses that are typically necessary for obtaining statistically rigorous classification results and meaningful interpretation.

  • Learning EEG Components for Discriminating Multi-Class Perceptual Decisions

    Logistic regression has been used as a supervised method for extracting EEG components predictive of binary perceptual decisions. However, often perceptual decisions require a choice between more than just two alternatives. In this paper we present results using multinomial logistic regression (MLR) for learning EEG components in a 3-way visual discrimination task. Subjects were required to decide between three object classes (faces, houses, and cars) for images which were embedded with varying amounts of noise. We recorded the subjects’ EEG while they were performing the task and then used MLR to predict the stimulus category, on a single-trial basis, for correct behavioral responses. We found an early component (at 170ms) that was consistent across all subjects and with previous binary discrimination paradigms. However a later component (at 300-400ms), previously reported in the binary discrimination paradigms, was more variable across subjects in this three-way discrimination task. We also computed forward models for the EEG components, with these showing a difference in the spatial distribution of component activity for the different categorical decisions. In summary, we find that logistic regression, generalized to the arbitrary N-class case, can be a useful approach for learning and analyzing EEG components underlying multi-class perceptual decisions.

  • A 3-D Immersive Environment for Characterizing EEG Signatures of Target Detection

    January 3, 2017 | David Jangraw, Paul Sajda

    Visual target detection is one of the most studied paradigms in human electrophysiology. Electroencephalo-graphic (EEG) correlates of target detection include the well-characterized N1, P2, and P300. In almost all cases the experimental paradigms used for studying visual target detection are extremely well-controlled – very simple stimuli are presented so as to minimize eye movements, and scenarios involve minimal active participation by the subject. However, to characterize these EEG correlates for real-world scenarios, where the target or the subject may be moving and the two may interact, a more flexible paradigm is required. The environment must be immersive and interactive, and the system must enable synchronization between events in the world, the behavior of the subject, and simultaneously recorded EEG signals. We have developed a hardware/software system that enables us to precisely control the appearance of objects in a 3D virtual environment, which subjects can navigate while the system tracks their eyes and records their EEG activity. We are using this environment to investigate a set of questions which focus on the relationship between the visibility, salience, and affect of the target; the agency and eye movements of the subject; and the resulting EEG signatures of detection. In this paper, we describe the design of our system and present some preliminary results regarding the EEG signatures of target detection.

  • Leveraging Brain Computer Interaction Technologies for Military Applications

    January 3, 2017 | A. Ries, B. Lance, Paul Sajda
  • Discriminant Multitaper Component Analysis of EEG

    January 3, 2017 | Mads Dyrholm, Paul Sajda

    This work extends Bilinear Discriminant Component Analysis to the case of oscillatory activity with allowed phase‐variability across trials. The proposed method learns a spatial profile together with a multitaper basis which can integrate oscillatory power in a band‐limited fashion. We demonstrate the method for predicting the handedness of a subject’s button press given multivariate EEG data. We show that our method learns multitapers sensitive to oscillatory activity in the 8–12 Hz range with spatial filters selective for lateralized motor cortex. This finding is consistent with the well‐known mu‐rhythm, whose power is known to modulate as a function of which hand a subject plans to move, and thus is expected to be discriminative (predictive) of the subject’s response.

  • In vivo snapshot hyperspectral image analysis of age-related macular degeneration

    January 3, 2017 | N. Lee, Jim Wielaard, A. A. Fawzi, Paul Sajda, A. Laine, G. Matin, M. S. Humayun, R.T. Smith

    Drusen, the hallmark lesions of age related macular degeneration (AMD), are biochemically heterogeneous and the identification of their biochemical distribution is key to the understanding of AMD. Yet the challenges are to develop imaging technology and analytics, which respect the physical generation of the hyperspectral signal in the presence of noise, artifacts, and multiple mixed sources while maximally exploiting the full data dimensionality to uncover clinically relevant spectral signatures. This paper reports on the statistical analysis of hyperspectral signatures of drusen and anatomical regions of interest using snapshot hyperspectral imaging and non-negative matrix factorization (NMF). We propose physical meaningful priors as initialization schemes to NMF for finding low-rank decompositions that capture the underlying physiology of drusen and the macular pigment. Preliminary results show that snapshot hyperspectral imaging in combination with NMF is able to detect biochemically meaningful components of drusen and the macular pigment. To our knowledge, this is the first reported demonstration in vivo of the separate absorbance peaks for lutein and zeaxanthin in macular pigment.

  • Mapping visual stimuli to perceptual decisions via sparse decoding of mesoscopic neural activity

    January 3, 2017 | Paul Sajda

    In this talk I will describe our work investigating sparse decoding of neural activity, given a realistic mapping of the visual scene to neuronal spike trains generated by a model of primary visual cortex (V1). We use a linear decoder which imposes sparsity via an L1 norm. The decoder can be viewed as a decoding neuron (linear summation followed by a sigmoidal nonlinearity) in which there are relatively few non-zero synaptic weights. We find: (1) the best decoding performance is for a representation that is sparse in both space and time, (2) decoding of a temporal code results in better performance than a rate code and is also a better fit to the psychophysical data, (3) the number of neurons required for decoding increases monotonically as signal-to-noise in the stimulus decreases, with as little as 1% of the neurons required for decoding at the highest signal-to-noise levels, and (4) sparse decoding results in a more accurate decoding of the stimulus and is a better fit to psychophysical performance than a distributed decoding, for example one imposed by an L2 norm. We conclude that sparse coding is well-justified from a decoding perspective in that it results in a minimum number of neurons and maximum accuracy when sparse representations can be decoded from the neural dynamics.

  • Combining computer and human vision into a BCI: Can the whole be greater than the sum of its parts?

    January 3, 2017 | Eric Pohlmeyer, David Jangraw, J. Wang, S-F Chang, Paul Sajda

    Our group has been investigating the development of BCI systems for improving information delivery to a user, specifically systems for triaging image content based on what captures a user’s attention. One of the systems we have developed uses single-trial EEG scores as noisy labels for a computer vision image retrieval system. In this paper we investigate how the noisy nature of the EEG-derived labels affects the resulting accuracy of the computer vision system. Specifically, we consider how the precision of the EEG scores affects the resulting precision of images retrieved by a graph-based transductive learning model designed to propagate image class labels based on image feature similarity and sparse labels.

  • The Bilinear Brain: Towards Subject‐Invariant Analysis

    January 3, 2017 | C. Christoforou, R. Haralick, Paul Sajda, L. Parra

    A major challenge in single-trial electroencephalography (EEG) analysis and Brain Computer Interfacing (BCI) is the so called, inter-subject/inter-session variability: (i.e large variability in measurements obtained during different recording sessions). This variability restricts the number of samples available for single-trial analysis to a limited number that can be obtained during a single session. Here we propose a novel method that distinguishes between subject-invariant features and subject-specific features, based on a bilinear formulation. The method allows for one to combine multiple recording of EEG to estimate the subject-invariant parameters, hence addressing the issue of inter-subject variability, while reducing the complexity of estimation for the subject-specific parameters. The method is demonstrated on 34 datasets from two different experimental paradigms: Perception categorization task and Rapid Serial Visual Presentation (RSVP) task. We show significant improvements in classification performance over state-of-the-art methods. Further, our method extracts neurological components never before reported on the RSVP thus demonstrating the ability of our method to extract novel neural signatures from the data.

  • Brain State Decoding for Rapid Image Retrieval

    January 3, 2017 | J. Wang, Eric Pohlmeyer, B. Hanna, Y-G Jiang, Paul Sajda, S-F Chang

    Human visual perception is able to recognize a wide range of targets under challenging conditions, but has limited throughput. Machine vision and automatic content analytics can process images at a high speed, but suffers from inadequate recognition accuracy for general target classes. In this paper, we propose a new paradigm to explore and combine the strengths of both systems. A single trial EEG-based brain machine interface (BCI) subsystem is used to detect objects of interest of arbitrary classes from an initial subset of images. The EEG detection outcomes are used as input to a graph-based pattern mining subsystem to identify, refine, and propagate the labels to retrieve relevant images from a much larger pool. The combined strategy is unique in its generality, robustness, and high throughput. It has great potential for advancing the state of the art in media retrieval applications. We have evaluated and demonstrated significant performance gains of the proposed system with multiple and diverse image classes over several data sets, including those from Internet (Caltech 101) and remote sensing images. In this paper, we will also present insights learned from the experiments and discuss future research directions.

  • Signal processing challenges for single-trial analysis of simultaneous EEG/fMRI

    January 2, 2017 | Paul Sajda

    A relatively new neuroimaging modality is simultaneous EEG and fMRI. Though such a multi-modal acquisition is attractive given that it can exploit the temporal resolution of EEG and spatial resolution of fMRI, it comes with unique signal processing and pattern classification challenges. In this paper I will review some our work at developing signal processing and pattern recognition for analysis of simultaneous EEG and fMRI, with a focus on those algorithms enabling a single-trial analysis of the neural signal. In general, these algorithms exploit the multivariate nature of the EEG, removing MR induced artifacts and classifying event-related signals that then can be correlated with the BOLD signal to yield specific fMRI activations.

  • Perceptual Decision Making Investigated via Sparse Decoding of a Spiking Neuron Model of V1

    January 2, 2017 | Jianing Shi, Jim Wielaard, R.T. Smith, Paul Sajda

    Recent empirical evidence supports the hypothesis that invariant visual object recognition might result from non-linear encoding of the visual input followed by linear decoding [1]. This hypothesis has received theoretical support through the development of neural network architectures which are based on a non-linear encoding of the input via recurrent network dynamics followed by a linear decoder [2], [3]. In this paper we consider such an architecture in which the visual input is non-linearly encoded by a biologically realistic spiking model of V1, and mapped to a perceptual decision via a sparse linear decoder. Novel is that we 1) utilize a large-scale conductance based spiking neuron model of V1 which has been well-characterized in terms of classical and extra-classical response properties, and 2) use the model to investigate decoding over a large population of neurons. We compare decoding performance of the model system to human performance by comparing neurometric and psychometric curves.

  • Do We See Before We Look?

    January 2, 2017 | An Luo, Paul Sajda

    We investigated neural correlates of target detection in the electroencephalogram (EEG) during a free viewing search task and analyzed signals locked to saccadic events. Subjects performed a search task for multiple random scenes while we simultaneously recorded 64 channels of EEG and tracked subjects eye position. For each subject we identified target saccades (TS) and distractor saccades (DS). We sampled the sets of TS and DS saccades such that they were equalized/matched for saccade direction and duration, ensuring that no information in the saccade properties themselves was discriminating for their type. We aligned EEG to the saccade onset and used logistic regression (LR), in the space of the 64 electrodes, to identify activity discriminating a TS from a DS on a single-trial basis. We found significant discriminating activity in the EEG both before and after the saccade. We also saw substantial reduction in discriminating activity when the saccade was executed. We conclude that we can identify neural signatures of detection both before and after the saccade, indicating that subjects anticipate the target before the last saccade, which serves to foveate and confirm the target identity.

  • Second Order Bilinear Discriminant Analysis for Single-trial EEG

    January 2, 2017 | C. Christoforou, Paul Sajda, L. Parra

    Traditional analysis methods for single-trial classification of electro-encephalography (EEG) focus on two types of paradigms: phase locked methods, in which the amplitude of the signal is used as the feature for classification, i.e. event related potentials; and second order methods, in which the feature of interest is the power of the signal, i.e event related (de)synchronization. The process of deciding which paradigm to use is ad hoc and is driven by knowledge of neurological findings. Here we propose a unified method in which the algorithm learns the best first and second order spatial and temporal features for classification of EEG based on a bilinear model. The efficiency of the method is demonstrated in simulated and real EEG from a benchmark data set for Brain Computer Interface.

  • Automated analysis of 1H magnetic resonance metabolic imaging data as an aid to clinical decision-making in the evaluation of intracranial lesions

    January 2, 2017 | D.C. Shungu, S. Du, X. Mao

    Proton magnetic resonance spectroscopic imaging (1H MRSI) is a noninvasive metabolic imaging technique that has emerged as a potentially powerful tool for complementing structural magnetic resonance imaging (MRI) in the clinical evaluation of neurological disorders and diagnostic decision-making. However, the relative complexity of methods that are currently available for analyzing the derived multi-dimensional metabolic imaging data has slowed incorporation of the technique into routine clinical practice. This paper discusses this impediment to widespread clinical use of 1H MRSI and then describes an automated data analysis approach that promises to facilitate use of the technique in the evaluation of intracranial lesions, with the potential to enhance the specificity of MRI and improve clinical decision-making.

  • A system for single-trial analysis of simultaneously acquired EEG and fMRI

    January 2, 2017 | Paul Sajda, Robin Goldman, Marios Philiastides, Adam Gerson, Truman R. Brown

    In this paper we describe a system for simultaneously acquiring EEG and fMRI and evaluate it in terms of discriminating, single-trial, task-related neural components in the EEG. Using an auditory oddball stimulus paradigm, we acquire EEG data both inside and outside a 1.5T MR scanner and compare both power spectra and single-trial discrimination performance for both conditions. We find that EEG activity acquired inside the MR scanner during echo planer image acquisition is of high enough quality to enable single-trial discrimination performance that is 95 % of that acquired outside the scanner. We conclude that EEG acquired simultaneously with fMRI is of high enough fidelity to permit single-trial analysis.

  • Classifying single-trial ERPs from visual and frontal cortex during free viewing

    January 2, 2017 | A. Tang, M.T. Sutherland, C.J. McKinney, L. Jing-Yu, W. Yan, L. Parra, Adam Gerson, Paul Sajda

    Event-related potentials (ERPs) recorded at the scalp are indicators of brain activity associated with event-related information processing; hence they may be suitable for the assessment of changes in cognitive processing load. While the measurement of ERPs in a laboratory setting and classifying those ERPs is trivial, such a task presents major challenges in a “real world” setting where the EEG signals are recorded when subjects freely move their eyes and the sensory inputs are continuously, as opposed to discretely presented. Here we demonstrate that with the aid of second-order blind identification (SOBI), a blind source separation (BSS) algorithm: (1) we can extract ERPs from such challenging data sets; (2) we were able to obtain meaningful single-trial ERPs in addition to averaged ERPs; and (3) we were able to estimate the spatial origins of these ERPs. Finally, using back-propagation neural networks as classifiers, we show that these single-trial ERPs from specific brain regions can be used to determine moment-to-moment changes in cognitive processing load during a complex “real world” task.

  • Analysis of a gain control model of V1: Is the goal redundancy reduction?

    January 2, 2017 | Jianing Shi, Jim Wielaard, Paul Sajda

    In this paper we analyze a popular divisive normalization model of V1 with respect to the relationship between its underlying coding strategy and the extraclassical physiological responses of its constituent modeled neurons. Specifically we are interested in whether the optimization goal of redundancy reduction naturally leads to reasonable neural responses, including reasonable distributions of responses. The model is trained on an ensemble of natural images and tested using sinusoidal drifting gratings, with metrics such as suppression index and contrast dependent receptive field growth compared to the objective function values for a sample of neurons. We find that even though the divisive normalization model can produce “typical” neurons that agree with some neurophysiology data, distributions across samples do not agree with experimental data. Our results suggest that redundancy reduction itself is not necessarily causal of the observed extraclassical receptive field phenomena, and that additional optimization dimensions and/or biological constraints must be considered.

  • Using single-trial EEG to estimate the timing of target onset during rapid serial visual presentation

    January 2, 2017 | An Luo, Paul Sajda

    The timing of a behavioral response, such as a button press in reaction to a visual stimulus, is highly variable across trials. In this paper we describe a methodology for single-trial analysis of electroencephalography (EEG) which can be used to reduce the error in the estimation of the timing of the behavioral response and thus reduce the error in estimating the onset time of the stimulus. We consider a rapid serial visual presentation (RSVP) paradigm consisting of concatenated video clips and where subjects are instructed to respond when they see a predefined target. We show that a linear discriminator, with inputs distributed across sensors and time and chosen via an information theoretic feature selection criterion, can be used in conjunction with the response to yield a lower error estimate of the onset time of the target stimulus compared to the response time. We compare our results to response time and previous EEG approaches using fixed windows in time, showing that our method has the lowest estimation error. We discuss potential applications, specifically with respect to cortically-coupled computer vision based triage of large image databases

  • Neural mechanisms of contrast dependent receptive field size in V1

    January 2, 2017 | Jim Wielaard, Paul Sajda

    Based on a large scale spiking neuron model of the input layers 4Candof macaque, we identify neural mechanisms for the observed contrast dependent receptive field size of V1 cells. We observe a rich variety of m echanisms for the phenomenon and analyze them based on the relative gain of excitatory and inhibitory synaptic inputs. We observe an average growth in the spatial extent of excitation and inhibition for low contrast, as predicted fr om phenomenological models. However, contrary to phenomenological models, our simulation results suggest this is neither sufficient nor necessary to explain t he phenomenon.

  • Spatio-temporal linear discrimination for inferring task difficulty from EEG

    January 2, 2017 | An Luo, Paul Sajda

    We present a spatio-temporal linear discrimination method for single-trial classification of multi-channel electroencephalography (EEG). No prior information about the characteristics of the neural activity is required i.e. the algorithm requires no knowledge about the timing and/or spatial distribution of the evoked responses. The algorithm finds a temporal delay/window onset time for each EEG channel and then spatially integrates the channels for each channel-specific onset time. The algorithm can be seen as learning discrimination trajectories defined within the space of EEG channels. We demonstrate the method for detecting auditory evoked neural activity and discrimination of task difficulty in a complex visual-auditory environment

  • Consistency of extracellular and intracellular classification of simple and complex cell

    Using a rectification model and an experimentally measured distribution of the extracellular modulation ratio (F1/F0), we investigate the consistency between extracellular and intracellular modulation metrics for classifying cells in primary visual cortex (V1). We first demonstrate that the shape of the distribution of the intracellular metric χ is sensitive to the specific form of the bimodality observed in F1/F0. When the proper mapping between F1/F0 and χ is applied to the experimentally measured F1/F0 data, χ is weakly bimodal. We then use a two-class mixture model to estimate physiological response parameters given the F1/F0 distribution. We show, once again, that a weak bimodality is present in χ. Finally, using the estimated parameters for the two cell clases, we show that simple and complex cell class assignment in F1/F0 is more-or-less preserved in a heavy-tailed f1/f0 distribution, with complex cells being in the core of the f1/f0 distribution and simple cells in the tail (misclassification error in f1/f0 = 19%). Class assignment in f1/f0 is likewise consistent (misclassification error in F1/F0 = 15%). Our results provide computational support for the conclusion that extracellular and intracellular metrics are relatively consistent measures for classifying cells in V1 as either simple or complex.

  • Recovery of metabolomic spectral sources using non-negative matrix factorization

    January 2, 2017 | S. Du, Paul Sajda, R. Stoyanova, Truman R. Brown

    1H magnetic resonance spectra (MRS) of biofluids contain rich biochemical information about the metabolic status of an organism. Through the application of pattern recognition and classification algorithms, such data have been shown to provide information for disease diagnosis as well as the effects of potential therapeutics. In this paper we describe a novel approach, using non-negative matrix factorization (NMF), for rapidly identifying metabolically meaningful spectral patterns in1H MRS. We show that the intensities of these identified spectral patterns can be related to the onset of, and recovery from, toxicity in both a time-related and dose-related fashion. These patterns can be seen as a new type of biomarker for the biological effect under study. We demonstrate, using k-means clustering, that the recovered patterns can be used to characterize the metabolic status of the animal during the experiment.

  • Large-scale simulation of the visual cortex: Classical and extraclassical phenomenaLarge-scale simulation of the visual cortex: Classical and extraclassical phenomena

    January 2, 2017 | Jim Wielaard, Paul Sajda
  • Comparison of supervised and unsupervised linear methods for recovering task-relevant activity in EEG

    January 2, 2017 | An Luo, Adam Gerson, Paul Sajda

    In this paper we compare three linear methods, independent component analysis (ICA), common spatial patterns (CSP), and linear discrimination (LD) for recovering task relevant neural activity from high spatial density electroencephalography (EEG). Each linear method uses a different objective function to recover underlying source components by exploiting statistical structure across a large number of sensors. We test these methods using a dual-task event-related paradigm. While engaged in a primary task, subjects must detect infrequent changes in the visual display, which would be expected to evoke several well-known event-related potentials (ERPs), including the N2 and P3. We find that though each method utilizes a different objective function, they in fact yield similar components. We note that one advantage of the LD approach is that the recovered component is easily interpretable, namely it represents the component within a given time window which is most discriminating for the task, given a spatial integration of the sensors. Both ICA and CSP return multiple components, of which the most discriminating component may not be the first. Thus, for these methods, visual inspection or additional processing is required to determine the significance of these components for the task.

  • Multi-resolution hierarchical blind recovery of biochemical markers of brain cancer in MRSI

    January 2, 2017 | S. Du, Paul Sajda, X. Mao, D.C. Shungu

    We present a multi-resolution hierarchical application of the constrained non-negative matrix factorization (cNMF) algorithm for blindly recovering constituent source spectra in magnetic resonance spectroscopic imaging (MRSI). cNMF is an extension of non-negative matrix factorization (NMF) that includes a positivity constraint on amplitudes of recovered spectra. We apply cNMF hierarchically, with spectral recovery and subspace reduction constraining which observations are used in the next level of processing. The decomposition model recovers physically meaningful spectra which are highly tissue-specific, for example spectra indicative of tumor proliferation, given a processing hierarchy that proceeds coarse-to-fine. We demonstrate the decomposition procedure on /sup 1/H long TE brain MRS data. The results show recovery of markers for normal brain tissue, low proliferative tissue and highly proliferative tissue. The coarse-to-fine hierarchy also makes the algorithm computationally efficient, thus it is potentially well-suited for use in diagnostic work-up.

  • Blind recovery of biochemical markers of brain cancer in MRSI

    January 2, 2017 | S. Du, X. Mao, D.C. Shungu, Paul Sajda

    We present an algorithm for blindly recovering constituent source spectra from magnetic resonance spectroscopic imaging (MRSI) of human brain. The algorithm is based on the non-negative matrix factorization (NMF) algorithm, extending it to include a constraint on the positivity of the amplitudes of the recovered spectra and mixing matrices. This positivity constraint enables recovery of physically meaningful spectra even in the presence of noise that causes a significant number of the observation amplitudes to be negative. The algorithm, which we call constrained non-negative matrix factorization (cNMF), does not enforce independence or sparsity, though it recovers sparse sources quite well. It can be viewed as a maximum likelihood approach for finding basis vectors in a bounded subspace. In this case the optimal basis vectors are the ones that envelope the observed data with a minimum deviation from the boundaries. We incorporate the cNMF algorithm into a hierarchical decomposition framework, showing that it can be used to recover tissue-specific spectra, e.g., spectra indicative of malignant tumor. We demonstrate the hierarchical procedure on 1H long echo time (TE) brain absorption spectra and conclude that the computational efficiency of the cNMF algorithm makes it well-suited for use in diagnostic work-up.

  • Inferring direction of figure using a recurrent integrate-and-fire neural circuit

    January 2, 2017 | Kay Baek, D.H. Kim, Paul Sajda

    Several theories of early visual perception hypothesize neural circuits that are responsible for assigning ownership of an object’s occluding contour to a region which represents the “figure”. Previously, we presented a Bayesian network model which integrates multiple cues and uses belief propagation to infer direction of figure (DOF) along an object’s occluding contour. In this paper, we use a linear integrate-and-fire model to demonstrate how such inference mechanisms could be carried out in a biologically realistic neural circuit. The circuit, modeled after the network proposed by Rao, maps the membrane potentials of individual neurons to log probabilities and uses recurrent connections to represent transition probabilities. The network’s “perception ” of DOF is demonstrated for several examples, including perceptually ambiguous figures, with results qualitatively consistent with human perception.

  • A probabilistic network model for integrating visual cues and inferring intermediate-level representations

    January 2, 2017 | Kay Baek, Paul Sajda

    Psychophysical data have demonstrated that our visual system must integrate multiple, spatially local and non-local cues to construct the visual scene. In this paper we describe a probabilistic network model which integrates visual cues to infer intermediate-level visual representations. We demonstrate the network model for two example problems: inferring “direction of figure” (DOF) [15] and estimating perceived velocity. One can consider the assignment of DOF as essentially a problem in probabilistic inference, with DOF being a hidden variable, assigning “ownership” of an object’s occluding boundary to a region which represents the “figure”. The DOF is not directly observed but can potentially be inferred from local observations and “message passing”. For example, our model combines contour convexity and similarity/proximity cues to form observations, with belief propagation (BP) used to integrate these observations with state probabilities to infer the DOF. We extend the network model, integrating form and motion streams, to explain the coherence based motion effects first demonstrated by McDermott et al. [11]. The extended model consists of two interacting network chains (streams), one for inferring DOF and the other for inferring scene motion. The local figure-ground relationships estimated in the DOF stream are subsequently used by the motion stream as evidence for surface occlusion, modulating the covariance of a Gaussian distribution used to model the velocity at apertures located at junction points. The distribution of scene motion ultimately is represented in velocity space as a mixture of these form-modulated Gaussians. Simulation results show that the network’s integration of cues can account for several examples of perceptual ambiguity in DOF, consistent with human perception. Also, the integration of form and motion representations qualitatively accounts for psychophysical results showing surface dependent motion coherence of oscillating edges [11]. We also show that the model naturally integrates top-down cues, leading to perceptual bias in interpreting ambiguous figures, such as Rubin’s vase, as well as bias in the perceived coherence of object motion.

  • Recovery of constituent spectra in 3D chemical shift imaging using non-negative matrix factorization

    January 2, 2017 | Paul Sajda, S. Du, L. Parra, R. Stoyanova, Truman R. Brown

    In this paper we describe a non-negative matrix factoriza- tion (NMF) for recovering constituent spectra in 3D chem- ical shift imaging (CSI). The method is based on the NMF algorithm of Lee and Seung (1), extending it to include a constraint on the minimum amplitude of the recovered spectra. This constrained NMF (cNMF) algorithm can be viewed as a maximum likelihood approach for finding ba- sis vectors in a bounded subspace. In this case the opti- mal basis vectors are the ones that envelope the observed data with a minimum deviation from the boundaries. Re- sults for P human brain data are compared to Bayesian Spectral Decomposition (BSD) (2) which considers a full Bayesian treatment of the source recovery problem and re- quires computationally expensive Monte Carlo methods. The cNMF algorithm is shown to recover the same con- stituent spectra as BSD, however in about less com- putational time.

  • Converging evidence of independent sources in EEG

    January 2, 2017 | L. Parra, Paul Sajda

    Blind source separation (BSS) has been proposed as a method to analyze multi-channel electroencephalography (EEG) data. A basic issue in applying BSS algorithms is the validity of the independence assumption. We investigate whether EEG can be considered to be a linear combination of independent sources. Linear BSS can be obtained under the assumptions of non-Gaussian, non-stationary, or non-white independent sources. If the linear independence hypothesis is violated, these three different conditions will not necessarily lead to the same result. We show, using 64 channel EEG data, that different algorithms which incorporate the three different assumptions lead to the same results, thus supporting the linear independence hypothesis.

  • Perceptual salience as novelty detection in cortical pinwheel space

    December 30, 2016 | Paul Sajda, F. Han

    We describe a filter-based model of orientation processing in primary visual cortex (V1) and demonstrate that novelty in cortical “pinwheel” space can be used as a measure of perceptual salience. In the model, novelty is computed as the negative log likelihood of a pinwheel’s activity relative to the population response. The population response is modeled using a mixture of Gaussians, enabling the representation of complex, multi-modal distributions. Hidden variables that are inferred in the mixture model can be viewed as grouping or “binding” pinwheels which have similar responses within the distribution space. Results are shown for several stimuli that illustrate well-known contextual effects related to perceptual salience, as well as results for a natural image.

  • High-throughput image search via single-trial event detection in a rapid serial visual presentation task

    December 30, 2016 | Paul Sajda, Adam Gerson, L. Parra

    We describe a method, using linear discrimination, for detecting single-trial EEG signatures of object recognition events in a rapid serial visual presentation (RSVP) task. We record EEG using a high spatial density array (87 electrodes) during the rapid presentation (50-200 msec per image) of natural images. Subjects were instructed to release a button when they recognized a target image (an image with a person/people). Trials consisted of 100 images each, with a 50% chance of a single target being in a trial. Subject EEG was analyzed on a single-trial basis with an optimal spatial linear discriminator learned at multiple time windows after the presentation of an image. Linear discrimination enables the estimation of a forward model and thus allows for an approximate localization of the discriminating activity. Results show multiple loci for discriminating activity (e.g. motor and visual). Using these detected EEG signatures, we show that in many cases we can detect targets more accurately than the overt response (button release) and that such signatures can be used to prioritize images for high-throughput search.

  • Simulated optical imaging of orientation preference in a model of V1

    December 30, 2016 | Paul Sajda, Jim Wielaard

    Optical imaging studies have played an important role in mapping the orientation selectivity and ocular dominance of neurons across an extended area of primary visual cortex (V1). Such studies have produced images with a more or less smooth and regular spatial distribution of relevant neuronal response properties. This is in spite of the fact that results from electrophysiological recordings, though limited in their number and spatial distribution, show significant scatter/variability in the relevant response properties of nearby neurons. In this paper we present a simulation of the optical imaging experiments of ocular dominance and orientation selectivity using a computational model of the primary visual cortex. The simulations assume that the optical imaging signal is proportional to the averaged response of neighboring neurons. The model faithfully reproduces ocular dominance columns and orientation pinwheels in the presence of realistic scatter of single cell preferred responses. In addition,we find the simulated optical imaging of orientation pinwheels to be remarkably robust, with the pinwheel structure maintained up to an addition of degrees of random scatter in the orientation preference of single cells. Our results suggest that an optical imaging result does not necessarily, by itself, provide any obvious upperbound for the scatter of the underlying neuronal response properties on local scales.

  • Mechanisms for surround suppression in a Spiking Neuron Model of Macaque Striate Cortex (V1)

    December 30, 2016 | Paul Sajda, Jim Wielaard
  • Recovery of constituent spectra using non-negative matrix factorization

    December 30, 2016 | Paul Sajda, S. Du, L. Parra

    In this paper a constrained non-negative matrix factorization (cNMF) algorithm for recovering constituent spectra is described together with experiments demonstrating the broad utility of the approach. The algorithm is based on the NMF algorithm of Lee and Seung, extending it to include a constraint on the minimum amplitude of the recovered spectra. This constraint enables the algorithm to deal with observations having negative values by assuming they arise from the noise distribution. The cNMF algorithm does not explicitly enforce independence or sparsity, instead only requiring the source and mixing matrices to be non-negative. The algorithm is very fast compared to other “blind” methods for recovering spectra. cNMF can be viewed as a maximum likelihood approach for finding basis vectors in a bounded subspace. In this case the optimal basis vectors are the ones that envelope the observed data with a minimum deviation from the boundaries. Results for Raman spectral data, hyperspectral images, and 31P human brain data are provided to illustrate the algorithm’s performance.

  • Spatial signatures of visual object recognition events learned from single-trial analysis of EEG

    December 30, 2016 | Paul Sajda, Adam Gerson, L. Parra

    In this paper we use linear discrimination for learning EEG signatures of object recognition events in a rapid serial visual presentation (RSVP) task. We record EEG using a high spatial density array (63 electrodes) during the rapid presentation (50-200 msec per image) of natural images. Each trial consists of 100 images, with a 50% chance of a single target being in a trial. Subjects are instructed to press a left mouse button at the end of the trial if they detected a target image, otherwise they are instructed to press the right button. Subject EEG was analyzed on a single-trial basis with an optimal spatial linear discriminator learned at multiple time windows after the presentation of an image. Analysis of discrimination results indicated a periodic fluctuation (time-localized oscillation) in A/sub z/ performance. Analysis of the EEG using the discrimination components learned at the peaks of the A/sub z/ fluctuations indicate 1) the presence of a positive evoked response, followed in time by a negative evoked response in strongly overlapping areas and 2) a component which is not correlated with the discriminator learned during the time-localized fluctuation. Results suggest that multiple signatures, varying over time, may exist for discriminating between target and distractor trials.

  • Capturing contextual dependencies in medical imagery using hierarchical multi-scale models

    December 30, 2016 | Paul Sajda, C. Spence, L. Parra

    In this paper we summarize our results for two classes of hierarchical multi-scale models that exploit contextual information for detection of structure in mammographic imagery. The first model, the hierarchical pyramid neural network (HPNN), is a discriminative model which is capable of integrating information either coarse-to-fine or fine-to-coarse for microcalcification and mass detection. The second model, the hierarchical image probability (HIP) model, captures short-range and contextual dependencies through a combination of coarse-to-fine factoring and a set of hidden variables. The HIP model, being a generative model, has broad utility, and we present results for classification, synthesis and compression of mammographic mass images. The two models demonstrate the utility of the hierarchical multi-scale framework for computer assisted detection and diagnosis.

  • Higher-order statistical properties arising from the non-stationarity of natural signals

    December 30, 2016 | L. Parra, C. Spence, Paul Sajda

    We present evidence that several higher-order statistical properties of natural images and signals can be explained by a stochastic model which simply varies scale of an otherwise stationary Gaussian process. We discuss two interesting consequences. The first is that a variety of natural signals can be related through a common model of spherically invariant random processes, which have the attractive property that the joint densities can be constructed from the one dimensional marginal. The second is that in some cases the non-stationarity assumption and only second order methods can be explicitly exploited to find a linear basis that is equivalent to independent components obtained with higher-order methods. This is demonstrated on spectro-temporal components of speech.

  • Detection, synthesis and compression in mammographic image analysis with a hierarchical image probability model

    December 30, 2016 | C. Spence, L. Parra, Paul Sajda

    We develop a probability model over image spaces and demonstrate its broad utility in mammographic image analysis. The model employs a pyramid representation to factor images across scale and a tree-structured set of hidden variables to capture long-range spatial dependencies. This factoring makes the computation of the density functions local and tractable. The result is a hierarchical mixture of conditional probabilities, similar to a hidden Markov model on a tree. The model parameters are found with maximum likelihood estimation using the EM algorithm. The utility of the model is demonstrated for three applications; 1) detection of mammographic masses in computer-aided diagnosis 2) qualitative assessment of model structure through mammographic synthesis and 3) compression of mammographic regions of interest.

  • Unmixing hyperspectral data

    December 30, 2016 | L. Parra, K-R Mueller, C. Spence, A. Ziehe, Paul Sajda

    In hyperspectral imagery one pixel typically consists of a mixture of the reflectance spectra of several materials’ where the mixture coefficients correspond to the abundances of the constituting materials. We assume linear combinations of reflectance spectra with some additive normal sensor noise and derive a probabilistic MAP framework for analyzing hyperspectral data. As the material reflectance characteristics are not know a priori’ we face the problem of unsupervised linear unmixing. The incorporation of different prior information (e.g. positivity and normalization of the abundances) naturally leads to a family of interesting algorithms’ for case yielding an algorithm that can be understood as constrained independent component analysis (ICA). Simulations underline the usefulness of our theory.

  • Mammographic mass detection with a hierarchical image probability (HIP) model

    December 28, 2016 | C. Spence, L. Parra, Paul Sajda

    We formulate a model for probability distributions on image spaces. We show that any distribution of images can be factored exactly into conditional distributions of feature vectors at one resolution (pyramid level) conditioned on the image information at lower resolutions. We would like to factor this over positions in the pyramid levels to make it tractable, but such factoring may miss long-range dependencies. To fix this, we introduce hidden class labels at each pixel in the pyramid. The result is a hierarchical mixture of conditional probabilities, similar to a hidden Markov model on a tree. The model parameters can be found with maximum likelihood estimation using the EM algorithm. We have obtained encouraging preliminary results on the problems of detecting masses in mammograms.

  • Hierarchical multi-resolution models for object recognition: Applications to mammographic computer-aided diagnosis

    December 28, 2016 | Paul Sajda, C. Spence, L. Parra, RM Nishikawa

    A fundamental problem in image analysis is the integration of information across scale to detect and classify objects. We have developed, within a machine learning framework, two classes of multiresolution models for integrating scale information for object detection and classification-a discriminative model called the hierarchical pyramid neural network and a generative model called a hierarchical image probability model. Using receiver operating characteristic analysis, we show that these models can significantly reduce the false positive rates for a well-established computer-aided diagnosis system.

  • Hierarchical image probability (HIP) models

    December 28, 2016 | C. Spence, L. Parra, Paul Sajda

    We formulate a model for probability distributions on image spaces. We show that any distribution of images can be factored exactly into conditional distributions of feature vectors at one resolution (pyramid level) conditioned on the image information at lower resolutions. We would like to factor this over positions in the pyramid levels to make it tractable, but such factoring may miss long-range dependencies. To capture long-range dependencies, we introduce hidden class labels at each pixel in the pyramid. The result is a hierarchical mixture of conditional probabilities, similar to a hidden Markov model on a tree. The model parameters can be found with maximum likelihood estimation using the EM algorithm. We have obtained encouraging preliminary results on the problems of detecting various objects in SAR images and target recognition in optical aerial images.

  • The role of feature selection in building pattern recognizers for computer-aided diagnosis

    December 28, 2016 | C. Spence, Paul Sajda

    In this paper we explore the use of feature selection techniques to improve the generalization performance of pattern recognizers for computer-aided diagnosis. We apply a modified version of the sequential forward floating selection (SFFS) of Pudil et al. to the problem of selecting an optimal feature subset for mass detection in digitized mammograms. The complete feature set consists of multi-scale tangential and radial gradients in the mammogram region of interest. We train a simple multi-layer perceptron (MLP) using the SFFS algorithm and compare its performance, using a jackknife procedure, to an MLP trained on the complete feature set (35 features). Results indicate that a variable number of features is chosen in each of the jackknife sets (12 +/- 4) and the test performance, Az, using the chosen feature subset is no better than the performance using the entire feature set. These results may be attributed to the fact that the feature set is noisy and the data set used for training/testing is small. We next modify the feature selection technique by using the results of the jackknife to compute the frequency at which different features are selected. We construct a classifier by choosing the top N features, selected most frequently, which maximize performance on the training data. We find that by adding this `hand-tuning’ component to the feature selection process, we can reduce the feature set from 35 to 8 features and at the same time have a statistically significant increase in generalization performance (p < 0.015).

  • Multi-resolution neural networks for mammographic mass detection

    December 28, 2016 | C. Spence, Paul Sajda

    We have previously presented a hierarchical pyramid/neural network (HPNN) architecture which combines multi-scale image processing techniques with neural networks. This coarse-to- fine HPNN was designed to learn large-scale context information for detecting small objects. We have developed a similar architecture to detect mammographic masses (malignant tumors). Since masses are large, extended objects, the coarse-to-fine HPNN architecture is not suitable for the problem. Instead we constructed a fine-to- coarse HPNN architecture which is designed to learn small- scale detail structure associated with the extended objects. Our initial result applying the fine-to-coarse HPNN to mass detection are encouraging, with detection performance improvements of about 30%. We conclude that the ability of the HPNN architecture to integrate information across scales, from fine to coarse in the case of masses, makes it well suited for detecting objects which may have detail structure occurring at scales other than the natural scale of the object.

  • Applications of multi-resolution neural networks to mammography

    December 28, 2016 | Paul Sajda, C. Spence

    We have previously presented a coarse-to-fine hierarchical pyramid/ neural network (HPNN) architecture which combines multi-scale image processing techniques with neural networks. In this paper we present applications of this general architecture to two problems in mammographic Computer-Aided Diagnosis (CAD). The first application is the detection of microcalcifications. The coarse-to-fine HPNN was designed to learn large-scale context information for detecting small objects like microcalcifications. Receiver operating characteristic (ROC) analysis suggests that the hierarchical architecture improves detection performance of a well established CAD system by roughly 50%. The second application is to detect mammographic masses directly. Since masses are large, extended objects, the coarse-to-fine HPNN architecture is not suitable for this problem. Instead we construct a fine-to-coarse HPNN architecture which is designed to learn small-scale detail structure associated with the extended objects. Our initial results applying the fine-to-coarse HPNN to mass detection are encouraging, with detection performance improvements of about 36%. We conclude that the ability of the HPNN architecture to integrate information across scales, both coarse-to-fine and fine-to-coarse, makes it well suited for detecting objects which may have contextual clues or detail structure occurring at scales other than the natural scale of the object.

  • Dealing with uncertainty and error in truth data when training neural networks for computer-aided diagnosis application

    December 28, 2016 | C. Spence, Paul Sajda, RM Nishikawa
  • Exploiting context in mammograms: a hierarchical neural network for detecting microcalcifications

    December 28, 2016 | Paul Sajda, C. Spence, J. Pearson, RM Nishikawa

    Microcalcifications are important cues used by radiologists for early detection in breast cancer. Individually, microcalcifications are difficult to detect, and often contextual information (e.g. clustering, location relative to ducts) can be exploited to aid in their detection. We have developed an algorithm for constructing a hierarchical pyramid/neural network (HPNN) architecture to automatically learn context information for detection. To test the HPNN we first examined if the hierarchical architecture improves detection of individual microcalcifications and if context is in fact extracted by the network hierarchy. We compared the performance of our hierarchical architecture versus a single neural network receiving input from all resolutions of a feature pyramid. Receiver operator characteristic (ROC) analysis shows that the hierarchical architecture reduces false positives by a factor of two. We examined hidden units at various levels of the processing hierarchy and found what appears to be representations of ductal location. We next investigated the utility of the HPNN if integrated as part of a complete computer-aided diagnosis (CAD) system for microcalcification detection, such as that being developed at the University of Chicago. Using ROC analysis, we tested the HPNN’s ability to eliminate false positive regions of interest generated by the computer, comparing its performance to the neural network currently used in the Chicago system. The HPNN achieves an area under the ROC curve of Az equal to .94 and a false positive fraction of FPF equal to .21 at TPF equals 1.0. This is in comparison to the results reported for the Chicago network; Az equal to .91, FPF equal to .43 at TPF equal to 1.0. These differences are statistically significant. We conclude that the HPNN algorithm is able to utilize contextual information for improving microcalcifications detection and potentially reduce the false positive rates in CAD systems.

  • Integrating multi-resolution and contextual information for improved microcalcification detection

    December 28, 2016 | Paul Sajda, C. Spence, J. Pearson, RM Nishikawa
  • Comparison of gender recognition by PDP and radial basis function network

    December 28, 2016 | S.C. Yen, Paul Sajda, L. H. Finkel

    Despite a long history of neurological, psychological, and computational efforts, no satisfactory explanation has been offered for the extraordinary ability of humans to recognize other human faces. However, a number of different network- based approaches (Turk and Pentland,1991; Brunelli and Poggio, 1993; Buhmann et al., 1989) have achieved surprisingly good ability to recognize faces, at least under certain restricted conditions. We decided to compare the solutions developed by different network architectures including PDP and radial basis function (RBF) networks to the problem of gender classification. Given a picture of a face, including external features such as hair, beard, jewelry, etc., the network must learn to distinguish male from female. This is a simpler problem than general face recognition, and there is some evidence that it is carried out by a separate population of cells in the inferior temporal cortex (Damasio et. al., 1990). Several investigators have previously applied PDP networks to the problem of gender classification (Golomb et al., 1989; Cottrell and Metcalfe, 1989). However, the hidden unit representations developed in those models were not analyzed in detail. Moreover, we wanted to directly compare the representations developed by different types of networks (PDP, RBF) when confronted with the exact same training and test set

  • Extracting contextual information in digital imagery: Applications to automatic target recognition and mammography

    December 28, 2016 | C. Spence, Paul Sajda, J. Pearson

    An important problem in image analysis is finding small objects in large images. The problem is challenging because (1) searching a large image is computationally expensive, and (2) small targets (on the order of a few pixels in size) have relatively few distinctive features which enable them to be distinguished from non-targets. To overcome these challenges we have developed a hierarchical neural network (HNN) architecture which combines multi-resolution pyramid processing with neural networks. The advantages of the architecture are: (1) both neural network training and testing can be done efficiently through coarse-to-fine techniques, and (2) such a system is capable of learning low-resolution contextual information to facilitate the detection of small target objects. We have applied this neural network architecture to two problems in which contextual information appears to be important for detecting small targets. The first problem is one of automatic target recognition (ATR), specifically the problem of detecting buildings in aerial photographs. The second problem focuses on a medical application, namely searching mammograms for microcalcifications, which are cues for breast cancer. Receiver operating characteristic (ROC) analysis suggests that the hierarchical architecture improves the detection accuracy for both the ATR and microcalcification detection problems, reducing false positive rates by a significant factor. In addition, we have examined the hidden units at various levels of the processing hierarchy and found what appears to be representations of road location (for the ATR example) and ductal/vasculature location (for mammography), both of which are in agreement with the contextual information used by humans to find these classes of targets. We conclude that this hierarchical neural network architecture is able to automatically extract contextual information in imagery and utilize it for target detection.

  • A hierarchical neural network architecture that learns target context: Applications to digital mammography

    December 28, 2016 | Paul Sajda, C. Spence, J. Pearson

    An important problem in image analysis is finding small objects in large images. The problem is challenging because: 1) searching a large image is computationally expensive; and 2) small targets (on the order of a few pixels in size) have relatively few distinctive features which enable them to be distinguished from non-targets. To overcome these challenges the authors have developed a hierarchical neural network architecture which combines multiresolution pyramid processing with neural networks. Here the authors discuss the application of their hierarchical neural network architecture to the problem of detecting microcalcifications in digital mammograms. Microcalcifications are cues for breast tumors. 30% to 50% of breast carcinomas have microcalcifications visible in mammograms while 60% to 80% of all breast tumors eventually show microcalcifications via histology. Similar to the building/ATR problem, microcalcifications are generally very small point-like objects (

  • Construction of illusory surfaces by intermediate-level visual cortical networks

    December 28, 2016 | Paul Sajda, L. H. Finkel

    A model is proposed which directly links the perception of illusory contours to intermediate-level cortical processes for visual surface discrimination. An important assertion of the model is that illusory contours are reentered, via feedback, into surface discrimination processes with the result being the construction of illusory surfaces. The model is tested in a number of simulations which demonstrate surface completion, generation of illusory contours, and interactions with depth cues from stereopsis.

  • Cortical mechanisms for surface segmentation

    December 28, 2016 | Paul Sajda, L. H. Finkel

    Physiology has shown that the neural machinery of “early vision” is well suited for extracting edges and determining orientation of contours in the visual field. However, when looking at objects in a scene our perception is not dominated by edges and contours but rather by surfaces. Previous models have attributed surface segmentation to filling-in processes, typically based on diffusion. Though diffusion related mechanisms may be important for perceptual filling-in [4], it is unclear how such mechanisms would discriminate multiple, overlapping surfaces, as might result from occlusion or transparency. For the case of occlusion, surfaces exist on either side of a boundary and the problem is not to fill-in the surfaces but to determine which surface “owns” the boundary [1][3]. This problem of boundary “ownership” can also be considered a special case of the binding problem, with a surface being “bound” to a contour.

  • Dual mechanisms for neural binding and segmentation and their role in cortical integration

    December 27, 2016 | Paul Sajda, L. H. Finkel

    We propose that the binding and segmentation of visual features is mediated by two complementary mechanisms; a low resolution, spatial-based, resource-free process and a high resolution, temporal-based, resource-limited process. In the visual cortex, the former depends upon the orderly topographic organization in striate and extrastriate areas while the latter may be related to observed temporal relationships between neuronal activities. Computer simulations illustrate the role the two mechanisms play in figure/ground discrimination, depth-from-occlusion, and the vividness of perceptual completion.

  • Texture discrimination and binding by a modified energy model

    December 27, 2016 | K. Sakai, Paul Sajda, L. H. Finkel

    The model presented shows how textured regions can be discriminated and textured surface created by the visual cortex. The model addresses two major processes: texture segmentation and texture binding. Textures are detected by using a version of the energy model of J. R. Bergen and E. H. Adelson (1988) and J. R. Bergen and M. S. Landy (1991), which was modified to include ON and OFF center cells, and units selective for line endings. A novel neural mechanism is described for binding a texture pattern together. Simulation results demonstrated the ability of the networks to segment and bind a well-known texture pattern.

  • A neural network model of object segmentation and feature binding in visual cortex

    December 27, 2016 | Paul Sajda, L. H. Finkel

    The authors present neural network simulations of how the visual cortex may segment objects and bind attributes based on depth-from-occlusion. They briefly discuss one particular subprocess in the occlusion-based model most relevant to segmentation and binding: determination of the direction of figure. They propose that the model allows addressing a central issue in object recognition: how the visual system defines an object. In addition, the model was tested on illusory stimuli, with the network’s response indicating the existence of robust psychophysical properties in the system.

  • Object segmentation and binding within a biologically-based neural network model of depth-from-occlusion

    December 18, 2016 | Paul Sajda, L. H. Finkel

    The problems of object segmentation and binding are addressed within a biologically based network model capable of determining depth from occlusion. In particular, the authors discuss two subprocesses most relevant to segmentation and binding: contour binding and figure direction. They propose that these two subprocesses have intrinsic constraints that allow several underdetermined problems in occlusion processing and object segmentation to be uniquely solved. Simulations that demonstrate the role these subprocesses play in discriminating objects and stratifying them in depth are reported. The network is tested on illusory stimuli, with the network’s response indicating the existence of robust psychological properties in the system.

  • Correlating Speaker Gestures in Political Debates with Audience Engagement Measured via EEG

    September 23, 2016 | John R. Zhang, Jason Sherwin, Jacek Dmochowski, Paul Sajda, John R. Kender

    We hypothesize that certain speaker gestures can convey significant information that are correlated to audience engagement…