Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions

We introduce a novel machine learning framework based on recursive autoencoders for sentence-level prediction of sentiment label distributions. Our method learns vector space representations for multi-word phrases. In sentiment prediction tasks these representations outperform other state-of-the-art approaches on commonly used datasets, such as movie reviews, without using any pre-defined sentiment lexica or polarity shifting rules. […]

Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection

Paraphrase detection is the task of examining two sentences and determining whether they have the same meaning. In order to obtain high accuracy on this task, thorough syntactic and semantic analysis of the two statements is needed. We introduce a method for paraphrase detection based on recursive autoencoders (RAE). Our unsupervised RAEs are based on […]

Unsupervised Learning Models of Primary Cortical Receptive Fields and Receptive Field Plasticity

The efficient coding hypothesis holds that neural receptive fields are adapted to the statistics of the environment, but is agnostic to the timescale of this adaptation, which occurs on both evolutionary and developmental timescales. In this work we focus on that component of adaptation which occurs during an organism’s lifetime, and show that a number […]

Sparse Filtering

Unsupervised feature learning has been shown to be effective at learning representations that perform well on image, video and audio classification. However, many existing feature learning algorithms are hard to use and require extensive hyperparameter tuning. In this work, we present sparse filtering, a simple new algorithm which is efficient and only has one hyperparameter, […]

ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning

Independent Components Analysis (ICA) and its variants have been successfully used for unsupervised feature learning. However, standard ICA requires an orthonoramlity constraint to be enforced, which makes it difficult to learn overcomplete features. In addition, ICA is sensitive to whitening. These properties make it challenging to scale ICA to high dimensional data. In this paper, […]

Selecting Receptive Fields in Deep Networks

Recent deep learning and unsupervised feature learning systems that learn from unlabeled data have achieved high performance in benchmarks by using extremely large architectures with many features (hidden units) at each layer. Unfortunately, for such large architectures the number of parameters can grow quadratically in the width of the network, thus necessitating hand-coded “local receptive […]

Learning Feature Representations with K-means

Many algorithms are available to learn deep hierarchies of features from unlabeled data, especially images. In many cases, these algorithms involve multi-layered networks of features (e.g., neural net- works) that are sometimes tricky to train and tune and are difficult to scale up to many machines effectively. Recently, it has been found that K-means clustering […]

Deep Learning of Invariant Features via Simulated Fixations in Video

We apply salient feature detection and tracking in videos to simulate fixations and smooth pursuit in human vision. With tracked sequences as input, a hierarchical network of modules learns invariant features using a temporal slowness constraint. The network encodes invariance which are increasingly complex with hierarchy. Although learned from videos, our features are spatial instead […]

Emergence of Object-Selective Features in Unsupervised Feature Learning

Recent work in unsupervised feature learning has focused on the goal of discovering high-level features from unlabeled images. Much progress has been made in this direction, but in most cases it is still standard to use a large amount of labeled data in order to construct detectors sensitive to object classes or other complex patterns […]

Word-level Acoustic Modeling with Convolutional Vector Regression Learning Workshop

We introduce a model that maps variable-length word utterances to a word vector space using convolutional neural networks. Our approach models entire word acoustics rather than short windows as in previous work. We introduce the notion of mapping these word inputs to a word vector space, rather than trying to solve the mas- sively multi-class problem of word classification.

Recurrent Neural Networks for Noise Reduction in Robust ASR

We introduce a model which uses a deep recurrent auto encoder neural network to denoise input features for robust ASR. We demonstrate the model is competitive with existing feature denoising approaches on the Aurora2 task, and outperforms a tandem approach where deep networks are used to predict phoneme posteriors directly. A.L. Maas, Q.V. Le, T.M. O’Neil, O. Vinyals, P. Nguyen, and Andrew Y. Ng in Interspeech 2012.

Large Scale Distributed Deep Networks

We have developed a software framework called DistBelief that can utilize computing clusters with thousands of machines to train large models. Within this framework, we have developed two algorithms for large-scale distributed training. We have successfully used our system to train a deep network 30x larger than previously reported in the literature, and achieves state-of-the-art performance on ImageNet.