Grasping with Application to an Autonomous Checkout Robot

In this paper, we present a novel grasp selection algorithm to enable a robot with a two-fingered end-effector to autonomously grasp unknown objects. Our approach requires as input only the raw depth data obtained from a single frame of a 3D sensor. Additionally, our approach uses no explicit models of the objects and does not […]

A Low-cost Compliant 7-DOF Robotic Manipulator

We present the design of a new low-cost series- elastic robotic arm. The arm is unique in that it achieves reasonable performance for the envisioned tasks (backlash-free, sub-3mm repeatability, moves at 1.5m/s, 2kg payload) but with a significantly lower parts cost than comparable manipulators. The paper explores the design decisions and tradeoffs made in achieving […]

Learning Word Vectors for Sentiment Analysis

Unsupervised vector-based approaches to semantics can model rich lexical meanings, but they largely fail to capture sentiment information that is central to many word meanings and important for a wide range of NLP tasks. We present a model that uses a mix of unsupervised and supervised techniques to learn word vectors capturing semantic term–document information […]

An Analysis of Single-Layer Networks in Unsupervised Feature Learning

A great deal of research has focused on algorithms for learning features from unlabeled data. Indeed, much progress has been made on benchmark datasets like NORB and CIFAR by employing increasingly complex unsupervised learning algorithms and deep models. In this paper, however, we show that several very simple factors, such as the number of hidden […]

Learning Hierarchical Spatio-Temporal Features for Action Recognition with Independent Subspace Analysis

Previous work on action recognition has focused on adapting hand-designed local features, such as SIFT or HOG, from static images to the video domain. In this paper, we propose using unsupervised feature learning as a way to learn features directly from video data. More specifically, we present an extension of the Independent Subspace Analysis algorithm […]

On Random Weights and Unsupervised Feature Learning

Recently two anomalous results in the literature have shown that certain feature learning architectures can perform very well on object recognition tasks, without training. In this paper we pose the question, why do random weights sometimes do so well? Our answer is that certain convolutional pooling architectures can be inherently frequency selective and translation invariant, […]

Multimodal Deep Learning

Deep networks have been successfully applied to unsupervised feature learning for single modalities (e.g., text, images or audio). In this work, we propose a novel application of deep networks to learn features over multiple modalities. We present a series of tasks for multimodal learning and show how to train a deep network that learns features […]

Learning Deep Energy Models

Deep generative models with multiple hidden layers have been shown to be able to learn meaningful and compact representations of data. In this work we propose deep energy models, which use deep feedforward neural networks to model the energy landscapes that define probabilistic models. We are able to efficiently train all layers of our model […]

On Optimization Methods for Deep Learning

The predominant methodology in training deep learning advocates the use of stochastic gradient descent methods (SGDs). Despite its ease of implementation, SGDs are difficult to tune and parallelize. These problems make it challenging to develop, debug and scale up deep learning algorithms with SGDs. In this paper, we show that more sophisticated off-the-shelf optimization methods […]

The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization

While vector quantization (VQ) has been applied widely to generate features for visual recognition problems, much recent work has focused on more powerful methods. In particular, sparse coding has emerged as a strong alternative to traditional VQ approaches and has been shown to achieve consistently higher performance on benchmark datasets. Both approaches can be split […]

Parsing Natural Scenes and Natural Language with Recursive Neural Networks

Recursive structure is commonly found in the inputs of different modalities such as natural scene images or natural language sentences. Discovering this recursive structure helps us to not only identify the units that an image or sentence contains but also how they interact to form a whole. We introduce a max-margin structure prediction architecture based […]

Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning

Reading text from photographs is a challenging problem that has received a significant amount of attention. Two key components of most systems are (i) text detection from images and (ii) character recognition, and many recent methods have been proposed to design better feature representations and models for both. In this paper, we apply methods recently […]

Grasping with Application to an Autonomous Checkout Robot￼

A Low-cost Compliant 7-DOF Robotic Manipulator￼