Posts

Showing posts with the label Samy Bengio

NIPS 2010 - Label Embedding Trees for Large Multi-Class Tasks [edit]

Label Embedding Trees for Large Multi-Class Tasks by Samy Bengio, Jason Weston, David Gran is a paper about coping with classification in the presents huge amounts of data and many many classes. Of course the image net challenge comes to mind. But what they actually did is doing classification on all of image net , which is about 16.000 classes and over a million instances. This work focuses on very fast recall but is very expensive to train - apparently the model was trained in a Google cluster. This paper explores two ideas: hierarchical classification and low dimensional embeddings. For the hierarchical classification part, a tree of label sets is build where the leaves are single classes. Starting from all classes at the root, the set of classes is split into subsets similar classes which are represented by new nodes. This is done by training a one-vs-rest classifier and inspecting the confusion matrix. Classes that are easy to confuse are put into the same node. At test-...