Posts

Showing posts from August, 2011

[CVML] Florent Perronnin on explicit feature maps

There is apparently a "new" trend in computer vision that I seem to have missed. It's explicit feature maps . So what is this about? We all know and love the kernel trick: have some vector representation X of your data and some algorithm (like SVMs) that rely only on dot products in the input space. Replace the dot products by some positive definite kernel and by Mercer's Theorem there is some Hilbert space of functions Y and some mapping phi: X -> Y such that your kernelized algorithm works as if you mapped your input to phi. This extremely powerfull method is now used everywhere, most of all for SVMs. But it has several significant drawbacks in this case: Training using a non-linear kernel is a lot slower [O(n^3)] than training a linear SVM [O(n)]. Kernel SVMs take a lot of memory to store, since they are "non-parametric" in a sense. In addition to the Lagrange-Multipliers alpha, all the support vectors have to be stored. Recall is painfully

Ensemble of Exemplar-SVMs for Object Decection and Beyond

I was just reading the paper in the title ( pdf ), which Alexei Efros talked about at CVML. It's from this years ICCV. The basic idea of the paper is to do object detection by training a linear SVM on hog features for each positive example. I thought the idea was pretty cool and wanted to write something about it... and then I saw there is already a post by the first author, Tomasz Malisiewicz, in his own blog . He also discusses some of his (matlab :( ) code for non-maximum suppression. So check out his blog for more details on this cool paper :)

Unbiased Look at Dataset Bias

Somehow I neglected writing about the CVML summer school (which is now what, 3 weeks ago?). I didn't really have the time to go through all of my notes yet. When I was at my notes on Alexei Efros' talk on Large Scale Recognition, I saw that he mentioned some joint work with Antonio Torralba on dataset biases in the computer vision community. The paper is called "Unbiased Look at Dataset Bias" ( pdf ) and appeared at CVPR this year. If you haven't heard of it: it is a MUST READ! Read it now! From the paper: "Disclaimer: No graduate students were harmed in the production of this paper. Authors are listed in order of increasing procrastination ability."

Ipython 0.11 with qt shell :) Edit: Working now.

Today I read about the "new" ipython 0.11 release . It it pretty awesome: not only does it feature new parallel computing capabilities, it also has a very nifty new qt shell. The shell has cool highlighting, tool-tip help and (drum roll) inline matplotlib figures! I urge you to check out the release notes and upgrade (or install). I didn't find any packages for maverick (yeah some of our machines are a little behind) which I took as an excuse to set up a ppa . It contains natty and maverick 0.11 packages. [edit] All working now :)[\edit] I hope I can use the ppa to distribute more code - I feel there is not enough python packages for vision research out there. But most of my code relies on CUDA, which I guess could be a problem for creating source-packages. We'll see. By the way, I used stdeb to create a debian package from the ipython source package. Using it, making a python package into a debian package becomes a one-liner. I find this very helpful sinc

[CVML] Ivan Laptev on Human Action Recognition

Last Thursday, Ivan Laptev talked about "Human Action Recognition" at the CVML. This is not really my area so while I really liked his very enthusiastic talk, I won't say much about it ;) What I liked about it most was how it was motivated. From a computer vision perspective, I felt human action recognition was somewhat peripheral to current research - surely with many interesting applications but not central to the field. However, Ivan Laptev had two arguments for human action recognition to be a center piece of visual understanding: Most of the data out there - in particular videos - show people: 35% of the pixels in TV and movies belong to people, 40% on youtube. Laptev concludes that video analysis is human action analysis. The semantics of objects can often be inferred from humans interacting with it. Instead of the classical "chair" example, Laptev showed a "luggage train": the thing in the airport you pick up your luggage from. Even if

Region connectivity graphs in Python [edit: minor bug]

Image
[edit] Nowadays you can find  much better implementations of this over at scikit-image. [/edit] Recently I started playing with CRFs on superpixels for image segmentation. While doing this I noticed that Python has very little methods for morphological operations on images. For example I did not find any functions to exctract connected components inside images. For CRFs one obviously needs the superpixel neighbourhood graph to work on and I didn't find any ready made function to obtain it. After some pondering, I came up with something. Since I didn't find anything else online I thought I'd share it. def make_graph(grid): # get unique labels vertices = np.unique(grid) # map unique labels to [1,...,num_labels] reverse_dict = dict(zip(vertices,np.arange(len(vertices)))) grid = np.array([reverse_dict[x] for x in grid.flat]).reshape(grid.shape) # create edges down = np.c_[grid[:-1, :].ravel(), grid[1:, :].ravel()] right = np.c_[grid

[CVML] Quotes

This is my favourite part of every converence and workshop: quotes and fun facts :) There are some things researchers will never write in a paper but that they really like to tell you. Also many professors actually have a pretty good humor (or at least one that is as nerdy as mine). Please not that even though I put the quotes into quotation marks, they might not be completely accurate. Most lecturers can talk pretty fast and I am usually taking notes in the old-school paper way... Ponce: About learned dictionaries and filters: "Dictionary elements don't have semantic meaning. People like to look at them, I don't know why." About denoising using structured sparsity: "We don't know anything about image processing. The finish guys are way better. But the sparse model still works better." Lambert: About 1-vs-all training for multi class classification: "Everyone is using that. But no one knows why it works." Francis Bach (?): Abo