ICCV! In Barcelona! Regrettably, I had to stay home in cold Bonn. Today, I went through the accepted papers, and one of the many I found interesting was "Ask the locals: multi-way local pooling for image recognition" by Y-Lan Boureau, Nicolas Le Roux, Francis Bach, Jean Ponce and Yann LeCun. Many big names on this one :) In this work the authors highlight a feature of many recent coding algorithms for visual descriptors: locality in the feature space. They formulate the encoding as a maximum pooling operation that is local in an image as well as in features space, by using a coarse k-means clustering on features (that are histograms of sparse codes if I understood correctly). The paper reports very good results on Caltech 101 and 256, and the scenes dataset. In particular, good results are achieved with quite small dictionaries, i.e. of size 256. My colleague Hannes pointed out that the features space binning is basically a layer of an RBF network. Which is not menti...