Posts

Mahotas: Image Processing for numpy

Today I stumbled upon mahotas , a python image processing library while researching local binary patterns on wikipedia. I haven't found many of those yet and this one seems to be very active at the moment and quite mature. The library includes I/O and has many low-level vision features, like thresholding, watershed, labeling, distance transform, convex hulls, etc. I was missing something like this, so I quite like it. Descriptors such as SURF and obviously LBP are also included - which is very nice since I am often having a hard time finding nice Python bindings for image features! I hope work on this library continues and maybe it'll include more features and more segmentation methods in the future :)

John Langford: Research Directions for Machine Learning and Algorithms

John Langford published a great article on his blog today: http://hunch.net/?p=1822 Don't miss out on it ;)

K-means on CUDA with CUV [edit]

Since I felt that our CUDA library, CUV, was a little lifeless without some examples on how it can be used, I started to work on a K-means implementation using it. The idea is to have something like from cuv_python import kmeans [clusters, assignements]=kmeans(data) run k-means on the GPU. There are some papers on kmeans on the GPU out there but I thought I see how far a little coding will get me. As an (maybe a little atypical) example, I choose MNIST as a dataset. My experiments were with k=10 or k=20 but this should not be a restriction on the implementation. First, I tried to do it without adding anything to CUV, just using simple matrix operations that were already there. The result is something like this: clusters=mnist[:,rand_indices] mnist_dev=cp.push(mnist) # copy('F') is necessary so we can slice later on clusters_dev=cp.push(clusters.copy("F")) norms = cp.dev_matrix_cmf(mnist_dev.w, 1) cp.reduce_to_row(norms.vec,mnist_dev,cp.reduce_functor.ADD_S...

Finding a Function by its Symbol

This one is from the live of a C++ programmer. As you might remember, I am working on the CUV Library for CUDA programming in Python. The library uses quite a lot of templates and meta-programming. Since we want to have all functionality in Python, we need to instantiate all the templates. After some refactoring, I got the helpful error message _cuv_python.so: undefined symbol: _ZN3cuv6detail20apply_binary_functorINS_6vectorIfNS_16dev_memory_spaceEjEES4_NS2_IhS3_jEEhEEvRT_RKT0_RKT1_RKNS_13BinaryFunctorERKiRKT2_SL_ Ah, right. That one. Well, often it is possible to guess which function this is about. But since the instantiations are not so straight-forward, I had no idea which function that was. So I wanted to look that up somehow. I tried to google how to find the corresponding function, but without much success. Probably I was missing the right keywords. After some fiddling around, I finally got it: Load your program/library in gdb. You can get a list of all symbols in t...

David Barber : Bayesian Reasoning and Machine Learning

While reading around MetaOptimize , I found a reference to the book " Bayesian Reasoning and Machine Learning" by David Barber. You can download the pdf for free and it will be published soon in Cambridge University Press. It comes with many Matlab examples and seems worth taking a look at :)

Python and Matlab bindings for Damascene, Global Probabilty of Boundary on GPU

So I'm still playing around with constrained parametric min-cuts for object segmentation. A major bottleneck of this algorithm is gPb, the global probability of boundary operator from Malik's group. Luckily, there already is a CUDA implementation of gPb out there: Damascene . Damascene provides a command line interface to apply gPb to ppm images. Since I wanted to include it directly with cpmc, I wrote some mex-wrappers for Damascene. And since I would love to see more algorithms done in Python instead of Matlab, I wrote some Python wrappers, too. You can find both, together with the current version of damascene here . You need a working CUDA setup and the acm library to compile it. Set your paths in common.mk and just "make" it. For the matlab wrappers, you also need to adjust the matlab path in "bindings/Makefile" and "make gpb_mex" in that directory. For the python wrappers, you have to compile damascene with "make shared=1...

"Single Layer Networks in Unsupervised Feature Learning" online!

The paper "Single Layer Networks in Unsupervised Feature Learning" by Coates, Lee, Ng, that I talked about in this post , is now available online ( pdf )! Thanks to Olivier Grisel from metasploit for pointing that out.