Showing posts from September, 2012

Recap of my first Kaggle Competition: Detecting Insults in Social Commentary [update 3]

Recently I entered my first kaggle competition - for those who don't know it, it is a site running machine learning competitions. A data set and time frame is provided and the best submission gets a money prize, often something between 5000$ and 50000$.

I found the approach quite interesting and could definitely use a new laptop, so I entered Detecting Insults in Social Commentary.
My weapon of choice was Python with scikit-learn - for those who haven't read my blog before: I am one of the core devs of the project and never shut up about it.

Scikit-learn 0.12 released

Last night I uploaded the new version 0.12 of scikit-learn to pypi. Also the updated website is up and running and development now starts towards 0.13.

The new release has some nifty new features (see whatsnew):
* Multidimensional scaling
* Multi-Output random forests (like these)
* Multi-task Lasso
* More loss functions for ensemble methods and SGD
* Better text feature extraction

Segmentation Algorithms in scikits-image

Recently some segmentation and superpixel algorithms I implemented were merged into scikits-image. You can see the example here.

I reimplemented Felzenszwalb's fast graph based method, quickshift and SLIC.
The goal was to have easy access to some successful methods to make comparison easier and encourage experimenting with the algorithms.

Here is a a comparison of my implementations against the original implementations on Lena (downscaled by a factor of 2). The first row is my implementation, the second the original.