Comments on Peekaboo: Another look at MNIST

I think it's a valid point. MNIST is really no...

2017-02-09T19:54:34.160+01:00

I think it's a valid point. MNIST is really not that challenging compared to a "generic" classification problem, because the classes form nice clusters. I am looking for a problem that had more convex regions, and that is probably what happens when you aggregate 0-4 and 5-9 as pointed out before.

One important thing to notice here is that you re-calculated the PCA for each pair here. So you found the one single optimal direction to classify between each pair. But what that means in the original domain is that you need all the 10*9 = 90 optimal features to make your classification. And indeed, 90 components seem to be the point you start to "explain" less than 90% of the variance. Having even more possible negative samples also probably makes the problem a little more challenging, e.g. mixing some letters or CIFAR images. It's more noise to filter out...

What we see from here is that _most_ of the sample...

2013-02-27T09:46:42.218+01:00

What we see from here is that _most_ of the samples are easily separable, but not all of them. This is very common in practice that given some realistic dataset you can easily reach 90% recognition rate with the simplest algorithm you have, but every next percent is very difficult to achieve.

Check this! http://arxiv.org/abs/1301.3342 (Barn...

2013-02-01T18:50:27.427+01:00

Check this! http://arxiv.org/abs/1301.3342 (Barnes-Hut SNE: Laurens van der Maaten)
all 70,000 non-linear DR, among other datasets!

I know it's a quick hack, but a much harder pr...

2013-01-16T18:23:00.354+01:00

I know it's a quick hack, but a much harder problem on MNIST is to separate two classes: digits 0-4 versus digits 5-9. Just give it a try!

The same test was done on USPS by Chapelle, in the now-classic paper "Training a S.V.M. in the Primal".

Nice work and observation. It happens in classifi...

2012-12-26T07:40:49.301+01:00

Nice work and observation.
It happens in classification practices that different algorithms often yield similar accuracy measures. Most of the time algorithm doesn't matter as much.