[update] This post is a bit old, but many people still seem interested. So just a short update: Nowadays I would use Python and scikit-learn to do this. Here is an example of how to do cross-validation for SVMs in scikit-learn. Scikit-learn even downloads MNIST for you. [/update] MNIST is, for better or worse, one of the standard benchmarks for machine learning and is also widely used in then neural networks community as a toy vision problem. Just for the unlikely case that anyone is not familiar with it: It is a dataset of handwritten digits, 0-9, in black on white background. It looks something like this: There are 60000 training and 10000 test images, each 28x28 gray scale. There are roughly the same number of examples of each category in the test and training datasets. I used it in some papers myself even though there are some reasons why it is a little weird. Some not-so-obvious (or maybe they are) facts are: - The images actually contain a 20x20 patch of digi