As with all feature extraction algorithms, it was obviously of utmost importance to be able to learn Gabor filters.
Inspired by his work and Natural Image Statistics, a great book on the topic of feature extraction from images, I wanted to see how hard it is to learn Gabor filters with my beloved scikit-learn.
I chose independent component analysis, since this is discusses to some depth in the book.
Luckily mldata had some image patches that I could use for the task.
import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import fetch_mldata from sklearn.decomposition import FastICA # fetch natural image patches image_patches = fetch_mldata("natural scenes data") X = image_patches.data # 1000 patches a 32x32 # not so much data, reshape to 16000 patches a 8x8 X = X.reshape(1000, 4, 8, 4, 8) X = np.rollaxis(X, 3, 2).reshape(-1, 8 * 8) # perform ICA ica = FastICA(n_components=49) ica.fit(X) filters = ica.unmixing_matrix_ # plot filters plt.figure() for i, f in enumerate(filters): plt.subplot(7, 7, i + 1) plt.imshow(f.reshape(8, 8), cmap="gray") plt.axis("off") plt.show()
And the result (takes ~40sec):
As Andrej suggested, here the k-means filters on whitened data:
For reference, the PCA whitening filters:
You can find the updated gist here.
There is also a nice example of learning similar filters from lena using scikits-image.