Graphcuts for Python: pygco (slight update)
I have been using the excellent gco library for energy minimization with graph cuts for quite some time. Finally I got around to clean up / rewrite some of my Python wrappers so that maybe someone else can use them, too.
So what does this library do?
It does discrete energy minimization on loopy graphs. This is an important topic in computer vision, since it can be used for segmentation, stereo vision and optical flows. When you read "CRF" in a computer vision paper and there is no learning involved, it just means they minimized an energy function (probably using gco). Often this is just some form of "smoothing" but a lot more can be done with it.
Let's talk about pairwise energy functions. These functions take the form
E = \sum v_i(y_i) + \sum w_ij(y_i, w_j)
with v_i and w_ij are some fixed parameters and y_i are the multinomial variables over which we want to minimize this energy. Typically the i run over all pixels in an image, and w_ij is only nonzero for adjacent pixels.
In general, think of v and w as tables (of size n_labels or n_labels x n_labels) for each node resp. edge in the graph. In simple models, these are the same tables for all nodes / edges, in more complicated ones they vary.
In general this is NP hard, but there are many interesting cases, in which this problem can (somewhat surprisingly) be solved efficiently and exactly.
If y_i are binary and the energy is submodular, the exact solution can be found efficiently.
If the y_i are not binary, but the energy fulfills a related property, it is possible to efficiently find good, if not optimal, solutions. See the seminal paper of Yuri Boykov, Olga Veksler and Ramin Zabih Fast approximate energy minimisation via graph cuts.
This is where gco comes in: it implements two move-making algorithms, alpha-expansion and alpha-beta-swaps, that solve a series of binary problems to obtain a solution to a multi-label problem.
Going back to the applications I mentioned above, these labels could be different depths in stereo vision, different directions of movements in optical flow, or different object classes in semantic segmentation.
There are several ways to specify the energy for gco and I don't want to go into to much detail here. Have a look at the README if you are interested.
For the moment, I have ported two ways to Python (but the rest should come soon). One is for general graphs, specified by listing all edges, the other one is for 2D grid graphs (images).
In all cases, the unary potentials (the v_i above) are just given as one number for each vertex (pixel in the grid graph case) and state.
In the simple case that I wrapped for Python, the pairwise potentials (the w_ij)
are fixed over the whole graph and only depend on the combination of labels y_i and y_j.
This contains the case of the simple Pott's potential, where w_ij is one if y_i = y_j and zero otherwise.
If you want to use the code, you need to download the original gco library and my wrappers. There is one thing you have to keep in mind when working with gco, though: all potentials are expected to be integers, so you need to round them!
Here are some examples (available at my git repo):
This is a trivial example of smoothing. We have a binary image, add some noise and want to recover the original image. The example contains a "grid graph" version and a version where I explicitly construct the edges.
The same works with multinomial input. Here I used two different kind of pairwise potentials, the second from the right used Pott's potentials, while the one on the very right knows that blue-green and green-red edges are more likely than blue-red.
And finally, a not-as-trivial-looking example:
The first two images on the top are consecutive frames from a video sequence from the middlebury benchmark. This example tries to find depth layers in the image by using paralax.
The second image is translated in x direction and the best fit for each pixel is found. The image on the top right gives the best translation for each individual pixel. The bottom images show the results obtained with alpha-expansion and Pott's potentials (right) and a 1d topology (left).
This is not really how you would do stereo vision (the potentials I used are very naive) but I feel it is a nice example (about 10 lines in Python).
I hope some people find this helpful an there will be more computer vision code in Python soon :)
So what does this library do?
It does discrete energy minimization on loopy graphs. This is an important topic in computer vision, since it can be used for segmentation, stereo vision and optical flows. When you read "CRF" in a computer vision paper and there is no learning involved, it just means they minimized an energy function (probably using gco). Often this is just some form of "smoothing" but a lot more can be done with it.
Let's talk about pairwise energy functions. These functions take the form
E = \sum v_i(y_i) + \sum w_ij(y_i, w_j)
with v_i and w_ij are some fixed parameters and y_i are the multinomial variables over which we want to minimize this energy. Typically the i run over all pixels in an image, and w_ij is only nonzero for adjacent pixels.
In general, think of v and w as tables (of size n_labels or n_labels x n_labels) for each node resp. edge in the graph. In simple models, these are the same tables for all nodes / edges, in more complicated ones they vary.
In general this is NP hard, but there are many interesting cases, in which this problem can (somewhat surprisingly) be solved efficiently and exactly.
If y_i are binary and the energy is submodular, the exact solution can be found efficiently.
If the y_i are not binary, but the energy fulfills a related property, it is possible to efficiently find good, if not optimal, solutions. See the seminal paper of Yuri Boykov, Olga Veksler and Ramin Zabih Fast approximate energy minimisation via graph cuts.
This is where gco comes in: it implements two move-making algorithms, alpha-expansion and alpha-beta-swaps, that solve a series of binary problems to obtain a solution to a multi-label problem.
Going back to the applications I mentioned above, these labels could be different depths in stereo vision, different directions of movements in optical flow, or different object classes in semantic segmentation.
There are several ways to specify the energy for gco and I don't want to go into to much detail here. Have a look at the README if you are interested.
For the moment, I have ported two ways to Python (but the rest should come soon). One is for general graphs, specified by listing all edges, the other one is for 2D grid graphs (images).
In all cases, the unary potentials (the v_i above) are just given as one number for each vertex (pixel in the grid graph case) and state.
In the simple case that I wrapped for Python, the pairwise potentials (the w_ij)
are fixed over the whole graph and only depend on the combination of labels y_i and y_j.
This contains the case of the simple Pott's potential, where w_ij is one if y_i = y_j and zero otherwise.
If you want to use the code, you need to download the original gco library and my wrappers. There is one thing you have to keep in mind when working with gco, though: all potentials are expected to be integers, so you need to round them!
Here are some examples (available at my git repo):
This is a trivial example of smoothing. We have a binary image, add some noise and want to recover the original image. The example contains a "grid graph" version and a version where I explicitly construct the edges.
The same works with multinomial input. Here I used two different kind of pairwise potentials, the second from the right used Pott's potentials, while the one on the very right knows that blue-green and green-red edges are more likely than blue-red.
And finally, a not-as-trivial-looking example:
The second image is translated in x direction and the best fit for each pixel is found. The image on the top right gives the best translation for each individual pixel. The bottom images show the results obtained with alpha-expansion and Pott's potentials (right) and a 1d topology (left).
This is not really how you would do stereo vision (the potentials I used are very naive) but I feel it is a nice example (about 10 lines in Python).
I hope some people find this helpful an there will be more computer vision code in Python soon :)
Hi,
ReplyDeleteI failed to install pycgo on my machine.
It fails on python setup.py build step, it can't compile gco library properly.
any hints ?
Hi Kostia.
ReplyDeleteIn writing the wrapper I was a bit lazy. You have to compile gco yourself. Unfortunately it doesn't come with a makefile iirc.
I'm not on my own box at the moment so I can't look up what I used but I think it should be a single gcc line (something along the lines of gcc --shared GCOptimization.cpp -O2 -o gco.so).
Hi Andreas,
DeleteI tried this, also I tried smth like this
gcc -fPIC -c GCoptimization.cpp \
LinkedBlockList.cpp \
graph.cpp \
maxflow.cpp
g++ -shared GCoptimization.o maxflow.o .o LinkedBlockList.o graph.o -o gco.so
but still it didn't compile for me...
What did not compile? gco.so or the Cython part?
DeleteAnd what error message did you get?
gco.so didn't compile...
DeleteI got a lot of them...
for instance,
the beginning of it is
GCoptimization.h:389:16: error: ‘ptrdiff_t’ does not name a type
GCoptimization.h:414:16: error: ‘ptrdiff_t’ does not name a type
GCoptimization.cpp:31:23: warning: ‘GCO_CLOCKS_PER_SEC’ initialized and declared ‘extern’ [enabled by default]
GCoptimization.cpp: In member function ‘GCoptimization::SiteID GCoptimization::GreedyIter::feasibleSites() const’:
GCoptimization.cpp:555:73: error: no match for ‘operator-’ in ‘((const GCoptimization::GreedyIter*)this)->GCoptimization::GreedyIter::m_siteend - ((const GCoptimization::GreedyIter*)this)->GCoptimization::GreedyIter::m_site’
GCoptimization.cpp: In member function ‘GCoptimization::EnergyTermType GCoptimization::DataCostFnSparse::search(GCoptimization::DataCostFnSparse::DataCostBucket&, GCoptimization::SiteID)’:
GCoptimization.cpp:1823:19: error: ‘cLinearSearchSize’ was not declared in this scope
In file included from GCoptimization.h:108:0,
from GCoptimization.cpp:4:
energy.h: In instantiation of ‘int Energy::get_var(Energy::Var) [with captype = int; tcaptype = int; flowtype = long long int; Energy::Var = int]’:
GCoptimization.cpp:197:20: required from here
energy.h:328:91: error: ‘what_segment’ was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
energy.h:328:91: note: declarations in dependent base ‘Graph’ are not found by unqualified lookup
energy.h:328:91: note: use ‘this->what_segment’ instead
energy.h: In instantiation of ‘void Energy::add_term1(Energy::Var, Energy::Value, Energy::Value) [with captype = int; tcaptype = int; flowtype = long long int; Energy::Var = int; Energy::Value = int]’:
GCoptimization.cpp:280:22: required from here
energy.h:207:2: error: ‘add_tweights’ was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
energy.h:207:2: note: declarations in dependent base ‘Graph’ are not found by unqualified lookup
energy.h:207:2: note: use ‘this->add_tweights’ instead
energy.h: In instantiation of ‘void Energy::add_term2(Energy::Var, Energy::Var, Energy::Value, Energy::Value, Energy::Value, Energy::Value) [with captype = int; tcaptype = int; flowtype = long long int; Energy::Var = int; Energy::Value = int]’:
GCoptimization.cpp:304:42: required from here
energy.h:220:2: error: ‘add_tweights’ was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
energy.h:220:2: note: declarations in dependent base ‘Graph’ are not found by unqualified lookup
energy.h:220:2: note: use ‘this->add_tweights’ instead
energy.h:235:3: error: ‘add_tweights’ was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
energy.h:235:3: note: declarations in dependent base ‘Graph’ are not found by unqualified lookup
energy.h:235:3: note: use ‘this->add_tweights’ instead
energy.h:236:3: error: ‘add_tweights’ was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
energy.h:236:3: note: declarations in dependent base ‘Graph’ are not found by unqualified lookup
Which of the two commands did that? The second should just link, right? And did you try the first with g++ ?
Deleteit did the first command... Yes, I tried...
DeleteI also googled for it, but without success.
I didn't use GCO before. May be It couldn't be compiled with gcc/g++ ?
Did you compile it under linux ?
This comment has been removed by the author.
DeleteI found that in order to compile gco with a recent version of gcc, you need to include stddef.h in GCoptimization.h, then it's just a matter of compiling and linking into a shared object.
DeleteThis comment has been removed by the author.
ReplyDeleteThe problem was because of my gcc 4.7. I managed to compile it, but didn't tested a lot, but at first glance it works.
ReplyDeleteI ran into the same problem with gcc 4.7. I was able to compile by editing energy.h and adding "this->" in front of the following functions: what_segment(), add_edge(), add_tweights()
DeleteKostia Could you please let me know how you solved the compilation problem. I am using 4.4.6. Thanks. I am new to the Linux.
DeleteAnonymous, I tried putting this in front of the functions. Still I am facing compilation problem. Could you please paste your modified code
Andreas,
DeleteCould you please let me know how GCO will help in segmenting the image and assigning labels. When I am looking at the code no where Image is being taken as input. If so how it will segment the image. Please let me know. Thanks.
This comment has been removed by the author.
ReplyDeleteA cross-platform solution for compiling gco, just use the following CMakeLists.txt
ReplyDeletecmake_minimum_required (VERSION 2.6)
PROJECT(gco)
SET( gcoSources
GCoptimization.h GCoptimization.cpp
graph.h graph.cpp
LinkedBlockList.h LinkedBlockList.cpp
energy.h
block.h
maxflow.cpp )
SET( LIBRARY_OUTPUT_PATH ${CMAKE_CURRENT_BINARY_DIR}/lib CACHE PATH
"Output for libraries" )
ADD_LIBRARY(gco SHARED ${gcoSources})
I also had to follow the instructions at
http://web.archive.org/web/20101221084639/http://nukeit.org/compile-python-2-7-packages-with-visual-studio-2010-express/
to get distutils to work with windows otherwise very easy. Thanks for the blog!
Thanks :) I was a bit to lazy to do this. If you want, a pull request for github with some instructions would be very welcome.
ReplyDeleteHey Andreas,
ReplyDeleteThanks for the wrapper. I added edge weights as well which was easy, so though of sharing it with you. It uses the setSmoothCostVH function.
Thanks again, Amir
#############
def cut_simple_vh(np.ndarray[np.int32_t, ndim=3, mode='c'] unary_cost,
np.ndarray[np.int32_t, ndim=2, mode='c'] pairwise_cost,
np.ndarray[np.int32_t, ndim=2, mode='c'] costV,
np.ndarray[np.int32_t, ndim=2, mode='c'] costH,
n_iter=5,
algorithm='expansion'):
"""
Apply multi-label graphcuts to grid graph.
Parameters
----------
unary_cost: ndarray, int32, shape=(width, height, n_labels)
Unary potentials
pairwise_cost: ndarray, int32, shape=(n_labels, n_labels)
Pairwise potentials for label compatibility
costV: ndarray, int32, shape=(width, height)
Vertical edge weights
costH: ndarray, int32, shape=(width, height)
Horizontal edge weights
n_iter: int, (default=5)
Number of iterations
algorithm: string, `expansion` or `swap`, default=expansion
Whether to perform alpha-expansion or alpha-beta-swaps.
"""
if unary_cost.shape[2] != pairwise_cost.shape[0]:
raise ValueError("unary_cost and pairwise_cost have incompatible shapes.\n"
"unary_cost must be height x width x n_labels, pairwise_cost must be n_labels x n_labels.\n"
"Got: unary_cost: (%d, %d, %d), pairwise_cost: (%d, %d)"
%(unary_cost.shape[0], unary_cost.shape[1], unary_cost.shape[2],
pairwise_cost.shape[0], pairwise_cost.shape[1]))
if pairwise_cost.shape[1] != pairwise_cost.shape[0]:
raise ValueError("pairwise_cost must be a square matrix.")
cdef int h = unary_cost.shape[1]
cdef int w = unary_cost.shape[0]
cdef int n_labels = pairwise_cost.shape[0]
if (pairwise_cost != pairwise_cost.T).any():
raise ValueError("pairwise_cost must be symmetric.")
if costV.shape[0] != w or costH.shape[0] != w or costV.shape[1] != h or costH.shape[1] != h:
raise ValueError("incorrect costV or costH dimensions.")
cdef GCoptimizationGridGraph* gc = new GCoptimizationGridGraph(h, w, n_labels)
gc.setDataCost(unary_cost.data)
gc.setSmoothCostVH(pairwise_cost.data, costV.data, costH.data)
if algorithm == 'swap':
gc.swap(n_iter)
elif algorithm == 'expansion':
gc.expansion(n_iter)
else:
raise ValueError("algorithm should be either `swap` or `expansion`. Got: %s" % algorithm)
cdef np.npy_intp result_shape[2]
result_shape[0] = w
result_shape[1] = h
cdef np.ndarray[np.int32_t, ndim=2] result = np.PyArray_SimpleNew(2, result_shape, np.NPY_INT32)
cdef int * result_ptr = result.data
for i in xrange(w * h):
result_ptr[i] = gc.whatLabel(i)
return result
ha, the comment section seems to screw all the indentations :(
ReplyDeleteThanks for sharing. If you want, you can add a pull-requestion on github. Otherwise it might take a while until I have time to include it.
ReplyDeleteDear Andreas,
ReplyDeletethis blog is very cool. I'm interested in this gco program, but I didn't find a good example in c++, like middlebury. So ff I have a grayscale image and lets say 4 labels, then how can I run this program, to segment the image into 4 region. I know, that you interested in python, but do you know some examples in the net?
Thanks.
Thanks :)
DeleteYou can find code for middlebury that compares several approaches here: http://vision.middlebury.edu/MRF/
I haven't looked at it in detail but it should be possible to reproduce the middlebury results with it.
Hi there. I'm trying to use your wrappers but after compiling and building them i'm having always the same error : "DLL load failed: The specified module could not be found". I used the depency walker tools on the pygco.pyd and says that At least one required implicit or forwarded dependency was not found: LIBSTDC++-6.DLL, LIBGCC_S_SJLJ-1.DLL. When i look it up on google most says that these are problems of compilers versions but i dont know how to resolve them. Can you help me?
ReplyDeleteThanks
Sorry, I have no idea. You are building using GCC under Windows? Cygwin? Btw, maybe you should give http://hci.iwr.uni-heidelberg.de/opengm2/ a shot. They have a branch containing python wrappers: https://github.com/DerThorsten/opengm/tree/RC2.1
DeleteWith windows. I'm no exepert with C++ and compilers but it worked for me one time, now i don't know whats rong. I cmpiled the files using Windows Visual studio 2012, i Build the wraper and copied the pygco.pyd file to my python interpreter directory. Isn't that what you supose to do?
ReplyDeleteThanks for your comment. The problem is that I can not maintain the source code of the actual implementation for license reasons. You can try to contact the people at western university that actually maintain the software to fix this. Fetching the source is complicated enough, I don't want to maintain a set of patches on their version.
DeleteI had a compilation error on Mac OS.
ReplyDeletegco_src/energy.h:245:3: error: use of undeclared identifier 'add_tweights'
Here are some notes how to install gco_python on mac and resolve this problem:
git clone https://github.com/amueller/gco_python.git
cd gco_python
mkdir gco_src
cd gco_src
curl -O http://vision.csd.uwo.ca/code/gco-v3.0.zip
unzip gco-v3.0.zip
curl -O https://raw.github.com/mjirik/pyseg_base/master/distr/energy.patch
patch energy.h < energy.patch
cd ..
sudo python setup.py install
https://github.com/mjirik/pyseg_base/blob/master/INSTALL
Hi Andy,
ReplyDeleteThe wrapper is awesome. Thank you for sharing.
However, when I feed certain matrices to the cut_simple function, the programme always throw this error message.
terminate called after throwing an instance of 'GCException'
Do you have any idea what happened?
Cheers.
Hi Thanapong.
DeleteI would assume that is because the potentials encoded by your matrix are not submodular. Have you checked that? In that case I would recommend you try pyqpbo. If you are lucky you can just "pip install pyqpbo".
Cheers,
Andy
Hi,
DeleteThank you very much. I will try the pyqpbo.
Same error but installing pyqpbo did not work, any further leads?
DeleteThis comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
DeleteHello, may I have question. Some how I tried the function cut_from_graph(*args, **kwargs) but in the end it returns only 2-class instead e.g 3 or more I asked... Any advice? Thanks
ReplyDeleteBtw, I can provide the sample code to have look, but there is no room here to upload it or put images (illustration).
thanks so much but can you give me link for python source i tried alot and not good for my machine
ReplyDeletejeddah transfer furniture company , jeddah cleanning tanks company , jeddah pest control company