Changing Python search path - for good. Or: the magic of the 'site' package.
After using Python for about two years now and being a somewhat active developer, I still frequently run into problems with my Python search path.
Luckily, I usually have root on the boxes I work on, so I could do some hacks.
At the moment, I'm at the IST Austria, where I can use the cluster, but I don't have root. So this time I needed a real solution.
Here it goes:
The problem is the following: I have some locally installed packages, like scikit-learn and joblib, that are not globally installed. That would be easy enough to solve by setting the
But I also have newer versions of already installed packages, like IPython. Here, modifying the Python path environment variable doesn't help, as this is appended to the search path. A somewhat hacky solution is to insert your package dir into the search path at the beginning of each script, like so:
So here is the REAL solution:
*drumroll*
Check out the site package. It tells you how to configure "sites", which are places where Python looks for packages.
Additionally, "site" directories can hold ".pth" files. Those are files that tell Python where to look for additional packages.
You can add sites by using
As not all my packages are installed there (the things that are git checkouts and build inplace), I also added a
To make this a bit more concrete:
My locally installed packages are in
Luckily, I usually have root on the boxes I work on, so I could do some hacks.
At the moment, I'm at the IST Austria, where I can use the cluster, but I don't have root. So this time I needed a real solution.
Here it goes:
The problem is the following: I have some locally installed packages, like scikit-learn and joblib, that are not globally installed. That would be easy enough to solve by setting the
PYTHONPATH
environment variable in my
.profile
, pointing to the install location.But I also have newer versions of already installed packages, like IPython. Here, modifying the Python path environment variable doesn't help, as this is appended to the search path. A somewhat hacky solution is to insert your package dir into the search path at the beginning of each script, like so:
import sys sys.path.insert(0, "MYPATH")That is somewhat nasty, as you have to insert it into every file. Also, I found it gives you trouble when trying to use parallel computing.
So here is the REAL solution:
*drumroll*
Check out the site package. It tells you how to configure "sites", which are places where Python looks for packages.
Additionally, "site" directories can hold ".pth" files. Those are files that tell Python where to look for additional packages.
You can add sites by using
site.addsidedir
, but each user also has a standard site,
on Linux it is ~/.local/lib/pythonX.Y/site-packagesYou can also get this dir by looking at
site.USER_SITE
What I did was create a link at ~/.local/lib/pythonX.Y/site-packagesto point at my local install location.
As not all my packages are installed there (the things that are git checkouts and build inplace), I also added a
.pth
file there, that points to my other directories.To make this a bit more concrete:
My locally installed packages are in
~/python_packages/lib/python2.6/site-packages/
.
So I $ ln -s ~/.local/lib/python2.6/site-packages/ ~/python_packages/lib/python2.6/site-packages/And created a new
.pth
file ~/python_packages/lib/python2.6/site-packages/local.pthcontaining:
import sys; sys.__plen = len(sys.path) /clusterhome/amueller/checkout/joblib /clusterhome/amueller/checkout/scikit-learn ./IPython import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; \ p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)The first and last line are taken from the
easy_install.pth
and they basically insert the path to the front. (Only lines starting with import are executed in
.pth
files
The other lines add directories to the path.
You can check whether you where successful in IPython:
In [1]: import sys In [2]: sys.path Out[2]: ['', '/clusterhome/amueller/python_packages/bin', '/clusterhome/amueller/.local/lib/python2.6/site-packages/pyflakes-0.5.0-py2.6.egg', '/clusterhome/amueller/checkout/joblib', '/clusterhome/amueller/checkout/scikit-learn', '/clusterhome/amueller/.local/lib/python2.6/site-packages/IPython', '/usr/local/lib/python2.6/dist-packages/scikit_learn-0.9-py2.6-linux-x86_64.egg', '/usr/lib/python2.6', '/usr/lib/python2.6/plat-linux2', '/usr/lib/python2.6/lib-tk', '/usr/lib/python2.6/lib-old', '/usr/lib/python2.6/lib-dynload', '/clusterhome/amueller/.local/lib/python2.6/site-packages', '/usr/local/lib/python2.6/dist-packages', '/usr/lib/python2.6/dist-packages', '/usr/lib/python2.6/dist-packages/PIL', '/usr/lib/pymodules/python2.6', '/usr/lib/pymodules/python2.6/gtk-2.0', '/usr/lib/python2.6/dist-packages/wx-2.8-gtk2-unicode', '/clusterhome/amueller/.local/lib/python2.6/site-packages/IPython/extensions']Success :) Hope that helped any one!
Why not use virtualenv instead?
ReplyDeleteI feel virtualenv is somewhat overkill for this simple problem. I just want one environment, which is mine.
ReplyDeleteThat is true. But wouldn't a virtualenv be easier to backup and transfer to other machines?
DeleteYeah, probably.
DeleteI think using virtualenv has a lot of advantages.
ReplyDeleteThe first on top of my head would be the ability to replicate your environment very quickly, using a pip freeze to get all the packages, and then upgrade only the one you need to try, without loosing your current working environment.
Virtualenwrapper makes also super easy to create a new virtualenv, and switch between them.
However you knew the package already, so maybe I'm missing something here!
Maybe I should use virtualenv. To be honest, I never really used it. I try to keep things simple and just adding some directories to my search path was all I wanted ;)
Delete