Monday, October 03, 2011

Installation of Boost.Python on Mac OS X

With the current MacPorts version of Boost 1.47.0, I can't follow the Boost.Python installation instructions. I installed the three relevant ports, boost +python27, boost-build, and boost-jam. The installation instructions recommend using the python/quickstart directory, and the include paths in the Jamfile don't exist. The Jamfile in the port is even missing the "import python" statement necessary to load the python-extension rule. The lesson is that it's OK to give up on a MacPorts installation when you are using an unusual feature.

Install boost-1.47.0 from source. Given that there is already a MacPorts installation, whose default install directory is /opt/local, put the boost_1_47_0 directory directly into /opt as /opt/boost_1_47_0. Follow build instructions to make boost's bjam and b2, so that the whole lot end up in /opt/bin, /opt/include, and /opt/lib. My build command was:

sudo ./ --with-bjam=/opt/local/bin/bjam --with-toolset=darwin --with-python=/opt/local/bin/python2.7 --prefix=/opt --without-libraries=mpi,regex

I was trying to use the MacPorts bjam, but Boost built its own, anyway, which turns out to be good because it builds the newer b2 version of bjam. Boost.Python defaults to the Mac OS X default Python, so why not specify your favorite version? Then return to the Boost.Python installation instructions. I had to set the path so that the newer Boost is earlier:

export DYLD_LIBRARY_PATH=/opt/lib
export PATH=/opt/bin:$PATH

There are still going to be errors about conflicts with isspace and other functions in localfwd.h. These come from a conflict with newer definitions in pyports.h that are designed to handle UTF-8. I got around these by installing MacPorts port for gcc45. Then, in the Jamroot of the sample directory, add:

using darwin : 4.5.3 : g++-mp-4.5 ;

The darwin import is derived from the gcc import, so you can give it pretty much the same options. In this case, it points directly to g++. Once this is done, you can run "sudo bjam" in the quickstart directory to build Boost.Python. This will build the libboost_python library so you can now run without using sudo in your project's working directory.

Still not done. If you are using Numpy arrays in Boost.Python, then you need to include the correct headers, meaning '#include "numpy/arrayobject.h"'. These are installed in a separate place on my Mac, again likely by MacPorts, but can be found by a change to the python-extension rule.

python-extension myproject : file.cpp
: /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/include ;

How do you find these things? Use the Mac's whole-machine find command on the command-line.

mdfind -name arrayobject.h -onlyin /opt

In the end, the boost-build.jam contains "boost-build /opt/boost_1_47_0/tools/build/v2 ;" and the project Jamroot uses "use-project boost : /opt/boost_1_47_0".


Wednesday, March 09, 2011

skipping incompatible in /usr/lib64

I was building a 32-bit executable on a 64-bit machine, using the -m32 switch, and thought I knew how to deal with this error from the g++ or gcc compiler.

/usr/bin/ld: skipping incompatible /usr/lib64/ when searching for -lelf
/usr/bin/ld: skipping incompatible /usr/lib64/libelf.a when searching for -lelf
/usr/bin/ld: cannot find -lelf

I checked the link line for any binaries that might be 64-bit by running the file command, as in "file tau_run.o". They all looked 32-bit. I checked my LIBRARY_PATH environment variable, which tells the linker in which directories to find libraries.

The problem turned out to be not that the compiler was looking in the wrong place but that it was looking for the a file that did not exist. The dyninstAPI library has a file called, but they did not include a link from to, which is what the compiler wants to find. The compiler was doing a great job of rejecting 64-bit libraries and its error message was actually rather clear that it could not find the -lelf it was looking for.

As usual, HTH.

Tuesday, February 08, 2011

How Does R Build a Package with C Source Code?

According to Writing R Extensions, I put C files into a package subdirectory and supply a and, and the command “R CMD INSTALL” will compile the extensions. I’m working in Linux, so it uses configure somehow, but what exactly does it do? Where do the defaults come from and how can I change them? As of R version 2.12.1, the build scripts are in the R tools package, no longer written in Perl.

There are a lot of files to put in an R package, from help files to a general description of the library. We are concerned with these three, as an example, for a library called foo.
  • foo/ - Input to create a configure script.
  • foo/src/ - The configure script will create Makevars from this file.
  • foo/src/foo.c - This is our C source to compile.
There are three commands R makes available to work with source while you develop it.
  • R CMD build foo - This creates a tar file of the directory suitable for installing later.
  • R CMD INSTALL foo - This compiles and installs foo into a directory.
  • R CMD check foo - This does all kinds of detailed checks on the health of the R package, all listed in Writing R Extensions, and it also calls R CMD install, so it’s a good way to smoke test compilation.
If you call “R CMD check foo”, then it calls the library tools:::.check_packages() which eventually installs the library into a local directory by calling R again:

R CMD INSTALL -l '/home/username/Documents/rlib/foo.Rcheck' --no-html --no-multiarch '/Users/ajd27/Documents/rlib/foo'

As an aside, calling “R CMD blah” sets R environment variables and then invokes a shell script which looks for a script called blah in R’s bin directory. If it doesn’t find one, it just executes whatever you passed it. Try “R CMD ls -la .” or “R CMD env|sort” to see what environment variables R defines.

The install command is implemented in tools:::.install_packages(). The easiest way to see what it does is to look in the R source code, in the src/library/tools/R/install.R. It executes these steps on your behalf.
  1. Define R-specific variables. These are listed below for one sample.
  2. Call autoconf to create foo/configure from foo/
  3. Call foo/configure, whose main goal is to make foo/src/Makevars from foo/src/
  4. Look for makefiles in foo/src and call make in foo/src to create shared libraries.
We can specify arguments to configure on the INSTALL command line, with --configure-args and --configure-vars. For instance, typing

R CMD INSTALL “--configure-args=--enable-lizards --disable-frogs”

will call

./configure --enable-lizards --disable-frogs

Use of quotation marks varies depending on the shell. The only way R modifies the execution of the configure command is to define the variables listed at the end of this post. The Guide to Writing R Extensions, however, recommends that authors of configure scripts use R to set defaults using R’s config command. Try

R CMD config --help

to see a list of variables R remembers from when it was configured and compiled.

When R calls make, it tacks a few files together. The first is the Makevars that configure just customized. The next is a list of variables, mostly from when R, itself, was configured. The last, shlib, is the target to build a shared library.

make -f Makevars -f /opt/local/lib/R/etc/x86_64/Makeconf -f /opt/local/lib/R/share/make/ SHLIB='' OBJECTS='foo.o'

The only variables not defined explicitly with Makeconf are
  • PKG_CFLAGS - Where includes go.
  • PKG_CPPFLAGS - For the C preprocessor, if relevant.
  • PKG_CXXFLAGS - For the C++ compiler.
  • PKG_OBJCFLAGS - Objective C’s CFLAGS.
  • PKG_OBJCXXFLAGS - Objective C++’s CFLAGS.
  • PKG_LIBS - Where we put libraries and the directories that hold them.
These are the only variables we should bother to define within Everything else, from CXX to CFLAGS, is already explicitly within Makeconf, so tough cookies if we want to change it, unless we are willing to make a custom target in our own Makefile in the src directory.

For our package, foo, we want to give the person installing the software a way to customize the include directories and library locations, so we probably want to check in our for the existence of FOO_CFLAGS and FOO_LIBS and assign those values to PKG_CFLAGS and PKG_LIBS. Using package-specific naming helps when there are multiple packages installed, an using variables at all helps people installing avoid figuring out how to pass command-line arguments to R.

Sample Variables Defined Before Configure and Make

EGREP=/usr/bin/grep -E
LN_S=ln -s