Saturday, December 04, 2010

Installing Rmpi under Linux on Ranger at TACC

Rmpi is a package for the R language to allow it to use MPI to run in parallel. Installation on Ranger is complicated by Ranger's selection of compilers, MPI libraries, and default installed packages. I got it to work, so here are hints.

R's default compiler options for installed packages are the same as those used to compile R. The Ranger copy of R was compiled with gcc. MPI libraries on Ranger are compiled with either PGI or Intel compilers, and there can be incompatibilities loading shared libraries from PGI and Intel into gcc code (especially the version of PGI on Ranger). An easy way to get started healthily is to recompile R with the Intel compilers, so "module swap pgi intel" and configure a home directory installation of R using mpicc, mpicxx, and mpif90 with the Known Good Intel options for Ranger, "-O2 -xW -fPIC".

The Rmpi library depends on the R wrapper for the SPRNG library which, of course, depends on SPRNG itself. It also needs gmp. Ranger's gmp library is just fine, so "module add gmp". If you want to know includes and libs for any Ranger library, use "module help ". The SPRNG library has a problem, though, because it wasn't compiled with -fPIC, which is necessary to build shared libraries on this platform.

Rmpi needs SPRNG 2.0, no other version. The Ranger sprng2.0 library is compiled without -fPIC, so we make our own. Download it with "curl http://sprng.cs.fsu.edu/Version2.0/sprng2.0b.tar.gz|tar zxf -". Build the MPI version with the Intel compilers, again, using the Known Good Intel Options, "-O2 -xW -fPIC" and mpicc, mpicxx, mpif90, but add another option Fortran flags to help Fortran compatibility, "-assume 2underscores". This comes from a little bug in the SPRNG autoconf, but it only affects Fortran builds. You have to set this information in sprng2.0/make.CHOICES and sprng2.0/src/make.INTEL.

SPRNG doesn't make a shared executable. No worries. We can rebuild it as a shared library. Go to the directory of libsprng.a, and repack it with a reference to the gmp library.

ar -x libsprng.a
icc -shared -fPIC -Wl,-rpath,$TACC_GMP_LIB -Wl,-zmuldefs *.o -L${TACC_GMP_LIB} -lgmp -o libsprng.so

Were you to forget to add the reference to -lgmp, you would later see that libsprng.so cannot be loaded by R because it cannot find a function __gmp_cmp.

To build rsprng, first download the tar file. Then invoke R's installer with hints about the location of sprng. Mine looked as follows.

~/R/bin/R CMD INSTALL --configure-vars='CFLAGS="-I/opt/apps/gmp/4.2.4/include" LDFLAGS="-L/opt/apps/gmp/4.2.4/lib"' --configure-args='--with-sprng=/share/home/00933/tg459569/sprng' rsprng_1.0.tar.gz

If you get the paths correct, that should build. Now the Rmpi library wants the same attention as Rsprng. First download it. Then let R build it.

~/R/bin/R CMD INSTALL --configure-vars='CFLAGS="-O2 -xW -fPIC" LDFLAGS="-O2 -xW -fPIC"' --configure-args='--with-Rmpi-type=MPICH' Rmpi_0.5-9.tar.gz

When R installs, it will fail its first test because miprun won't work on a Ranger login node. Just go to http://math.acadiau.ca/ACMMaC/Rmpi/index.html and try the tutorial in a batch job.

HTH
Drew

2 comments:

Anonymous said...

I found this post using Google and I'm trying to follow them for my own cluster but I'm hitting a wall in trying to rebuild sprng as a shared library. I can de-archive, but when I try to recompile using gcc (not icc), I get:

/usr/bin/ld: checkid.o: relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC

I'd appreciate any help you might be able to offer; I'm not too experienced when it comes to all this low-level linking/sharing.

Zack said...

Great Post! Can you elaborate on how to install R using mpicc...

Do I need to change all the gcc to mpicc?

Thanks.