diff options
author | Ricardo Wurmus <rekado@elephly.net> | 2017-03-24 11:09:03 +0100 |
---|---|---|
committer | Ricardo Wurmus <rekado@elephly.net> | 2017-03-24 11:09:03 +0100 |
commit | 4dea9d78fa2eb3fecc57bbc6ecc28425a190ca4e (patch) | |
tree | 7d56c5dcacbdbd276c60888a329d9c20886732fa /posts | |
parent | d48827971ffd2093e19d073c13faedccb9668b4f (diff) |
posts: Add "Using R with Guix".
Diffstat (limited to 'posts')
-rw-r--r-- | posts/2017-03-24-r-with-guix.skr | 174 |
1 files changed, 174 insertions, 0 deletions
diff --git a/posts/2017-03-24-r-with-guix.skr b/posts/2017-03-24-r-with-guix.skr new file mode 100644 index 0000000..3fd806b --- /dev/null +++ b/posts/2017-03-24-r-with-guix.skr @@ -0,0 +1,174 @@ +(post + :title "Using R with Guix" + :date (string->date* "2017-03-24 10:00") + :tags '("gnu" + "planet-fsfe-en" + "free software" + "guix" + "r") + + (h2 [Introducing the actors]) + + (p [For the past few years I have been working on +,(ref "https://gnu.org/software/guix" "GNU Guix"), a functional +package manager. One of the most obvious benefits of a functional +package manager is that it allows you to install any number of +variants of a software package into a separate environment, without +polluting any other environment on your system. This feature makes a +lot of sense in the context of scientific computing where it may be +necessary to use different versions or variants of applications and +libraries for different projects.]) + + (p [Many programming languages come with their own package management +facilities, that some users rely on despite their obvious limitations. +In the case of GNU R the built-in ,(code "install.packages") procedure +makes it easy for users to quickly install packages from CRAN, and the +third-party ,(code "devtools") package extends this mechanism to +install software from other sources, such as a git repository.]) + + (h2 [ABI incompatibilities]) + + (p [Unfortunately, limitations in how binaries are executed and +linked on GNU+Linux systems make it hard for people to continue to use +the package installation facilities of the language when also using R +from Guix on a distribution of the GNU system other than GuixSD. +Packages that are installed through ,(code "install.packages") are +built on demand. Some of these packages provide bindings to other +libraries, which may be available at the system level. When these +bindings are built R uses the compiler toolchain and the libraries the +system provides. All software in Guix, on the other hand, is +completely independent from any libraries the host system provides, +because that's a direct consequence of implementing functional package +management. As a result, binaries from Guix do not have binary +compatibility with binaries built using system tools and linked with +system libraries. In other words: due to the lack of a shared ABI +between Guix binaries and system binaries, packages built with the +system toolchain and linked with non-Guix libraries cannot be loaded +into a process of a Guix binary (and vice versa).]) + + (p [Of course, this is not always a problem, because not all R +packages provide bindings to other libraries; but the problem usually +strikes with more complicated packages where using Guix makes a lot of +sense as it covers the whole dependency graph.]) + + (p [Because of this nasty problem, which cannot be solved without a +redesign of compiler toolchains and file formats, I have been +recommending people to just use Guix for everything and avoid mixing +software installation methods. Guix comes with many R packages and +for those that it doesn't include it has an importer for the CRAN and +Bioconductor repositories, which makes it easy to create Guix package +expressions for R packages. While this is certainly valid advice, it +ignores the habits of long-time R users, who may be really attached to +,(code "install.packages") or ,(code "devtools").]) + + + (h2 [Schroedinger's Cake]) + + (p [There is another way; you can have your cake and eat it too. The +problem arises from using the incompatible libraries and toolchain +provided by the operating system. So let's just ,(em "not") do this, +mmkay? As long as we can make R from Guix use libraries and the +compiler toolchain from Guix we should not have any of these +ABI problems when using ,(code "install.packages").]) + + (p [Let's create an environment containing the current version of R, +the GCC toolchain, and the GNU Fortran compiler with Guix. We could +use ,(code "guix environment --ad-hoc") here, but it's better to use a +persistent profile.]) + + (pre (code [$ guix package -p /path/to/.guix-profile \ + -i r gcc-toolchain gfortran])) + + (p [To "enter" the profile I recommend using a sub-shell like this:]) + + (pre (code [$ bash +$ source /path/to/.guix-profile/etc/profile +$ … +$ exit])) + + (p [When inside the sub-shell we see that we use both the GCC +toolchain and R from Guix:]) + + (pre (code [$ which gcc +$ /gnu/store/…-profile/bin/gcc +$ which R +$ /gnu/store/…-profile/bin/R +])) + + (p [Note that this is a ,(em "minimal") profile; it contains the GCC +toolchain with a linker that ensures that e.g. the GNU C library from +Guix is used at link time. It does not actually contain any of the +libraries you may need to build certain packages.]) + + (p [Take the R package "Cairo", which provides bindings to the Cairo +rendering libraries as an example. Trying to build this in this new +environment will fail, because the Cairo libraries are not found. To +privide the required libraries we exit the environment, install the +Guix packages providing the libraries and re-enter the environment.]) + + (pre (code [$ exit +$ guix package -p /path/to/.guix-profile -i cairo libxt +$ bash +$ source /path/to/.guix-profile/etc/profile +$ R +> install.packages("Cairo") +… + * DONE (Cairo) +> library(Cairo) +>])) + + (p [Yay! This should work for any R package with bindings to any +libraries that are in Guix. For this particular case you could have +installed the ,(code "r-cairo") package using Guix, of course.]) + + (h2 [Potential problems and potential solutions]) + + (p [What happens if the system provides the required header files and +libraries? Will the GCC toolchain from Guix use them? Yes. But +that's okay, because it won't be able to compile and link the binaries +anyway. When the files are provided by both Guix ,(em +"and") the system the toolchain prefers the Guix stuff.]) + + (p [It is ,(em "possible") to prevent the R process and all its +children from ever seeing system libraries, but this requires the use +of containers, which are not available on somewhat older kernels that +are commonly used in scientific computing environments. Guix provides +support for containers, so if you use a modern Linux kernel on your +GNU system you can avoid some confusion by using either ,(code "guix +environment --container") or ,(code "guix container"). Check out +,(ref +"http://www.gnu.org/software/guix/manual/html_node/Invoking-guix-environment.html" +"the glorious manual").]) + + (p [Another problem is that the packages you build manually do not +come with the benefits that Guix provides. This means, for example, +that these packages won't be bit-reproducible. If you want +bit-reproducible software environments: use Guix and don't look +back.]) + + (h2 [Summary]) + + (ul [,(li [Don't mix Guix with system things to avoid ABI conflicts.]) + + ,(li [If you use ,(code "install.packages") let R from Guix use + the GCC toolchain and libraries from Guix.]) + + ,(li [We do this by installing the toolchain and all libraries we + need into a separate Guix profile. R runs inside of that + environment.])]) + + (h2 [Lean more!]) + + (p [If you want to learn more about GNU Guix I recommend taking a + look at the excellent ,(ref "https://www.gnu.org/software/guix/" + "GNU Guix project page"), which offers links to talks, papers, + and the manual. Feel free to contact me if you want to learn + more about packaging scientific software for Guix. It is not + difficult and we all can benefit from joining efforts in adopting + this usable, dependable, hackable, and liberating platform for + scientific computing with free software.]) + + (p [The Guix community is very friendly, supportive, responsive and + welcoming. I encourage you to visit the project’s ,(ref + "https://webchat.freenode.net?channels=#guix" "IRC channel #guix + on Freenode"), where I go by the handle “rekado”.])) |