summaryrefslogtreecommitdiff
path: root/posts/2017-03-24-r-with-guix.skr
blob: d5893bb4a9ea5013927018a7568f70a4ed4231e4 (about) (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
(post
 :title "Using R with Guix"
 :date (string->date* "2017-03-24 10:00")
 :tags '("gnu"
         "planet-fsfe-en"
         "free software"
         "guix"
         "r")

 (h2 [Introducing the actors])

 (p [For the past few years I have been working on
,(ref "https://gnu.org/software/guix" "GNU Guix"), a functional
package manager.  One of the most obvious benefits of a functional
package manager is that it allows you to install any number of
variants of a software package into a separate environment, without
polluting any other environment on your system.  This feature makes a
lot of sense in the context of scientific computing where it may be
necessary to use different versions or variants of applications and
libraries for different projects.])

 (p [Many programming languages come with their own package management
facilities, that some users rely on despite their obvious limitations.
In the case of GNU R the built-in ,(code "install.packages") procedure
makes it easy for users to quickly install packages from CRAN, and the
third-party ,(code "devtools") package extends this mechanism to
install software from other sources, such as a git repository.])

 (h2 [ABI incompatibilities])

 (p [Unfortunately, limitations in how binaries are executed and
linked on GNU+Linux systems make it hard for people to continue to use
the package installation facilities of the language when also using R
from Guix on a distribution of the GNU system other than GuixSD.
Packages that are installed through ,(code "install.packages") are
built on demand.  Some of these packages provide bindings to other
libraries, which may be available at the system level.  When these
bindings are built, R uses the compiler toolchain and the libraries the
system provides.  All software in Guix, on the other hand, is
completely independent from any libraries the host system provides,
because that's a direct consequence of implementing functional package
management.  As a result, binaries from Guix do not have binary
compatibility with binaries built using system tools and linked with
system libraries.  In other words: due to the lack of a shared ABI
between Guix binaries and system binaries, packages built with the
system toolchain and linked with non-Guix libraries cannot be loaded
into a process of a Guix binary (and vice versa).])

 (p [Of course, this is not always a problem, because not all R
packages provide bindings to other libraries; but the problem usually
strikes with more complicated packages where using Guix makes a lot of
sense as it covers the whole dependency graph.])

 (p [Because of this nasty problem, which cannot be solved without a
redesign of compiler toolchains and file formats, I have been
recommending people to just use Guix for everything and avoid mixing
software installation methods.  Guix comes with many R packages and
for those that it doesn't include it has an importer for the CRAN and
Bioconductor repositories, which makes it easy to create Guix package
expressions for R packages.  While this is certainly valid advice, it
ignores the habits of long-time R users, who may be really attached to
,(code "install.packages") or ,(code "devtools").])
         

 (h2 [Schroedinger's Cake])

 (p [There is another way; you can have your cake and eat it too.  The
problem arises from using the incompatible libraries and toolchain
provided by the operating system.  So let's just ,(em "not") do this,
mmkay?  As long as we can make R from Guix use libraries and the
compiler toolchain from Guix we should not have any of these
ABI problems when using ,(code "install.packages").])

 (p [Let's create an environment containing the current version of R,
the GCC toolchain, and the GNU Fortran compiler with Guix.  We could
use ,(code "guix environment --ad-hoc") here, but it's better to use a
persistent profile.])

 (pre (code [$ guix package -p /path/to/.guix-profile \
    -i r gcc-toolchain gfortran]))

 (p [To "enter" the profile I recommend using a sub-shell like this:])

 (pre (code [$ bash
$ source /path/to/.guix-profile/etc/profile
$ …
$ exit]))

 (p [When inside the sub-shell we see that we use both the GCC
toolchain and R from Guix:])

 (pre (code [$ which gcc
$ /gnu/store/…-profile/bin/gcc
$ which R
$ /gnu/store/…-profile/bin/R
]))

 (p [Note that this is a ,(em "minimal") profile; it contains the GCC
toolchain with a linker that ensures that e.g. the GNU C library from
Guix is used at link time.  It does not actually contain any of the
libraries you may need to build certain packages.])

 (p [Take the R package "Cairo", which provides bindings to the Cairo
rendering libraries as an example.  Trying to build this in this new
environment will fail, because the Cairo libraries are not found.  To
privide the required libraries we exit the environment, install the
Guix packages providing the libraries and re-enter the environment.])

 (pre (code [$ exit
$ guix package -p /path/to/.guix-profile -i cairo libxt
$ bash
$ source /path/to/.guix-profile/etc/profile
$ R
> install.packages("Cairo")
…
 * DONE (Cairo)
> library(Cairo)
>]))

 (p [Yay!  This should work for any R package with bindings to any
libraries that are in Guix.  For this particular case you could have
installed the ,(code "r-cairo") package using Guix, of course.])

 (h2 [Potential problems and potential solutions])

 (p [What happens if the system provides the required header files and
libraries?  Will the GCC toolchain from Guix use them?  Yes.  But
that's okay, because it won't be able to compile and link the binaries
anyway.  When the files are provided by both Guix ,(em
"and") the system the toolchain prefers the Guix stuff.])

 (p [It is ,(em "possible") to prevent the R process and all its
children from ever seeing system libraries, but this requires the use
of containers, which are not available on somewhat older kernels that
are commonly used in scientific computing environments.  Guix provides
support for containers, so if you use a modern Linux kernel on your
GNU system you can avoid some confusion by using either ,(code "guix
environment --container") or ,(code "guix container").  Check out
,(ref
"http://www.gnu.org/software/guix/manual/html_node/Invoking-guix-environment.html"
"the glorious manual").])

 (p [Another problem is that the packages you build manually do not
come with the benefits that Guix provides.  This means, for example,
that these packages won't be bit-reproducible.  If you want
bit-reproducible software environments: use Guix and don't look
back.])

 (h2 [Summary])

 (ul [,(li [Don't mix Guix with system things to avoid ABI conflicts.])
 
      ,(li [If you use ,(code "install.packages") let R from Guix use
            the GCC toolchain and libraries from Guix.])

      ,(li [We do this by installing the toolchain and all libraries we
            need into a separate Guix profile.  R runs inside of that
            environment.])])

 (h2 [Learn more!])
 
  (p [If you want to learn more about GNU Guix I recommend taking a
     look at the excellent ,(ref "https://www.gnu.org/software/guix/"
     "GNU Guix project page"), which offers links to talks, papers,
     and the manual.  Feel free to contact me if you want to learn
     more about packaging scientific software for Guix.  It is not
     difficult and we all can benefit from joining efforts in adopting
     this usable, dependable, hackable, and liberating platform for
     scientific computing with free software.])

 (p [The Guix community is very friendly, supportive, responsive and
     welcoming.  I encourage you to visit the project’s ,(ref
     "https://webchat.freenode.net?channels=#guix" "IRC channel #guix
     on Freenode"), where I go by the handle “rekado”.])

 (p [Read ,(ref "/tags/guix.html" "more posts
     about GNU Guix here").]))