posts: Add link to other Guix posts.
[software/elephly-net.git] / posts / 2017-03-24-r-with-guix.skr
1 (post
2 :title "Using R with Guix"
3 :date (string->date* "2017-03-24 10:00")
4 :tags '("gnu"
5 "planet-fsfe-en"
6 "free software"
7 "guix"
8 "r")
10 (h2 [Introducing the actors])
12 (p [For the past few years I have been working on
13 ,(ref "" "GNU Guix"), a functional
14 package manager. One of the most obvious benefits of a functional
15 package manager is that it allows you to install any number of
16 variants of a software package into a separate environment, without
17 polluting any other environment on your system. This feature makes a
18 lot of sense in the context of scientific computing where it may be
19 necessary to use different versions or variants of applications and
20 libraries for different projects.])
22 (p [Many programming languages come with their own package management
23 facilities, that some users rely on despite their obvious limitations.
24 In the case of GNU R the built-in ,(code "install.packages") procedure
25 makes it easy for users to quickly install packages from CRAN, and the
26 third-party ,(code "devtools") package extends this mechanism to
27 install software from other sources, such as a git repository.])
29 (h2 [ABI incompatibilities])
31 (p [Unfortunately, limitations in how binaries are executed and
32 linked on GNU+Linux systems make it hard for people to continue to use
33 the package installation facilities of the language when also using R
34 from Guix on a distribution of the GNU system other than GuixSD.
35 Packages that are installed through ,(code "install.packages") are
36 built on demand. Some of these packages provide bindings to other
37 libraries, which may be available at the system level. When these
38 bindings are built R uses the compiler toolchain and the libraries the
39 system provides. All software in Guix, on the other hand, is
40 completely independent from any libraries the host system provides,
41 because that's a direct consequence of implementing functional package
42 management. As a result, binaries from Guix do not have binary
43 compatibility with binaries built using system tools and linked with
44 system libraries. In other words: due to the lack of a shared ABI
45 between Guix binaries and system binaries, packages built with the
46 system toolchain and linked with non-Guix libraries cannot be loaded
47 into a process of a Guix binary (and vice versa).])
49 (p [Of course, this is not always a problem, because not all R
50 packages provide bindings to other libraries; but the problem usually
51 strikes with more complicated packages where using Guix makes a lot of
52 sense as it covers the whole dependency graph.])
54 (p [Because of this nasty problem, which cannot be solved without a
55 redesign of compiler toolchains and file formats, I have been
56 recommending people to just use Guix for everything and avoid mixing
57 software installation methods. Guix comes with many R packages and
58 for those that it doesn't include it has an importer for the CRAN and
59 Bioconductor repositories, which makes it easy to create Guix package
60 expressions for R packages. While this is certainly valid advice, it
61 ignores the habits of long-time R users, who may be really attached to
62 ,(code "install.packages") or ,(code "devtools").])
65 (h2 [Schroedinger's Cake])
67 (p [There is another way; you can have your cake and eat it too. The
68 problem arises from using the incompatible libraries and toolchain
69 provided by the operating system. So let's just ,(em "not") do this,
70 mmkay? As long as we can make R from Guix use libraries and the
71 compiler toolchain from Guix we should not have any of these
72 ABI problems when using ,(code "install.packages").])
74 (p [Let's create an environment containing the current version of R,
75 the GCC toolchain, and the GNU Fortran compiler with Guix. We could
76 use ,(code "guix environment --ad-hoc") here, but it's better to use a
77 persistent profile.])
79 (pre (code [$ guix package -p /path/to/.guix-profile \
80 -i r gcc-toolchain gfortran]))
82 (p [To "enter" the profile I recommend using a sub-shell like this:])
84 (pre (code [$ bash
85 $ source /path/to/.guix-profile/etc/profile
86 $ …
87 $ exit]))
89 (p [When inside the sub-shell we see that we use both the GCC
90 toolchain and R from Guix:])
92 (pre (code [$ which gcc
93 $ /gnu/store/…-profile/bin/gcc
94 $ which R
95 $ /gnu/store/…-profile/bin/R
96 ]))
98 (p [Note that this is a ,(em "minimal") profile; it contains the GCC
99 toolchain with a linker that ensures that e.g. the GNU C library from
100 Guix is used at link time. It does not actually contain any of the
101 libraries you may need to build certain packages.])
103 (p [Take the R package "Cairo", which provides bindings to the Cairo
104 rendering libraries as an example. Trying to build this in this new
105 environment will fail, because the Cairo libraries are not found. To
106 privide the required libraries we exit the environment, install the
107 Guix packages providing the libraries and re-enter the environment.])
109 (pre (code [$ exit
110 $ guix package -p /path/to/.guix-profile -i cairo libxt
111 $ bash
112 $ source /path/to/.guix-profile/etc/profile
113 $ R
114 > install.packages("Cairo")
116 * DONE (Cairo)
117 > library(Cairo)
118 >]))
120 (p [Yay! This should work for any R package with bindings to any
121 libraries that are in Guix. For this particular case you could have
122 installed the ,(code "r-cairo") package using Guix, of course.])
124 (h2 [Potential problems and potential solutions])
126 (p [What happens if the system provides the required header files and
127 libraries? Will the GCC toolchain from Guix use them? Yes. But
128 that's okay, because it won't be able to compile and link the binaries
129 anyway. When the files are provided by both Guix ,(em
130 "and") the system the toolchain prefers the Guix stuff.])
132 (p [It is ,(em "possible") to prevent the R process and all its
133 children from ever seeing system libraries, but this requires the use
134 of containers, which are not available on somewhat older kernels that
135 are commonly used in scientific computing environments. Guix provides
136 support for containers, so if you use a modern Linux kernel on your
137 GNU system you can avoid some confusion by using either ,(code "guix
138 environment --container") or ,(code "guix container"). Check out
139 ,(ref
140 ""
141 "the glorious manual").])
143 (p [Another problem is that the packages you build manually do not
144 come with the benefits that Guix provides. This means, for example,
145 that these packages won't be bit-reproducible. If you want
146 bit-reproducible software environments: use Guix and don't look
147 back.])
149 (h2 [Summary])
151 (ul [,(li [Don't mix Guix with system things to avoid ABI conflicts.])
153 ,(li [If you use ,(code "install.packages") let R from Guix use
154 the GCC toolchain and libraries from Guix.])
156 ,(li [We do this by installing the toolchain and all libraries we
157 need into a separate Guix profile. R runs inside of that
158 environment.])])
160 (h2 [Lean more!])
162 (p [If you want to learn more about GNU Guix I recommend taking a
163 look at the excellent ,(ref ""
164 "GNU Guix project page"), which offers links to talks, papers,
165 and the manual. Feel free to contact me if you want to learn
166 more about packaging scientific software for Guix. It is not
167 difficult and we all can benefit from joining efforts in adopting
168 this usable, dependable, hackable, and liberating platform for
169 scientific computing with free software.])
171 (p [The Guix community is very friendly, supportive, responsive and
172 welcoming. I encourage you to visit the project’s ,(ref
173 "" "IRC channel #guix
174 on Freenode"), where I go by the handle “rekado”.])
176 (p [Read ,(ref "/tags/guix.html" "more posts
177 about GNU Guix here").]))