summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--doc/ref/compiler.texi106
1 files changed, 58 insertions, 48 deletions
diff --git a/doc/ref/compiler.texi b/doc/ref/compiler.texi
index 75fd4e5cb..6696360bd 100644
--- a/doc/ref/compiler.texi
+++ b/doc/ref/compiler.texi
@@ -939,7 +939,8 @@ respectively.
We describe programs in Guile's CPS language as being a kind of ``soup''
because all continuations in the program are mixed into the same
-``pot'', so to speak. A program in CPS is a map from continuation
+``pot'', so to speak, without explicit markers as to what function or
+scope a continuation is in. A program in CPS is a map from continuation
labels to continuation values. As discussed in the introduction, a
continuation label is an integer. No label may be negative.
@@ -951,16 +952,18 @@ refer to other functions, either via @code{$fun} and @code{$rec} in
higher-order CPS, or via @code{$closure} and @code{$callk} in
first-order CPS. The program logically contains all continuations of
all functions reachable from the entry function. A compiler pass may
-leave unreachable continuations in a program, but analyses should
-usually either apply only to reachable continuations, or should make
-translations that are oblivious as to whether a continuation is
-reachable or not.
+leave unreachable continuations in a program; subsequent compiler passes
+should ensure that their transformations and analyses only take
+reachable continuations into account. It's OK though if transformation
+runs over all continuations if including the unreachable continuations
+has no effect on the transformations on the live continuations.
@cindex intmap
-The map itself is implemented as an @dfn{intmap}, a functional
-array-mapped trie specialized for integer keys. Currently intmaps are a
-private data structure only used by the CPS phase of the compiler. To
-work with intmaps, load the @code{(language cps intmap)} module:
+The ``soup'' itself is implemented as an @dfn{intmap}, a functional
+array-mapped trie specialized for integer keys. Intmaps associate
+integers with values of any kind. Currently intmaps are a private data
+structure only used by the CPS phase of the compiler. To work with
+intmaps, load the @code{(language cps intmap)} module:
@example
(use-modules (language cps intmap))
@@ -992,14 +995,14 @@ as-is; in that way, one can detect whether the set operation produced a
new result simply by checking with @code{eq?}. This makes intmaps
useful when computing fixed points.
-If a key is present in both intmaps and the key is not the same in the
-sense of @code{eq?}, the resulting value is determined by a ``meet''
-procedure, which is the optional last argument to @code{intmap-union},
-@code{intmap-intersect}, and also to @code{intmap-add},
-@code{intmap-replace}, and similar functions. The meet procedure will
-be called with the two values and should return the intersected or
-unioned value in some appropriate way. If no meet procedure is given,
-the default meet procedure will raise an error.
+If a key is present in both intmaps and the associated values are not
+the same in the sense of @code{eq?}, the resulting value is determined
+by a ``meet'' procedure, which is the optional last argument to
+@code{intmap-union}, @code{intmap-intersect}, and also to
+@code{intmap-add}, @code{intmap-replace}, and similar functions. The
+meet procedure will be called with the two values and should return the
+intersected or unioned value in some domain-specific way. If no meet
+procedure is given, the default meet procedure will raise an error.
To traverse over the set of values in an intmap, there are the
@code{intmap-next} and @code{intmap-prev} procedures. For example, if
@@ -1033,15 +1036,16 @@ This operation ensures that the result of existing computations is not
affected by future computations: no mutation is ever visible to user
code. This is a great property in a compiler data structure, as it lets
us hold a copy of a program before a transformation and use it while we
-build a post-transformation program.
+build a post-transformation program. Updating an intmap is O(log
+@var{n}) in the size of the intmap.
-However, the allocation costs are sometimes too much, especially in
-cases when we know that we can just update the intmap in place. As an
-example, say we have an intmap mapping the integers 1 to 100 to the
-integers 42 to 141. Let's say that we want to transform this map by
-adding 1 to each value. There is already an efficient @code{intmap-map}
-procedure in the @code{(language cps utils}) module, but if we didn't
-know about that we might do:
+However, the O(log @var{n}) allocation costs are sometimes too much,
+especially in cases when we know that we can just update the intmap in
+place. As an example, say we have an intmap mapping the integers 1 to
+100 to the integers 42 to 141. Let's say that we want to transform this
+map by adding 1 to each value. There is already an efficient
+@code{intmap-map} procedure in the @code{(language cps utils}) module,
+but if we didn't know about that we might do:
@example
(define (intmap-increment map)
@@ -1062,21 +1066,21 @@ state with the last one, and we could update in place. Guile allows
this kind of interface via @dfn{transient intmaps}, inspired by
Clojure's transient interface (@uref{http://clojure.org/transients}).
-The @code{intmap-add!} and @code{intmap-replace!} procedures return
-transient intmaps. If one of these in-place procedures is called on a
-normal persistent intmap, a new transient intmap is created. This is an
-O(1) operation. In all other respects the interface is like their
+The in-place @code{intmap-add!} and @code{intmap-replace!} procedures
+return transient intmaps. If one of these in-place procedures is called
+on a normal persistent intmap, a new transient intmap is created. This
+is an O(1) operation. In all other respects the interface is like their
persistent counterparts, @code{intmap-add} and @code{intmap-replace}.
-
If an in-place procedure is called on a transient intmap, the intmap is
-mutated in-place and the same value is returned. If a persistent
-operation like @code{intmap-add} is called on a transient intmap, the
-transient's mutable substructure is then marked as persistent, and
-@code{intmap-add} then runs on a new persistent intmap sharing structure
-but not state with the original transient. Mutating a transient will
-cause enough copying to ensure that it can make its change, but if part
-of its substructure is already ``owned'' by it, no more copying is
-needed.
+mutated in-place and the same value is returned.
+
+If a persistent operation like @code{intmap-add} is called on a
+transient intmap, the transient's mutable substructure is then marked as
+persistent, and @code{intmap-add} then runs on a new persistent intmap
+sharing structure but not state with the original transient. Mutating a
+transient will cause enough copying to ensure that it can make its
+change, but if part of its substructure is already ``owned'' by it, no
+more copying is needed.
We can use transients to make @code{intmap-increment} more efficient.
The two changed elements have been marked @strong{like this}.
@@ -1098,6 +1102,10 @@ call @code{persistent-intmap} on the incoming map, to ensure that if it
were already transient, that the mutations in the body of
@code{intmap-increment} wouldn't affect the incoming value.
+In summary, programs in CPS are intmaps whose values are continuations.
+See the source code of @code{(language cps utils)} for a number of
+useful facilities for working with CPS values.
+
@node Compiling CPS
@subsubsection Compiling CPS
@@ -1114,16 +1122,18 @@ variables (in Tree-IL, locals that are @code{<lexical-set>}) are
converted to being boxed values on the heap. @xref{Variables and the
VM}.
-After CPS conversion, Guile runs some optimization passes. The major
-optimization performed on CPS is contification, in which functions that
-are always called with the same continuation are incorporated directly
-into a function's body. This opens up space for more optimizations, and
-turns procedure calls into @code{goto}. It can also make loops out of
-recursive function nests.
-
-At the time of this writing (2014), most high-level optimization in
-Guile is done on Tree-IL. We would like to rewrite many of these passes
-to operate on CPS instead, as it is easier to reason about CPS.
+After CPS conversion, Guile runs some optimization passes over the CPS.
+Most optimization in Guile is done on the CPS language. The one major
+exception is partial evaluation, which for historic reasons is done on
+Tree-IL.
+
+The major optimization performed on CPS is contification, in which
+functions that are always called with the same continuation are
+incorporated directly into a function's body. This opens up space for
+more optimizations, and turns procedure calls into @code{goto}. It can
+also make loops out of recursive function nests. Guile also does dead
+code elimination, common subexpression elimination, loop peeling and
+invariant code motion, and range and type inference.
The rest of the optimization passes are really cleanups and
canonicalizations. CPS spans the gap between high-level languages and