summaryrefslogtreecommitdiff
path: root/doc/guile-vm.texi
diff options
context:
space:
mode:
Diffstat (limited to 'doc/guile-vm.texi')
-rw-r--r--doc/guile-vm.texi1042
1 files changed, 1042 insertions, 0 deletions
diff --git a/doc/guile-vm.texi b/doc/guile-vm.texi
new file mode 100644
index 000000000..927c09e88
--- /dev/null
+++ b/doc/guile-vm.texi
@@ -0,0 +1,1042 @@
+\input texinfo @c -*-texinfo-*-
+@c %**start of header
+@setfilename guile-vm.info
+@settitle Guile VM Specification
+@footnotestyle end
+@setchapternewpage odd
+@c %**end of header
+
+@set EDITION 0.6
+@set VERSION 0.6
+@set UPDATED 2005-04-26
+
+@c Macro for instruction definitions.
+@macro insn{}
+Instruction
+@end macro
+
+@c For Scheme procedure definitions.
+@macro scmproc{}
+Scheme Procedure
+@end macro
+
+@c Scheme records.
+@macro scmrec{}
+Record
+@end macro
+
+@ifinfo
+@dircategory Scheme Programming
+@direntry
+* Guile VM: (guile-vm). Guile's Virtual Machine.
+@end direntry
+
+This file documents Guile VM.
+
+Copyright @copyright{} 2000 Keisuke Nishida
+Copyright @copyright{} 2005 Ludovic Court`es
+
+Permission is granted to make and distribute verbatim copies of this
+manual provided the copyright notice and this permission notice are
+preserved on all copies.
+
+@ignore
+Permission is granted to process this file through TeX and print the
+results, provided the printed document carries a copying permission
+notice identical to this one except for the removal of this paragraph
+(this paragraph not being relevant to the printed manual).
+
+@end ignore
+Permission is granted to copy and distribute modified versions of this
+manual under the conditions for verbatim copying, provided that the
+entire resulting derived work is distributed under the terms of a
+permission notice identical to this one.
+
+Permission is granted to copy and distribute translations of this manual
+into another language, under the above conditions for modified versions,
+except that this permission notice may be stated in a translation
+approved by the Free Software Foundation.
+@end ifinfo
+
+@titlepage
+@title Guile VM Specification
+@subtitle for Guile VM @value{VERSION}
+@author Keisuke Nishida
+
+@page
+@vskip 0pt plus 1filll
+Edition @value{EDITION} @*
+Updated for Guile VM @value{VERSION} @*
+@value{UPDATED} @*
+
+Copyright @copyright{} 2000 Keisuke Nishida
+Copyright @copyright{} 2005 Ludovic Court`es
+
+Permission is granted to make and distribute verbatim copies of this
+manual provided the copyright notice and this permission notice are
+preserved on all copies.
+
+Permission is granted to copy and distribute modified versions of this
+manual under the conditions for verbatim copying, provided that the
+entire resulting derived work is distributed under the terms of a
+permission notice identical to this one.
+
+Permission is granted to copy and distribute translations of this manual
+into another language, under the above conditions for modified versions,
+except that this permission notice may be stated in a translation
+approved by the Free Software Foundation.
+@end titlepage
+
+@contents
+
+@c *********************************************************************
+@node Top, Introduction, (dir), (dir)
+@top Guile VM Specification
+
+This document would like to correspond to Guile VM @value{VERSION}.
+However, be warned that important parts still correspond to version
+0.0 and are not valid anymore.
+
+@menu
+* Introduction::
+* Variable Management::
+* Instruction Set::
+* The Compiler::
+* Concept Index::
+* Function and Instruction Index::
+* Command and Variable Index::
+
+@detailmenu
+ --- The Detailed Node Listing ---
+
+Instruction Set
+
+* Environment Control Instructions::
+* Branch Instructions::
+* Subprogram Control Instructions::
+* Data Control Instructions::
+
+The Compiler
+
+* Overview::
+* The Language Front-Ends::
+* GHIL::
+* Compiling Scheme Code::
+* GLIL::
+* The Assembler::
+
+@end detailmenu
+@end menu
+
+@c *********************************************************************
+@node Introduction, Variable Management, Top, Top
+@chapter What is Guile VM?
+
+A Guile VM has a set of registers and its own stack memory. Guile may
+have more than one VM's. Each VM may execute at most one program at a
+time. Guile VM is a CISC system so designed as to execute Scheme and
+other languages efficiently.
+
+@unnumberedsubsec Registers
+
+@itemize
+@item pc - Program counter ;; ip (instruction poiner) is better?
+@item sp - Stack pointer
+@item bp - Base pointer
+@item ac - Accumulator
+@end itemize
+
+@unnumberedsubsec Engine
+
+A VM may have one of three engines: reckless, regular, or debugging.
+Reckless engine is fastest but dangerous. Regular engine is normally
+fail-safe and reasonably fast. Debugging engine is safest and
+functional but very slow.
+
+@unnumberedsubsec Memory
+
+Stack is the only memory that each VM owns. The other memory is shared
+memory that is shared among every VM and other part of Guile.
+
+@unnumberedsubsec Program
+
+A VM program consists of a bytecode that is executed and an environment
+in which execution is done. Each program is allocated in the shared
+memory and may be executed by any VM. A program may call other programs
+within a VM.
+
+@unnumberedsubsec Instruction
+
+Guile VM has dozens of system instructions and (possibly) hundreds of
+functional instructions. Some Scheme procedures such as cons and car
+are implemented as VM's builtin functions, which are very efficient.
+Other procedures defined outside of the VM are also considered as VM's
+functional features, since they do not change the state of VM.
+Procedures defined within the VM are called subprograms.
+
+Most instructions deal with the accumulator (ac). The VM stores all
+results from functions in ac, instead of pushing them into the stack.
+I'm not sure whether this is a good thing or not.
+
+@node Variable Management, Instruction Set, Introduction, Top
+@chapter Variable Management
+
+FIXME: This chapter needs to be reviewed so that it matches reality.
+A more up-to-date description of the mechanisms described in this
+section is given in @ref{Instruction Set}.
+
+A program may have access to local variables, external variables, and
+top-level variables.
+
+@section Local/external variables
+
+A stack is logically divided into several blocks during execution. A
+"block" is such a unit that maintains local variables and dynamic chain.
+A "frame" is an upper level unit that maintains subprogram calls.
+
+@example
+ Stack
+ dynamic | | | |
+ chain +==========+ - =
+ | |local vars| | |
+ `-|block data| | block |
+ /|frame data| | |
+ | +----------+ - |
+ | |local vars| | | frame
+ `-|block data| | |
+ /+----------+ - |
+ | |local vars| | |
+ `-|block data| | |
+ /+==========+ - =
+ | |local vars| | |
+ `-|block data| | |
+ /|frame data| | |
+ | +----------+ - |
+ | | | | |
+@end example
+
+The first block of each frame may look like this:
+
+@example
+ Address Data
+ ------- ----
+ xxx0028 Local variable 2
+ xxx0024 Local variable 1
+ bp ->xxx0020 Local variable 0
+ xxx001c Local link (block data)
+ xxx0018 External link (block data)
+ xxx0014 Stack pointer (block data)
+ xxx0010 Return address (frame data)
+ xxx000c Parent program (frame data)
+@end example
+
+The base pointer (bp) always points to the lowest address of local
+variables of the recent block. Local variables are referred as "bp[n]".
+The local link field has a pointer to the dynamic parent of the block.
+The parent's variables are referred as "bp[-1][n]", and grandparent's
+are "bp[-1][-1][n]". Thus, any local variable is represented by its
+depth and offset from the current bp.
+
+A variable may be "external", which is allocated in the shared memory.
+The external link field of a block has a pointer to such a variable set,
+which I call "fragment" (what should I call?). A fragment has a set of
+variables and its own chain.
+
+@example
+ local external
+ chain| | chain
+ | +-----+ .--------, |
+ `-|block|--+->|external|-'
+ /+-----+ | `--------'\,
+ `-|block|--' |
+ /+-----+ .--------, |
+ `-|block|---->|external|-'
+ +-----+ `--------'
+ | |
+@end example
+
+An external variable is referred as "bp[-2]->variables[n]" or
+"bp[-2]->link->...->variables[n]". This is also represented by a pair
+of depth and offset. At any point of execution, the value of bp
+determines the current local link and external link, and thus the
+current environment of a program.
+
+Other data fields are described later.
+
+@section Top-level variables
+
+Guile VM uses the same top-level variables as the regular Guile. A
+program may have direct access to vcells. Currently this is done by
+calling scm_intern0, but a program is possible to have any top-level
+environment defined by the current module.
+
+@section Scheme and VM variable
+
+Let's think about the following Scheme code as an example:
+
+@example
+ (define (foo a)
+ (lambda (b) (list foo a b)))
+@end example
+
+In the lambda expression, "foo" is a top-level variable, "a" is an
+external variable, and "b" is a local variable.
+
+When a VM executes foo, it allocates a block for "a". Since "a" may be
+externally referred from the closure, the VM creates a fragment with a
+copy of "a" in it. When the VM evaluates the lambda expression, it
+creates a subprogram (closure), associating the fragment with the
+subprogram as its external environment. When the closure is executed,
+its environment will look like this:
+
+@example
+ block Top-level: foo
+ +-------------+
+ |local var: b | fragment
+ +-------------+ .-----------,
+ |external link|---->|variable: a|
+ +-------------+ `-----------'
+@end example
+
+The fragment remains as long as the closure exists.
+
+@section Addressing mode
+
+Guile VM has five addressing modes:
+
+@itemize
+@item Real address
+@item Local position
+@item External position
+@item Top-level location
+@item Constant object
+@end itemize
+
+Real address points to the address in the real program and is only used
+with the program counter (pc).
+
+Local position and external position are represented as a pair of depth
+and offset from bp, as described above. These are base relative
+addresses, and the real address may vary during execution.
+
+Top-level location is represented as a Guile's vcell. This location is
+determined at loading time, so the use of this address is efficient.
+
+Constant object is not an address but gives an instruction an Scheme
+object directly.
+
+[ We'll also need dynamic scope addressing to support Emacs Lisp? ]
+
+
+Overall procedure:
+
+@enumerate
+@item A source program is compiled into a bytecode.
+@item A bytecode is given an environment and becomes a program.
+@item A VM starts execution, creating a frame for it.
+@item Whenever a program calls a subprogram, a new frame is created for it.
+@item When a program finishes execution, it returns a value, and the VM
+ continues execution of the parent program.
+@item When all programs terminated, the VM returns the final value and stops.
+@end enumerate
+
+
+@node Instruction Set, The Compiler, Variable Management, Top
+@chapter Instruction Set
+
+The Guile VM instruction set is roughly divided two groups: system
+instructions and functional instructions. System instructions control
+the execution of programs, while functional instructions provide many
+useful calculations.
+
+@menu
+* Environment Control Instructions::
+* Branch Instructions::
+* Subprogram Control Instructions::
+* Data Control Instructions::
+@end menu
+
+@node Environment Control Instructions, Branch Instructions, Instruction Set, Instruction Set
+@section Environment Control Instructions
+
+@deffn @insn{} link binding-name
+Look up @var{binding-name} (a string) in the current environment and
+push the corresponding variable object onto the stack. If
+@var{binding-name} is not bound yet, then create a new binding and
+push its variable object.
+@end deffn
+
+@deffn @insn{} variable-ref
+Dereference the variable object which is on top of the stack and
+replace it by the value of the variable it represents.
+@end deffn
+
+@deffn @insn{} variable-set
+Set the value of the variable on top of the stack (at @code{sp[0]}) to
+the object located immediately before (at @code{sp[-1]}).
+@end deffn
+
+As an example, let us look at what a simple function call looks like:
+
+@example
+(+ 2 3)
+@end example
+
+This call yields the following sequence of instructions:
+
+@example
+(link "+") ;; lookup binding "+"
+(variable-ref) ;; dereference it
+(make-int8 2) ;; push immediate value `2'
+(make-int8 3) ;; push immediate value `3'
+(tail-call 2) ;; call the proc at sp[-3] with two args
+@end example
+
+@deffn @insn{} local-ref offset
+Push onto the stack the value of the local variable located at
+@var{offset} within the current stack frame.
+@end deffn
+
+@deffn @insn{} local-set offset
+Pop the Scheme object located on top of the stack and make it the new
+value of the local variable located at @var{offset} within the current
+stack frame.
+@end deffn
+
+@deffn @insn{} external-ref offset
+Push the value of the closure variable located at position
+@var{offset} within the program's list of external variables.
+@end deffn
+
+@deffn @insn{} external-set offset
+Pop the Scheme object located on top of the stack and make it the new
+value of the closure variable located at @var{offset} within the
+program's list of external variables.
+@end deffn
+
+@deffn @insn{} make-closure
+Pop the program object from the stack and assign it the current
+closure variable list as its closure. Push the result program
+object.
+@end deffn
+
+Let's illustrate this:
+
+@example
+(let ((x 2))
+ (lambda ()
+ (let ((x++ (+ 1 x)))
+ (set! x x++)
+ x++)))
+@end example
+
+The resulting program has one external (closure) variable, i.e. its
+@var{nexts} is set to 1 (@pxref{Subprogram Control Instructions}).
+This yields the following code:
+
+@example
+ ;; the traditional program prologue with NLOCS = 0 and NEXTS = 1
+
+ 0 (make-int8 2)
+ 2 (external-set 0)
+ 4 (make-int8 4)
+ 6 (link "+") ;; lookup `+'
+ 9 (vector 1) ;; create the external variable vector for
+ ;; later use by `object-ref' and `object-set'
+ ...
+ 40 (load-program ##34#)
+ 59 (make-closure) ;; assign the current closure to the program
+ ;; just pushed by `load-program'
+ 60 (return)
+@end example
+
+The program loaded here by @var{load-program} contains the following
+sequence of instructions:
+
+@example
+ 0 (object-ref 0) ;; push the variable for `+'
+ 2 (variable-ref) ;; dereference `+'
+ 3 (make-int8:1) ;; push 1
+ 4 (external-ref 0) ;; push the value of `x'
+ 6 (call 2) ;; call `+' and push the result
+ 8 (local-set 0) ;; make it the new value of `x++'
+ 10 (local-ref 0) ;; push the value of `x++'
+ 12 (external-set 0) ;; make it the new value of `x'
+ 14 (local-ref 0) ;; push the value of `x++'
+ 16 (return) ;; return it
+@end example
+
+At this point, you should know pretty much everything about the three
+types of variables a program may need to access.
+
+
+@node Branch Instructions, Subprogram Control Instructions, Environment Control Instructions, Instruction Set
+@section Branch Instructions
+
+All the conditional branch instructions described below work in the
+same way:
+
+@itemize
+@item They take the Scheme object located on the stack and use it as
+the branch condition;
+@item If the condition if false, then program execution continues with
+the next instruction;
+@item If the condition is true, then the instruction pointer is
+increased by the offset passed as an argument to the branch
+instruction;
+@item Finally, when the instruction finished, the condition object is
+removed from the stack.
+@end itemize
+
+Note that the offset passed to the instruction is encoded on two 8-bit
+integers which are then combined by the VM as one 16-bit integer.
+
+@deffn @insn{} br offset
+Jump to @var{offset}.
+@end deffn
+
+@deffn @insn{} br-if offset
+Jump to @var{offset} if the condition on the stack is not false.
+@end deffn
+
+@deffn @insn{} br-if-not offset
+Jump to @var{offset} if the condition on the stack is false.
+@end deffn
+
+@deffn @insn{} br-if-eq offset
+Jump to @var{offset} if the two objects located on the stack are
+equal in the sense of @var{eq?}. Note that, for this instruction, the
+stack pointer is decremented by two Scheme objects instead of only
+one.
+@end deffn
+
+@deffn @insn{} br-if-not-eq offset
+Same as @var{br-if-eq} for non-equal objects.
+@end deffn
+
+@deffn @insn{} br-if-null offset
+Jump to @var{offset} if the object on the stack is @code{'()}.
+@end deffn
+
+@deffn @insn{} br-if-not-null offset
+Jump to @var{offset} if the object on the stack is not @code{'()}.
+@end deffn
+
+
+@node Subprogram Control Instructions, Data Control Instructions, Branch Instructions, Instruction Set
+@section Subprogram Control Instructions
+
+Programs (read: ``compiled procedure'') may refer to external
+bindings, like variables or functions defined outside the program
+itself, in the environment in which it will evaluate at run-time. In
+a sense, a program's environment and its bindings are an implicit
+parameter of every program.
+
+@cindex object table
+In order to handle such bindings, each program has an @dfn{object
+table} associated to it. This table (actually a Scheme vector)
+contains all constant objects referenced by the program. The object
+table of a program is initialized right before a program is loaded
+with @var{load-program}.
+
+Variable objects are one such type of constant object: when a global
+binding is defined, a variable object is associated to it and that
+object will remain constant over time, even if the value bound to it
+changes. Therefore, external bindings only need to be looked up once
+when the program is loaded. References to the corresponding external
+variables from within the program are then performed via the
+@var{object-ref} instruction and are almost as fast as local variable
+references.
+
+Let us consider the following program (procedure) which references
+external bindings @code{frob} and @var{%magic}:
+
+@example
+(lambda (x)
+ (frob x %magic))
+@end example
+
+This yields the following assembly code:
+
+@example
+(make-int8 64) ;; number of args, vars, etc. (see below)
+(link "frob")
+(link "%magic")
+(vector 2) ;; object table (external bindings)
+...
+(load-program #u8(20 0 23 21 0 20 1 23 36 2))
+(return)
+@end example
+
+All the instructions occurring before @var{load-program} (some were
+omitted for simplicity) form a @dfn{prologue} which, among other
+things, pushed an object table (a vector) that contains the variable
+objects for the variables bound to @var{frob} and @var{%magic}. This
+vector and other data pushed onto the stack are then popped by the
+@var{load-program} instruction.
+
+Besides, the @var{load-program} instruction takes one explicit
+argument which is the bytecode of the program itself. Disassembled,
+this bytecode looks like:
+
+@example
+(object-ref 0) ;; push the variable object of `frob'
+(variable-ref) ;; dereference it
+(local-ref 0) ;; push the value of `x'
+(object-ref 1) ;; push the variable object of `%magic'
+(variable-ref) ;; dereference it
+(tail-call 2) ;; call `frob' with two parameters
+@end example
+
+This clearly shows that there is little difference between references
+to local variables and references to externally bound variables since
+lookup of externally bound variables if performed only once before the
+program is run.
+
+@deffn @insn{} load-program bytecode
+Load the program whose bytecode is @var{bytecode} (a u8vector), pop
+its meta-information from the stack, and push a corresponding program
+object onto the stack. The program's meta-information may consist of
+(in the order in which it should be pushed onto the stack):
+
+@itemize
+@item optionally, a pair representing meta-data (see the
+@var{program-meta} procedure); [FIXME: explain their meaning]
+@item optionally, a vector which is the program's object table (a
+program that does not reference external bindings does not need an
+object table);
+@item either one immediate integer or four immediate integers
+representing respectively the number of arguments taken by the
+function (@var{nargs}), the number of @dfn{rest arguments}
+(@var{nrest}, 0 or 1), the number of local variables (@var{nlocs}) and
+the number of external variables (@var{nexts}) (@pxref{Environment
+Control Instructions}).
+@end itemize
+
+@end deffn
+
+@deffn @insn{} object-ref offset
+Push the variable object for the external variable located at
+@var{offset} within the program's object table.
+@end deffn
+
+@deffn @insn{} return
+Free the program's frame.
+@end deffn
+
+@deffn @insn{} call nargs
+Call the procedure, continuation or program located at
+@code{sp[-nargs]} with the @var{nargs} arguments located from
+@code{sp[0]} to @code{sp[-nargs + 1]}. The
+procedure/continuation/program and its arguments are dropped from the
+stack and the result is pushed. When calling a program, the
+@code{call} instruction reserves room for its local variables on the
+stack, and initializes its list of closure variables and its vector of
+externally bound variables.
+@end deffn
+
+@deffn @insn{} tail-call nargs
+Same as @code{call} except that, for tail-recursive calls to a
+program, the current stack frame is re-used, as required by RnRS.
+This instruction is otherwise similar to @code{call}.
+@end deffn
+
+
+@node Data Control Instructions, , Subprogram Control Instructions, Instruction Set
+@section Data Control Instructions
+
+@deffn @insn{} make-int8 value
+Push @var{value}, an 8-bit integer, onto the stack.
+@end deffn
+
+@deffn @insn{} make-int8:0
+Push the immediate value @code{0} onto the stack.
+@end deffn
+
+@deffn @insn{} make-int8:1
+Push the immediate value @code{1} onto the stack.
+@end deffn
+
+@deffn @insn{} make-false
+Push @code{#f} onto the stack.
+@end deffn
+
+@deffn @insn{} make-true
+Push @code{#t} onto the stack.
+@end deffn
+
+@itemize
+@item %push
+@item %pushi
+@item %pushl, %pushl:0:0, %pushl:0:1, %pushl:0:2, %pushl:0:3
+@item %pushe, %pushe:0:0, %pushe:0:1, %pushe:0:2, %pushe:0:3
+@item %pusht
+@end itemize
+
+@itemize
+@item %loadi
+@item %loadl, %loadl:0:0, %loadl:0:1, %loadl:0:2, %loadl:0:3
+@item %loade, %loade:0:0, %loade:0:1, %loade:0:2, %loade:0:3
+@item %loadt
+@end itemize
+
+@itemize
+@item %savei
+@item %savel, %savel:0:0, %savel:0:1, %savel:0:2, %savel:0:3
+@item %savee, %savee:0:0, %savee:0:1, %savee:0:2, %savee:0:3
+@item %savet
+@end itemize
+
+@section Flow control instructions
+
+@itemize
+@item %br-if
+@item %br-if-not
+@item %jump
+@end itemize
+
+@section Function call instructions
+
+@itemize
+@item %func, %func0, %func1, %func2
+@end itemize
+
+@section Scheme built-in functions
+
+@itemize
+@item cons
+@item car
+@item cdr
+@end itemize
+
+@section Mathematical buitin functions
+
+@itemize
+@item 1+
+@item 1-
+@item add, add2
+@item sub, sub2, minus
+@item mul2
+@item div2
+@item lt2
+@item gt2
+@item le2
+@item ge2
+@item num-eq2
+@end itemize
+
+
+
+@node The Compiler, Concept Index, Instruction Set, Top
+@chapter The Compiler
+
+This section describes Guile-VM's compiler and the compilation process
+to produce bytecode executable by the VM itself (@pxref{Instruction
+Set}).
+
+@menu
+* Overview::
+* The Language Front-Ends::
+* GHIL::
+* Compiling Scheme Code::
+* GLIL::
+* The Assembler::
+@end menu
+
+@node Overview, The Language Front-Ends, The Compiler, The Compiler
+@section Overview
+
+Compilation in Guile-VM is a three-stage process:
+
+@cindex intermediate language
+@cindex assembler
+@cindex compiler
+@cindex GHIL
+@cindex GLIL
+@cindex bytecode
+
+@enumerate
+@item the source programming language (e.g. R5RS Scheme) is read and
+translated into GHIL, @dfn{Guile's High-Level Intermediate Language};
+@item GHIL code is then translated into a lower-level intermediate
+language call GLIL, @dfn{Guile's Low-Level Intermediate Language};
+@item finally, GLIL is @dfn{assembled} into the VM's assembly language
+(@pxref{Instruction Set}) and bytecode.
+@end enumerate
+
+The use of two separate intermediate languages eases the
+implementation of front-ends since the gap between high-level
+languages like Scheme and GHIL is relatively small.
+
+@vindex guilec
+From an end-user viewpoint, compiling a Guile program into bytecode
+can be done either by using the @command{guilec} command-line tool, or
+by using the @code{compile-file} procedure exported by the
+@code{(system base compile)} module.
+
+@deffn @scmproc{} compile-file file . opts
+Compile Scheme source code from file @var{file} using compilation
+options @var{opts}. The resulting file, a Guile object file, will be
+name according the application of the @code{compiled-file-name}
+procedure to @var{file}. The possible values for @var{opts} are the
+same as for the @code{compile-in} procedure (see below, @pxref{The Language
+Front-Ends}).
+@end deffn
+
+@deffn @scmproc{} compiled-file-name file
+Given source file name @var{file} (a string), return a string that
+denotes the name of the Guile object file corresponding to
+@var{file}. By default, the file name returned is @var{file} minus
+its extension and plus the @code{.go} file extension.
+@end deffn
+
+@cindex self-hosting
+It is worth noting, as you might have already guessed, that Guile-VM's
+compiler is written in Guile Scheme and is @dfn{self-hosted}: it can
+compile itself.
+
+@node The Language Front-Ends, GHIL, Overview, The Compiler
+@section The Language Front-Ends
+
+Guile-VM comes with a number of @dfn{language front-ends}, that is,
+code that can read a given high-level programming language like R5RS
+Scheme, and translate it into a lower-level representation suitable to
+the compiler.
+
+Each language front-end provides a @dfn{specification} and a
+@dfn{translator} to GHIL. Both of them come in the @code{language}
+module hierarchy. As an example, the front-end for Scheme is located
+in the @code{(language scheme spec)} and @code{(language scheme
+translate)} modules. Language front-ends can then be retrieved using
+the @code{lookup-language} procedure of the @code{(system base
+language)} module.
+
+@deftp @scmrec{} <language> name title version reader printer read-file expander translator evaluator environment
+Denotes a language front-end specification a various methods used by
+the compiler to handle source written in that language. Of particular
+interest is the @code{translator} slot (@pxref{GHIL}).
+@end deftp
+
+@deffn @scmproc{} lookup-language lang
+Look for a language front-end named @var{lang}, a symbol (e.g,
+@code{scheme}), and return the @code{<language>} record describing it
+if found. If @var{lang} does not denote a language front-end, an
+error is raised. Note that this procedure assumes that language
+@var{lang} exists if there exist a @code{(language @var{lang} spec)}
+module.
+@end deffn
+
+The @code{(system base compile)} module defines a procedure similar to
+@code{compile-file} but that is not limited to the Scheme language:
+
+@deffn @scmproc{} compile-in expr env lang . opts
+Compile expression @var{expr}, which is written in language @var{lang}
+(a @code{<language>} object), using compilation options @var{opts},
+and return bytecode as produced by the assembler (@pxref{The
+Assembler}).
+
+Options @var{opts} may contain the following keywords:
+
+@table @code
+@item :e
+compilation will stop after the code expansion phase.
+@item :t
+compilation will stop after the code translation phase, i.e. after
+code in the source language @var{lang} has been translated into GHIL
+(@pxref{GHIL}).
+@item :c
+compilation will stop after the compilation phase and before the
+assembly phase, i.e. once GHIL has been translated into GLIL
+(@pxref{GLIL}).
+@end table
+
+Additionally, @var{opts} may contain any option understood by the
+GHIL-to-GLIL compiler described in @xref{GLIL}.
+@end deffn
+
+
+@node GHIL, Compiling Scheme Code, The Language Front-Ends, The Compiler
+@section Guile's High-Level Intermediate Language
+
+GHIL has constructs almost equivalent to those found in Scheme.
+However, unlike Scheme, it is meant to be read only by the compiler
+itself. Therefore, a sequence of GHIL code is only a sequence of GHIL
+@emph{objects} (records), as opposed to symbols, each of which
+represents a particular language feature. These records are all
+defined in the @code{(system il ghil)} module and are named
+@code{<ghil-*>}.
+
+Each GHIL record has at least two fields: one containing the
+environment (Guile module) in which it is considered, and one
+containing its location [FIXME: currently seems to be unused]. Below
+is a list of the main GHIL object types and their fields:
+
+@example
+;; Objects
+(<ghil-void> env loc)
+(<ghil-quote> env loc obj)
+(<ghil-quasiquote> env loc exp)
+(<ghil-unquote> env loc exp)
+(<ghil-unquote-splicing> env loc exp)
+;; Variables
+(<ghil-ref> env loc var)
+(<ghil-set> env loc var val)
+(<ghil-define> env loc var val)
+;; Controls
+(<ghil-if> env loc test then else)
+(<ghil-and> env loc exps)
+(<ghil-or> env loc exps)
+(<ghil-begin> env loc exps)
+(<ghil-bind> env loc vars vals body)
+(<ghil-lambda> env loc vars rest body)
+(<ghil-call> env loc proc args)
+(<ghil-inline> env loc inline args)
+@end example
+
+As can be seen from this examples, the constructs in GHIL are pretty
+close to the fundamental primitives of Scheme.
+
+It is the role of front-end language translators (@pxref{The Language
+Front-Ends}) to produce a sequence of GHIL objects from the
+human-readable, source programming language. The next section
+describes the translator for the Scheme language.
+
+@node Compiling Scheme Code, GLIL, GHIL, The Compiler
+@section Compiling Scheme Code
+
+The language object for Scheme, as returned by @code{(lookup-language
+'scheme)} (@pxref{The Language Front-Ends}), defines a translator
+procedure that returns a sequence of GHIL objects given Scheme code.
+Before actually performing this operation, the Scheme translator
+expands macros in the original source code.
+
+The macros that may be expanded can come from different sources:
+
+@itemize
+@item core Guile macros, such as @code{false-if-exception};
+@item macros defined in modules used by the module being compiled,
+e.g., @code{receive} in @code{(ice-9 receive)};
+@item macros defined within the module being compiled.
+@end itemize
+
+@cindex macro
+@cindex syntax transformer
+@findex define-macro
+@findex defmacro
+The main complexity in handling macros at compilation time is that
+Guile's macros are first-class objects. For instance, when using
+@code{define-macro}, one actually defines a @emph{procedure} that
+returns code; of course, unlike a ``regular'' procedure, it is
+executed when an S-exp is @dfn{memoized} by the evaluator, i.e.,
+before the actual evaluation takes place. Worse, it is possible to
+turn a procedure into a macro, or @dfn{syntax transformer}, thus
+removing, to some extent, the boundary between the macro expansion and
+evaluation phases, @inforef{Internal Macros, , guile}.
+
+[FIXME: explain limitations, etc.]
+
+
+@node GLIL, The Assembler, Compiling Scheme Code, The Compiler
+@section Guile's Low-Level Intermediate Language
+
+A GHIL instruction sequence can be compiled into GLIL using the
+@code{compile} procedure exported by the @code{(system il compile)}
+module. During this translation process, various optimizations may
+also be performed.
+
+The module @code{(system il glil)} defines record types representing
+various low-level abstractions. Compared to GHIL, the flow control
+primitives in GLIL are much more low-level: only @code{<glil-label>},
+@code{<glil-branch>} and @code{<glil-call>} are available, no
+@code{lambda}, @code{if}, etc.
+
+
+@deffn @scmproc{} compile ghil environment . opts
+Compile @var{ghil}, a GHIL instruction sequence, within
+environment/module @var{environment}, and return the resulting GLIL
+instruction sequence. The option list @var{opts} may be either the
+empty list or a list containing the @code{:O} keyword in which case
+@code{compile} will first go through an optimization stage of
+@var{ghil}.
+
+Note that the @code{:O} option may be passed at a higher-level to the
+@code{compile-file} and @code{compile-in} procedures (@pxref{The
+Language Front-Ends}).
+@end deffn
+
+@deffn @scmproc{} pprint-glil glil . port
+Print @var{glil}, a GLIL sequence instructions, in a human-readable
+form. If @var{port} is passed, it will be used as the output port.
+@end deffn
+
+
+Let's consider the following Scheme expression:
+
+@example
+(lambda (x) (+ x 1))
+@end example
+
+The corresponding (unoptimized) GLIL code, as shown by
+@code{pprint-glil}, looks like this:
+
+@example
+(@@asm (0 0 0 0)
+ (@@asm (1 0 0 0) ;; expect one arg.
+ (@@bind (x argument 0)) ;; debugging info
+ (module-ref #f +) ;; lookup `+'
+ (argument-ref 0) ;; push the argument onto
+ ;; the stack
+ (const 1) ;; push `1'
+ (tail-call 2) ;; call `+', with 2 args,
+ ;; using the same stack frame
+ (@@source 15 33)) ;; additional debugging info
+ (return 0))
+@end example
+
+This is not unlike the VM's assembly language described in
+@ref{Instruction Set}.
+
+@node The Assembler, , GLIL, The Compiler
+@section The Assembler
+
+@findex code->bytes
+
+The final compilation step consists in converting the GLIL instruction
+sequence into VM bytecode. This is what the @code{assemble} procedure
+defined in the @code{(system vm assemble)} module is for. It relies
+on the @code{code->bytes} procedure of the @code{(system vm conv)}
+module to convert instructions (represented as lists whose @code{car}
+is a symbol naming the instruction, e.g. @code{object-ref},
+@pxref{Instruction Set}) into binary code, or @dfn{bytecode}.
+Bytecode itself is represented using SRFI-4 byte vectors,
+@inforef{SRFI-4, SRFI-4 homogeneous numeric vectors, guile}.
+
+
+@deffn @scmproc{} assemble glil environment . opts
+Return a binary representation of @var{glil} (bytecode), either in the
+form of an SRFI-4 @code{u8vector} or a @code{<bytespec>} object.
+[FIXME: Why is that?]
+@end deffn
+
+
+
+@c *********************************************************************
+@node Concept Index, Function and Instruction Index, The Compiler, Top
+@unnumbered Concept Index
+@printindex cp
+
+@node Function and Instruction Index, Command and Variable Index, Concept Index, Top
+@unnumbered Function and Instruction Index
+@printindex fn
+
+@node Command and Variable Index, , Function and Instruction Index, Top
+@unnumbered Command and Variable Index
+@printindex vr
+
+@bye
+
+@c Local Variables:
+@c ispell-local-dictionary: "american";
+@c End:
+
+@c LocalWords: bytecode