summaryrefslogtreecommitdiff
path: root/doc/ref/statprof.texi
diff options
context:
space:
mode:
authorAndy Wingo <wingo@pobox.com>2016-02-01 14:58:34 +0100
committerAndy Wingo <wingo@pobox.com>2016-02-01 15:12:36 +0100
commit8998f1539f9b998a9ec5f867d3933cdd8f06fc41 (patch)
tree1b238bba8b10adf073d9549aeaf48a90ffc45a5f /doc/ref/statprof.texi
parent4066ee31920d52ec0549ce882f883b92992f894b (diff)
Update statprof documentation; deprecate `with-statprof'
* module/statprof.scm: Remove most of the commentary, as it was duplicated in the manual and was getting out of date. (stats): Remove self-secs-per-call and cum-secs-per-call fields as they can be computed from the other fields. (statprof-call-data->stats): Adapt. (statprof-stats-self-secs-per-call): (statprof-stats-cum-secs-per-call): New functions. (statprof-display/flat): Don't print the seconds-per-call fields, as we are no longer stopping the clock around call counters. Anyway these times were quite misleading. (with-statprof): Deprecate. It took its keyword arguments at the beginning; very complicated! Better to use the `statprof' function. (`statprof' was introduced after `with-statprof' and then `with-statprof' was adapted to use it.) * doc/ref/statprof.texi (Statprof): Port this documentation away from the automatically generated text and update it for the new interfaces like #:display-style. * module/system/base/syntax.scm (record-case): Remove comment that referenced with-statprof. Add comment indicating that record-case should be replaced. * doc/ref/scheme-using.texi (Profile Commands): Update to mention keyword arguments and to link to the statprof documentation.
Diffstat (limited to 'doc/ref/statprof.texi')
-rw-r--r--doc/ref/statprof.texi414
1 files changed, 190 insertions, 224 deletions
diff --git a/doc/ref/statprof.texi b/doc/ref/statprof.texi
index 5b99fb6b8..6282811d1 100644
--- a/doc/ref/statprof.texi
+++ b/doc/ref/statprof.texi
@@ -6,279 +6,245 @@
@node Statprof
@section Statprof
-@code{(statprof)} is a fairly simple statistical profiler for Guile.
+Statprof is a statistical profiler for Guile.
A simple use of statprof would look like this:
-@example
-(statprof-reset 0 50000 #t)
-(statprof-start)
-(do-something)
-(statprof-stop)
-(statprof-display)
+@example
+(use-modules (statprof))
+(statprof (lambda ()
+ (map 1+ (iota 1000000))
+ #f))
@end example
-This would reset statprof, clearing all accumulated statistics, then
-start profiling, run some code, stop profiling, and finally display a
-gprof flat-style table of statistics which will look something like
-this:
-
-@example
- % cumulative self self total
- time seconds seconds calls ms/call ms/call name
- 35.29 0.23 0.23 2002 0.11 0.11 -
- 23.53 0.15 0.15 2001 0.08 0.08 positive?
- 23.53 0.15 0.15 2000 0.08 0.08 +
- 11.76 0.23 0.08 2000 0.04 0.11 do-nothing
- 5.88 0.64 0.04 2001 0.02 0.32 loop
- 0.00 0.15 0.00 1 0.00 150.59 do-something
- ...
+This would run the thunk with statistical profiling, finally displaying
+a flat table of statistics which could look something like this:
+
+@example
+% cumulative self
+time seconds seconds procedure
+ 57.14 39769.73 0.07 ice-9/boot-9.scm:249:5:map1
+ 28.57 0.04 0.04 ice-9/boot-9.scm:1165:0:iota
+ 14.29 0.02 0.02 1+
+ 0.00 0.12 0.00 <current input>:2:10
+---
+Sample count: 7
+Total time: 0.123490713 seconds (0.201983993 seconds in GC)
@end example
All of the numerical data with the exception of the calls column is
statistically approximate. In the following column descriptions, and in
-all of statprof, "time" refers to execution time (both user and system),
-not wall clock time.
-
-@table @asis
-@item % time
-The percent of the time spent inside the procedure itself (not counting
-children).
-
-@item cumulative seconds
-The total number of seconds spent in the procedure, including children.
-
-@item self seconds
-The total number of seconds spent in the procedure itself (not counting
-children).
-
-@item calls
-The total number of times the procedure was called.
-
-@item self ms/call
-The average time taken by the procedure itself on each call, in ms.
-
-@item total ms/call
-The average time taken by each call to the procedure, including time
-spent in child functions.
-
-@item name
-The name of the procedure.
-
-@end table
+all of statprof, ``time'' refers to execution time (both user and
+system), not wall clock time.
+
+The @code{% time} column indicates the percentage of the run-time time
+spent inside the procedure itself (not counting children). It is
+calculated as @code{self seconds}, measuring the amount of time spent in
+the procedure, divided by the total run-time.
+
+@code{cumulative seconds} also counts time spent in children of a
+function. For recursive functions, this can exceed the total time, as
+in our example above, because each activation on the stack adds to the
+cumulative time.
+
+Finally, the GC time measures the time spent in the garbage collector.
+On systems with multiple cores, this time can be larger than the run
+time, because it counts time spent in all threads, and will run the
+``marking'' phase of GC in parallel. If GC time is a significant
+fraction of the run time, that means that most time in your program is
+spent allocating objects and cleaning up after those allocations. To
+speed up your program, one good place to start would be to look at how
+to reduce the allocation rate.
+
+Statprof's main mode of operation is as a statistical profiler. However
+statprof can also run in a ``precise'' mode as well. Pass the
+@code{#:count-calls? #t} keyword argument to @code{statprof} to record
+all calls:
+
+@example
+(use-modules (statprof))
+(statprof (lambda ()
+ (map 1+ (iota 1000000))
+ #f)
+ #:count-calls? #t)
+@end example
-The profiler uses @code{eq?} and the procedure object itself to identify
-the procedures, so it won't confuse different procedures with the same
-name. They will show up as two different rows in the output.
+The result has an additional @code{calls} column:
+
+@example
+% cumulative self
+time seconds seconds calls procedure
+ 82.26 0.73 0.73 1000000 1+
+ 11.29 420925.80 0.10 1000001 ice-9/boot-9.scm:249:5:map1
+ 4.84 0.06 0.04 1 ice-9/boot-9.scm:1165:0:iota
+[...]
+---
+Sample count: 62
+Total time: 0.893098065 seconds (1.222796536 seconds in GC)
+@end example
-Right now the profiler is quite simplistic. I cannot provide call-graphs
-or other higher level information. What you see in the table is pretty
-much all there is. Patches are welcome :-)
+As you can see, the profile is perturbed: @code{1+} ends up on top,
+whereas it was not marked as hot in the earlier profile. This is
+because the overhead of call-counting unfairly penalizes calls. Still,
+this precise mode can be useful at times to do algorithmic optimizations
+based on the precise call counts.
@section Implementation notes
+
The profiler works by setting the unix profiling signal
@code{ITIMER_PROF} to go off after the interval you define in the call
-to @code{statprof-reset}. When the signal fires, a sampling routine is
-run which looks at the current procedure that's executing, and then
-crawls up the stack, and for each procedure encountered, increments that
-procedure's sample count. Note that if a procedure is encountered
-multiple times on a given stack, it is only counted once. After the
-sampling is complete, the profiler resets profiling timer to fire again
-after the appropriate interval.
-
-Meanwhile, the profiler keeps track, via @code{get-internal-run-time},
-how much CPU time (system and user -- which is also what
-@code{ITIMER_PROF} tracks), has elapsed while code has been executing
-within a statprof-start/stop block.
-
-The profiler also tries to avoid counting or timing its own code as much
-as possible.
+to @code{statprof-reset}. When the signal fires, a sampling routine
+runs which crawls up the stack, recording all instruction pointers into
+a buffer. After the sample is complete, the profiler resets profiling
+timer to fire again after the appropriate interval.
+
+Later, when profiling stops, that log buffer is analyzed to produce the
+``self seconds'' and ``cumulative seconds'' statistics. A procedure at
+the top of the stack counts toward ``self'' samples, and everything on
+the stack counts towards ``cumulative'' samples.
+
+While the profiler is running it measures how much CPU time (system and
+user -- which is also what @code{ITIMER_PROF} tracks) has elapsed while
+code has been executing within the profiler. Only run time counts
+towards the profile, not wall-clock time. For example, sleeping and
+waiting for input or output do not cause the timer clock to advance.
@section Usage
-@anchor{statprof statprof-active?}@defun statprof-active?
-Returns @code{#t} if @code{statprof-start} has been called more times
-than @code{statprof-stop}, @code{#f} otherwise.
-@end defun
-
-@anchor{statprof statprof-start}@defun statprof-start
-Start the profiler.@code{}
-
-@end defun
+@deffn {Scheme Procedure} statprof thunk @
+ [#:loop loop=1] [#:hz hz=100] @
+ [#:port port=(current-output-port)] @
+ [#:count-calls? count-calls?=#f] @
+ [#:display-style display-style='flat]
+Profile the execution of @var{thunk}, and return its return values.
-@anchor{statprof statprof-stop}@defun statprof-stop
-Stop the profiler.@code{}
+The stack will be sampled @var{hz} times per second, and the thunk
+itself will be called @var{loop} times.
-@end defun
+If @var{count-calls?} is true, all procedure calls will be recorded.
+This operation is somewhat expensive.
-@anchor{statprof statprof-reset}@defun statprof-reset sample-seconds sample-microseconds count-calls? [full-stacks?]
-Reset the statprof sampler interval to @var{sample-seconds} and
-@var{sample-microseconds}. If @var{count-calls?} is true, arrange to
-instrument procedure calls as well as collecting statistical profiling
-data. If @var{full-stacks?} is true, collect all sampled stacks into a
-list for later analysis.
+After the @var{thunk} has been profiled, print out a profile to
+@var{port}. If @var{display-style} is @code{flat}, the results will be
+printed as a flat profile. Otherwise if @var{display-style} is
+@code{tree}, print the results as a tree profile.
+@end deffn
-Enables traps and debugging as necessary.
+Profiling can also be enabled and disabled manually.
-@end defun
+@deffn {Scheme Procedure} statprof-active?
+Returns @code{#t} if @code{statprof-start} has been called more times
+than @code{statprof-stop}, @code{#f} otherwise.
+@end deffn
-@anchor{statprof statprof-accumulated-time}@defun statprof-accumulated-time
-Returns the time accumulated during the last statprof run.@code{}
+@deffn {Scheme Procedure} statprof-start
+@deffnx {Scheme Procedure} statprof-stop
+Start or stop the profiler.
+@end deffn
-@end defun
+@deffn {Scheme Procedure} statprof-reset sample-seconds sample-microseconds count-calls?
+Reset the profiling sample interval to @var{sample-seconds} and
+@var{sample-microseconds}. If @var{count-calls?} is true, arrange to
+instrument procedure calls as well as collecting statistical profiling
+data.
+@end deffn
-@anchor{statprof statprof-sample-count}@defun statprof-sample-count
-Returns the number of samples taken during the last statprof run.@code{}
+If you use the manual @code{statprof-start}/@code{statprof-stop}
+interface, an implicit statprof state will persist starting from the
+last call to @code{statprof-reset}, or the first call to
+@code{statprof-start}. There are a number of accessors to fetch
+statistics from this implicit state.
-@end defun
+@deffn {Scheme Procedure} statprof-accumulated-time
+Returns the time accumulated during the last statprof run.
+@end deffn
-@anchor{statprof statprof-fold-call-data}@defun statprof-fold-call-data proc init
-Fold @var{proc} over the call-data accumulated by statprof. Cannot be
-called while statprof is active. @var{proc} should take two arguments,
-@code{(@var{call-data} @var{prior-result})}.
+@deffn {Scheme Procedure} statprof-sample-count
+Returns the number of samples taken during the last statprof run.
+@end deffn
-Note that a given proc-name may appear multiple times, but if it does,
-it represents different functions with the same name.
+@deffn {Scheme Procedure} statprof-fold-call-data proc init
+Fold @var{proc} over the call-data accumulated by statprof. This
+procedure cannot be called while statprof is active.
-@end defun
+@var{proc} will be called with arguments, @var{call-data} and
+@var{prior-result}.
+@end deffn
-@anchor{statprof statprof-proc-call-data}@defun statprof-proc-call-data proc
+@deffn {Scheme Procedure} statprof-proc-call-data proc
Returns the call-data associated with @var{proc}, or @code{#f} if none
is available.
+@end deffn
-@end defun
-
-@anchor{statprof statprof-call-data-name}@defun statprof-call-data-name cd
-@end defun
-
-@anchor{statprof statprof-call-data-calls}@defun statprof-call-data-calls cd
-@end defun
+@deffn {Scheme Procedure} statprof-call-data-name cd
+@deffnx {Scheme Procedure} statprof-call-data-calls cd
+@deffnx {Scheme Procedure} statprof-call-data-cum-samples cd
+@deffnx {Scheme Procedure} statprof-call-data-self-samples cd
+Accessors for the fields in a statprof call-data object.
+@end deffn
-@anchor{statprof statprof-call-data-cum-samples}@defun statprof-call-data-cum-samples cd
-@end defun
-
-@anchor{statprof statprof-call-data-self-samples}@defun statprof-call-data-self-samples cd
-@end defun
-
-@anchor{statprof statprof-call-data->stats}@defun statprof-call-data->stats call-data
+@deffn {Scheme Procedure} statprof-call-data->stats call-data
Returns an object of type @code{statprof-stats}.
+@end deffn
+
+@deffn {Scheme Procedure} statprof-stats-proc-name stats
+@deffnx {Scheme Procedure} statprof-stats-%-time-in-proc stats
+@deffnx {Scheme Procedure} statprof-stats-cum-secs-in-proc stats
+@deffnx {Scheme Procedure} statprof-stats-self-secs-in-proc stats
+@deffnx {Scheme Procedure} statprof-stats-calls stats
+@deffnx {Scheme Procedure} statprof-stats-self-secs-per-call stats
+@deffnx {Scheme Procedure} statprof-stats-cum-secs-per-call stats
+Accessors for the fields in a @code{statprof-stats} object.
+@end deffn
+
+@deffn {Scheme Procedure} statprof-display @
+ [port=(current-output-port)] [#:style style=flat]
+Displays a summary of the statistics collected. Possible values for
+@var{style} include:
-@end defun
-
-@anchor{statprof statprof-stats-proc-name}@defun statprof-stats-proc-name stats
-@end defun
-
-@anchor{statprof statprof-stats-%-time-in-proc}@defun statprof-stats-%-time-in-proc stats
-@end defun
-
-@anchor{statprof statprof-stats-cum-secs-in-proc}@defun statprof-stats-cum-secs-in-proc stats
-@end defun
-
-@anchor{statprof statprof-stats-self-secs-in-proc}@defun statprof-stats-self-secs-in-proc stats
-@end defun
-
-@anchor{statprof statprof-stats-calls}@defun statprof-stats-calls stats
-@end defun
-
-@anchor{statprof statprof-stats-self-secs-per-call}@defun statprof-stats-self-secs-per-call stats
-@end defun
-
-@anchor{statprof statprof-stats-cum-secs-per-call}@defun statprof-stats-cum-secs-per-call stats
-@end defun
-
-@anchor{statprof statprof-display}@defun statprof-display . _
-Displays a gprof-like summary of the statistics collected. Unless an
-optional @var{port} argument is passed, uses the current output port.
-
-@end defun
-
-@anchor{statprof statprof-display-anomolies}@defun statprof-display-anomolies
-A sanity check that attempts to detect anomolies in statprof's
-statistics.@code{}
-
-@end defun
+@table @code
+@item flat
+Display a traditional gprof-style flat profile.
+@item anomalies
+Find statistical anomalies in the data.
+@item tree
+Display a tree profile.
+@end table
+@end deffn
-@anchor{statprof statprof-fetch-stacks}@defun statprof-fetch-stacks
+@deffn {Scheme Procedure} statprof-fetch-stacks
Returns a list of stacks, as they were captured since the last call to
@code{statprof-reset}.
+@end deffn
-Note that stacks are only collected if the @var{full-stacks?} argument
-to @code{statprof-reset} is true.
-
-@end defun
-
-@anchor{statprof statprof-fetch-call-tree}@defun statprof-fetch-call-tree
+@deffn {Scheme Procedure} statprof-fetch-call-tree [#:precise precise?=#f]
@verbatim
Return a call tree for the previous statprof run.
-The return value is a list of nodes, each of which is of the type:
-@@code
- node ::= (@@var@{proc@} @@var@{count@} . @@var@{nodes@})
-@@end code
+The return value is a list of nodes. A node is a list of the form:
+@code
+ node ::= (@var{proc} @var{count} . @var{nodes})
+@end code
+
+The @var{proc} is a printable representation of a procedure, as a
+string. If @var{precise?} is false, which is the default, then a node
+corresponds to a procedure invocation. If it is true, then a node
+corresponds to a return point in a procedure. Passing @code{#:precise?
+#t} allows a user to distinguish different source lines in a procedure,
+but usually it is too much detail, so it is off by default.
@end verbatim
-@end defun
-
-@anchor{statprof statprof}@defun statprof thunk [#:loop] [#:hz] [#:count-calls?] [#:full-stacks?]
-Profile the execution of @var{thunk}, and return its return values.
-
-The stack will be sampled @var{hz} times per second, and the thunk
-itself will be called @var{loop} times.
-
-If @var{count-calls?} is true, all procedure calls will be recorded.
-This operation is somewhat expensive.
-
-If @var{full-stacks?} is true, at each sample, statprof will store away
-the whole call tree, for later analysis. Use
-@code{statprof-fetch-stacks} or @code{statprof-fetch-call-tree} to
-retrieve the last-stored stacks.
+@end deffn
-@end defun
+@deffn {Scheme Procedure} gcprof thunk [#:loop]
+Like the @code{statprof} procedure, but instead of profiling CPU time,
+we profile garbage collection.
-@anchor{statprof with-statprof}@defspec with-statprof args
-Profile the expressions in the body, and return the body's return
-value.
-
-Keyword arguments:
-
-@table @code
-@item #:loop
-Execute the body @var{loop} number of times, or @code{#f} for no looping
-
-default: @code{#f}
-
-@item #:hz
-Sampling rate
-
-default: @code{20}
-
-@item #:count-calls?
-Whether to instrument each function call (expensive)
-
-default: @code{#f}
-
-@item #:full-stacks?
-Whether to collect away all sampled stacks into a list
-
-default: @code{#f}
-
-@end table
-
-@end defspec
-
-@anchor{statprof gcprof}@defun gcprof thunk [#:loop] [#:full-stacks?]
-Do an allocation profile of the execution of @var{thunk}.
-
-The stack will be sampled soon after every garbage collection, yielding
-an approximate idea of what is causing allocation in your program.
+The stack will be sampled soon after every garbage collection during the
+evaluation of @var{thunk}, yielding an approximate idea of what is
+causing allocation in your program.
Since GC does not occur very frequently, you may need to use the
@var{loop} parameter, to cause @var{thunk} to be called @var{loop}
times.
-
-If @var{full-stacks?} is true, at each sample, statprof will store away
-the whole call tree, for later analysis. Use
-@code{statprof-fetch-stacks} or @code{statprof-fetch-call-tree} to
-retrieve the last-stored stacks.
-
-@end defun
+@end deffn