diff options
author | Andy Wingo <wingo@pobox.com> | 2016-02-01 14:58:34 +0100 |
---|---|---|
committer | Andy Wingo <wingo@pobox.com> | 2016-02-01 15:12:36 +0100 |
commit | 8998f1539f9b998a9ec5f867d3933cdd8f06fc41 (patch) | |
tree | 1b238bba8b10adf073d9549aeaf48a90ffc45a5f /doc/ref/statprof.texi | |
parent | 4066ee31920d52ec0549ce882f883b92992f894b (diff) |
Update statprof documentation; deprecate `with-statprof'
* module/statprof.scm: Remove most of the commentary, as it was
duplicated in the manual and was getting out of date.
(stats): Remove self-secs-per-call and cum-secs-per-call fields as
they can be computed from the other fields.
(statprof-call-data->stats): Adapt.
(statprof-stats-self-secs-per-call):
(statprof-stats-cum-secs-per-call): New functions.
(statprof-display/flat): Don't print the seconds-per-call fields, as
we are no longer stopping the clock around call counters. Anyway
these times were quite misleading.
(with-statprof): Deprecate. It took its keyword arguments at the
beginning; very complicated! Better to use the `statprof' function.
(`statprof' was introduced after `with-statprof' and then
`with-statprof' was adapted to use it.)
* doc/ref/statprof.texi (Statprof): Port this documentation away from
the automatically generated text and update it for the new interfaces
like #:display-style.
* module/system/base/syntax.scm (record-case): Remove comment that
referenced with-statprof. Add comment indicating that record-case
should be replaced.
* doc/ref/scheme-using.texi (Profile Commands): Update to mention
keyword arguments and to link to the statprof documentation.
Diffstat (limited to 'doc/ref/statprof.texi')
-rw-r--r-- | doc/ref/statprof.texi | 414 |
1 files changed, 190 insertions, 224 deletions
diff --git a/doc/ref/statprof.texi b/doc/ref/statprof.texi index 5b99fb6b8..6282811d1 100644 --- a/doc/ref/statprof.texi +++ b/doc/ref/statprof.texi @@ -6,279 +6,245 @@ @node Statprof @section Statprof -@code{(statprof)} is a fairly simple statistical profiler for Guile. +Statprof is a statistical profiler for Guile. A simple use of statprof would look like this: -@example -(statprof-reset 0 50000 #t) -(statprof-start) -(do-something) -(statprof-stop) -(statprof-display) +@example +(use-modules (statprof)) +(statprof (lambda () + (map 1+ (iota 1000000)) + #f)) @end example -This would reset statprof, clearing all accumulated statistics, then -start profiling, run some code, stop profiling, and finally display a -gprof flat-style table of statistics which will look something like -this: - -@example - % cumulative self self total - time seconds seconds calls ms/call ms/call name - 35.29 0.23 0.23 2002 0.11 0.11 - - 23.53 0.15 0.15 2001 0.08 0.08 positive? - 23.53 0.15 0.15 2000 0.08 0.08 + - 11.76 0.23 0.08 2000 0.04 0.11 do-nothing - 5.88 0.64 0.04 2001 0.02 0.32 loop - 0.00 0.15 0.00 1 0.00 150.59 do-something - ... +This would run the thunk with statistical profiling, finally displaying +a flat table of statistics which could look something like this: + +@example +% cumulative self +time seconds seconds procedure + 57.14 39769.73 0.07 ice-9/boot-9.scm:249:5:map1 + 28.57 0.04 0.04 ice-9/boot-9.scm:1165:0:iota + 14.29 0.02 0.02 1+ + 0.00 0.12 0.00 <current input>:2:10 +--- +Sample count: 7 +Total time: 0.123490713 seconds (0.201983993 seconds in GC) @end example All of the numerical data with the exception of the calls column is statistically approximate. In the following column descriptions, and in -all of statprof, "time" refers to execution time (both user and system), -not wall clock time. - -@table @asis -@item % time -The percent of the time spent inside the procedure itself (not counting -children). - -@item cumulative seconds -The total number of seconds spent in the procedure, including children. - -@item self seconds -The total number of seconds spent in the procedure itself (not counting -children). - -@item calls -The total number of times the procedure was called. - -@item self ms/call -The average time taken by the procedure itself on each call, in ms. - -@item total ms/call -The average time taken by each call to the procedure, including time -spent in child functions. - -@item name -The name of the procedure. - -@end table +all of statprof, ``time'' refers to execution time (both user and +system), not wall clock time. + +The @code{% time} column indicates the percentage of the run-time time +spent inside the procedure itself (not counting children). It is +calculated as @code{self seconds}, measuring the amount of time spent in +the procedure, divided by the total run-time. + +@code{cumulative seconds} also counts time spent in children of a +function. For recursive functions, this can exceed the total time, as +in our example above, because each activation on the stack adds to the +cumulative time. + +Finally, the GC time measures the time spent in the garbage collector. +On systems with multiple cores, this time can be larger than the run +time, because it counts time spent in all threads, and will run the +``marking'' phase of GC in parallel. If GC time is a significant +fraction of the run time, that means that most time in your program is +spent allocating objects and cleaning up after those allocations. To +speed up your program, one good place to start would be to look at how +to reduce the allocation rate. + +Statprof's main mode of operation is as a statistical profiler. However +statprof can also run in a ``precise'' mode as well. Pass the +@code{#:count-calls? #t} keyword argument to @code{statprof} to record +all calls: + +@example +(use-modules (statprof)) +(statprof (lambda () + (map 1+ (iota 1000000)) + #f) + #:count-calls? #t) +@end example -The profiler uses @code{eq?} and the procedure object itself to identify -the procedures, so it won't confuse different procedures with the same -name. They will show up as two different rows in the output. +The result has an additional @code{calls} column: + +@example +% cumulative self +time seconds seconds calls procedure + 82.26 0.73 0.73 1000000 1+ + 11.29 420925.80 0.10 1000001 ice-9/boot-9.scm:249:5:map1 + 4.84 0.06 0.04 1 ice-9/boot-9.scm:1165:0:iota +[...] +--- +Sample count: 62 +Total time: 0.893098065 seconds (1.222796536 seconds in GC) +@end example -Right now the profiler is quite simplistic. I cannot provide call-graphs -or other higher level information. What you see in the table is pretty -much all there is. Patches are welcome :-) +As you can see, the profile is perturbed: @code{1+} ends up on top, +whereas it was not marked as hot in the earlier profile. This is +because the overhead of call-counting unfairly penalizes calls. Still, +this precise mode can be useful at times to do algorithmic optimizations +based on the precise call counts. @section Implementation notes + The profiler works by setting the unix profiling signal @code{ITIMER_PROF} to go off after the interval you define in the call -to @code{statprof-reset}. When the signal fires, a sampling routine is -run which looks at the current procedure that's executing, and then -crawls up the stack, and for each procedure encountered, increments that -procedure's sample count. Note that if a procedure is encountered -multiple times on a given stack, it is only counted once. After the -sampling is complete, the profiler resets profiling timer to fire again -after the appropriate interval. - -Meanwhile, the profiler keeps track, via @code{get-internal-run-time}, -how much CPU time (system and user -- which is also what -@code{ITIMER_PROF} tracks), has elapsed while code has been executing -within a statprof-start/stop block. - -The profiler also tries to avoid counting or timing its own code as much -as possible. +to @code{statprof-reset}. When the signal fires, a sampling routine +runs which crawls up the stack, recording all instruction pointers into +a buffer. After the sample is complete, the profiler resets profiling +timer to fire again after the appropriate interval. + +Later, when profiling stops, that log buffer is analyzed to produce the +``self seconds'' and ``cumulative seconds'' statistics. A procedure at +the top of the stack counts toward ``self'' samples, and everything on +the stack counts towards ``cumulative'' samples. + +While the profiler is running it measures how much CPU time (system and +user -- which is also what @code{ITIMER_PROF} tracks) has elapsed while +code has been executing within the profiler. Only run time counts +towards the profile, not wall-clock time. For example, sleeping and +waiting for input or output do not cause the timer clock to advance. @section Usage -@anchor{statprof statprof-active?}@defun statprof-active? -Returns @code{#t} if @code{statprof-start} has been called more times -than @code{statprof-stop}, @code{#f} otherwise. -@end defun - -@anchor{statprof statprof-start}@defun statprof-start -Start the profiler.@code{} - -@end defun +@deffn {Scheme Procedure} statprof thunk @ + [#:loop loop=1] [#:hz hz=100] @ + [#:port port=(current-output-port)] @ + [#:count-calls? count-calls?=#f] @ + [#:display-style display-style='flat] +Profile the execution of @var{thunk}, and return its return values. -@anchor{statprof statprof-stop}@defun statprof-stop -Stop the profiler.@code{} +The stack will be sampled @var{hz} times per second, and the thunk +itself will be called @var{loop} times. -@end defun +If @var{count-calls?} is true, all procedure calls will be recorded. +This operation is somewhat expensive. -@anchor{statprof statprof-reset}@defun statprof-reset sample-seconds sample-microseconds count-calls? [full-stacks?] -Reset the statprof sampler interval to @var{sample-seconds} and -@var{sample-microseconds}. If @var{count-calls?} is true, arrange to -instrument procedure calls as well as collecting statistical profiling -data. If @var{full-stacks?} is true, collect all sampled stacks into a -list for later analysis. +After the @var{thunk} has been profiled, print out a profile to +@var{port}. If @var{display-style} is @code{flat}, the results will be +printed as a flat profile. Otherwise if @var{display-style} is +@code{tree}, print the results as a tree profile. +@end deffn -Enables traps and debugging as necessary. +Profiling can also be enabled and disabled manually. -@end defun +@deffn {Scheme Procedure} statprof-active? +Returns @code{#t} if @code{statprof-start} has been called more times +than @code{statprof-stop}, @code{#f} otherwise. +@end deffn -@anchor{statprof statprof-accumulated-time}@defun statprof-accumulated-time -Returns the time accumulated during the last statprof run.@code{} +@deffn {Scheme Procedure} statprof-start +@deffnx {Scheme Procedure} statprof-stop +Start or stop the profiler. +@end deffn -@end defun +@deffn {Scheme Procedure} statprof-reset sample-seconds sample-microseconds count-calls? +Reset the profiling sample interval to @var{sample-seconds} and +@var{sample-microseconds}. If @var{count-calls?} is true, arrange to +instrument procedure calls as well as collecting statistical profiling +data. +@end deffn -@anchor{statprof statprof-sample-count}@defun statprof-sample-count -Returns the number of samples taken during the last statprof run.@code{} +If you use the manual @code{statprof-start}/@code{statprof-stop} +interface, an implicit statprof state will persist starting from the +last call to @code{statprof-reset}, or the first call to +@code{statprof-start}. There are a number of accessors to fetch +statistics from this implicit state. -@end defun +@deffn {Scheme Procedure} statprof-accumulated-time +Returns the time accumulated during the last statprof run. +@end deffn -@anchor{statprof statprof-fold-call-data}@defun statprof-fold-call-data proc init -Fold @var{proc} over the call-data accumulated by statprof. Cannot be -called while statprof is active. @var{proc} should take two arguments, -@code{(@var{call-data} @var{prior-result})}. +@deffn {Scheme Procedure} statprof-sample-count +Returns the number of samples taken during the last statprof run. +@end deffn -Note that a given proc-name may appear multiple times, but if it does, -it represents different functions with the same name. +@deffn {Scheme Procedure} statprof-fold-call-data proc init +Fold @var{proc} over the call-data accumulated by statprof. This +procedure cannot be called while statprof is active. -@end defun +@var{proc} will be called with arguments, @var{call-data} and +@var{prior-result}. +@end deffn -@anchor{statprof statprof-proc-call-data}@defun statprof-proc-call-data proc +@deffn {Scheme Procedure} statprof-proc-call-data proc Returns the call-data associated with @var{proc}, or @code{#f} if none is available. +@end deffn -@end defun - -@anchor{statprof statprof-call-data-name}@defun statprof-call-data-name cd -@end defun - -@anchor{statprof statprof-call-data-calls}@defun statprof-call-data-calls cd -@end defun +@deffn {Scheme Procedure} statprof-call-data-name cd +@deffnx {Scheme Procedure} statprof-call-data-calls cd +@deffnx {Scheme Procedure} statprof-call-data-cum-samples cd +@deffnx {Scheme Procedure} statprof-call-data-self-samples cd +Accessors for the fields in a statprof call-data object. +@end deffn -@anchor{statprof statprof-call-data-cum-samples}@defun statprof-call-data-cum-samples cd -@end defun - -@anchor{statprof statprof-call-data-self-samples}@defun statprof-call-data-self-samples cd -@end defun - -@anchor{statprof statprof-call-data->stats}@defun statprof-call-data->stats call-data +@deffn {Scheme Procedure} statprof-call-data->stats call-data Returns an object of type @code{statprof-stats}. +@end deffn + +@deffn {Scheme Procedure} statprof-stats-proc-name stats +@deffnx {Scheme Procedure} statprof-stats-%-time-in-proc stats +@deffnx {Scheme Procedure} statprof-stats-cum-secs-in-proc stats +@deffnx {Scheme Procedure} statprof-stats-self-secs-in-proc stats +@deffnx {Scheme Procedure} statprof-stats-calls stats +@deffnx {Scheme Procedure} statprof-stats-self-secs-per-call stats +@deffnx {Scheme Procedure} statprof-stats-cum-secs-per-call stats +Accessors for the fields in a @code{statprof-stats} object. +@end deffn + +@deffn {Scheme Procedure} statprof-display @ + [port=(current-output-port)] [#:style style=flat] +Displays a summary of the statistics collected. Possible values for +@var{style} include: -@end defun - -@anchor{statprof statprof-stats-proc-name}@defun statprof-stats-proc-name stats -@end defun - -@anchor{statprof statprof-stats-%-time-in-proc}@defun statprof-stats-%-time-in-proc stats -@end defun - -@anchor{statprof statprof-stats-cum-secs-in-proc}@defun statprof-stats-cum-secs-in-proc stats -@end defun - -@anchor{statprof statprof-stats-self-secs-in-proc}@defun statprof-stats-self-secs-in-proc stats -@end defun - -@anchor{statprof statprof-stats-calls}@defun statprof-stats-calls stats -@end defun - -@anchor{statprof statprof-stats-self-secs-per-call}@defun statprof-stats-self-secs-per-call stats -@end defun - -@anchor{statprof statprof-stats-cum-secs-per-call}@defun statprof-stats-cum-secs-per-call stats -@end defun - -@anchor{statprof statprof-display}@defun statprof-display . _ -Displays a gprof-like summary of the statistics collected. Unless an -optional @var{port} argument is passed, uses the current output port. - -@end defun - -@anchor{statprof statprof-display-anomolies}@defun statprof-display-anomolies -A sanity check that attempts to detect anomolies in statprof's -statistics.@code{} - -@end defun +@table @code +@item flat +Display a traditional gprof-style flat profile. +@item anomalies +Find statistical anomalies in the data. +@item tree +Display a tree profile. +@end table +@end deffn -@anchor{statprof statprof-fetch-stacks}@defun statprof-fetch-stacks +@deffn {Scheme Procedure} statprof-fetch-stacks Returns a list of stacks, as they were captured since the last call to @code{statprof-reset}. +@end deffn -Note that stacks are only collected if the @var{full-stacks?} argument -to @code{statprof-reset} is true. - -@end defun - -@anchor{statprof statprof-fetch-call-tree}@defun statprof-fetch-call-tree +@deffn {Scheme Procedure} statprof-fetch-call-tree [#:precise precise?=#f] @verbatim Return a call tree for the previous statprof run. -The return value is a list of nodes, each of which is of the type: -@@code - node ::= (@@var@{proc@} @@var@{count@} . @@var@{nodes@}) -@@end code +The return value is a list of nodes. A node is a list of the form: +@code + node ::= (@var{proc} @var{count} . @var{nodes}) +@end code + +The @var{proc} is a printable representation of a procedure, as a +string. If @var{precise?} is false, which is the default, then a node +corresponds to a procedure invocation. If it is true, then a node +corresponds to a return point in a procedure. Passing @code{#:precise? +#t} allows a user to distinguish different source lines in a procedure, +but usually it is too much detail, so it is off by default. @end verbatim -@end defun - -@anchor{statprof statprof}@defun statprof thunk [#:loop] [#:hz] [#:count-calls?] [#:full-stacks?] -Profile the execution of @var{thunk}, and return its return values. - -The stack will be sampled @var{hz} times per second, and the thunk -itself will be called @var{loop} times. - -If @var{count-calls?} is true, all procedure calls will be recorded. -This operation is somewhat expensive. - -If @var{full-stacks?} is true, at each sample, statprof will store away -the whole call tree, for later analysis. Use -@code{statprof-fetch-stacks} or @code{statprof-fetch-call-tree} to -retrieve the last-stored stacks. +@end deffn -@end defun +@deffn {Scheme Procedure} gcprof thunk [#:loop] +Like the @code{statprof} procedure, but instead of profiling CPU time, +we profile garbage collection. -@anchor{statprof with-statprof}@defspec with-statprof args -Profile the expressions in the body, and return the body's return -value. - -Keyword arguments: - -@table @code -@item #:loop -Execute the body @var{loop} number of times, or @code{#f} for no looping - -default: @code{#f} - -@item #:hz -Sampling rate - -default: @code{20} - -@item #:count-calls? -Whether to instrument each function call (expensive) - -default: @code{#f} - -@item #:full-stacks? -Whether to collect away all sampled stacks into a list - -default: @code{#f} - -@end table - -@end defspec - -@anchor{statprof gcprof}@defun gcprof thunk [#:loop] [#:full-stacks?] -Do an allocation profile of the execution of @var{thunk}. - -The stack will be sampled soon after every garbage collection, yielding -an approximate idea of what is causing allocation in your program. +The stack will be sampled soon after every garbage collection during the +evaluation of @var{thunk}, yielding an approximate idea of what is +causing allocation in your program. Since GC does not occur very frequently, you may need to use the @var{loop} parameter, to cause @var{thunk} to be called @var{loop} times. - -If @var{full-stacks?} is true, at each sample, statprof will store away -the whole call tree, for later analysis. Use -@code{statprof-fetch-stacks} or @code{statprof-fetch-call-tree} to -retrieve the last-stored stacks. - -@end defun +@end deffn |