If you want to profile a Python application, you can get some basic functionality with the cProfile and pstats modules, as described in the Python Profilers documentation. After some curious experimenting, I discovered that the pstats module provides a handy interactive mode. To my surprising, I didn't found any substantial documentation about this time saver, so let me entertain you with a dump of my findings.
In the documentation you'll find a
Instant User's Manual
to get the impatient up an running in no time. E.g. to profile the
entry point foo()
, execute the following in your script/module:
import cProfile
cProfile.run('foo()', 'foo.profile')
This will profile the foo()
function, and save the profiling data to a
file foo.profile
.
If you want to profile the whole application, you even don't have to
edit your code, as you can run cProfile as a script with your application,
say barbazzr.py
, as argument:
python -m cProfile -o barbazzr.profile barbazzr.py
This will save the profiling data to barbazzr.profile
.
Now, to examine the profile data (foo.profile
or barbazzr.profile
in the examples above), you can use the pstats module as described in the
Instant User's Manual.
But I've never liked typing these long chained function calls.
Much easier is using the interactive mode of the pstats module by running it as a script with the profile data file as argument:
python -m pstats barbazzr.profile
And voila, you are in an interactive profile statistics browsing session:
Welcome to the profile statistics browser.
%
Use the help command to get the supported commands:
% help
Documented commands (type help <topic>):
========================================
EOF add callees callers quit read reverse sort stats strip
Undocumented commands:
======================
help
Get more help on a command, e.g. the stats command:
% help stats
Print statistics from the current stat object.
Arguments may be:
* An integer maximum number of entries to print.
* A decimal fractional number between 0 and 1, controlling
what fraction of selected entries to print.
* A regular expression; only entries with function names
that match it are printed.
Strip the long file paths to make the reports easier on the eyes:
% strip
Let's sort. Get the sorting options with the sort command (without arguments):
% sort
Valid sort keys (unique prefixes are accepted):
stdname -- standard name
nfl -- name/file/line
pcalls -- call count
file -- file name
calls -- call count
time -- internal time
line -- line number
cumulative -- cumulative time
module -- file name
name -- function name
Sort on internal time:
% sort time
And show the top 10 functions:
% stats 10
Thu Jan 21 17:54:33 2010 barbazrr.profile
18195692 function calls (18084048 primitive calls) in 37.803 CPU seconds
Ordered by: internal time
List reduced from 979 to 10 due to restriction <10>
ncalls tottime percall cumtime percall filename:lineno(function)
22 7.644 0.347 7.645 0.347 arff.py:382(as_dict)
19964 1.664 0.000 2.483 0.000 backend_ps.py:456(_convert_path)
124176 1.541 0.000 2.307 0.000 arff.py:191(append)
73 1.044 0.014 1.044 0.014 {method 'close' of 'file' objects}
597349 0.978 0.000 1.475 0.000 labelhistogram.py:137(observe)
22 0.967 0.044 2.275 0.103 labelhistogram.py:40(observe_from_arff)
1368524 0.959 0.000 1.312 0.000 labelhistogram.py:21(observe)
22 0.809 0.037 0.809 0.037 {built-in method write_png}
39700 0.774 0.000 0.774 0.000 numeric.py:1500(identity)
95736 0.719 0.000 0.719 0.000 {numpy.core.multiarray.array}
Sort on cumulative time and get the top 5:
% sort cumulative
% stats 5
Thu Jan 21 17:54:33 2010 barbazrr.profile
18195692 function calls (18084048 primitive calls) in 37.803 CPU seconds
Ordered by: cumulative time
List reduced from 979 to 5 due to restriction <5>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.097 0.097 37.803 37.803 <string>:1(<module>)
1 0.002 0.002 37.706 37.706 annotation_shifts.py:301(main)
11 0.092 0.008 34.746 3.159 annotation_shifts.py:44(analyse)
22 0.001 0.000 18.514 0.842 htmlreport.py:159(addPylabFigure)
44 0.001 0.000 18.500 0.420 figure.py:927(savefig)
Very handy indeed.
Playing with the other commands or options are left as an exercise for the reader (I've always wanted to say this).
Some notes
- Although the cProfile and pstats modules are listed on the
Python Standard Library
page, they were not installed by default on my setup (Ubuntu).
A simple install of the
python-profiler
package did the trick. -
Running your application with
python -m cProfile -o barbazzr.profile barbazzr.py
appears to mess with the
__file__
variable: it evaluates to the file name of the cProfile module (/usr/lib/python2.6/cProfile.py
in my case), instead of the file name of the script/module where__file__
variable is used. The workaround is to do the profiling with the first approach:import cProfile cProfile.run('foo()', 'foo.profile')