Think you python code is slow?
Learn how to measure and improve the performance
Profiling
- Profiling is form of a dynamic analysis that measures the various aspects of the program like the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls
- That means measuring the time spent in each of its functions
- This will give you data about where your program is spending time, and what area might be worth optimising
Profilers in python
- cProfile and profile (python standard library modules) are generally used for determining the time that takes every Python function when running the code
- line_profiler (3rd party open source library) is a module for doing line-by-line profiling of functions
- memory_profiler (3rd party open source library) is a module for monitoring memory consumption of a process as well as line-by-line analysis
cProfile
Profiling code with cProfile is really quite easy. All you need to do is import the module and call its run function
import cProfile
#cProfile is recommended for most users
#It’s a C extension with reasonable overhead
#that makes it suitable for profiling
#long-running programs
def fib(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return fib(n-1) + fib(n-2)
def fib_seq(n):
seq = [ ]
if n > 0:
seq.extend(fib_seq(n-1))
seq.append(fib(n))
return seq
print '=' * 80
cProfile.run('print fib_seq(25); print')
==========================================================================================
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
57355 function calls (65 primitive calls) in 0.017 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.017 0.017 <string>:1(<module>)
21/1 0.000 0.000 0.017 0.017 fib.py:11(fib_seq)
57291/21 0.016 0.000 0.016 0.001 fib.py:3(fib)
21 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
20 0.000 0.000 0.000 0.000 {method 'extend' of 'list' objects}
The line after the output indicates that 57355 calls were monitored. Of those calls, 65 were primitive, meaning that the call was not induced via recursion The next line: Ordered by: standard name, indicates that the text string in the far right column was used to sort the output ncalls - the number of calls tottime - the total time spent in the given function (and excluding time made in calls to sub-functions) percall - is the quotient of tottime divided by ncalls
cumtime - is the cumulative time spent in this and all subfunctions (from invocation till exit)
percall - is the quotient of cumtime divided by primitive calls
filename:lineno(function) - provides the respective data of each function
When there are two numbers in the first column (for example 21/1), it means that the function recursed. The second value is the number of primitive calls and the former is the total number of calls. Note that when the function does not recurse, these two values are the same, and only the single figure is printed.
profile
Running profile is the same as cProfile, except that it adds significant overhead to profiled programs
import profile
#A pure Python module whose interface is imitated by cProfile
#but which adds significant overhead to profiled programs
#If you’re trying to extend the profiler in some way
#the task might be easier with this module
def fib(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return fib(n-1) + fib(n-2)
def fib_seq(n):
seq = [ ]
if n > 0:
seq.extend(fib_seq(n-1))
seq.append(fib(n))
return seq
print '=' * 80
profile.run('print fib_seq(25); print')
==========================================================================================
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
57356 function calls (66 primitive calls) in 0.187 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
21 0.000 0.000 0.000 0.000 :0(append)
20 0.000 0.000 0.000 0.000 :0(extend)
1 0.001 0.001 0.001 0.001 :0(setprofile)
1 0.000 0.000 0.186 0.186 <string>:1(<module>)
21/1 0.000 0.000 0.186 0.186 fib.py:11(fib_seq)
57291/21 0.185 0.000 0.185 0.009 fib.py:3(fib)
1 0.000 0.000 0.187 0.187 profile:0(print fib_seq(20); print)
0 0.000 0.000 profile:0(profiler)
Same program when profiled using cProfile took 17ms where as profile took 187ms
DEMO
line_profiler
- line_profiler will profile the time individual lines of code take to execute
- The profiler is implemented in C via Cython in order to reduce the overhead of profiling
- Also included is the script kernprof.py which can be used to conveniently profile Python applications and scripts either with line_profiler or with the function-level profiling tools in the python standard library
- LineProfiler can be given functions to profile, and it will time the execution of each individual line inside those functions
- It does need to be explicitly told what functions to profile. The easiest way to get started is to use the kernprof script kernprof -l script_to_profile.py
- kernprof will create an instance of LineProfiler and insert it into the __builtins__ namespace with the name profile. It has been written to be used as a decorator, so in your script, you decorate the functions you want to profile with @profile.
@profile
def fib(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return fib(n-1) + fib(n-2)
@profile
def fib_seq(n):
seq = [ ]
if n > 0:
seq.extend(fib_seq(n-1))
seq.append(fib(n))
return seq
print '=' * 90
print fib_seq(20)
kernprof -l fib.py
To view the output
kernprof -l -v fib.py
python -m line_profiler fib.py.lprof
(or)
==========================================================================================
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
Wrote profile results to fib.py.lprof
Timer unit: 1e-06 s
Total time: 0.073717 s
File: fib.py
Function: fib at line 1
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1 @profile
2 def fib(n):
3 57291 18929 0.3 25.7 if n == 0:
4 10946 3258 0.3 4.4 return 0
5 46345 15603 0.3 21.2 elif n == 1:
6 17710 5398 0.3 7.3 return 1
7 else:
8 28635 30529 1.1 41.4 return fib(n-1) + fib(n-2)
Total time: 0.192911 s
File: fib.py
Function: fib_seq at line 10
Line # Hits Time Per Hit % Time Line Contents
==============================================================
10 @profile
11 def fib_seq(n):
12 21 14 0.7 0.0 seq = [ ]
13 21 11 0.5 0.0 if n > 0:
14 20 60 3.0 0.0 seq.extend(fib_seq(n-1))
15 21 192820 9181.9 100.0 seq.append(fib(n))
16 21 6 0.3 0.0 return seq
Line #: The line number in the file
Hits: The number of times that line was executed
Time: The total amount of time spent executing the line in the timer's units. In the header information before the tables, you will see a line "Timer unit:" giving the conversion factor to seconds
Per Hit: The average amount of time spent executing the line once in the timer's units
% Time: The percentage of time spent on that line relative to the total amount of recorded time spent in the function
Line Contents: The actual source code. Note that this is always read from disk when the formatted results are viewed, not when the code was executed.
DEMO
memory_profiler
-
python module for monitoring memory consumption of a process as well as line-by-line analysis of memory consumption for python programs
-
It is a pure python module and has the psutil module as optional (but highly recommended) dependencies
-
Also included is the script mprof which gives full memory usage reports as a function of time (not line-by-line) of external processes (be it python scripts or not)
==========================================================================================
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
Filename: fib.py
Line # Mem usage Increment Line Contents
================================================
1 17.332 MiB 0.000 MiB @profile
2 def fib(n):
3 17.332 MiB 0.000 MiB if n == 0:
4 17.332 MiB 0.000 MiB return 0
5 17.332 MiB 0.000 MiB elif n == 1:
6 17.332 MiB 0.000 MiB return 1
7 else:
8 17.332 MiB 0.000 MiB return fib(n-1) + fib(n-2)
Filename: fib.py
Line # Mem usage Increment Line Contents
================================================
10 17.324 MiB 0.000 MiB @profile
11 def fib_seq(n):
12 17.324 MiB 0.000 MiB seq = [ ]
13 17.324 MiB 0.000 MiB if n > 0:
14 17.332 MiB 0.008 MiB seq.extend(fib_seq(n-1))
15 17.332 MiB 0.000 MiB seq.append(fib(n))
16 17.332 MiB 0.000 MiB return seq
Generic timing with time
A quick and dirty method of timing python code - the good old unix utility time
def fib(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return fib(n-1) + fib(n-2)
def fib_seq(n):
seq = [ ]
if n > 0:
seq.extend(fib_seq(n-1))
seq.append(fib(n))
return seq
print '=' * 90
print fib_seq(20)
time python fib.py
==========================================================================================
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
python fib.py
0.02s user
0.01s system
0.034 total
total is wall clock time - time from start to finish of the call. This is all elapsed time including time slices used by other processes and time the process spends blocked
user is the amount of CPU time spent in user-mode code (outside the kernel) within the process. This is only actual CPU time used in executing the process
system is the amount of CPU time spent in the kernel within the process. This means executing CPU time spent in system calls within the kernel, as opposed to library code, which is still running in user-space. Like 'user', this is only CPU time used by the process
Elegant timing with a timing context manager
import time
class Timer(object):
def __init__(self, verbose=False):
self.verbose = verbose
def __enter__(self):
self.start = time.time()
return self
def __exit__(self, *args):
self.end = time.time()
self.secs = self.end - self.start
self.msecs = self.secs * 1000
if self.verbose:
print 'elapsed time: %f ms' % self.msecs
==========================================================================================
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765]
Time taken for fib_seq 0.00767183303833 s
To use it, wrap blocks of code that you want to time with Python’s with keyword and this Timer context manager. It will take care of starting the timer when your code block begins execution and stopping the timer when your code block ends.
from timer import Timer
def fib(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return fib(n-1) + fib(n-2)
def fib_seq(n):
seq = [ ]
if n > 0:
seq.extend(fib_seq(n-1))
seq.append(fib(n))
return seq
print '=' * 90
with Timer() as t:
print fib_seq(20)
print "Time taken %s s" % t.secs
Think you python code is slow?
By ambi
Think you python code is slow?
- 1,142