Skip to content
Irixo edited this page Dec 26, 2020 · 1 revision

Line-by-line memory usage

The line-by-line memory usage mode is used much in the same way of the line_profiler: first decorate the function you would like to profile with @profile and then run the script with a special script (in this case with specific arguments to the Python interpreter).

In the following example, we create a simple function my_func that allocates lists a, b and then deletes b:

@profile
def my_func():
	a = [1] * (10 ** 6)
	b = [2] * (2 * 10 ** 7)
	del b
	return a

if __name__ == '__main__':
	my_func()

Execute the code passing the option -m memory_profiler to the python interpreter to load the memory_profiler module and print to stdout the line-by-line analysis. If the file name was example.py, this would result in:

$ python -m memory_profiler example.py

Output will follow:

Line #    Mem usage    Increment  Occurences   Line Contents
============================================================
	 3   38.816 MiB   38.816 MiB           1   @profile
	 4                                         def my_func():
	 5   46.492 MiB    7.676 MiB           1       a = [1] * (10 ** 6)
	 6  199.117 MiB  152.625 MiB           1       b = [2] * (2 * 10 ** 7)
	 7   46.629 MiB -152.488 MiB           1       del b
	 8   46.629 MiB    0.000 MiB           1       return a

The first column represents the line number of the code that has been profiled, the second column (Mem usage) the memory usage of the Python interpreter after that line has been executed. The third column (Increment) represents the difference in memory of the current line with respect to the last one. The last column (Line Contents) prints the code that has been profiled.

Decorator

A function decorator is also available. Use as follows:

from memory_profiler import profile

@profile
def my_func():
	a = [1] * (10 ** 6)
	b = [2] * (2 * 10 ** 7)
	del b
	return a

In this case the script can be run without specifying -m memory_profiler in the command line.

In function decorator, you can specify the precision as an argument to the decorator function. Use as follows:

from memory_profiler import profile

@profile(precision=4)
def my_func():
	a = [1] * (10 ** 6)
	b = [2] * (2 * 10 ** 7)
	del b
	return a

If a python script with decorator @profile is called using -m memory_profiler in the command line, the precision parameter is ignored.

Time-based memory usage

Sometimes it is useful to have full memory usage reports as a function of time (not line-by-line) of external processes (be it Python scripts or not). In this case the executable mprof might be useful. Use it like:

mprof run <executable>
mprof plot

The first line run the executable and record memory usage along time, in a file written in the current directory. Once it's done, a graph plot can be obtained using the second line. The recorded file contains a timestamps, that allows for several profiles to be kept at the same time.

Help on each mprof subcommand can be obtained with the -h flag, e.g. mprof run -h.

In the case of a Python script, using the previous command does not give you any information on which function is executed at a given time. Depending on the case, it can be difficult to identify the part of the code that is causing the highest memory usage.

Adding the profile decorator to a function and running the Python script with

mprof run <script>

will record timestamps when entering/leaving the profiled function. Running

mprof plot

afterward will plot the result, making plots (using matplotlib) similar to these:

or, with mprof plot --flame (the function and timestamp names will appear on hover):

A discussion of these capabilities can be found here.

Warning

If your Python file imports the memory profiler from memory_profiler import profile these timestamps will not be recorded. Comment out the import, leave your functions decorated, and re-run.

The available commands for mprof are:

The available commands for mprof are:

  • mprof run: running an executable, recording memory usage
  • mprof plot: plotting one the recorded memory usage (by default, the last one)
  • mprof list: listing all recorded memory usage files in a user-friendly way.
  • mprof clean: removing all recorded memory usage files.
  • mprof rm: removing specific recorded memory usage files

Tracking forked child processes

In a multiprocessing context the main process will spawn child processes whose system resources are allocated separately from the parent process. This can lead to an inaccurate report of memory usage since by default only the parent process is being tracked. The mprof utility provides two mechanisms to track the usage of child processes: sum the memory of all children to the parent's usage and track each child individual.

To create a report that combines memory usage of all the children and the parent, use the include_children flag in either the profile decorator or as a command line argument to mprof:

mprof run --include-children <script>

The second method tracks each child independently of the main process, serializing child rows by index to the output stream. Use the multiprocess flag and plot as follows:

mprof run --multiprocess <script>
mprof plot

This will create a plot using matplotlib similar to this:

You can combine both the include_children and multiprocess flags to show the total memory of the program as well as each child individually. If using the API directly, note that the return from memory_usage will include the child memory in a nested list along with the main process memory.

Plot settings

By default, the command line call is set as the graph title. If you wish to customize it, you can use the -t option to manually set the figure title.

mprof plot -t 'Recorded memory usage'

You can also hide the function timestamps using the n flag, such as

mprof plot -n

Trend lines and its numeric slope can be plotted using the s flag, such as

mprof plot -s

The intended usage of the -s switch is to check the labels' numerical slope over a significant time period for :

  • >0 it might mean a memory leak.
  • ~0 if 0 or near 0, the memory usage may be considered stable.
  • <0 to be interpreted depending on the expected process memory usage patterns, also might mean that the sampling period is too small.

The trend lines are for ilustrative purposes and are plotted as (very) small dashed lines.

Setting debugger breakpoints

It is possible to set breakpoints depending on the amount of memory used. That is, you can specify a threshold and as soon as the program uses more memory than what is specified in the threshold it will stop execution and run into the pdb debugger. To use it, you will have to decorate the function as done in the previous section with @profile and then run your script with the option -m memory_profiler --pdb-mmem=X, where X is a number representing the memory threshold in MB. For example:

$ python -m memory_profiler --pdb-mmem=100 my_script.py

will run my_script.py and step into the pdb debugger as soon as the code uses more than 100 MB in the decorated function.