Skip to content

Debugging codes with valgrind

Paul Moeller edited this page Aug 23, 2018 · 2 revisions

If a python program is seg-faulting, especially after the last line of the program has been executed, it is likely that memory has been corrupted somewhere during execution. valgrind is a useful program to track down invalid memory access. It requires that python be compiled specially. Below are the steps to build a debug version of python and run valgrind.

First install valgrind:

# yum install valgrind-devel

Download the python version you use from python.org. https://www.python.org/downloads/source/ ex. Python-2.7.14.tgz

Extract the source and configure python so it is compatible with valgrind. Use the --prefix arg to configure to set a custom install directory (somewhere in your home directory)

$ tar zxf /vagrant/Python-2.7.14.tgz
$ cd Python-2.7.14
$ ./configure --without-pymalloc --with-pydebug --with-valgrind --prefix=/home/vagrant/pjm2
$ make
$ make install

$ export PATH=~/pjm2/bin/:$PATH
$ which python
python is /home/vagrant/pjm2/bin/python

Install pip.

$ curl https://bootstrap.pypa.io/3.3/get-pip.py > get-pip.py
$ python get-pip.py

Install any dependencies for the code:

$ pip install numpy

Build the target code in debug mode - edit the Makefile to add "-g" to CFLAGS if necessary

$ cd ~/src/radiasoft/SRW-light
(edit cpp/gcc/Makefile and add -g to CFLAGS)
$ make clean
$ make fftw
$ make

$ export PYTHONPATH=~/src/radiasoft/SRW-light/env/work/srw_python/

Ensure the python program still seg-faults with the debug version of python.

$ python crystal3.py
first mesh size: 8 x 8
running PropagElecField
running CalcIntFromElecField
complete ok
Segmentation fault (core dumped)

Run valgrind on the program.

$ valgrind --tool=memcheck --leak-check=full python crystal3.py &> out.txt

Look in the output for "Invalid write" or "Invalid read".

==7961== Invalid write of size 4
==7961==    at 0x6A1B319: srTRadGenManip::ExtractSingleElecIntensity2DvsXZ(srTRadExtract&) (srradmnp.cpp:519)
==7961==    by 0x6A23BF4: ExtractSingleElecIntensity (srradmnp.h:86)
==7961==    by 0x6A23BF4: srTRadGenManip::ExtractRadiation(int, int, int, int, double, double, double, char*) (srradmnp.cpp:1398)
==7961==    by 0x6949C1A: srwlCalcIntFromElecField (srwlib.cpp:703)
==7961==    by 0x693EB5C: srwlpy_CalcIntFromElecField(_object*, _object*) (srwlpy.cpp:3999)
==7961==    by 0x563F6C: PyCFunction_Call (methodobject.c:81)
==7961==    by 0x4D7638: call_function (ceval.c:4357)
==7961==    by 0x4D20E4: PyEval_EvalFrameEx (ceval.c:2994)
==7961==    by 0x4D4E06: PyEval_EvalCodeEx (ceval.c:3589)
==7961==    by 0x4C77B2: PyEval_EvalCode (ceval.c:669)
==7961==    by 0x507577: run_mod (pythonrun.c:1376)
==7961==    by 0x5074FD: PyRun_FileExFlags (pythonrun.c:1362)
==7961==    by 0x505CEA: PyRun_SimpleFileExFlags (pythonrun.c:948)

Now you know where memory was first corrupted and you can follow the call stack and add additional debug statements to the code and recompile.

Clone this wiki locally