Finding memory leaks in native code

Topics

  • Manual memory handling
  • Memory leaks
  • Valgrind
  • Massif
  • Examples

Manual memory handling

(for GCed language devs)

Manual memory handling

In C/C++ and other languages, you need to manage memory allocate and deallocation yourself. It is not automatically handled by the runtime.

 

In C, this means using malloc() and free(), malloc allocates a memory block of a particular size, and free deallocates one.

 

Some libraries support reference counting, where you can add and remove references, and it is freed after the count goes to zero.

Why do it manually

There are a number of reasons for doing manual memory handling, the main one being history. Efficient techniques for automatically handling memory management were not known in the early days of programming.

 

Even with the latest developments, automatic memory management usually has a cost, in time, memory, flexibility, complexity or several of those.

Manual leaks

Although there are some advantages to manual memory management, the largest disadvantage is that it is error prone, and requires the programmer to remember to free all memory exactly once.

 

If memory is freed twice, it may cause an immediate abort, or it may just corrupt state and cause errors later.

 

If memory is not freed when it should be, it causes a memory leak, and the program will increase in memory consumption over time.

 

Manual leaks

Simple memory management (without reference counting) gets hard when there are complex ownership issues. Who should free the memory?

Manual leaks

int foo() {
  char* buffer = malloc(1024);
  int n;

  // do work

  return n;
  // oops we didn't free `buffer`
}

Valgrind

Memory problem detector

Valgrind

Valgrind is a suite of tools which detect and report information about memory problems:

  • Leaks (memcheck)
  • Excess usage (massif)
  • Data races (helgrind)

Valgrind - memcheck

The simplest of the tools is memcheck, which reports accesses of memory outside allocated blocks, double frees, and optionally memory that is not freed when the program ends.

 

It is the default tool if not specified, and --leak-check=yes will make it report about leaked memory.

 

valgrind --leak-check=yes myprog arg1 arg2

memcheck - invalid use

When started, your program will run (10-50 times slower than usual), and report problems such as this one for writing off the end of an array:

 

  ==19182== Invalid write of size 4
  ==19182==    at 0x804838F: f (example.c:6)
  ==19182==    by 0x80483AB: main (example.c:11)
  ==19182==  Address 0x1BA45050 is 0 bytes after a block of size 40 alloc'd
  ==19182==    at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130)
  ==19182==    by 0x8048385: f (example.c:5)
  ==19182==    by 0x80483AB: main (example.c:11)

memcheck - leak

If you forget to free memory and there are no pointers to it you will get:

  ==19182== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1
  ==19182==    at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130)
  ==19182==    by 0x8048385: f (a.c:5)
  ==19182==    by 0x80483AB: main (a.c:11)

 

It may also report:

  • Indirectly lost - there is a pointer to it, but from lost memory
  • Possible lost - there is a pointer but to the middle not the start of the block
  • Still reachable - global variable points to it

memcheck - suppressions

It is fairly common to have libraries which allocate memory during initialisation and do not free it, since it is needed for the life of the program, and freeing on exit is (mostly) pointless.

 

To deal with this, you can also use a "suppressions" file, which lists well-known cases of that. Your libraries probably already have one.

memcheck - fixing problems

What do you do if memory reports a memory leak or illegal access? Fix it!

 

Illegal accesses and "definitely lost" reports always indicate problems unless you are doing funny tricks (like xor-pointer lists). Determining the cause may require a lot of thinking about ownership

 

Other reported leaks may be real or false positives.

massif - memory profiling

Aside from leaks, another kind of problem is "unexpected memory retention" (which Java people call "leaks").

 

This problem does not involve losing pointers, but occurs when memory is not freed at the time it should be, so memory usage is higher than expected/needed.

 

Java give us good tools to analyse heap dumps to find these, but for native code you need to be more creative.

massif - memory profiling

Massif collects data about memory usage, and create histograms of allocate sites. You can view total memory usage over time.

 

For the point in time which has peak memory usage (and a few others), it can produce a tree-histogram of how much memory is allocated at various sites.

 

Run with

 valgrind --tool=massif ...

massif - memory profiling

If the unexpected memory retention rises over time, but it is not an actual lost-pointer leak, you can compare data over time.

 

If you subtract the used memory early on by used memory later, the allocation sites which have high memory usage rises are likely the cause of the unexpected retention.

Example

 

Memory retention in
 native JVM code

Finding them in JVMs

Java has automatic memory handling via garbage collection, but the (Oracle/OpenJDK) JVM itself is implemented in C++ which does not, and the JDK uses native libraries.

 

This means there can be native memory leaks in the JVM. It is one of the trickier program to find them in, due to self-modifying JITed code and the unusual allocation patterns.

Finding them in JVMs

To use the tools, you need debugging symbols installs. So only OpenJDK and not Oracle JDK.

 

If you have launcher scripts, use --trace-children=yes

 

For the JVM, add -Djava.compiler=NONE, which will make it very slow since it disables JIT-compilation

Example

From a support case where the JVM had it's resident memory increase over time, by several GB.

 

Too old to use Native Memory Tracking in 7u40+. Used Massif

 

Let it in for as long as possible, to leak as much memory as possible, and then look at results.

References

MemCheck - http://valgrind.org/docs/manual/mc-manual.html

Massif - http://valgrind.org/docs/manual/ms-manual.html

JVM - https://access.redhat.com/articles/1277173

 

Visualisation - https://projects.kde.org/projects/extragear/sdk/massif-visualizer

 

Future replacement? - http://milianw.de/blog/heaptrack-a-heap-memory-profiler-for-linux

Finding memory leaks in native code

By doctau

Finding memory leaks in native code

  • 1,248