Analyzing memory use with SystemTap

So last year, GLib gained support for SystemTap. I used this for a bit to analyze memory usage in GNOME Shell at the time, but for Fedora 14, we forgot to –enable-systemtap (oops!), and so shipped without the support. This is now fixed in Fedora 15, so we can “out of the box” instrument any GLib program in a variety of ways.

Now, tracing and performance analysis is an extremely complex subject, and there are a ton of different tools out there for Linux. Tracing user space in particular is still an area under active development. But what I want to talk about today is using SystemTap specifically on GLib.

SystemTap is very different from a tool like “strace” that you might use to watch a particular process. It’s a full programming language (and a fairly neat one at that), and it’s global to the system. Now, the static probes that we added to GLib give you easy access to important data from the library. Let’s look at an example.

// gmalloc_watch.stp: Print calls to g_malloc
// Usage: stap ./gmalloc-watch.stp

probe glib.mem_alloc {
printf ("g_malloc: pid=%d n_bytes=%d\n", pid(), n_bytes);

Compile and run this with: $ stap -v ./gmalloc_watch.stp. What do you see?

g_malloc: pid=3598 n_bytes=104
g_malloc: pid=3598 n_bytes=68
g_malloc: pid=3598 n_bytes=16
g_malloc: pid=3598 n_bytes=40
g_malloc: pid=3598 n_bytes=40
g_malloc: pid=3598 n_bytes=1
g_malloc: pid=3598 n_bytes=104
g_malloc: pid=3598 n_bytes=104
g_malloc: pid=3598 n_bytes=68

All calls to g_malloc from all processes on the system, with very little overhead. This is pretty cool, and it’s just scratching the surface of what we can do. (Note: You will need to add your user to the stapusr group etc. to make the above work; for more documentation see the SystemTap web page linked above).

Okay, so what I wanted was a good way to answer the question “What’s using memory in my GLib program?”. The latest version of my SystemTap script to help answer that is glib-memtrace2.stp. Let’s dive in:

Download, and try: stap -v -c gtk-demo ./glib-memtrace2.stp. Here’s some selections from the output:

$ stap -v -c gtk-demo ~/tmp/glib-memtrace2.stp
Pass 1: parsed user script and 82 library script(s) using 25328virt/16196res/2340shr kb, in 650usr/30sys/696real ms.
Pass 2: analyzed script: 21 probe(s), 5 function(s), 3 embed(s), 9 global(s) using 27328virt/18164res/3360shr kb, in 120usr/500sys/1801real ms.
Pass 3: using cached /home/walters/.systemtap/cache/49/stap_496ad3bd34b95e731521ff2d33066010_13757.c
Pass 4: using cached /home/walters/.systemtap/cache/49/stap_496ad3bd34b95e731521ff2d33066010_13757.ko
Pass 5: starting run.
// glib-memtrace2.stp; target=3703
g_slice: 483652
g_malloc: 578938
GObject GParamObject: 39
GObject GdkDisplayManager: 1
GObject GdkDisplayX11: 1
GObject GParamPointer: 5
GObject GParamDouble: 15
GObject GdkScreenX11: 1
GObject GdkVisual: 32
GObject GtkWindow: 2
# <snip lots of other GObjects>

This is after 5 seconds. What’s it telling me? The gtk-demo process allocated 578938 bytes using g_malloc() in the 5 seconds since it started up. There is also an almost equal number of bytes taken from the slice allocator. Even more interesting, I also have a dump of how many GObjects of which class it allocated. Now, 5 seconds later:

g_slice: 52
g_malloc: -84
GObject GdkPixmapImplX11: 0
GObject GdkPixmap: 0
GObject PangoLayout: 5
# <snip other GObjects>

What it’s printing now is the delta since the earlier statistics. We can see that the g_malloc heap shrank by 84 bytes. The 0 for e.g. GdkPixmap is telling me that one got allocated and freed. Basically, I can interact with apps at nearly full speed and watch in real time how that affects memory usage. Very cool!

I’ve been using this on GNOME 3, and will be checking for memory leaks for the final release. Let’s analyze some parts of the script, so you can understand not only how this script works, but how you can write SystemTap programs.

First of all, I mentioned earlier that SystemTap is global to the system (your programs become kernel modules). Because we only want to trace one process, we need to do this:

if (target() == pid())

The value of target() is set to whatever the process ID of the program we started with -c was (in the case above, remember we used -c gtk-demo).

Second, keeping track of the g_malloc heap is a little tricky; when the function is called, we are told how many bytes it’s allocating, but when the corresponding g_free is called, we don’t know how much is freed! So how did I do it? Basically we model the heap:

global g_heap[65536]
probe glib.mem_alloc {
g_malloc_delta += n_bytes
g_heap[mem] = n_bytes
probe glib.mem_free {
g_malloc_delta -= g_heap[mem]
delete g_heap[mem]

The g_heap variable is an associative array, mapping memory addresses of malloc “chunks” to how big they are. Here you can see a sort of limitation of SystemTap in that things will fail if the process mallocs more than 65536 hunks. These fixed limits are because SystemTap keeps memory in kernel space.

Finally, we set up a timer to print out information every 5 seconds:

probe timer.sec(5) {
printf ("g_slice: %d\n", g_slice_delta);
g_slice_delta = 0;

Pretty easy. That’s it for now! Again for more information on SystemTap, check out the web page. For more on the GLib tapset points, see /usr/share/systemtap/tapset/glib.stp and also the gobject.stp. Thanks for reading, and happy memory leak hunting!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s