Analyzing memory use with SystemTap

March 19, 2011

So last year, GLib gained support for SystemTap. I used this for a bit to analyze memory usage in GNOME Shell at the time, but for Fedora 14, we forgot to –enable-systemtap (oops!), and so shipped without the support. This is now fixed in Fedora 15, so we can “out of the box” instrument any GLib program in a variety of ways.

Now, tracing and performance analysis is an extremely complex subject, and there are a ton of different tools out there for Linux. Tracing user space in particular is still an area under active development. But what I want to talk about today is using SystemTap specifically on GLib.

SystemTap is very different from a tool like “strace” that you might use to watch a particular process. It’s a full programming language (and a fairly neat one at that), and it’s global to the system. Now, the static probes that we added to GLib give you easy access to important data from the library. Let’s look at an example.


// gmalloc_watch.stp: Print calls to g_malloc
// Usage: stap ./gmalloc-watch.stp

probe glib.mem_alloc {
printf ("g_malloc: pid=%d n_bytes=%d\n", pid(), n_bytes);
}

Compile and run this with: $ stap -v ./gmalloc_watch.stp. What do you see?


g_malloc: pid=3598 n_bytes=104
g_malloc: pid=3598 n_bytes=68
g_malloc: pid=3598 n_bytes=16
g_malloc: pid=3598 n_bytes=40
g_malloc: pid=3598 n_bytes=40
g_malloc: pid=3598 n_bytes=1
g_malloc: pid=3598 n_bytes=104
g_malloc: pid=3598 n_bytes=104
g_malloc: pid=3598 n_bytes=68
...

All calls to g_malloc from all processes on the system, with very little overhead. This is pretty cool, and it’s just scratching the surface of what we can do. (Note: You will need to add your user to the stapusr group etc. to make the above work; for more documentation see the SystemTap web page linked above).

Okay, so what I wanted was a good way to answer the question “What’s using memory in my GLib program?”. The latest version of my SystemTap script to help answer that is glib-memtrace2.stp. Let’s dive in:

Download, and try: stap -v -c gtk-demo ./glib-memtrace2.stp. Here’s some selections from the output:


$ stap -v -c gtk-demo ~/tmp/glib-memtrace2.stp
Pass 1: parsed user script and 82 library script(s) using 25328virt/16196res/2340shr kb, in 650usr/30sys/696real ms.
Pass 2: analyzed script: 21 probe(s), 5 function(s), 3 embed(s), 9 global(s) using 27328virt/18164res/3360shr kb, in 120usr/500sys/1801real ms.
Pass 3: using cached /home/walters/.systemtap/cache/49/stap_496ad3bd34b95e731521ff2d33066010_13757.c
Pass 4: using cached /home/walters/.systemtap/cache/49/stap_496ad3bd34b95e731521ff2d33066010_13757.ko
Pass 5: starting run.
// glib-memtrace2.stp; target=3703
g_slice: 483652
g_malloc: 578938
GObject GParamObject: 39
GObject GdkDisplayManager: 1
GObject GdkDisplayX11: 1
GObject GParamPointer: 5
GObject GParamDouble: 15
GObject GdkScreenX11: 1
GObject GdkVisual: 32
GObject GtkWindow: 2
# <snip lots of other GObjects>

This is after 5 seconds. What’s it telling me? The gtk-demo process allocated 578938 bytes using g_malloc() in the 5 seconds since it started up. There is also an almost equal number of bytes taken from the slice allocator. Even more interesting, I also have a dump of how many GObjects of which class it allocated. Now, 5 seconds later:


g_slice: 52
g_malloc: -84
GObject GdkPixmapImplX11: 0
GObject GdkPixmap: 0
GObject PangoLayout: 5
# <snip other GObjects>

What it’s printing now is the delta since the earlier statistics. We can see that the g_malloc heap shrank by 84 bytes. The 0 for e.g. GdkPixmap is telling me that one got allocated and freed. Basically, I can interact with apps at nearly full speed and watch in real time how that affects memory usage. Very cool!

I’ve been using this on GNOME 3, and will be checking for memory leaks for the final release. Let’s analyze some parts of the script, so you can understand not only how this script works, but how you can write SystemTap programs.

First of all, I mentioned earlier that SystemTap is global to the system (your programs become kernel modules). Because we only want to trace one process, we need to do this:


if (target() == pid())

The value of target() is set to whatever the process ID of the program we started with -c was (in the case above, remember we used -c gtk-demo).

Second, keeping track of the g_malloc heap is a little tricky; when the function is called, we are told how many bytes it’s allocating, but when the corresponding g_free is called, we don’t know how much is freed! So how did I do it? Basically we model the heap:


global g_heap[65536]
...
probe glib.mem_alloc {
g_malloc_delta += n_bytes
g_heap[mem] = n_bytes
}
probe glib.mem_free {
g_malloc_delta -= g_heap[mem]
delete g_heap[mem]
}

The g_heap variable is an associative array, mapping memory addresses of malloc “chunks” to how big they are. Here you can see a sort of limitation of SystemTap in that things will fail if the process mallocs more than 65536 hunks. These fixed limits are because SystemTap keeps memory in kernel space.

Finally, we set up a timer to print out information every 5 seconds:


probe timer.sec(5) {
printf ("g_slice: %d\n", g_slice_delta);
g_slice_delta = 0;
...

Pretty easy. That’s it for now! Again for more information on SystemTap, check out the web page. For more on the GLib tapset points, see /usr/share/systemtap/tapset/glib.stp and also the gobject.stp. Thanks for reading, and happy memory leak hunting!


In fact, anyone can do it

March 13, 2011

Mark says:

I have little optimism that the internal code dynamics of Gnome can be fixed – I have seen too many cases where a patch which implements something needed by Unity is dissed, then reimplemented differently, or simply left to rot…

I’m presuming here that he’s referring to e.g. this bug, or maybe others. What Mark apparently isn’t understanding here is that this is a totally normal process for all patches to a competently run project, even if the patch is originated by a “maintainer”. It’s called code review.

Let’s take an example from just the “gnome-shell” module. Look at this bug, where I propose a change, and it gets marked “rejected”. I don’t have my feelings hurt – even though I really think my change is right and keep fighting for it (and eventually Dan does fix it in an even better way). You will find many, many examples of this kind of thing if you search Bugzilla. And this isn’t unique to GNOME – this is a large part of what is on say the Linux kernel mailing list.

What’s even more interesting is that anyone from the world can sign up to watch bugs for e.g. GLib, and review patches. By doing this, one can choose to take a stake in the future success of GLib, and not just see it as a code dump. While there is technically a MAINTAINERS file in each module, it’s just an imperfect reflection, not a hard line.

Could someone else show up today and just mark a patch accepted-commit-now? That would be frowned upon – it’s easy to just mark a patch as OK. But if one shows up and gain some history of pointing out issues or problems, that’s all it takes to gain the credibility to become a “maintainer”. Again, anyone can do it; there is no cabal.


Enough is enough

October 30, 2010

What’s an obvious, practical step we can take to put a stop to this, and in one stroke greatly improve global information security?

It’s simple.  All the important browser vendors need to agree to “flashblock” by default.  What is flash blocking?  Basically, it just means you need one extra click to watch (some) internet videos.

If you’re reading this in a desktop browser, stop right now and follow one of these links:

Flash block for:
Firefox
Google Chrome
Internet Explorer
Safari

If you’re an IT manager, there is no excuse for not deploying these across your organization right now.  If you’re a technology enthusiast, help evangelize flash blocking, and install it on the computers of friends and relatives.

But even though these exist, it’s not good enough.  It’s really time for the vendors to come to a rough consensus and agree to do this by default

I think the Internet Explorer model is actually the best; make it per-user, per-domain.  All Microsoft has to do is toggle that switch in their code.  But the incentive for them to do it is low if users are going to complain that on (some other browser) they need one less click to watch that video.

So consensus needs to be built. Mozilla, Microsoft, Google, and Apple need to agree to take action.  Apple is trying but they can’t do it alone.

The work that’s been going on to replace Flash with HTML 5 is great, and now is the right time to start actively deprecating Flash on the Web. 

The cost is so low.  Just one extra click.  The benefit is millions of consumer computers being notably more secure, again by default.

Further reading on simple, effective tips for improving your security are in this lifehacker post.

(This post was brought to you by years of having to de-virus my family’s computers)


PyGTK: Performing engine maintenance while the car is running

April 14, 2010

The PyGTK hackfest is happening at the OLPC office in Cambridge, and we’ve had some productive discussions so far. There are two major orthogonal changes happening simultaneously.

Python 3

First, Python 3 support for interacting with the GTK+ stack. As far as I understand it, the Python 3 work is mostly mechanical and uninteresting, so I won’t say much more about it. There’s just a lot of code that needs to be changed.

Introspection

Second, pushing forward the new PyGI stack which among other things is significantly more memory efficient.

So there are two Python binding stacks now with different tradeoffs. In pictures, here are their respective architectures:

PyGTK


In PyGTK, a large C library is generated which bridges the two worlds, which combines hand-written “overrides” with some metadata (.defs file) about the C library.

PyGI


In PyGI, everything is done dynamically based on the information from the .typelib file. There are a very few custom overrides.

PyGTK-on-PyGI


This is our current concept; in the combined architecture, the API people are used to from PyGTK is preserved, however we begin to “hollow out” the core so that for more of the simple functions that aren’t overridden, instead of generating a static C blob for them, we look up dynamically through PyGI. This would be relatively straightforward to do on a per-function level by adding a hook in the metaclass or __getattr__. More complex would be doing this on a whole-class level, however this would also be the biggest win in terms of memory usage which is important for everyone, but particularly for Sugar.

More updates as we hack them. If you’re interested you can join us!


How the Fedora desktop gets made

April 8, 2010

I’d like to illuminate a bit the process by which the Fedora Desktop CD image gets made.

People

Ultimately, the content of the image is created by people. Yes, those names you see on the Internet have real people behind them. After creating content (source code, art, documentation, etc.), it’s typically added to:

Revision Control (git)

A revision control system is a shared online service where people can put changes to a particular component, from applications like Rhythmbox to operating system plumbing like the Linux kernel. From revision control, content is pulled by Fedora into:

Fedora Package CVS

Fedora uses a system of tracking the content of these repositories around the internet called Fedora CVS. This content is pulled periodically at the discretion of a package maintainer. When it’s added to Fedora CVS, it then gets submitted to:

Koji

The Koji service turns the content of the Package CVS into files called .rpm which are consumable components that can be installed individually, but for the purpose of the CD, are combined at a high level using:

Comps

The Fedora Comps is simply a list of these packages. For our purpose here, the most important part is a group called @gnome-desktop which defines the components (out of the universe of packages in the Fedora repositories) that are installed by default on the desktop. From comps, we now turn to:

spin-kickstarts

The spin-kickstarts project uses Kickstart files to combine the comps grouping of packages above with some “extra sauce” such as scripts. (For example, the Live images are specifically modified not to perform software updates while they’re in “Live” mode). The specific kickstart file of interest is fedora-livecd-desktop.ks. This kickstart file is then consumed by:

LiveCD tools

The LiveCD tools consume the kickstart file, a comps listing, and a repository of RPMS and actually creates the final CD image.

This whole process happens nightly, and the results currently appear here. I hope you learned something and/or found this interesting!


Hot on the heels of the previous tarball

March 17, 2010

A quick note, dbus 1.2.22 is out, see the release announcement. OS vendors should use this for GNOME 2.30.

I also wanted to give a shout out to the cool patch from GNOME Shell contributor Maxim Ermilov, which lets you drag and drop windows from the “linear” workspace view, by temporarily zooming out. We had this in the “grid” view for a while, but hopefully this functionality will let us focus on making linear the central view.

Maxim did this in a few hundred lines of JavaScript. I’ve noticed a distinct lack of contributions from some of the old-school GNOME hackers, and I’m just saying…if he can do it so easily, well, you guys are getting schooled by the new kids on the block.

Video:


Math is hard

March 1, 2010

Apparently, someone decided to go shopping instead. But really guys…really?


In which new tarballs appear

February 12, 2010

HotSSH 0.2.7

So I took a bit of free time to fix up some things in my semi-toy project hotssh. If you like it, you should upgrade since this release fixes some major bugs with the connection tracking, and some more minor things.

The project’s at the point where though if I wanted to do anything noticeably more compelling, I’d have to either take the leap of using a real SSH library (maybe libssh?) rather than invoking OpenSSH as a subprocess. The problem is that gets into a lot of complexity in trying to stay in sync with whatever OpenSSH does (key management, known_hosts etc.). Probably someday though.

dbus 1.2.20

In the category of less user visible but probably more important, a new stable DBus is available. There are fixes larger and smaller (the real changelog is from 1.2.18 which was a paper bag release). I think one of the most important for mobile Linux users is the patch to switch to the monotonic clock; basically DBus will be more reliable if you suspend the system or reset the system clock. Besides other reliability fixes, there are some other small nice things like a better dbus-monitor. Thanks to Tom Hughes of Palm for the former, and Lennart Poettering of Red Hat and Will Thompson of Collabora for the latter!

And now back to some more user-visible GNOME Shell work; as Jon mentioned it looks like some new contributors are outpacing me, while I’ve been working on some of the underlying St toolkit infrastructure. Time to catch up!


On the Fedora Board

January 7, 2010

Now that I’m on the Fedora Project Board, you may be wondering what my plans are. The first answer is – ideally – not much! Ideally, no one posts semi-nude material on the planet, we all cooperate nicely on the mailing lists, and in general the construction of a Free Software operating system and applications basically runs itself, and I can spend most of my time working on code too. However, we aren’t quite in an ideal state, so let me give you a sense of my thoughts and goals.

First, at a high level, I’d like Fedora to be more like Mozilla, which is arguably the most successful Free Software project ever. They do a lot of things extremely well, and we would do well to learn from them. I’ll elaborate a bit more on this later. But I think a lot of us inside the project should ask ourselves “How does Mozilla do it”?

Second, I’ll do all I can to prevent or work through intrapersonal conflicts inside the project. These have been a very serious problem for us, and we have to remember that we share the same goals. Being on the Board doesn’t give me any more actual powers here at the moment, possibly just a slightly higher platform from which to say, pretty please. (I have personally been part of the problem in the past, and that’s something I’ve been trying very hard to fix).

Third, be a part of the vision/what-is-Fedora/target audience discussions. I have a fairly strong opinion on these in general, which let me give a one sentence description here:

Fedora – A project to produce a Free Software general purpose operating system and applications through the process of rough consensus and working code.
Target Audience – Ok, this one is hard to fit in a single sentence, but I think the current proposal is a good start.

Finally, as mentioned earlier, spend most of my time on the code!


Apparently it didn’t keep the doctor away, but…

January 4, 2010

Andre Klapper did a neat summary of GNOME Bugzilla activity and I ended up with an average of exactly one patch submitted to GNOME a day for 2009. Cool! A fair chunk of that is probably tweaks of existing patches re-submitted, since git bz makes it so easy.

Looking forward to 2010!


Follow

Get every new post delivered to your Inbox.