Monthly Archives: January 2013

Limiting LXC Memory Usage

I’ve been playing around with LXC over the past few weeks and one of the things I tried out was limiting the memory that the container is allowed to use. I didn’t plan on explaining all the ins-and-outs of LXC here, but a short description is that LXC provides a virtualizedish environment that is more than a chroot gives you, but less than a full-blown virtual machine. If you want more details, please check out stgraber’s blog post about LXC in 12.04.

Kernel Configuration

The first thing you need to do in order to limit memory usage for LXC is make sure your kernel is properly configured, you need the following flag enabled:


If you plan on also limiting swap space usage, you’ll also need:


These flags are enabled for me in my 12.10 kernel (3.5.0-22) and so presumably you’ll have them in 12.04.

Setting the Cap

First, I’m going to create my container. Following the instructions from stgraber’s blog post, and calling the container “memlimit”:

sudo lxc-create -t ubuntu -n memlimit

Once the container is built, we need to modify the config files. Look at /var/lib/lxc/memlimit/config. We need to add a few lines to that file. I’m going to limit memory to 512M and total usage or memory + swap to 1G. Note the second setting is for overall memory + swap, not just swap usage.

lxc.cgroup.memory.limit_in_bytes = 512M
lxc.cgroup.memory.memsw.limit_in_bytes = 1G

Now let’s start the container and get some debug info out of it to make sure these were set:

sudo lxc-start -n memlimit -l debug -o debug.out

The debug.out file will show up wherever you ran lxc-start from, so let’s see if it picked up our limits:

lxc-start 1359136997.617 DEBUG lxc_conf - cgroup 'memory.limit_in_bytes' set to '512M'
lxc-start 1359136997.617 DEBUG lxc_conf - cgroup 'memory.memsw.limit_in_bytes' set to '1G'

Looks good to me!

Note, I tried setting this once to 1.5G and it seems that the fields are only happy with whole numbers, it complained about 1.5G. That error message appeared in the same debug log I used above.

A list of more of the values you can set in here is available here.

Measuring Memory Usage

The view of /proc/meminfo inside the container and outside the container are the same. This means that you cannot rely on tools like top to show how much memory the container is using. In other words, when run inside the container top will correctly only show processes inside the container, it will also show overall memory usage for the entire system. To get valid information, instead we need to examine some files in /sys:

Current memory usage:

Current memory + swap usage:

Maximum memory usage:

Maximum memory + swap usage:

You can use expr to show it as kb or mb which is easier to read for me:

expr `cat memory.max_usage_in_bytes` / 1024

What Happens When the Limit is Reached?

When the cap is reached, the container simply behaves as if the system ran out of memory. Calls to malloc will start failing (returning -1), leading to strange and bad things happening. Dialog boxes may not open, you may not be able to save files, and more than likely where people didn’t bother to check the returned value from malloc (aka, everyone), you’ll get segfaults. You can alleviate the pressure like normal systems do, by enabling swap inside the container, but once that limit is reached, you’ll have the same problem. In this case the host system’s kernel should start firing up the OOM killer and killing stuff inside the container.

Here is my extremely simple test program to drive up memory usage, install gcc in your container and you can try it too:

int main(void) {
int i;
for (i=0; i<65536; i++) {
char *q = malloc(65536);
printf ("Malloced: %ld\n", 65536*i);

You can simply compiled with: gcc -o foo foo.c

I used the following simple shell construct to watch the memory usage. This needs to be run outside the container and I ran it from the /sys directory mentioned above:

while true; do echo -n "Mem Usage (mb): " \&\& expr `cat memory.usage_in_bytes` / 1024 / 1024; echo -n "Mem+swap Usage (mb): " \&\& expr `cat memory.memsw.usage_in_bytes` / 1024 / 1024; sleep 1; done

With the above shell script runnint, I fired up a bunch of copies of foo one bye one. Here’s the memory usage from that script:

Running a few copies:

Mem+swap Usage (mb): 825
Mem Usage (mb): 511
Mem+swap Usage (mb): 859
Mem Usage (mb): 511

A new copy of foo is starting:

Mem+swap Usage (mb): 899
Mem Usage (mb): 511
Mem+swap Usage (mb): 932
Mem Usage (mb): 511
Mem+swap Usage (mb): 1010
Mem Usage (mb): 511

The OOM killer just said “Nope!”

Mem+swap Usage (mb): 814
Mem Usage (mb): 511
Mem+swap Usage (mb): 825
Mem Usage (mb): 511

At the point where the OOM killer fired up, I see this in my container:
[1] Killed ./foo

So the limits are set, and they’re working.


If you are using LXC or considering using LXC, you can use a memory limit to protect the host from a container run amok. You could also use it to test your code in an artificially restricted environment. In either case, try the tools above and let me know how it works for you.

Tagged , ,

Cairo Perf Testing on the Nexus 7

Last week I was running some cairo perf traces on the Nexus7. Cairo-perf traces are a great way to measure 2d graphics performance and to use those numbers to measure the effects of code, hardware, or driver changes. One other cool thing is that with this tool you can do a benchmark on something like Chromium or Firefox without even needing the application installed.

The purpose of this post is to briefly explain how to build the traces, how to run the tools on Ubuntu, and finally a quick look at some results on the Nexus7.

Before running the tools you need to get setup and build the traces. A full clone and build will use several gigs of disk space. Since the current N7 image only builds a 6G or so filesystem, you may want to build the traces in a pbuilder. The disk I/O on the N7 is also pretty slow, so I found that building in the pbuilder, even though it runs inside a qemu, is much faster on my Core i5 + SSD.

In the steps below I’ve tried to call out the things you can do to reduce the disk space.

Building the traces

1. Setup the build environment

sudo apt­-get install libcairo2-­dev lzma git

2. Grab the traces from git

git clone git://­-traces

3. (Optional) Remove unused files to save on disk space. Don’t do this if you plan on submitting changes back upstream.

cd cairo-­traces
rm -­rf .git

4. Build the benchmarks, I used -j4 on my laptop and -j2 on the Nexus7. I didn’t really investigate the optimal value.

make -j4 benchmarks

5. The benchmark directory is now ready to use for traces. If you built it on a different system, you only need to copy over this directory. You can delete the lzma files if you want.

The traces you are pixman version specific, so if you have a Raring based system like the Nexus7, you can’t re-use them on a Precise based box.

Running cairo-perf-trace

1, Before you start, delete the ocitysmap trace from the benchmarks folder. It uses too much RAM and ended up locking up my N7.

2. If you are at the command line, connected via ssh for example, you need to set the display or it will segfault, simply run export DISPLAY=:0

3. Run the tool, I’d start first with a simple trace to make sure that everything is working.

CAIRO_TEST_TARGET=image cairo-­perf-­trace ­-i3 -­r ./benchmark/gvim.trace > ~/result_image.txt

In that command above we did a few things, first we set the cairo backend. Image is a software renderer, you probably want to use xlib or xcb to test hardware. If you don’t set the CAIRO_TEST_TARGET it will try all the back-ends, this will take a long long time and I don’t recommend doing it. A simple way to get the tool to list them all is to set it to a bad value, for example

mfisch@caprica:~$ CAIRO_TEST_TARGET=mfisch cairo-perf-trace
Cannot find target 'mfisch'.
Known targets: script, xcb, xcb-window, xcb-window&, xcb-render-0.0, xcb-fallback, xlib, xlib-window, xlib-render-0_0, xlib-fallback, image, image16, recording

The next argument, -i3 tells it to run 3 iterations, this gives us a good set of data to work with. -r asks for raw output, which is literally just the amount of time the trace took to run. Finally ./benchmark/gvim.trace shows which trace to run. You can pass in a directory here and it will run them all, but I’d recommend trying that just one until you know that it’s working. When you’re running a long set of traces doing a tail -f on the result file can help assure you that it’s working without placing too heavy of a load on the system. The hardware backend runs took almost all day to finish, so you should always be plugged into a power source when doing this.

The output should look something like this:
[ # ] backend.content test-size ticks-per-ms time(ticks) ...
[*] xlib.rgba chromium-tabs.0 1e+06 1962036000 1948712000 1938894000

Making Pretty Graphs

Once you have some traces you can make charts with cairo-perf-chart. This undocumented tool has several options which I determined by reading the code. I did send a patch to add a usage() statement to this tool, but nobody has accepted it yet. First, the basic usage, then the options:

cairo-perf-chart nexus7_fbdev_xlib.txt nexus7_tegra3_xlib.txt

cairo-perf-chart will build two charts with that command, one will be an absolute chart, on that chart, larger bars indicate worse performance. The second chart, the relative chart takes the first argument as the baseline and compares the rest of the results files against it. On the relative chart, a number below the zero line indicates that the results are slower than the baseline (which is the first argument to cairo-perf-chart.

Now a quick note about the useful arguments. cairo-perf-chart can take as many results files as you want to supply it when building graphs, if you’d like to compare more than two files. If you want to resize the chart, just pass –width= and –height=, defaults are 640×480. Another useful option is –html which generates an HTML comparison chart from the data. The only issue with this option is that you manually need to make a table header and stick it in to a basic HTML document.

Some Interesting Results

Now some results from the Nexus7 and they are actually pretty interesting. I compared the system with and without the tegra3 drivers enabled. Actually I just plain uninstalled the tegra3 drivers to get some numbers with fbdev. My first run used the image backend, pure software rendering. As expected the numbers are almost identical, since the software rendering is just using the same CPU+NEON.

Absolute Results - Tegra3 vs fbdev drivers, image (software) backend

Absolute Results – Tegra3 vs fbdev drivers, image (software) backend

Relative Results - Tegra3 vs fbdev drivers, image (software) backend

Relative Results – Tegra3 vs fbdev drivers, image (software) backend

The second set of results are more interesting. I switched to the xlib backend so we would get hardware rendering. With the tegra3 driver enabled we should expect a massive performance gain, right?

Absolute Results - Tegra3 vs fbdev drivers, xlib backend

Absolute Results – Tegra3 vs fbdev drivers, xlib backend

Relative Results - Tegra3 vs fbdev drivers, xlib backend

Relative Results – Tegra3 vs fbdev drivers, xlib backend

So as it turns out the tegra3 is actually way slower than fbdev and I don’t know why. I think that this could be for a variety of reasons, such as unoptimized 2d driver code or hardware (CPU+NEON vs Tegra3 GPU).

Now that we have a method for gathering data, perhaps we can solve that mystery?

If you want to know more about the benchmarks or see some more analysis, you should read this great post which is where I found out most of the info on running the tools. If you want to know more background about the cairo-perf trace tools you might want to read this excellent blog post.

Tagged , ,