Cairo Perf Testing on the Nexus 7

Last week I was running some cairo perf traces on the Nexus7. Cairo-perf traces are a great way to measure 2d graphics performance and to use those numbers to measure the effects of code, hardware, or driver changes. One other cool thing is that with this tool you can do a benchmark on something like Chromium or Firefox without even needing the application installed.

The purpose of this post is to briefly explain how to build the traces, how to run the tools on Ubuntu, and finally a quick look at some results on the Nexus7.

Before running the tools you need to get setup and build the traces. A full clone and build will use several gigs of disk space. Since the current N7 image only builds a 6G or so filesystem, you may want to build the traces in a pbuilder. The disk I/O on the N7 is also pretty slow, so I found that building in the pbuilder, even though it runs inside a qemu, is much faster on my Core i5 + SSD.

In the steps below I’ve tried to call out the things you can do to reduce the disk space.

Building the traces

1. Setup the build environment

sudo apt­-get install libcairo2-­dev lzma git

2. Grab the traces from git

git clone git://anongit.freedesktop.org/cairo­-traces

3. (Optional) Remove unused files to save on disk space. Don’t do this if you plan on submitting changes back upstream.

cd cairo-­traces
rm -­rf .git

4. Build the benchmarks, I used -j4 on my laptop and -j2 on the Nexus7. I didn’t really investigate the optimal value.

make -j4 benchmarks

5. The benchmark directory is now ready to use for traces. If you built it on a different system, you only need to copy over this directory. You can delete the lzma files if you want.

The traces you are pixman version specific, so if you have a Raring based system like the Nexus7, you can’t re-use them on a Precise based box.

Running cairo-perf-trace

1, Before you start, delete the ocitysmap trace from the benchmarks folder. It uses too much RAM and ended up locking up my N7.

2. If you are at the command line, connected via ssh for example, you need to set the display or it will segfault, simply run export DISPLAY=:0

3. Run the tool, I’d start first with a simple trace to make sure that everything is working.

CAIRO_TEST_TARGET=image cairo-­perf-­trace ­-i3 -­r ./benchmark/gvim.trace > ~/result_image.txt

In that command above we did a few things, first we set the cairo backend. Image is a software renderer, you probably want to use xlib or xcb to test hardware. If you don’t set the CAIRO_TEST_TARGET it will try all the back-ends, this will take a long long time and I don’t recommend doing it. A simple way to get the tool to list them all is to set it to a bad value, for example

mfisch@caprica:~$ CAIRO_TEST_TARGET=mfisch cairo-perf-trace
Cannot find target 'mfisch'.
Known targets: script, xcb, xcb-window, xcb-window&, xcb-render-0.0, xcb-fallback, xlib, xlib-window, xlib-render-0_0, xlib-fallback, image, image16, recording

The next argument, -i3 tells it to run 3 iterations, this gives us a good set of data to work with. -r asks for raw output, which is literally just the amount of time the trace took to run. Finally ./benchmark/gvim.trace shows which trace to run. You can pass in a directory here and it will run them all, but I’d recommend trying that just one until you know that it’s working. When you’re running a long set of traces doing a tail -f on the result file can help assure you that it’s working without placing too heavy of a load on the system. The hardware backend runs took almost all day to finish, so you should always be plugged into a power source when doing this.

The output should look something like this:
[ # ] backend.content test-size ticks-per-ms time(ticks) ...
[*] xlib.rgba chromium-tabs.0 1e+06 1962036000 1948712000 1938894000

Making Pretty Graphs

Once you have some traces you can make charts with cairo-perf-chart. This undocumented tool has several options which I determined by reading the code. I did send a patch to add a usage() statement to this tool, but nobody has accepted it yet. First, the basic usage, then the options:

cairo-perf-chart nexus7_fbdev_xlib.txt nexus7_tegra3_xlib.txt

cairo-perf-chart will build two charts with that command, one will be an absolute chart, on that chart, larger bars indicate worse performance. The second chart, the relative chart takes the first argument as the baseline and compares the rest of the results files against it. On the relative chart, a number below the zero line indicates that the results are slower than the baseline (which is the first argument to cairo-perf-chart.

Now a quick note about the useful arguments. cairo-perf-chart can take as many results files as you want to supply it when building graphs, if you’d like to compare more than two files. If you want to resize the chart, just pass –width= and –height=, defaults are 640×480. Another useful option is –html which generates an HTML comparison chart from the data. The only issue with this option is that you manually need to make a table header and stick it in to a basic HTML document.

Some Interesting Results

Now some results from the Nexus7 and they are actually pretty interesting. I compared the system with and without the tegra3 drivers enabled. Actually I just plain uninstalled the tegra3 drivers to get some numbers with fbdev. My first run used the image backend, pure software rendering. As expected the numbers are almost identical, since the software rendering is just using the same CPU+NEON.

Absolute Results - Tegra3 vs fbdev drivers, image (software) backend

Absolute Results – Tegra3 vs fbdev drivers, image (software) backend

Relative Results - Tegra3 vs fbdev drivers, image (software) backend

Relative Results – Tegra3 vs fbdev drivers, image (software) backend

The second set of results are more interesting. I switched to the xlib backend so we would get hardware rendering. With the tegra3 driver enabled we should expect a massive performance gain, right?

Absolute Results - Tegra3 vs fbdev drivers, xlib backend

Absolute Results – Tegra3 vs fbdev drivers, xlib backend

Relative Results - Tegra3 vs fbdev drivers, xlib backend

Relative Results – Tegra3 vs fbdev drivers, xlib backend

So as it turns out the tegra3 is actually way slower than fbdev and I don’t know why. I think that this could be for a variety of reasons, such as unoptimized 2d driver code or hardware (CPU+NEON vs Tegra3 GPU).

Now that we have a method for gathering data, perhaps we can solve that mystery?

If you want to know more about the benchmarks or see some more analysis, you should read this great post which is where I found out most of the info on running the tools. If you want to know more background about the cairo-perf trace tools you might want to read this excellent blog post.

Tagged , ,

3 thoughts on “Cairo Perf Testing on the Nexus 7

  1. Steve says:

    That’s been my experience on a number of x86and ARM chipsets over the years too – a raw framebuffer is fastest for most 2D operations (sometimes dramatically so). 2D code in proprietary drivers often doesn’t get much attention – 2D performance doesn’t sell silicon, whereas 3D and video decode performance do (see comparisons of the ATI Catalyst drivers to the open-source radeon driver for an example).

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>