“I don’t always benchmark my Android devices. But when I do, I prefer AndEBench-Pro.”
– The Most Boring Man in the World
Benchmarking, like economics, is a dismal science. Both are important and both are more complicated than the casual observer may expect. EEMBC is an expert at one of these.
The Embedded Microprocessor Benchmark consortium (the second E is purely decorative) is a nonprofit group approaching its 20th birthday. The merry band of benchmarkers has expanded beyond its original remit of creating tiny benchmark kernels for stressing CPU cores, and it now offers a whole catalog of benchmarks large and small for just about anything. There’s an automotive benchmark, a Web browser benchmark, and now two different Android benchmarks.
What’s this, you say? A second Android benchmark? What was wrong with the first one? Nothing at all, but this is Version 2.0. It’s better. And it’s called AndEBench-Pro, to distinguish it from its predecessor, AndEBench.
AndEBench-Pro is a slick, packaged, ready-to-run program that you can download for free from Google Play or the Amazon App Store. Unlike most EEMBC benchmarks that are supplied as source code for engineers, AndEBench-Pro is an idiot-proof test suite that anybody can run on just about any Android-based gizmo. It’s primarily intended for smartphones and tablets and the like, but if your hardware can successfully download apps from the above-mentioned app stores, you’re good to go.
Running the tests couldn’t be simpler. Hit the big red Start button on the screen and wait. There’s a progress thermometer to let you know it’s working and hasn’t hung the machine, but apart from that there’s nothing to see. Well, until the zombies come. Really.
Behind the scenes, AndEBench runs a series of discrete tests (many of them multiple times) that exercise your CPU, GPU, memory latency and bandwidth, and internal storage. The CPU tests exercise both integer and floating-point performance, and both single-core and multiple core/thread behavior (assuming your processor supports them). At the end, you’ll get a detailed accounting of each subtest, as well as a weighted “Device Score” that’s intended to be a single overall metric of goodness that you can share and compare to other devices. People like single numbers, and Device Score provides that.
Because AndEBench is provided as an executable program instead of source code, there’s no opportunity to fine-tune your compiler switches. Normally, benchmarks get tweaked to within an inch of their lives as people fiddle with compiler optimizations hoping to eke out the best possible score. That doesn’t work with AndEench-Pro, so EEMBC did the tweaking for you. Each CPU test is run twice, once with the standard Google Native Development Kit (NDK) compiler and again with a CPU-specific compiler that’s been pre-tweaked to yield the best performance. Normally, duplicating the same executable code twice in the same program would be a waste of space, if not an outright bug, but in the case of AndEBench-Pro, it gives you a good idea of how your hardware actually performs. Since the idea is to benchmark the platform, not the program, doubling up on optimized and non-optimized code makes sense.
The most complex test of the suite is the “platform test,” which is intended to mimic a fairly typical consumer-item use case. It starts by decrypting and decompressing an input file and calculating its hash, as though you had received it via download and wanted to verify its integrity. The decrypted, decompressed file contains a bunch of contact names, images, and an XML file with a to-do list for each. The benchmark then searches the contact list, transforms the images in various ways, packs the whole thing back up into an encrypted, compressed .ZIP file, and then stores the result. You see? Typical Android workload.
Bizarrely, the GPU test doesn’t display any images. Which is a shame, because the test video is a zombies-versus-aliens Jazzercise routine in a rainy dark alleyway. The rendering is done off-screen because different Android platforms have different screen sizes, and it wouldn’t be fair to penalize systems with larger screens (which have more pixels and require more rendering time). Instead, all the work is done assuming an internal 1080p resolution, making times comparable. Happily, the fully rendered video is then scaled and displayed so you can watch the zombies shake a leg.
The memory bandwidth and latency tests measure, um, memory bandwidth and latency. It’s a little more interesting than that, however, since AndEBench measures bandwidth first with one CPU core, then with multiple cores, to see how multithreaded software affects your memory. The latency test works by randomly seeding a big (64 MB) linked list, then traversing the list all the way from start to finish.
So AndEBench-Pro replaces the original AndEBench, right? Nope. The newer version incorporates most of the original’s testing kernels, plus lots and lots more. But the original AndEBench is still available, and still useful. Scores are not comparable between the two, and since more than 5000 users have published their AndEBench scores, it would be a shame to discard all that accumulated data. Over time, we may amass as much information about AndEBench-Pro. If the zombies don’t get us first.
One thought on “Off to the Android Races”
Hi. One quick comment pertaining to your sentence “Since the idea is to benchmark the platform, not the program, doubling up on optimized and non-optimized code makes sense.”
EEMBC uses the terms peak and base, respectively, to refer to the CPU portion of the tests. Base doesn’t mean non-optimized – in fact, the compiler settings were quite carefully chosen to build the APK binary. ‘Base’ more accurately refers to the use of the Google NDK toolchain, which will be most typically used by application developers. On the other hand, ‘Peak’ allows the use of any commercially available compiler. For example, for the Intel code build, we used Intel C++ Android compiler (ICC), which yielded about an 8-10% increase.