“There are lies, damn lies, and benchmarks.” With apologies to Mark Twain (or possibly Benjamin Disraeli or maybe Henry Du Pré Labouchère), benchmarks have been used and abused ever since there have been computers. Like the question about when the first auto race was held (“as soon as the second automobile was built”), the question of who makes the fastest computer has beguiled and bedeviled engineers for ages. Now, just maybe, we may be making progress toward settling that dispute.
The bigger the computer, the bigger the benchmark. Conversely, testing just the microprocessor by itself requires only the simplest of code loops – or so it might appear. But even the simplest benchmark distorts the true nature of the processor you’re testing, as any “marketing engineer” can tell you. No matter what you measure – clock speed, arithmetic agility, procedural proficiency, or what have you – you’re always leaving something out. No synthetic test can truly encapsulate all the goodness (and badness) of a microprocessor.
For most of recorded history, the Dhrystone benchmark was the acid test for microprocessors, and sometimes for complete systems. Dhrystone is woefully inadequate, but that hasn’t stopped millions of curious programmers and incautious marketeers from using it to establish their bona fides. Intended as an integer-only alternative to the Whetstone benchmark (whose name it clearly mimics), Dhrystone is small, simple, easily portable, and utterly useless.
Enter EEMBC, the Embedded Microprocessor Benchmark Consortium. This nonprofit industry group has developed dozens of intricately engineered and skillfully crafted benchmarks for a variety of specialty industries. There are EEMBC benchmarks for automotive applications, industrial controllers, fax machines, and most other types of embedded applications you can name. What the group hasn’t ever created was a basic, all-purpose benchmark to replace the ubiquitous Dhrystone – until now.
All Hail CoreMark
Starting this week, the group has released CoreMark, a free benchmark for measuring your favorite microprocessor or microcontroller. CoreMark is distributed as C source code that anyone can download and compile. In fact, please do. Anyone can post their CoreMark benchmark results at www.CoreMark.org, which acts as a public clearinghouse for CoreMark results from all over the world. For embedded developers shopping for their next processor or microcontroller, the site should become a valuable resource.
CoreMark makes very few assumptions about the target hardware (apart from the existence of a working C compiler), so it works on chips with or without caches, with or without floating-point units, with or without specific peripherals, and so on. In short, CoreMark should run on just about any processor worthy of the name and deliver useful results.
Where does CoreMark succeed where Dhrystone fails? By paying attention to the way we write code and the way certain, uh, overeager marketing departments have, shall we say, “optimized” their Dhrystone scores over the years. Because Dhrystone is very small and simple, and because its results are often given so much weight, there’s a big incentive to tweak it – sometimes to the point where it compiles into little more than a sequence of NOPs. Clearly, some kind of legitimate alternative was needed.
In contrast to Dhrystone, CoreMark can’t be optimized away with compiler switches or creative programming. And although it’s still a synthetic benchmark, CoreMark mimics real-life workloads better than Dhrystone ever could. It contains a mixture of integer arithmetic, matrix manipulation, linked lists, state-machine operation with data-dependant branches, and a CRC, among other typical tasks. It’s also fairly smart about timing the right things, like the actual loops and not the overhead of library calls. Dhrystone was often a better test of the compiler than of the processor; CoreMark is designed to invert that relationship.
I’m sure CoreMark isn’t perfect – no benchmark is – and I’m equally sure that many programmers and marketing dweebs will cry foul and point to some arcane feature of some equally arcane chip that isn’t adequately represented in the test. Frankly, I don’t much care. Anything that displaces the miserable Dhrystone test is okay in my book. The EEMBC folks have shown themselves to be very clever when it comes to crafting practical and reliable benchmarks. We have much to be hopeful for.