feature article
Subscribe Now

Your Basic $99 Supercomputer

Adapteva’s Epiphany-IV Chip Rocks Floating-Point Math

What do you get when you combine a floating-point processor with a mesh network? The Adapteva Epiphany-IV microprocessor, apparently. This Boston-based startup composed of four refugees from Analog Devices has developed a brand new high-performance processor that should help smartphones and other mobile devices get even smarter.

Epiphany-IV comes in chip form or as licensed IP. You can also get an evaluation board for $99 (described below), which the company proudly boasts is the world’s most power-efficient supercomputer. Big words from such a little company. But from tiny acorns do mighty oak trees grow, as a certain British competitor can attest.

The Epiphany architecture combines simple floating-point units connected by a point-to-point mesh network. The idea is that each CPU core works on a small part of a larger data set, exchanging data with its neighbors as it goes. Like a silicon hive mind, Epiphany relies on tiny amounts of work, multiplied.

As any fifth-grader can tell you, floating-point math (i.e., fractions) is a lot harder than basic integer math. Intel’s x86 family, for instance, is notable for its miserable floating-point performance. But Intel can’t change its Paleolithic FPU architecture without breaking software compatibility. Adapteva can.

Epiphany is so focused on floating-point operations that it can’t even do basic integer multiplication or division. (It does do integer ops; just not very complex ones.) Epiphany’s dual-issue CPU dispatches one integer instruction and one FP instruction every cycle. Since memory loads and stores are considered integer operations, the bulk of the integer half of the pipeline will likely be kept busy transferring data for the floating-point half.

The CPU doesn’t have any of the aggressive hardware we often see in microprocessors today. It doesn’t reorder instructions, for example, nor does it have any kind of branch prediction. In true RISC fashion, Epiphany relies on Adapteva’s compiler to figure all this stuff out. The hardware does whatever it’s told.

According to the EEMBC CoreMark benchmark, Epiphany’s integer performance is… lackluster. It’s not even half as fast as an ARM Cortex-A9, for example (1.3 CoreMarks/MHz versus 2.9 for the ARM core). But integer arithmetic isn’t really Epiphany’s forte. It’s intended for floating-point work, and on that it excels.

Each CPU can spit out one floating-point multiply-accumulate operation (MAC) every cycle, pretty much like a DSP. And there are either 16 or 64 of these CPU cores on each Epiphany chip. Taken altogether, Epiphany is a pretty speedy little FP coprocessor. Current samples of Epiphany-IV run at 600 to 700 MHz in 28nm silicon.

Programmers of a certain age may remember the Intel i860, a roughly similar device that briefly adorned the Intel product catalog in the late 1980s. Like Epiphany, the i860 had a “naked” RISC pipeline that was streamlined and efficient, but that also leaned heavily on its compiler to know what to do. This approach yields smaller and more power-efficient hardware than most CPUs today, but it also lays several traps for unwary programmers. Where power efficiency is the goal, stripped-down hardware is a fair tradeoff. Where ease of programming is important, it’s a quagmire. Given that Adapteva has no installed software base to protect, and its own in-house compiler to play with, it can manage the complexity. If Epiphany becomes more popular, third-party developers will have to port their code carefully.

As an example of Epiphany’s less-is-faster approach, the chip has no cache, only SRAM. Specifically, each of the 16 or 64 CPU cores has its own 32KB block of SRAM that it can access directly. That memory can also be accessed by the 15 or 63 other CPUs on the same chip, via the on-chip mesh network. But they have to ask for it. In other words, there’s no cache snooping among all those SRAM blocks, so no power is wasted keeping them all coherent. That also means no cache tags, no lookup logic, and so on. SRAM is simple, predictable, and easy to manufacture, whereas caches add complexity, size, and power. Caches might be easier for programmers to manage, and they effectively multiply the size of the memory, but they’re not as simple SRAMs.

The on-chip mesh network also sacrifices transparency on the altar of simplicity. While each CPU can connect to its immediate neighbor to the north, south, east, or west, it can’t transfer data any farther than that without multiple hops. Like a miniature version of Internet routers, each CPU in Epiphany must forward packets not intended for itself. Every hop adds a clock cycle, so there’s an incentive for programmers to design software with physical, as well as logical, layouts in mind. It’s not often that you see the physical structure of a program affecting its performance.

As we’ve said before on these pages, designing a microprocessor is the easy part. Getting software support for it is hard. The company has taken the unusual step of making its lack of software someone else’s problem. To help develop Epiphany code, Adapteva has launched a Kickstarter program to raise $750,000 for the company. A $99 “pledge” gets you an Epiphany evaluation board, while a check for $499 entitles the donor to a “limited edition” evaluation board with a low serial number. Wooo.

If floating-point is your thing, and small die size and minimal power consumption are your metrics for success, Adapteva’s Epiphany design may be just the ticket. It’s far smaller than an ARM processor in terms of transistor count and silicon area (proving, once again, that ARM is not the tiny CPU that many imagine it to be), but there are drawbacks to that austere minimalism. Other processor designers don’t add transistors gratuitously; all that extra silicon is there for a reason. And in most cases, it’s to make the CPU easier to program and less likely to crash. Most CPUs have caches, MMUs, bus interfaces, peripheral I/O, and other productive uses for their transistors. Epiphany has none of those features, and as long as its users know that going in, they’ll likely be able to brag they’ve got the smallest, fastest, cheapest, and most obscure supercomputer on the block. 

Leave a Reply

featured blogs
Feb 28, 2021
Using Cadence ® Specman ® Elite macros lets you extend the e language '”€ i.e. invent your own syntax. Today, every verification environment contains multiple macros. Some are simple '€œsyntax... [[ Click on the title to access the full blog on the Cadence Comm...
Feb 27, 2021
New Edge Rate High Speed Connector Set Is Micro, Rugged Years ago, while hiking the Colorado River Trail in Rocky Mountain National Park with my two sons, the older one found a really nice Swiss Army Knife. By “really nice” I mean it was one of those big knives wi...
Feb 26, 2021
OMG! Three 32-bit processor cores each running at 300 MHz, each with its own floating-point unit (FPU), and each with more memory than you than throw a stick at!...

featured video

Designing your own Processor with ASIP Designer

Sponsored by Synopsys

Designing your own processor is time-consuming and resource intensive, and it used to be limited to a few experts. But Synopsys’ ASIP Designer tool allows you to design your own specialized processor within your deadline and budget. Watch this video to learn more.

Click here for more information

featured paper

The Basics of Using the DS28S60

Sponsored by Maxim Integrated

This app note details how to use the DS28S60 cryptographic processor with the ChipDNA™. It describes the required set up of the DS28S60 and a step-by-step approach to use the asymmetric key exchange to securely generate a shared symmetric key between a host and a client. Next, it provides a walk through on how to use the symmetric key to exchange encrypted data between a Host and a Client. Finally, it gives an example of a bidirectional authentication process with the DS28S60 using an ECDSA.

Click here to download the whitepaper

featured chalk talk

Single Pair Ethernet

Sponsored by Mouser Electronics and Phoenix Contact

Single-pair Ethernet is revolutionizing industrial system design, with new levels of performance and simplicity. But, before you make the jump, you need to understand the options for cables, connectors, and other infrastructure. In this episode of Chalk Talk, Amelia Dalton chats with Lyndsey Walling of Phoenix Contact about the latest in single-pair Ethernet for industrial applications.

Click here for more information about Phoenix Contact Single Pair Ethernet (SPE) Connectors