feature article
Subscribe Now

ARMing a New Generation

Altera Announces Processor Architecture for Gen X

The future of FPGA-based processors is coming into focus. Altera just announced that the processor architecture for their upcoming “Generation 10” Stratix FPGA family will be the ARM Cortex-A53. Pretty clear, eh? OK, that’s it. End of article. Move on along. 

What? Still have questions? What does it all mean?

To review, Altera has announced that their next Stratix family (Generation 10) will be built on Intel’s 14nm Tri-Gate (FinFET) process. As we have discussed before, Altera is currently in a race with archrival Xilinx, whose first FinFET FPGAs will be riding in on TSMC’s 16nm FinFET process. Which horse is faster? Intel is widely believed to have superior process technology and has already been shipping 22nm FinFET-based devices. Those points go to Intel. TSMC, on the other hand, has vastly more experience as a merchant fab and has announced that they are working closely with Xilinx to accelerate their FinFET program, in a blitz whose marketing name is “FinFAST.”

At this point, therefore, it is unclear who will be shipping first, (and, except for bragging rights between the two companies, probably few people care.) It is likely that we will not see production devices from either company before 2015, so we are definitely in “future” mode here. It is also unclear how the performance attributes of the two companies’ offerings will stack up. Altera has shown more of their hand thus far, and their predictions are impressive – up to four million LUT-4 equivalent 1GHz programmable fabric, 56Gbps SerDes, better power efficiency, tons-o-RAM – and a high-powered processing subsystem in the SoC version. What’s the processing subsystem look like? That’s why we are gathered here today. 

There was speculation that the architecture might be other-than-ARM since the manufacturer is none-other-than-Intel. As far as we know, Intel hasn’t historically been too keen on manufacturing competing processor architectures. However, two other, more important market forces are at work in this situation. First, Altera has made a huge commitment to the ARM architecture with their current-generation SoC FPGAs. Getting their customers committed to the ARM/FPGA architecture and then jumping ship and forcing them to migrate after only one generation would be a major inconvenience, and it would be a big black eye for Altera. It would have been very unlikely that Altera would have inked the Intel deal knowing that they couldn’t continue their ARM commitment.

Second, Intel is obviously trying to make a go at it in the merchant fab business. If the company had a hard-and-fast policy of never manufacturing a chip with an ARM architecture on board, they’d be severely limiting their market. While Intel has already been building FPGAs for both Tabula and Achronix, getting Altera in their stable is a whole ‘nuther deal. Putting aside petty concerns about processor architecture is a small price to pay for better street cred in the merchant fab business.

So, this week, Altera announced that they are planning a quad-core, 64-bit ARM Cortex-A53 as the SoC engine inside Stratix 10. Given the performance specs of Intel’s 14nm Tri-Gate process, this is likely to be a tiger of a processor. The Cortex-A53 is the architecture used in the “LITTLE” end of ARM’s 64-bit big.LITTLE configuration pair, with the Cortex-A57 as the “big” part. The A53 has a simple, in-order, 8-stage pipeline (compared with the A57’s complex, out-of-order, multi-issue pipeline.) Altera’s choice of the A53 comes down to power. The A53 delivers a crazy level of performance with very high power efficiency. Altera says the A53 will deliver up to 6 times the performance of current generation SoC FPGA processors. While the A57 could deliver more performance, it would be at a severe power penalty. In an SoC FPGA, when you want the big performance, you use hardware acceleration in the FPGA fabric, so the A53 will be able to deliver a lot of processing at a very low power, and it will dovetail nicely with super-fast, hardware-accelerated algorithms that will virtually sip power as well.

The A53 can also run in 32-bit mode – and, in that mode, it will be code-compatible with the A9 processor in Altera’s current SoC FPGAs. That should allow a smooth migration path for current customers, with the option to kick things up to 64-bit power at the cost of a modest software port. It isn’t clear how many customers have developed/deployed solutions based on the Altera/Cortex-A9 so we don’t know how widely this compatibility will be used.

With this much processing power on board, we are seeing a philosophical departure from the FPGA norm. Previous generation FPGA SoCs feel like connectivity devices with a helpful processor on board. With this level of capability, however, what we really have is a true heterogeneous computing device – a high-powered processor with impressive acceleration capabilities in programmable logic fabric. Boosting those capabilities will be Altera’s already-announced hard-wired floating point DSP cores which, when combined with the ARM subsystem, should give these new devices a crazy amount of processing capability.

These days, power efficiency is even more important than computing power. Many systems are form-factor constrained and are already at the limit of the amount of heat that can be extracted or the amount of power that can be supplied. That means increases in computing power have to be accompanied by offsetting improvements in power efficiency. This metric is where these new SoC FPGAs should really shine.

The tricky part of harnessing all that power and efficiency, however, is the programming. Most design teams don’t have the wherewithal to do the complex software/hardware development that would be required to take full advantage of this kind of device using a traditional HDL-based approach for designing the hardware acceleration bits. Altera’s go-to answer for this problem is their OpenCL support, which promises to allow software developers to use the language – popular for GPU applications – to write algorithms that can be accelerated with FPGA fabric. The degree to which this approach is able to take advantage of the capabilities of the upcoming Stratix 10 FPGA SoC has yet to be seen, of course, but Altera has already gained considerable experience with the OpenCL approach on their current-generation devices.

There are still a large number of unknowns about the next generation of FPGA technology, but if the disclosures we’ve seen so far are any indication – 2014/2015 will be a wild ride in programmable logic land.

One thought on “ARMing a New Generation”

  1. Altera announced that their upcoming Stratix 10 SoC FPGAs will feature a quad-core ARM Cortex-A53 processor architecture. What do you see as the killer app for this Intel-fabbed ARM-based heterogeneous computing platform? How will people write software for it?

Leave a Reply

featured blogs
Oct 15, 2021
We will not let today's gray and wet weather in Fort Worth (home of Cadence's Pointwise team) put a damper on the week's CFD news which contains something from the highbrow to the... [[ Click on the title to access the full blog on the Cadence Community site. ...
Oct 13, 2021
How many times do you search the internet each day to track down for a nugget of knowhow or tidbit of trivia? Can you imagine a future without access to knowledge?...
Oct 13, 2021
High-Bandwidth Memory (HBM) interfaces prevent bottlenecks in online games, AI applications, and more; we explore design challenges and IP solutions for HBM3. The post HBM3 Will Feed the Growing Need for Speed appeared first on From Silicon To Software....
Oct 4, 2021
The latest version of Intel® Quartus® Prime software version 21.3 has been released. It introduces many new intuitive features and improvements that make it easier to design with Intel® FPGAs, including the new Intel® Agilex'„¢ FPGAs. These new features and improvements...

featured video

Product Update: Complete DesignWare 400G/800G Ethernet IP

Sponsored by Synopsys

In this video product experts describe how designers can maximize the performance of their high-performance computing, AI and networking SoCs with Synopsys' complete DesignWare Ethernet 400G/800G IP solution, including MAC, PCS and PHY.

Click here for more information

featured paper

System-Level Benefits of the Versal Platform

Sponsored by Xilinx

This white paper provides both a qualitative and quantitative analysis of Versal ACAP system-level capabilities for a host of markets ranging from cloud to wired networking and 5G wireless infrastructure. Learn how the Versal architecture delivers best-in-class performance/watt leadership over competing 10nm FPGA architectures in end-applications such as AI compute accelerator, 5G Massive MIMO, network accelerator, smart SSDs, and multi-terabit SmartPHY—supported with data that can be validated with public tools.

Click to read more

featured chalk talk

Nordic Cellular IoT

Sponsored by Mouser Electronics and Nordic Semiconductor

Adding cellular connectivity to your IoT design is a complex undertaking, requiring a broad set of engineering skills and expertise. For most teams, this can pose a serious schedule challenge in getting products out the door. In this episode of Chalk Talk, Amelia Dalton chats with Kristian Sæther of Nordic Semiconductor about the easiest path to IoT cellular connectivity with the Nordic nRF9160 low-power system-in-package solution.

Click here for more information about Nordic Semiconductor nRF91 Cellular IoT Modules