feature article
Subscribe Now

Heterogeneous Processing, SoCs and FPGAs

A Path to Flexible System Implementation

Firstly – if you are an existing FPGA user, you may not find much that is new in this piece, but really, it is not aimed at you. What would be useful is if you share it with your system architect colleagues and your software colleagues, for whom much of this may well be new and useful.

You are beginning a new project – let’s say a motor control system. You can assemble components on a board – possibly a processor, a DSP, an FPGA for peripherals, and a networking ASIC. The result is a relatively large board, with the inherent reliability issues and a high BoM cost. If you are intending to produce many of these, the alternative might be an SoC/ASIC, which, even if you go down the low-cost route (see Cheap Chips: ASICs for the Rest of Us), is still going to take time and involve considerable NRE investment, and the end result is relatively inflexible.

There is another way: the SoC FPGA. FPGAs have been around for a long time, and they have moved from being merely the glue logic on a board to becoming significant devices in their own right, with massive deployments in a wide range of applications, particularly in communications. There are huge quantities of IP for a wide range of functions and, over time, the FPGA companies have added hard-wired functions – for communication, for example. IP for processors and for DSP functionality has long been available, but in 2011, FPGA market leader Xilinx launched the first SoC FPGA, which they call Zynq. At its heart is a silicon implementation of a standard ARM Cortex A9 processor, and sharing the chip is a chunk of FPGA fabric. At power-up, the processor boots first and then configures functions in programmable fabric. In one light it could be seen as a standard ARM-based microcontroller, with the option of choosing interfaces and on-chip functions, such as DSP, rather than accepting what a microcontroller company has chosen to offer. But in another light, it can be seen as a SoC, since FPGA fabric can be used to implement a range of functionality.

Altera followed Xilinx, except that they chose to use the phrase SoC FPGA and introduced the approach in each of their product families, so today we have Cyclone V, Arria V, Arria 10 and Stratix 10 SoC FPGAs.

And Microsemi uses the SoC FPGA label for its SmartFusion devices.

So, what are the advantages of a SoC FPGAs, and where should you use them? I spoke about this recently with two Altera people: Danny Biran, Senior Vice President of Corporate Strategy and Marketing, and Chris Balough, Senior Director, System-on-Chip Product Marketing. (Naturally they were both enthusiastic about SoC FPGAs.)

There are three starting points for looking at an SoC FPGA. One is as an alternative to a microcontroller, another if you are already using a processor and an FPGA together, and thirdly, if you are already using an ASIC. In each case an SoC FPGA may make your life simpler.

For a start, there is the enormous cost of building a new “conventional” SoC. An IC built for a 20nm process node costs around $156 million and takes many man-years to complete. This means that you are going to have to achieve sales of around $750 million if you are to keep development costs at around 20% of revenue. So you are going to have to sell a lot of chips to make your numbers. At 14nm, these figures double.

Developing an SoC FPGA is going to be far faster and cheaper. Once the design is complete, initial production devices are immediately available, there is no waiting while expensive mask sets are created, and the first silicon completes the production process. With lower development costs and without the need, inherent in silicon processing, to manufacture many devices at the same time, it is possible to have much smaller markets and still be profitable. And it is relatively easy to create variants of the basic design to match segments of your market more precisely.

Ah – but aren’t FPGAs slower and more power hungry than an SoC? Indeed they can be, at a particular process node. But if you are a process node junkie, you will have seen that FPGAs are always among the first products at each process node. The Stratix 10 SoC FPGAs that are just about to emerge are built using Intel’s 14nm Tri-Gate (FinFET) technology – probably the most advanced process in production in the world. (This is under a deal struck before Intel bought Altera.) Arria V devices are in 20nm, and the Cyclone Vs are in 28nm. By contrast, many mainstream SoCs are being designed in 40nm or 65nm, and Altera argues that this gap is increasing.

While these are technical benefits of the SoC FPGAs, there are also underlying currents of demand that these devices are well placed to meet. Mobile applications are creating rapidly increasing demands for large-scale transmission, processing, and storage of data.

Processing requirements have so far been fulfilled by multi-core homogeneous processors, but there is an increasing demand for other forms of processing, including ARM’s big.LITTLE approach, Digital Signal Processing, and graphical processing. These have two objectives: to match processing methods to the application needs, and to save power by taking the load from the main CPU. For example: with big.LITTLE, the largest processor is normally in sleep or hibernation mode, being awakened only when the smaller processor identifies an application that needs more processing power than it has.

SoC FPGAs with hard-wired CPUs and programmable fabric are ideal for building heterogeneous computing systems. IP for DSPs and for GPUs is available, and Altera also has implementations of the NIOS II 32-bit processor, to provide another processor option.

Now that we have these complex devices, how on earth do we implement them, programme them and debug them?

Well, Altera offers a range of tools that are matched to different users’ expertise. The OpenCL SDK gives programmers the power to develop systems without getting involved with the details of the implementation. OpenCL is an open source framework, based on the C programming language, for programming heterogeneous systems. The SDK, in addition to programming the processors on the SoC FPGA, can also generate code to program the FPGA fabric.

An alternative route is through the SoC Embedded Design Suite. This incorporates the ARM DS-5 Altera Edition, which is an Eclipse-based IDE (Integrated Development Environment) that provides the tools to programme, debug and optimise applications for the ARM Core. There are also a hardware library, configuration tools, and implementation examples.

For people working in Mathworks’ MATLAB and Simulink, there is DSP builder, which takes DSP algorithms from Simulink and compiles them into VHDL for the FPGA.

And there is the Quartus II design tool suite, which is standard for all Altera FPGAs. (In fact a significant number of Altera engineers are working not on hardware, but on software.)

Companies within the Altera and ARM ecosystems are creating and deploying other tools for development and debugging of SoC FPGAs.

The Altera SoC FPGA family starts with the Cyclone V, which uses a dual-core ARM Cortex-A9 and is targeted at industrial motor control, video converter and capture, and hand-held and portable devices. The three versions vary in the number of logic elements in the FPGA fabric and in the amount of hard-wired interfacing.

Next up is the Arria V, designed for lowest-power implementation of applications such as remote radio units, 10G/40G line cards, medical imaging, and broadcast studio equipment. Again there is a dual-core ARM Cortex-A9, hard-wired memory controllers, and high-speed, low-power communication interfaces. The two versions are for high throughput or lower power.

The Arria 10 SoC uses the same processors, but architectural innovations and a move from the Arria V SoC’s 28nm process to the TSMC 20nm process gives higher performance and lower power. There is more on-chip memory and more interfacing and hard-wired communication.

The top of the family is the Stratix 10 SoC. This combines a 64-bit quad-core ARM Cortex-A53 with Altera’s new HyperFlex FPGA architecture, and it is fabricated in 14nm Tri-Gate architecture to give some very exciting performance figures.

One of Altera’s claims for its SoC FPGAs is that it that the company has a road map to provide developers with the confidence of future support. This, without any dates, is divided into three families, codenamed Sequoia, Oak, and Cedar (see what they did there?).

Sequoia is at the top end, aiming for cloud applications, Terabit Systems, and military signal processing. To meet this, there will be an “Enterprise-class multi-core CPU system” and vast quantities of the HyperFlex FPGA architecture, and it will be fabricated in Intel’s 10nm Tri-Gate process.

Oak is midrange, for industrial IoT and automotive applications, and it will be in 14nm Tri-Gate and have an ARM 64-bit quad-core processor.

Cedar will be fabricated in a TSMC low cost/low-power process and have an ARM 64-bit dual-core processor.

All of them, Altera says, will have advances in security and power efficiency, and they will all have the Altera Common SoC Architecture (ACSA). This will allow code-portability, scalability, common tools, and a common development flow. Developers will be able to begin their project but will not decide on the final target until they have carried out sufficient implementation to have a firm idea of the performance they want from the device.

SoC FPGAs, and not just those from Altera, are now sufficiently mature, and possess appropriate supporting tools, to be taken seriously as an alternative route to system implementation. The long awaited cross-over point between FPGAs and ASICs has probably arrived.

17 thoughts on “Heterogeneous Processing, SoCs and FPGAs”

  1. Why does almost everybody say Zynq was first SoC FPGA?
    First SoC FPGA was Altera’s Excalibur 😉
    Nevertheless, thank you very much for the great article!

  2. I agree with Mr. Atanasov – unless I misunderstand the definition of an SoC FPGA, the Xilinx Virtex-2 Pro (with the PowerPC 405 core) and the Altera Excalibur (with an ARM9 core) predated Zynq by a decade. While Excalibur was dropped early on, it’s legacy continued for many years in the name of Altera’s erstwhile IP assembly tool, SOPC Builder (SOPC was the abbreviation for “System on Programmable Chip”)

    Also predating Zynq was Actel’s SmartFusion family — Actel (now Microsemi SoC) featured an ARM Cortex-M3 on those devices and subsequently continued to use the M3 in its SmartFusion2.

    What Xilinx did very notably was to give their SoC FPGAs the unique Zynq brand, and to aggressively position their Zynq devices by steering clear of FPGA terminology and pursuing sockets outside of the normal FPGA space.

  3. While there is a lot of good information in the article, a modern SOC does not cost anywhere near $156M to develop. Even using a 14 nm node, one can build one for less than $20M if they know what they are doing. This does not count any specialized software that you might use in teh SOC, however you have that cost if it is a n FPGA or an SOC.

    I have done quite a few SOCs over the years and have recently completed a state of the art SOC for far less than that amount. <$16M. Perhaps if the person had no experince and went completely turnkey and just sat back and watched the SOC being built by an outside company, it can take that half that much. The arguments being made are from an FPGA point of view where they want to show the economics poorly to highlight their case. My designs use an SOC FPGA in most of them as an assist to my SOCs and they are well done.

  4. @atanasov and @ecigan

    I think the difference between Excaliber and Smartfusion and the latest round of products is that the amount of programmable fabric left over after the hardwired core was in place was quite small. This limited dramatically the scope for additional logic and interfacing options.
    With Zynq and the Altera families you have quite large programmable fabric resources, equivalent in fact to entire FPGAs of an earlier generation. It is this, I feel, that is the game changer.

  5. @Asic1designer

    Thanks for your input.

    I was given the numbers and used them in good faith

    I am interested that you were able to get a product into production for “much less”.

    Did this include the mask set?

    Did you have to do a respin?

    Did you use a lot of IP, either bought in or from your own earlier products?

    Many thanks


  6. Pingback: panselnas.id
  7. Pingback: GVK Biosciences
  8. Pingback: Dungeon
  9. Pingback: DMPK Studies
  10. Pingback: Learn More

Leave a Reply

featured blogs
Jul 12, 2024
I'm having olfactory flashbacks to the strangely satisfying scents found in machine shops. I love the smell of hot oil in the morning....

featured video

Larsen & Toubro Builds Data Centers with Effective Cooling Using Cadence Reality DC Design

Sponsored by Cadence Design Systems

Larsen & Toubro built the world’s largest FIFA stadium in Qatar, the world’s tallest statue, and one of the world’s most sophisticated cricket stadiums. Their latest business venture? Designing data centers. Since IT equipment in data centers generates a lot of heat, it’s important to have an efficient and effective cooling system. Learn why, Larsen & Toubro use Cadence Reality DC Design Software for simulation and analysis of the cooling system.

Click here for more information about Cadence Multiphysics System Analysis

featured paper

DNA of a Modern Mid-Range FPGA

Sponsored by Intel

While it is tempting to classify FPGAs simply based on logic capacity, modern FPGAs are alterable systems on chips with a wide variety of features and resources. In this blog we look closer at requirements of the mid-range segment of the FPGA industry.

Click here to read DNA of a Modern Mid-Range FPGA - Intel Community

featured chalk talk

Portenta C33
Sponsored by Mouser Electronics and Arduino and Renesas
In this episode of Chalk Talk, Marta Barbero from Arduino, Robert Nolf from Renesas, and Amelia Dalton explore how the Portenta C33 module can help you develop cost-effective, real-time applications. They also examine how the Arduino ecosystem supports innovation throughout the development lifecycle and the benefits that the RA6M5 microcontroller from Renesas brings to this solution.  
Nov 8, 2023