feature article
Subscribe Now

Why Use an 8-bit Core When 32 Bits Are Better?

You are designing a new product as an SoC and need some processing power  – not a huge amount – and you have tight power and real estate budgets. So you drop in an 8051 core. Job done? Well, not according to the folks at Cortus. These guys, a multinational mix of people based in the Southern French town of Montpellier, whose backgrounds include working on processors for Intel, Bosch, Infineon, Siemens, and Synopsys, are likely to say that you may have made a poor move. Your real estate and power budgets can be achieved with a processor that will also give you a great deal more processing horsepower and a lower overall cost of ownership – their APS3 32-bit core.

Now this seems a bit counter-intuitive: so how do they justify their attitude? Let’s get some physical things out of the way first. The basic core is implemented in 7.9k gates (which, they say, is the same size as an 8051 and around 60 per cent of an ARM Cortex-M0) and it occupies only 0.04mm2 in a 65nm process. They compare this with a “high end” 32-bit core, which will be approaching 50k gates. Power consumption is related to silicon area and process, but some quoted figures for their core are 22μW/MHz and 19μW/DMIPS. (These figures are for an implementation in a TSMC 130nm process.) Power consumption is reduced by a design that reduces the areas of the CPU that need to be switched.

Then comes the claim that really summarises the Cortus argument: the APS3 does around 40 times more work per cycle than an 8-bit core. This means that, for a given workload, you can slow the clock right down, and so burn less power. And, if you are running more slowly, there is no need for expensive fast memory.

But they haven’t stopped with the silicon. Right from the start, the Cortus team has been creating the software development tool chain in parallel with the core. Arguing that most users will want to use a high level language for faster system development and easier application maintenance, right from the start, Cortus has been working on the development of a version of GCC (GNU Compiler Collection) to compile C and C++ to run on the core. The compiler, as with all the other tools, is built as a plug-in for the Eclipse IDE. Also available is an Instruction Set Simulator (ISS) and other customized GNU development tools, all within Eclipse. Licensees of the APS3 core can repackage and rebrand these elements of the tool chain, so that their customers will have a fully supported product.

There is already a choice of operating systems. The basic OS available from Cortus is FreeRTOS, which has a code footprint in ROM of 11,084 bytes. Cortus compares this with the same RTOS on an 8051, which occupies 26,007 bytes. Other operating systems available include OPENRTOS (a commercial version of FreeRTOS); Micrium’s μC/OSII, another RTOS that is specifically aimed at the safety-critical market; and the MMU-less μC Linux, which makes the APS3 the smallest core to support Linux. (You will need about 1MByte of ROM to hold it, though.) They have also worked with Lauterbach, the simulator and debugger company, to ensure that debug tools are available.

While the RISC instruction set is rich and full, Cortus provides a coprocessor interface and a coprocessor (a full Freescale 56000-like 16-bit DSP) to allow it to be extended. The company will even develop the algorithms and instructions needed. One of the current licensees, security specialist Certicom, a subsidiary of RIM, has used the coprocessor to take care of calculations in the elliptic curve encryption/decryption algorithm that they have developed.

The core exists in a number of flavours, including a floating-point version, and these are supported by a list of peripherals, including simple and complex bus interfaces (from UARTs to Ethernet and USB), a JTAG interface (and Cortus also sells an EtherTag, a JTAG/Ethernet debug interface for system development), an instruction cache, a data cache, and an MMU.

Cortus’s APS3 has been designed into products that have been shipping in volume since the end of 2008, and the company has some interesting plans for extending their products. There are routes available to versions of the cores for what Cortus calls  “more demanding applications.” These are often made by coupling some of the more complex peripherals more closely to the core. The team is also talking about routes to multi-core, and they provide a coherent data cache for dual- and quad-core implementations, which are supported by the existing tool chain. In all cases there seems to be a flexible attitude by the company to develop, both from the existing building blocks and, if necessary, from custom design, a core that meets the licensees needs.

For a company that has been shipping licenses for only 3 years, the web site has a long reference list of happy customers: APS3 licensees include Schneider, Taegee/MagnaChip, E2V, Coronis/Elster, SightIC/Broadcom, Certicom, PointChips, Discretix, CEA, and StarChip, among others. They are all building chips that are deployed in deeply embedded applications, such as touch screen controllers, utility power monitoring, Bluetooth, SIM cards and Pay TV smart cards.

All the IP is available as RTL for integration into a standard ASIC/SoC design flow and also for Microsemi SoC products (formerly Actel FPGAs).

The company is gearing up for even more growth, and it closed a new round of investment last September, for an undisclosed amount, from French sources. The announcement at the time was that the new funds were going to be used to support international expansion.

4 thoughts on “Why Use an 8-bit Core When 32 Bits Are Better?”

  1. 40 times more work per clock? That claim needs to be backed up with at least tiny bit of hard data. 4 times, easily, 8 times, maybe – but 40 times?? I’m dubious.

  2. It is reasonable to compare benchmark performance data such as DMIPS/MHz. The most commonly used 8051 cores use 4 cycles/instruction and deliver 0.026 DMIPS/MHz. The basic APS3 delivers 1.15 DMIPS/MHz so does approx 40 times as much per instruction.

  3. In other news, they hold another kind of 8k mcu demo compo in France; spooky computation at EU distance.

    The Microprocessor Report abstracts on Lilliputian mcu are generous, but it is nice to see this as an option to a Cadence module suggestion I can’t out-optimize (refuse.) The number one reason has to be that data with 128bit indices (at least in time code) is getting common; 24-bit sound subcoding, 30-bitplane stereo images; who wants to glue up to that 8 bits at a time (though that is the minimum DDR3 bitline chunk…)

Leave a Reply

featured blogs
Aug 3, 2021
Picking up from where we left off in the previous post , let's look at some more new and interesting changes made in Hotfix 019. As you might already know, Allegro ® System Capture is available... [[ Click on the title to access the full blog on the Cadence Community si...
Aug 2, 2021
Can you envision intelligent machines creating a 'work of art' involving biological implementations of human legs being used to power some sort of mechanism?...
Jul 30, 2021
You can't attack what you can't see, and cloaking technology for devices on Ethernet LANs is merely one of many protection layers implemented in Q-Net Security's Q-Box to protect networked devices and transaction between these devices from cyberattacks. Other security technol...
Jul 29, 2021
Learn why SoC emulation is the next frontier for power system optimization, helping chip designers shift power verification left in the SoC design flow. The post Why Wait Days for Results? The Next Frontier for Power Verification appeared first on From Silicon To Software....

featured video

Accelerate Intelligent SLAM with DesignWare ARC EV Processor IP

Sponsored by Synopsys

Simultaneous localization and mapping (SLAM) algorithms build a map and determine location in the map at the same time. But how can you speed up the results? This demo shows how ARC EV processor IP with CNN engine accelerates KudanSLAM algorithms.

Click here for more information about DesignWare ARC EV Processors for Embedded Vision

featured paper

Configure the charge and discharge current separately in a reversible buck/boost regulator

Sponsored by Maxim Integrated

The design of a front-end converter can be made less complicated when minimal extra current overhead is required for charging the supercapacitor. This application note explains how to configure the reversible buck/boost converter to achieve a lighter impact on the system during the charging phase. Setting the charge current requirement to the minimum amount keeps the discharge current availability intact.

Click to read more

featured chalk talk

The Wireless Member of the DARWIN Family

Sponsored by Mouser Electronics and Maxim Integrated

MCUs continue to evolve based on increasing demands from designers. We expect our microcontrollers to do more than ever - better security, more performance, lower power consumption - and we want it all for less money, of course. In this episode of Chalk Talk, Amelia Dalton chats with Kris Ardis from Maxim Integrated about the new DARWIN line of low-power MCUs.

Click here for more information about Maxim Integrated MAX32665-MAX32668 UB Class Microcontroller