feature article
Subscribe Now

MIPS Moves on Multi-Core

MIPS32 1004K

At first, the concept of “multi-core” from a processor IP company might seem a bit confusing.  Couldn’t we already put multiple MIPS cores on our devices?  If your concept of multi-core ends with putting more than one processor on a chip, you may not be yet dialed into the subtleties.

This week, MIPS launched their highest-performance solution ever with the new MIPS32 1004K “Coherent Processing System” – a multi-core, multi-threaded IP solution.  The challenge of keeping all your cores busy in a symmetric multi-processing (SMP) system is actually made much easier when the multi-processor is combined with multi-threading.  With monolithic processors continuing to pound frequency and particularly power consumption walls, multi-core technology is permeating every segment of computing.  Embedded applications have long taken advantage of multiple processors – tasking different cores to perform completely independent system functions.  More general multi-core processing is relatively new, however, and the proliferation of higher-end applications running on sophisticated operating systems makes multi-core an imperative for power-sensitive, performance-hungry embedded applications.

Last year, MIPS rolled out their highest-performance monolithic, single-threaded core – the MIPS32 74K.  The 74K gets its performance the old fashioned ways – higher frequencies (up to 1GHz) and deeper, more sophisticated pipelines.  This approach, however, runs into power problems as you continue to boost your performance.  When processes can be efficiently parallelized, multiple cores can do the same work at lower frequencies or more work at the same frequencies with much better power-per-performance metrics.  For those applications, MIPS is now rolling out the 1004K. 

1004K can provide up to four processor cores in either single- multi-threaded configurations.  The real excitement comes in with combining multi-cores with multi-threading.  MIPS’s new product pivots around a Coherence Manager (CM) that coordinates the activities of the (up to four) processor cores.  Additionally, an optional I/O Coherence Unit (IOCU) can coordinate coherence for I/O peripherals.

In each core, an intervention port connects to the coherence manager.  In the CM, read and response requests come and go from CPUs or from the IOCU via a Request Unit (RQU).  A Memory Interface Unit (MIU) communicates with physical memory and receives coherent read/writes from the IVU and non-coherent read/writes and speculative coherent reads from the RQU.  The MIU then hands read responses to a Response Unit (RSU) that passes data on to CPUs or the IOCU as applicable.

The IOCU enables hardware I/O coherence via bridging I/O subsystems to the CM.  It translates OCP 2.2 non-coherent requests into OCP 3.0 coherent/non-coherent requests.  It breaks up bursts and unaligned accesses into cacheline/dword transactions, which minimizes impact to the coherence fabric by structuring I/O data to the coherent system.  Attributes are applied per-transaction, and requests can be tagged to snoop L1+L2, L2 only, or neither.    I/O parking gives I/O transactions priority over the CPU cores in the coherence manager.

The “MIPS32” part at the beginning of “MIPS32 1004K” tells us that the new multi-core processor is compliant with existing software developed for MIPS 24K, 24KE and 34K families.  Multi-threading is enabled by virtual processing elements (VPEs).  Each individual core can be configured in a wide variety of ways – one or two VPEs for single- or multi-threaded operation.  An FPU is available, and the CPU/FPU clock ratio can be configured.  You can select and size TLB, caches, and scratchpad RAM and also create user-defined instructions for your application.

There is also a Global Interrupt Controller.  CPU access to the GIC is managed through a relocatable memory-mapped address range.  The GIC can connect to the CM or elsewhere in the system.  The GIC supports system-level and inter-processor interrupts and routs interrupts to the particular core or VPE.  The number of system interrupts is configurable (up to 256).

Example specs, for a system configured with two cores, and each core set up with two VPEs for multi-threading, caches, coherence manager, and global interrupt controller, the base cores can operate at 800MHz, achieving a DMIPS rating of >2400.  Using TSMC 65nm, 9-track, low Vt the total area is about 3.8mm2. 

MIPS is a big player in the Linux club, and the new cores are friendly with open-source SMP Linux.  MIPS also has a complete software debug environment, including in-system debug from MIPS FS2.  FS2’s PDtrace has coherence awareness for compatibility with the multi-core environment.  The software tools are GNU-based, including an Eclipse Navigator IDE. 

The 1004K core is available now.  Early RTL was delivered in Q4 last year.

Leave a Reply

featured blogs
Aug 15, 2018
https://youtu.be/6a0znbVfFJk \ Coming from the Cadence parking lot (camera Sean) Monday: Jobs: Farmer, Baker Tuesday: Jobs: Printer, Chocolate Maker Wednesday: Jobs: Programmer, Caver Thursday: Jobs: Some Lessons Learned Friday: Jobs: Five Lessons www.breakfastbytes.com Sign ...
Aug 15, 2018
VITA 57.4 FMC+ Standard As an ANSI/VITA member, Samtec supports the release of the new ANSI/VITA 57.4-2018 FPGA Mezzanine Card Plus Standard. VITA 57.4, also referred to as FMC+, expands upon the I/O capabilities defined in ANSI/VITA 57.1 FMC by adding two new connectors that...
Aug 15, 2018
The world recognizes the American healthcare system for its innovation in precision medicine, surgical techniques, medical devices, and drug development. But they'€™ve been slow to adopt 21st century t...
Aug 14, 2018
I worked at HP in Ft. Collins, Colorado back in the 1970s. It was a heady experience. We were designing and building early, pre-PC desktop computers and we owned the market back then. The division I worked for eventually migrated to 32-bit workstations, chased from the deskto...
Jul 30, 2018
As discussed in part 1 of this blog post, each instance of an Achronix Speedcore eFPGA in your ASIC or SoC design must be configured after the system powers up because Speedcore eFPGAs employ nonvolatile SRAM technology to store its configuration bits. The time required to pr...