feature article
Subscribe Now

MIPS Moves on Multi-Core

MIPS32 1004K

At first, the concept of “multi-core” from a processor IP company might seem a bit confusing.  Couldn’t we already put multiple MIPS cores on our devices?  If your concept of multi-core ends with putting more than one processor on a chip, you may not be yet dialed into the subtleties.

This week, MIPS launched their highest-performance solution ever with the new MIPS32 1004K “Coherent Processing System” – a multi-core, multi-threaded IP solution.  The challenge of keeping all your cores busy in a symmetric multi-processing (SMP) system is actually made much easier when the multi-processor is combined with multi-threading.  With monolithic processors continuing to pound frequency and particularly power consumption walls, multi-core technology is permeating every segment of computing.  Embedded applications have long taken advantage of multiple processors – tasking different cores to perform completely independent system functions.  More general multi-core processing is relatively new, however, and the proliferation of higher-end applications running on sophisticated operating systems makes multi-core an imperative for power-sensitive, performance-hungry embedded applications.

Last year, MIPS rolled out their highest-performance monolithic, single-threaded core – the MIPS32 74K.  The 74K gets its performance the old fashioned ways – higher frequencies (up to 1GHz) and deeper, more sophisticated pipelines.  This approach, however, runs into power problems as you continue to boost your performance.  When processes can be efficiently parallelized, multiple cores can do the same work at lower frequencies or more work at the same frequencies with much better power-per-performance metrics.  For those applications, MIPS is now rolling out the 1004K. 

1004K can provide up to four processor cores in either single- multi-threaded configurations.  The real excitement comes in with combining multi-cores with multi-threading.  MIPS’s new product pivots around a Coherence Manager (CM) that coordinates the activities of the (up to four) processor cores.  Additionally, an optional I/O Coherence Unit (IOCU) can coordinate coherence for I/O peripherals.

In each core, an intervention port connects to the coherence manager.  In the CM, read and response requests come and go from CPUs or from the IOCU via a Request Unit (RQU).  A Memory Interface Unit (MIU) communicates with physical memory and receives coherent read/writes from the IVU and non-coherent read/writes and speculative coherent reads from the RQU.  The MIU then hands read responses to a Response Unit (RSU) that passes data on to CPUs or the IOCU as applicable.

The IOCU enables hardware I/O coherence via bridging I/O subsystems to the CM.  It translates OCP 2.2 non-coherent requests into OCP 3.0 coherent/non-coherent requests.  It breaks up bursts and unaligned accesses into cacheline/dword transactions, which minimizes impact to the coherence fabric by structuring I/O data to the coherent system.  Attributes are applied per-transaction, and requests can be tagged to snoop L1+L2, L2 only, or neither.    I/O parking gives I/O transactions priority over the CPU cores in the coherence manager.

The “MIPS32” part at the beginning of “MIPS32 1004K” tells us that the new multi-core processor is compliant with existing software developed for MIPS 24K, 24KE and 34K families.  Multi-threading is enabled by virtual processing elements (VPEs).  Each individual core can be configured in a wide variety of ways – one or two VPEs for single- or multi-threaded operation.  An FPU is available, and the CPU/FPU clock ratio can be configured.  You can select and size TLB, caches, and scratchpad RAM and also create user-defined instructions for your application.

There is also a Global Interrupt Controller.  CPU access to the GIC is managed through a relocatable memory-mapped address range.  The GIC can connect to the CM or elsewhere in the system.  The GIC supports system-level and inter-processor interrupts and routs interrupts to the particular core or VPE.  The number of system interrupts is configurable (up to 256).

Example specs, for a system configured with two cores, and each core set up with two VPEs for multi-threading, caches, coherence manager, and global interrupt controller, the base cores can operate at 800MHz, achieving a DMIPS rating of >2400.  Using TSMC 65nm, 9-track, low Vt the total area is about 3.8mm2. 

MIPS is a big player in the Linux club, and the new cores are friendly with open-source SMP Linux.  MIPS also has a complete software debug environment, including in-system debug from MIPS FS2.  FS2’s PDtrace has coherence awareness for compatibility with the multi-core environment.  The software tools are GNU-based, including an Eclipse Navigator IDE. 

The 1004K core is available now.  Early RTL was delivered in Q4 last year.

Leave a Reply

featured blogs
Jul 20, 2024
If you are looking for great technology-related reads, here are some offerings that I cannot recommend highly enough....

featured video

Larsen & Toubro Builds Data Centers with Effective Cooling Using Cadence Reality DC Design

Sponsored by Cadence Design Systems

Larsen & Toubro built the world’s largest FIFA stadium in Qatar, the world’s tallest statue, and one of the world’s most sophisticated cricket stadiums. Their latest business venture? Designing data centers. Since IT equipment in data centers generates a lot of heat, it’s important to have an efficient and effective cooling system. Learn why, Larsen & Toubro use Cadence Reality DC Design Software for simulation and analysis of the cooling system.

Click here for more information about Cadence Multiphysics System Analysis

featured paper

Navigating design challenges: block/chip design-stage verification

Sponsored by Siemens Digital Industries Software

Explore the future of IC design with the Calibre Shift left initiative. In this paper, author David Abercrombie reveals how Siemens is changing the game for block/chip design-stage verification by moving Calibre verification and reliability analysis solutions further left in the design flow, including directly inside your P&R tool cockpit. Discover how you can reduce traditional long-loop verification iterations, saving time, improving accuracy, and dramatically boosting productivity.

Click here to read more

featured chalk talk

Connectivity Solutions for Smart Trailers
Smart trailers can now be equipped with a wide variety of interconnection systems including wire-to-wire, wire-to-board, and high-speed data solutions. In this episode of Chalk Talk, Amelia Dalton and Blaine Dudley from TE Connectivity explore the evolution of smart trailer technology, the different applications within a trailer where connectivity would be valuable, and how TE Connectivity is encouraging innovation in the world of smart trailer technology.
Oct 6, 2023
34,927 views