feature article
Subscribe Now

MIPS Moves on Multi-Core

MIPS32 1004K

At first, the concept of “multi-core” from a processor IP company might seem a bit confusing.  Couldn’t we already put multiple MIPS cores on our devices?  If your concept of multi-core ends with putting more than one processor on a chip, you may not be yet dialed into the subtleties.

This week, MIPS launched their highest-performance solution ever with the new MIPS32 1004K “Coherent Processing System” – a multi-core, multi-threaded IP solution.  The challenge of keeping all your cores busy in a symmetric multi-processing (SMP) system is actually made much easier when the multi-processor is combined with multi-threading.  With monolithic processors continuing to pound frequency and particularly power consumption walls, multi-core technology is permeating every segment of computing.  Embedded applications have long taken advantage of multiple processors – tasking different cores to perform completely independent system functions.  More general multi-core processing is relatively new, however, and the proliferation of higher-end applications running on sophisticated operating systems makes multi-core an imperative for power-sensitive, performance-hungry embedded applications.

Last year, MIPS rolled out their highest-performance monolithic, single-threaded core – the MIPS32 74K.  The 74K gets its performance the old fashioned ways – higher frequencies (up to 1GHz) and deeper, more sophisticated pipelines.  This approach, however, runs into power problems as you continue to boost your performance.  When processes can be efficiently parallelized, multiple cores can do the same work at lower frequencies or more work at the same frequencies with much better power-per-performance metrics.  For those applications, MIPS is now rolling out the 1004K. 

1004K can provide up to four processor cores in either single- multi-threaded configurations.  The real excitement comes in with combining multi-cores with multi-threading.  MIPS’s new product pivots around a Coherence Manager (CM) that coordinates the activities of the (up to four) processor cores.  Additionally, an optional I/O Coherence Unit (IOCU) can coordinate coherence for I/O peripherals.

In each core, an intervention port connects to the coherence manager.  In the CM, read and response requests come and go from CPUs or from the IOCU via a Request Unit (RQU).  A Memory Interface Unit (MIU) communicates with physical memory and receives coherent read/writes from the IVU and non-coherent read/writes and speculative coherent reads from the RQU.  The MIU then hands read responses to a Response Unit (RSU) that passes data on to CPUs or the IOCU as applicable.

The IOCU enables hardware I/O coherence via bridging I/O subsystems to the CM.  It translates OCP 2.2 non-coherent requests into OCP 3.0 coherent/non-coherent requests.  It breaks up bursts and unaligned accesses into cacheline/dword transactions, which minimizes impact to the coherence fabric by structuring I/O data to the coherent system.  Attributes are applied per-transaction, and requests can be tagged to snoop L1+L2, L2 only, or neither.    I/O parking gives I/O transactions priority over the CPU cores in the coherence manager.

The “MIPS32” part at the beginning of “MIPS32 1004K” tells us that the new multi-core processor is compliant with existing software developed for MIPS 24K, 24KE and 34K families.  Multi-threading is enabled by virtual processing elements (VPEs).  Each individual core can be configured in a wide variety of ways – one or two VPEs for single- or multi-threaded operation.  An FPU is available, and the CPU/FPU clock ratio can be configured.  You can select and size TLB, caches, and scratchpad RAM and also create user-defined instructions for your application.

There is also a Global Interrupt Controller.  CPU access to the GIC is managed through a relocatable memory-mapped address range.  The GIC can connect to the CM or elsewhere in the system.  The GIC supports system-level and inter-processor interrupts and routs interrupts to the particular core or VPE.  The number of system interrupts is configurable (up to 256).

Example specs, for a system configured with two cores, and each core set up with two VPEs for multi-threading, caches, coherence manager, and global interrupt controller, the base cores can operate at 800MHz, achieving a DMIPS rating of >2400.  Using TSMC 65nm, 9-track, low Vt the total area is about 3.8mm2. 

MIPS is a big player in the Linux club, and the new cores are friendly with open-source SMP Linux.  MIPS also has a complete software debug environment, including in-system debug from MIPS FS2.  FS2’s PDtrace has coherence awareness for compatibility with the multi-core environment.  The software tools are GNU-based, including an Eclipse Navigator IDE. 

The 1004K core is available now.  Early RTL was delivered in Q4 last year.

Leave a Reply

featured blogs
May 14, 2021
Another Friday, another week chock full of CFD, CAE, and CAD news. This week features a topic near and dear to my heart involving death of the rainbow color map for displaying simulation results.... [[ Click on the title to access the full blog on the Cadence Community site....
May 13, 2021
Samtec will attend the PCI-SIG Virtual Developers Conference on Tuesday, May 25th through Wednesday, May 26th, 2021. This is a free event for the 800+ member companies that develop and bring to market new products utilizing PCI Express technology. Attendee Registration is sti...
May 13, 2021
Our new IC design tool, PrimeSim Continuum, enables the next generation of hyper-convergent IC designs. Learn more from eeNews, Electronic Design & EE Times. The post Synopsys Makes Headlines with PrimeSim Continuum, an Innovative Circuit Simulation Solution appeared fi...
May 13, 2021
By Calibre Design Staff Prior to the availability of extreme ultraviolet (EUV) lithography, multi-patterning provided… The post A SAMPle of what you need to know about SAMP technology appeared first on Design with Calibre....

featured video

Industry’s First USB4 Silicon Success

Sponsored by Synopsys

USB4 offers up to 40Gbps speeds for incredibly fast connections. Join Synopsys to see the first demonstration of USB4 IP in silicon, along with real TX eyes for DesignWare USB4, DisplayPort, and USB 3.x IP.

Click here for more information about DesignWare USB4 IP

featured paper

Ultra Portable IO On The Go

Sponsored by Maxim Integrated

The Go-IO programmable logic controller (PLC) reference design (MAXREFDES212) consists of multiple software configurable IOs in a compact form factor (less than 1 cubic inch) to address the needs of industrial automation, building automation, and industrial robotics. Go-IO provides design engineers with the means to rapidly create and prototype new industrial control systems before they are sourced and constructed.

Click to read more

Featured Chalk Talk

Benefits of FPGAs & eFPGA IP in Futureproofing Compute Acceleration

Sponsored by Achronix

In the quest to accelerate and optimize today’s computing challenges such as AI inference, our system designs have to be flexible above all else. At the confluence of speed and flexibility are today’s new FPGAs and e-FPGA IP. In this episode of Chalk Talk, Amelia Dalton chats with Mike Fitton from Achronix about how to design systems to be both fast and future-proof using FPGA and e-FPGA technology.

Click here for more information about the Achronix Speedster7 FPGAs