feature article
Subscribe Now

ARM Dips Toe Into Configurability Pool

Cortex-M33 Adds Support for User-Created Instructions… Sort Of.

“My ghast was flabbered.” – Anthony Grayling

User-configurable microprocessors are a subject near and dear to me, so I was excited to hear that ARM, a company renowned for its iron-fisted control over its CPU architecture, was loosening its grip and allowing users to create their own custom instructions. Could it be? Had the company really joined the ranks of the user-configurable army pioneered by Tensilica, ARC, RISC-V, and others? 

Well… no, not really. 

While ARM will, for the first time, allow its licensees to create their own instructions to extend the processor’s functionality, it’s a far cry from the fully user-configurable CPUs we’ve seen from other companies. In fact, it’s about the least you could do and still be able to check off the “user-configurable” box on a developer’s wish list. 

Still, it’s a step in the right direction and an indication that, “ARM is changing,” in the words of Thomas Ensergueix, the company’s Senior Director for Embedded, Automotive, and IoT Business. The company wants to see its nearly ubiquitous processors make inroads into edge devices that often have unusual, domain-specific tasks to accomplish. Sensor aggregation and motor control, for instance, aren’t well suited to generic CPU architectures. Customers often wind up using a DSP or an FPGA, or they design custom hardware – including user-defined CPUs. ARM wants to elbow into that business. 

ARM has supported coprocessors for a long time, but that’s not the same thing as new instructions. A coprocessor like an FPU or a DSP (also from ARM) operates alongside and somewhat independent of the main Cortex CPU. Coprocessors have the advantage of being able to run in parallel with the main CPU so that long-latency instructions (a floating-point divide, for example) don’t hang up the main processing pipeline. But they also have the disadvantage of operating at arm’s length, so to speak. Operands must be explicitly moved into and out of the coprocessor over the CPU bus, which takes time and burns energy. They’re like big peripherals that consume data and instructions. 

In contrast to all that, new user-created instructions will be part of the processor core and execute as part of its CPU instruction stream. New instructions have direct access to the CPU’s register file and don’t need to be spoon-fed operands like coprocessors do. This should permit faster and more efficient instructions that also use less energy. New instructions can be arbitrarily complex. Sounds like a win. 

But there are several limitations. 

For one, this newfound freedom applies only to the Cortex-M33. No other ARM processors support this new feature, nor will it be retroactively added to any of them. Even Cortex-M33 doesn’t really support it yet; that upgrade comes next year. Looking ahead, future Cortex-M designs will support user-defined instructions, but only Cortex-M designs. Although future Cortex-A or R-series processors might get this ability, I wouldn’t bet on it. 

Second, your user-defined extensions can’t access memory. Or control registers. They can access the processor’s main data registers (r0 through r12), which means you can do arithmetic or logic operations on register data, but not much else. You could implement your own bit-twiddle instructions, but only if they don’t need to touch memory. You can’t do flow control (no if/then/else), nor can you tweak the processor’s control registers or secure areas. Massage register data all you want, but ARM draws the line there. 

On the plus side, whatever you do create is your property; ARM has no rights to the IP, and you could theoretically sell your handiwork to others if you want to. ARM hopes, somewhat optimistically, that third-party design houses might do a nice side business hot rodding Cortex-M processors with various custom ISA upgrades. That’s a fun idea, although history suggests it won’t happen. 

You’re also on your own for software development. None of ARM’s programming tools support user-defined instructions – obviously – nor do any third-party tools. Any instructions you create will have to be hand-coded using assembler or intrinsics. Same goes for debugging: nobody’s debugger will understand what your new FUBAR operation does or how it should behave, and most will treat it as either an illegal operation or as a coprocessor operation. (User-defined instructions overlay coprocessor functions in the opcode map.) Planned updates will allow the tools to tolerate user-created instructions, but they still won’t understand them. 

How many Cortex-M33 licensees will start creating their own instructions? Probably not very many. That’s not a slam against ARM’s implementation; it’s just the the reality of user-configurable processors in general. Everybody likes the idea of souping up their processor and adding their own secret sauce, but few developers actually do it. Paradoxically, even the processors that are known for being user-configurable, such as ARC (from Synopsys) and Tensilica (from Cadence), are overwhelmingly used in their default “factory configuration.” Like an SUV that’s never taken off-road, it sounds like a great idea right up until it’s time to leave the beaten path. 

Which is a shame, because customizable processors have real, tangible advantages. It’s not all marketing glitter. A motor-control loop can benefit hugely from just one or two custom instructions (sine or cosine, for example) that aren’t part of the standard Cortex-M instruction set. Cryptography applications, sensor data acquisition, network filtering, and dozens of other obscure corners of the embedded world can all benefit from processors that are tweaked to handle their unique data types or oddball arithmetic operations. Tenfold performance improvements are not unheard of. Just eliminating the back-and-forth latency of the coprocessor interface can make a difference. 

And yet… ARM’s move feels like it’s only a very small step in that direction. On one hand, it’s a big philosophical change for a company that has always avoided branches in the family tree, mostly to ensure that (almost) all ARM processors are (mostly) binary compatible with one another. It’s a guiding principle that’s served the company well. Diverting from that path is kind of a big deal. 

There’s also the substantial engineering work that went into this thing. Designing a CPU that can tolerate tacked-on third-party hardware is no easy feat. ARM’s cores are pretty tightly optimized, but there’s no telling how sloppy a customer’s additions might be. Plus, ARM will take on a whole new support burden fielding calls from DIY processor architects tampering with its products. 

But this also feels like a bit of PR handwaving, a move intended to blunt the appeal of RISC-V, Tensilica, ARC, and other modular or customizable processors. Cortex-M33 is nowhere near as adjustable as those others, despite coming to the market 20 years later. But now ARM can credibly join the conversation. For designers who think they want user-configurability – whether they really use it or not – Cortex-M33 can now make the short list. 

One thought on “ARM Dips Toe Into Configurability Pool”

  1. It’s like the guy who invented universal solvent — could not package it to ship or sell.
    What is really needed is certainly not at the level of diddling registers.
    The overhead of moving data between memory and registers is a big part of the problem.
    Since the data comes from somewhere outside memory, the sensible thing is to at least do some processing as it moves through the input path.
    Putting raw data into memory just so it can be read from memory for processing is dumb.

Leave a Reply

featured blogs
Apr 25, 2024
Cadence's seven -year partnership with'¯ Team4Tech '¯has given our employees unique opportunities to harness the power of technology and engage in a three -month philanthropic project to improve the livelihood of communities in need. In Fall 2023, this partnership allowed C...
Apr 24, 2024
Learn about maskless electron beam lithography and see how Multibeam's industry-first e-beam semiconductor lithography system leverages Synopsys software.The post Synopsys and Multibeam Accelerate Innovation with First Production-Ready E-Beam Lithography System appeared fir...
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...

featured video

MaxLinear Integrates Analog & Digital Design in One Chip with Cadence 3D Solvers

Sponsored by Cadence Design Systems

MaxLinear has the unique capability of integrating analog and digital design on the same chip. Because of this, the team developed some interesting technology in the communication space. In the optical infrastructure domain, they created the first fully integrated 5nm CMOS PAM4 DSP. All their products solve critical communication and high-frequency analysis challenges.

Learn more about how MaxLinear is using Cadence’s Clarity 3D Solver and EMX Planar 3D Solver in their design process.

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

Enabling IoT with DECT NR+, the Non-Cellular 5G Standard
In the ever-expanding IoT market, there is a growing need for private, low cost networks. In this episode of Chalk Talk, Amelia Dalton and Heidi Sollie from Nordic Semiconductor explore the details of DECT NR+, the world’s first non-cellular 5G technology standard. They investigate how this self-healing, decentralized, autonomous mesh network can help solve a variety of IoT connectivity issues and how Nordic is helping designers take advantage of DECT NR+ with their nRF91 System-in-Package family.
Aug 17, 2023
30,119 views