feature article
Subscribe Now

Tilera Gets Its Gonzo On

Go big or go home. That could be the company motto for Tilera, a Boston-based startup that makes the most gonzo processor you’ve probably ever seen.

Tilera’s new GX-100 chip contains one hundred – count ’em – identical microprocessors, each connected to the others though a massive terabit network. And these processors aren’t just wimpy little cores, either. They’re big 64-bit RISC machines, each with its own L1 and L2 caches, TLB, and 64-bit instruction set. Roughly speaking, each Tilera processor is about equivalent to a good PowerPC, MIPS, or Intel Xeon. In other words, a serious processor. In a big crowd of serious processors.

What would you do with this much horsepower in a single package? If you have to ask, this chip probably isn’t for you. Step aside, sonny, and let the real engineers do their work. But if your business card says Cisco, AT&T, Google, or Nokia, you probably have some good ideas about where this chip might fit. It’s aimed at high-end networking and telecommunications gear, products that have to massage a lot of data in record time.

Ushering in the Tile Era

As the company name suggests, Tilera’s big chip is made of “tiles,” or repeating arrays of processor, cache, and interconnect. The chip is homogeneous, in the sense that all 100 tiles are identical. It’s organized as a 10×10 array with each processor tile connecting directly to its neighbor to the north, south, east, and west. Processors along the edges of the chip connect to peripheral I/O (more on this later).

Each processor tile is happy to operate all on its own. In fact, they often do. After all, each one is a self-contained island with all the execution resources it needs, including two levels of local cache. If one processor needs to communicate with another, it takes only one clock cycle per “hop.” Naturally, you’d want to talk to your neighboring processors as much as possible, but even a worst-case connection needs only ten hops to get from one side of the chip to the other. That may sound like a lot, but even normal system buses on standard processor chips need 10 cycles for a bus transaction. Tilera’s chip can do dozens of these transactions all at once, from any processor to any processor.

The religiously regular nature of the GX-100 chip lends itself to task partitioning. If your task requires, say, six processors, you’ll probably want them next to each other in a 2×3 block. This shortens the communication paths among processors and avoids cutting out oddly shaped “holes” that might orphan some tiles in the 10×10 grid. There’s no requirement that cooperating processor tiles be contiguous; it’s just a good idea. With 100 tiles to work with, you can add or remove processing power in 1% increments.

Adjoining processor tiles can also ignore each other. In other words, the six tiles that handle your network protocols might have nothing to do with the adjacent ten tiles running the operating system. Or the neighboring four tiles handling the user interface. And so on. You can even duplicate tasks, so that two separate 20-tile blocks are both running Apache server software, with both being independent and unaware of each other.

How you actually partition your tasks is mostly up to you. Tilera doesn’t enforce any kind of rigor either in its hardware or its software. If you’re running a multiprocessing operating system (a few of which have been ported to Tilera), it will see the chip as 100 separate processor cores and will launch and kill threads as it sees fit. SMP Linux, for example, will move tasks around as they spawn and die, like a large-scale game of Life.

Under the Hood

If you’ve been following Tilera, you may be familiar with its tile-based approach to multiprocessing. But you’ll still be surprised at the processors themselves. The company has completely ditched the MIPS architecture it used in its previous chips (shipping since 2007) and designed its own 64-bit architecture from scratch. That means all new software tools, and it means existing Tilera code won’t run on the new chips.

Tilera doesn’t see this as much of a problem. For one, there aren’t that many existing Tilera customers around, so there isn’t much existing Tilera code to port. Second, what code there is was probably written in C, and Tilera’s new C compiler will handle the recompilation, no problem. It’s hard to imagine anyone hand-tweaking assembly code for such a beast, but, if you had, you’ll need to rewrite it for the new instruction set.

Paradoxically, Tilera touts the massive GX-100 chip as a “green” power-saving alternative to competing processors. Odd as it sounds, they may have a point. Even with 100 processors all running at 1.2 GHz or so, the chip dissipates about 55 watts. That’s not terrible, and a whole lot less than 100 (or even two) Intel Xeon chips would draw. The company claims the GX-100 is also more power-efficient than anything Cavium, Freescale, or RMI produces. At 55 watts, the GX-100 will need a good strong fan, but it won’t require exotic liquid cooling or its own power station.

So what do you hook this thing up to, apart from several stout power leads? Just about anything you want to, as long as it’s communications-related. The periphery of the GX-100 chip is peppered with every networking controller known to man, including XAUI (eight of them), Interlaken (two 10-lane interfaces), Gigabit Ethernet (32 of those), PCI Express (two 8-lanes and a 4-lane), DDR3 (four separate controllers), two independent crypto accelerators, plus the usual assortment of UARTs, USB, JTAG, I2C, SPI, and so on. What, no RS-232?

For the adventurous but less fiscally independent engineer, Tilera is also planning scaled-back versions of the GX-100 that have just 16, 36, or 64 cores. The number of I/O controllers diminishes with the reduction in tile count (mostly because there’s no room on the die or in the package), but otherwise these chips are identical to their big brother. They’re still awesome, just less awesome. 

For programmers, tackling a Tilera chip must be a daunting task. It’s an entirely new instruction set and processor architecture, combined with the vagaries of interprocessor communication, shared caches, and load balancing. It doesn’t have to be complex, but it could very rapidly become so. Dabbling with the 16-tile version may be the way to start. The Tilera chips are nothing if not scalable, so learning one means you’ve cracked them all. Or you could just try overclocking that 8051 for a few more years.

Leave a Reply

featured blogs
Feb 28, 2021
Using Cadence ® Specman ® Elite macros lets you extend the e language '”€ i.e. invent your own syntax. Today, every verification environment contains multiple macros. Some are simple '€œsyntax... [[ Click on the title to access the full blog on the Cadence Comm...
Feb 27, 2021
New Edge Rate High Speed Connector Set Is Micro, Rugged Years ago, while hiking the Colorado River Trail in Rocky Mountain National Park with my two sons, the older one found a really nice Swiss Army Knife. By “really nice” I mean it was one of those big knives wi...
Feb 26, 2021
OMG! Three 32-bit processor cores each running at 300 MHz, each with its own floating-point unit (FPU), and each with more memory than you than throw a stick at!...

featured video

Designing your own Processor with ASIP Designer

Sponsored by Synopsys

Designing your own processor is time-consuming and resource intensive, and it used to be limited to a few experts. But Synopsys’ ASIP Designer tool allows you to design your own specialized processor within your deadline and budget. Watch this video to learn more.

Click here for more information

featured paper

The Basics of Using the DS28S60

Sponsored by Maxim Integrated

This app note details how to use the DS28S60 cryptographic processor with the ChipDNA™. It describes the required set up of the DS28S60 and a step-by-step approach to use the asymmetric key exchange to securely generate a shared symmetric key between a host and a client. Next, it provides a walk through on how to use the symmetric key to exchange encrypted data between a Host and a Client. Finally, it gives an example of a bidirectional authentication process with the DS28S60 using an ECDSA.

Click here to download the whitepaper

featured chalk talk

LED Lighting Solutions

Sponsored by Mouser Electronics and Amphenol ICC

LED lighting is revolutionizing lighting design. Engineers now need to consider a host of issues such as power consumption, spectrum, form factor, and reliability. In this episode of Chalk Talk, Amelia Dalton chats with Peter Swift from Amphenol ICC about the latest in LED lighting technology, and solutions for indoor and outdoor applications.

Click here for more about Amphenol Commercial Lighting Solutions ICC