feature article

## Lessons From the Old Clock Tower

Cyclos Brings the Past Into the Future

Back in the old days, they really knew how to make clocks. Their energy sources were less than ideal – usually a big tensioned spring or an elevated mass on a chain. They wanted whatever energy they stored there to last as long as possible, as it was usually recharged manually by humans. Their go-to solution was simple harmonic motion – usually in the form of a pendulum. As long as they tuned the resonant frequency of the pendulum to the frequency they needed for their clock, the system would tick and tock for days – at a steady pace – using very little of the precious stored energy.

If they’d tried to drive those pendulums at some frequency other than the resonant frequency (or a harmonic) – well, the results would have been pretty silly. The amount of energy required to push and pull that system would have gone up exponentially.

Fast forward a few centuries and we discover that we have not really learned to fully apply the simple lessons of those clockmakers of the past. When we design a complex SoC, we put a bunch of metal down in the form of clock trees, and we frantically pump charge in and out of the resulting big’ol capacitor with no regard to resonance. The result? Our clock trees use a whole bunch of power, and the resulting clock skew makes timing of our designs a nightmare.

Wouldn’t it be cool if we could apply some of the lessons of the old clockmakers? What if our clock lines were tuned to resonate at the exact frequency at which we were driving them? Physics 101 tells us that they’d ring – happily resonating at that frequency and using a tiny fraction of the energy required for a non-resonant clock tree.

This simple observation gave birth to a startup called “Cyclos.” Cyclos decided to take the capacitor we unintentionally create when we put down our clock lines and pair it with an inductor to make a parallel LC tank circuit – yep, the same one we all built in our first EE lab course. With a whopping 2 components, (both passive,) designing one of these things isn’t exactly rocket science. The results are predictable – a clock net tuned to resonate at our target frequency will run faster with less energy. What’s more, we can potentially then abandon our current notion of “clock trees,” with multiple active drivers at nodes along the way, and replace them with “clock meshes” – an idea for reducing clock skew that had been mostly discarded because of the huge power requirements.

A clock mesh is just about what the name implies – a grid of metal lines spanning the circuit layout. The good thing about a clock mesh is that it eliminates the active components from the clock tree and, as a result, almost totally eliminates clock skew. The bad thing about a clock mesh is that it’s a big capacitor that takes a lot of energy to charge and discharge at high frequencies… unless you tune it with an LC tank circuit.

Cyclos didn’t invent the clock mesh – they just made it practical again. If you lay down a clock mesh, and then choose the right inductor to fabricate so that the resulting resonant frequency matches your clock, it’s like free money. Your clock will ring away – just sipping coulombs from your battery, and your skew will be virtually eliminated – allowing your circuit to run faster with smaller guardbands limiting your timing freedom. That all sounds too good to be true, doesn’t it?

Usually, when something sounds that good, there are some troubling side effects that come back to rain on our parade. In this case, though, the biggest side effect is actually a bonus. Since a clock mesh is completely passive (even one tuned as an LC tank), we don’t have a problem with process variation affecting the performance. Variation in the performance of transistors doesn’t affect the clock tree if the clock tree doesn’t contain any – simple as that.

The downside, of course, is that you have to do the math and put the correct inductors in your design to make the magic happen. This requires a good deal of process knowledge, and, if you miss your target, you can end up with a big boat anchor attached to your clock lines. That’s not a very good reason for a re-spin, so you should be careful. Also, of course, inductors are pretty big. Although, the investment of a little silicon area is well worth it if you get to reap the benefits.

For those of us using FPGAs and similar technologies, it isn’t clear how we might take advantage of this simple idea. The clock tree is already built on our chip, so we can’t really plop down an inductor to our specifications to get our clock lines to resonate. The FPGA company can’t do it either – they don’t know what frequency our clocks will be, and, since clock lines are connected by programmable switches, the capacitance of the clock network can change on the fly. Furthermore, in FPGAs, there are other power hogs hanging around, so the clock tree doesn’t account for as much of the overall power consumption and hence isn’t as large a target for power savings.

For designs that can take advantage of this technique, though, Cyclos claims that power savings of 25%-35% have been achieved on things like ARM cores, with a die area increase of 4%-5% (to account for the clock mesh and the inductors). They claim that users have achieved 5%-10% more performance to boot, as a result of reduced skew and the resulting increased timing slack. As clock frequencies ascend into the gigahertz, the benefits get larger – as traditional clock tree schemes get more complex to create, skew increases, and clock tree power consumption turns skyward.

It will be interesting to see if Cyclos gets traction in the custom IC world, and whether or how the benefits of this old and simple yet revolutionary idea find their way into a broader section of design practices. It’s worth watching!

## 3 thoughts on “Lessons From the Old Clock Tower”

1. kevin says:

Is tuning clock trees with LC tanks just too simple and elegant to be true? Why haven’t we been doing this all along? Can we extend the application range of this idea?

2. gabor@alacron.com says:

In the second to last paragraph about clock frequencies increasing into the gigaHertz range you forgot to mention the additional benefit of the smaller associated inductor (inductance should relate to one over square root of frequency for the same capacitance).

3. kevin says:

@gabor,

True, so the higher frequencies should benefit more from the technique, and it should have less overhead at those frequencies as well. Double win.

featured blogs
Aug 16, 2018
Learn about the challenges and solutions for integrating and verification PCIe(r) Gen4 into an Arm-Based Server SoC. Listen to this relatively short webinar by Arm and Cadence, as they describe the collaboration and results, including methodology and technology for speeding i...
Aug 16, 2018
All of the little details were squared up when the check-plots came out for "final" review. Those same preliminary files were shared with the fab and assembly units and, of course, the vendors have c...
Aug 15, 2018
VITA 57.4 FMC+ Standard As an ANSI/VITA member, Samtec supports the release of the new ANSI/VITA 57.4-2018 FPGA Mezzanine Card Plus Standard. VITA 57.4, also referred to as FMC+, expands upon the I/O capabilities defined in ANSI/VITA 57.1 FMC by adding two new connectors that...
Aug 14, 2018
I worked at HP in Ft. Collins, Colorado back in the 1970s. It was a heady experience. We were designing and building early, pre-PC desktop computers and we owned the market back then. The division I worked for eventually migrated to 32-bit workstations, chased from the deskto...
Jul 30, 2018
As discussed in part 1 of this blog post, each instance of an Achronix Speedcore eFPGA in your ASIC or SoC design must be configured after the system powers up because Speedcore eFPGAs employ nonvolatile SRAM technology to store its configuration bits. The time required to pr...