Back in the old days, they really knew how to make clocks. Their energy sources were less than ideal – usually a big tensioned spring or an elevated mass on a chain. They wanted whatever energy they stored there to last as long as possible, as it was usually recharged manually by humans. Their go-to solution was simple harmonic motion – usually in the form of a pendulum. As long as they tuned the resonant frequency of the pendulum to the frequency they needed for their clock, the system would tick and tock for days – at a steady pace – using very little of the precious stored energy.
It was all about resonance.
If they’d tried to drive those pendulums at some frequency other than the resonant frequency (or a harmonic) – well, the results would have been pretty silly. The amount of energy required to push and pull that system would have gone up exponentially.
Fast forward a few centuries and we discover that we have not really learned to fully apply the simple lessons of those clockmakers of the past. When we design a complex SoC, we put a bunch of metal down in the form of clock trees, and we frantically pump charge in and out of the resulting big’ol capacitor with no regard to resonance. The result? Our clock trees use a whole bunch of power, and the resulting clock skew makes timing of our designs a nightmare.
Wouldn’t it be cool if we could apply some of the lessons of the old clockmakers? What if our clock lines were tuned to resonate at the exact frequency at which we were driving them? Physics 101 tells us that they’d ring – happily resonating at that frequency and using a tiny fraction of the energy required for a non-resonant clock tree.
This simple observation gave birth to a startup called “Cyclos.” Cyclos decided to take the capacitor we unintentionally create when we put down our clock lines and pair it with an inductor to make a parallel LC tank circuit – yep, the same one we all built in our first EE lab course. With a whopping 2 components, (both passive,) designing one of these things isn’t exactly rocket science. The results are predictable – a clock net tuned to resonate at our target frequency will run faster with less energy. What’s more, we can potentially then abandon our current notion of “clock trees,” with multiple active drivers at nodes along the way, and replace them with “clock meshes” – an idea for reducing clock skew that had been mostly discarded because of the huge power requirements.
A clock mesh is just about what the name implies – a grid of metal lines spanning the circuit layout. The good thing about a clock mesh is that it eliminates the active components from the clock tree and, as a result, almost totally eliminates clock skew. The bad thing about a clock mesh is that it’s a big capacitor that takes a lot of energy to charge and discharge at high frequencies… unless you tune it with an LC tank circuit.
Cyclos didn’t invent the clock mesh – they just made it practical again. If you lay down a clock mesh, and then choose the right inductor to fabricate so that the resulting resonant frequency matches your clock, it’s like free money. Your clock will ring away – just sipping coulombs from your battery, and your skew will be virtually eliminated – allowing your circuit to run faster with smaller guardbands limiting your timing freedom. That all sounds too good to be true, doesn’t it?
Usually, when something sounds that good, there are some troubling side effects that come back to rain on our parade. In this case, though, the biggest side effect is actually a bonus. Since a clock mesh is completely passive (even one tuned as an LC tank), we don’t have a problem with process variation affecting the performance. Variation in the performance of transistors doesn’t affect the clock tree if the clock tree doesn’t contain any – simple as that.
The downside, of course, is that you have to do the math and put the correct inductors in your design to make the magic happen. This requires a good deal of process knowledge, and, if you miss your target, you can end up with a big boat anchor attached to your clock lines. That’s not a very good reason for a re-spin, so you should be careful. Also, of course, inductors are pretty big. Although, the investment of a little silicon area is well worth it if you get to reap the benefits.
For those of us using FPGAs and similar technologies, it isn’t clear how we might take advantage of this simple idea. The clock tree is already built on our chip, so we can’t really plop down an inductor to our specifications to get our clock lines to resonate. The FPGA company can’t do it either – they don’t know what frequency our clocks will be, and, since clock lines are connected by programmable switches, the capacitance of the clock network can change on the fly. Furthermore, in FPGAs, there are other power hogs hanging around, so the clock tree doesn’t account for as much of the overall power consumption and hence isn’t as large a target for power savings.
For designs that can take advantage of this technique, though, Cyclos claims that power savings of 25%-35% have been achieved on things like ARM cores, with a die area increase of 4%-5% (to account for the clock mesh and the inductors). They claim that users have achieved 5%-10% more performance to boot, as a result of reduced skew and the resulting increased timing slack. As clock frequencies ascend into the gigahertz, the benefits get larger – as traditional clock tree schemes get more complex to create, skew increases, and clock tree power consumption turns skyward.
It will be interesting to see if Cyclos gets traction in the custom IC world, and whether or how the benefits of this old and simple yet revolutionary idea find their way into a broader section of design practices. It’s worth watching!