I want you to put your “imagining hat” on. If you don’t own an “imagining hat,” you’ll just have to imagine that you have one proudly perched on the top of your head. Let’s start by imagining an accelerator card whose core processing device demands 2,000 amps (my eyes are already watering).
Now imagine multiple accelerator cards on a motherboard in a cabinet or server tray, multiple cabinets or trays in a rack, and tens of thousands of these racks in a data center (a large data center can have 10,000 racks or more, while a hyperscale data center can have 50,000 racks or more). All I can say is that we are talking about a lot of amps, a lot of power, a lot of heat, and a lot of money.
As an aside, and as I may have mentioned before, my first job circa 1980 was as a member of a design team at International Computers Limited (ICL). We were creating CPUs for mainframe computers. The first computer I worked on consumed 2,000 amps at 5 volts. We could tweak the voltage using a ginormous rheostat (like a 2-terminal steroid-pumping version of a 3-terminal potentiometer).
That was 2,000 amps for the entire computer. The thought of 2,000 amps on a single accelerator card (which itself isn’t something we would have envisaged) would have had us rolling around on the floor in laughter (or gibbering in the corner in terror). We all thought power consumption was going to go down over the years. In a way, we were correct—power consumption has indeed gone down—but only if we think in terms of units of WPOPCI (that’s “watts-per-operation-per-cubic-inch,” which is a unit I just made up). Just be thankful we aren’t obliged to work in units of CPCPC (that’s “coulombs-per-cogitation-per-congius”) like my dear old dad).
The problem is that today’s processors have dramatically shrunk in physical size. At the same time, however, they are now performing trillions of operations per second. So, even though each computation consumes a miniscule amount of power, when we are doing trillions of operations in a small amount of space, the power (and heat) starts to mount.
There were many interesting sessions at DesignCon 2025, which took place just last week as I pen these words. The one I’m currently thinking of was a panel titled: Power Delivery Strategies for AI and Data Center ASICs: Horizontal vs. Vertical.
This started by referencing a CNBC article, Data Centers Powering Artificial Intelligence Could Use More Electricity Than Entire Cities, from which we learn (a) Google, Amazon, Microsoft, and Oracle are all investing in nuclear solutions to power their data centers, (b) Oracle is building a gigawatt data center powered by three small nuclear reactors, and (c) a gigawatt data center using 85% of its peak demand is approximately equal to 710,000 U.S. households or 1.8 million people, which implies that data centers will indeed consume more power than small cities.
Also referenced was a Goldman Sachs article, AI is Poised to Drive 160% Increase in Data Center Power Demand. Here we learn that (a) ChatGPT queries consume ~10x as much electricity as traditional Google searches and (b) it is expected for AI to represent ~19% of the total data center power demand by 2028. Also, at CES 2025, Nvidia launched its Blackwell GPU. This boasts two GPU chiplets with a 10 terabytes-per-second (TBps) link enabling them to function as a single processor. The B100 boasts 8 HBM3E stacks, while the B200 flaunts 12 HBM3E stacks. An Nvidia DGX B200 chassis with QTYx8 B200 Blackwell GPUs will consume roughly 15kW! A rack of these will consume around 60kW! Oh, my stars!
Just to remind ourselves before we plunge headfirst into the fray with gusto and abandon (and aplomb). A very high-level summary of recent Nvidia GPU architectures is as follows:
Volta (2017): V100
Amper (2020): A100
Hopper (2022/2023): H100
Blackwell (2025): B100, B200
So, what can we do about this increasing demand for power (other than running gibbering into the night)? I’m glad you asked. I was just chatting with Eelco Bergman, who is Chief Business Officer at Saras Micro Devices. Although Saras was founded only four short years ago in 2021, it seems poised to make its presence felt.
We commenced by looking at an image of an H100-based accelerator card. As shown below, the H100 processor itself is the big device in the middle of the board. All the areas bounded in green (which essentially means the rest of the board) contain power delivery and conversion components.
H100-based accelerator card with power conversion and delivery components bounded in green (Source: Saras Micro Devices)
The area in the upper right-hand corner takes 48V coming in and converts it to 12V. All the other areas step this 12V down to the point-of-load voltages. The smaller packages are semiconductor converters, and the larger packages are inductors. The back of the board is festooned with capacitors.
In this case, we are looking at >1,000W for the board. If we assume an average point-of-load voltage of 1V, this equates to 1,000A. This is an example of traditional horizontal power delivery; that is, the power is conveyed from the converters to the processor via horizontal wires, thereby incurring significant IR losses, which equates to wasted heat that needs to be removed.
And things are getting worse because core voltages are dropping. For example, 1,000W at 0.7V = ~1,500A. When we combine this with next-generation processing capability (the Blackwell boasts 208 billion transistors—excluding the HBM stacks), we will be looking at 2,000A before you know it.
This is where the guys and gals at Saras leap onto the center of the stage with a ballyhoo of Basset horns (a sound that lingers long after the players have retreated). Their goal is to remove the inductors and capacitors that consume so much valuable real estate, and to embed them in the board or package substrates, thereby paving the way for vertical power delivery, whose much shorter tracks dramatically reduce IR losses.
Other companies have tried embedding discrete components in substrates before, The difference here is that the chaps and chapesses at Saras have developed something they call the Saras Tile, or STILE.
Introducing the STILE (Source: Saras Micro Devices)
Each STILE can comprise multiple high-capacity capacitors and/or inductors. Now, rather than embed discrete components in the substrate, you embed STILEs. Since each application is different, the folks at Saras work with their users to create custom STILEs (width, length, thickness, and numbers/types/values of capacitive/inductive elements). In addition to side-by-side, capacitors and inductors can also be mounted on top of each other and connected in series. The STILE has double-sided contacts, thereby providing a true 3D pass-through.
There are multiple STILE deployment options as illustrated below. These include being embedded in the printed circuit board (PCB), in the processor package, and in the power modules.
There are multiple STILE deployment options (Source: Saras Micro Devices)
Well, I don’t know about you, but this has certainly given me much to meditate on. I’m currently working on a hobby project whose modest 1.5A requirement is giving me more than enough to worry about, so I’m still reeling at the thought of 1,000A on an accelerator board, with 1,500A and 2,000A heading our way.
How about you? Do you have any cogitations and ruminations you’d care to share with the rest of us? As always, I welcome your captivating comments.