feature article
Subscribe Now

Lattice Leaps Ahead

Launches New ECP4 Mid-Range FPGA

One might think that a launch of a new 65nm FPGA family these days would be an anachronism.  After all, haven’t we been flooded for over a year with news about super-high-density 28nm and even 22nm FPGA families with logic cell counts into the millions?  We’ve all been trained by Mr. Moore for over four decades to believe in the mantra that Moore is more, and therefore the next process node is always better than the last one in terms of cost, performance, and power consumption.

Why, then, is Lattice coming out and bragging about a new mid-range FPGA family (dubbed ECP4) that is fabricated with creaky-old 65nm process technology?  Aren’t they embarrassed to be doing that in public?  Didn’t 65nm go out of style with Hammer pants? 

Ah, not so fast there, MC Process Node… 

It turns out that newer is not always better when it comes to semiconductor processes.  In fact, for quite a while after their first volume production, newer, smaller geometries have a lot of properties that make them worse than older, more mature processes.  First, it stands to reason that yield takes a while to achieve on any new process, while older processes just keep yielding better and better.  If one also factors in other contributors like the dramatically higher cost of the fabrication equipment that must be amortized over a production run, it takes quite a while for a newer, smaller process node (like 28nm) to overtake an older, more mature one (like 65nm) in terms of cost, even though each die obviously consumes far less silicon.   

Lattice’s new ECP4 family has a lot to brag about, actually.  Compared with its predecessor, the highly successful ECP3, the new family has 10x the hard IP, 7x more DSP, 2x faster SerDes, 66% more logic (up to 250K LUTs), and 33% faster DDR interfaces.  Lattice is going after a particular segment of the communications and networking market with these devices, and their feature set reflects that.  One could compare these new low-cost ECP4 FPGAs to the flagship Virtex and Stratix devices from Xilinx and Altera only a couple of years back.  Communications companies who previously designed those expensive devices into their equipment and are now looking for a cost-reduction in the FPGA part should flock to ECP4 in droves. 

When comparing Lattice’s new family with competitive offerings fabricated on newer processes, it is interesting to see Lattice’s strategy at play.  When designing on a new, advanced semiconductor process, FPGA companies tend to optimize their architecture for the largest/fastest devices and then scale down from there to their mid-range and low-cost devices.  Lattice optimizes their design specifically for these mid-range, low-cost devices in the first place, which they claim results in lower cost and lower power consumption than similar-sized FPGAs from competitors. 

In the FPGA market, as you know well if you use them, cost is a fuzzy thing.  While FPGA companies design with cost as a major design goal, the amount any given customer pays for any given FPGA is more akin to the prices of airline tickets.  On paper, however, it looks like Lattice’s devices should cost you less than comparably sized FPGAs from other vendors – if all distribution variables are equal.  For example, since Lattice is designing for a maximum density of about 250K LUTs, they stayed with a LUT4 architecture.  The tradeoff between LUT4 and wider logic elements comes down to logic density versus routing density.  Building your logic out of a larger number of smaller cells is proven to be inherently more efficient than trying to use a smaller number of wider cells.  There are too many times that space in the wider LUT is wasted when only a 2- or 3-input function is required.  However, when you take routing resources into account, particularly on larger FPGAs, it turns out that a wider logic cell uses less overall resources.  Lattice is designing smaller FPGAs, so they claim it is more efficient for them to use LUT4.  Other vendors are optimizing for larger FPGAs, so they use wider (LUT6-ish) cells. 

Also, packaging is less expensive in ECP4 than most mid-range FPGAs because of Lattice’s choice to stick with wirebond packaging.  “Aha! What about multi-gigabit SerDes?” (We hear you ask…) Lattice claims to have optimized their SerDes with the expectation of wirebond packaging, whereas most competitive offerings assume the device will be packaged in a flip chip package.  Lattice still offers flip chip options, however, which can improve signal integrity enough to make the leap from chip-to-chip to backplane applications.  Since ECP4’s SerDes tops out at 6 Gbps, the company claims that they get excellent signal integrity with SerDes at those data rates – even using wirebond packaging.  ECP4 offers four to sixteen channels of 6Gbps SerDes, which is plenty for most of the targeted communications applications.  This kind of “enough is more” thinking is obvious across the board with this new family.  Lattice designed these FPGAs to solve very specific market needs without a lot of extras.  Therefore, if your design will work with one of these devices, it will probably cost less than with a competitive FPGA.  As we said above, however, your mileage may vary a lot depending on how and where you buy yours.

When we look at power: because Lattice’s devices are fabricated on 65nm technology, they should inherently have lower leakage current than the same device on a smaller process.  The tradeoff on power comes with dynamic power, where higher operating voltages dictate more switching power.  By designing SerDes to top out at 6Gbps, the company claims they are also able to choose a design that optimzed power consumption for that performance point. When you blend all these factors together, subtract out the power-specific optimizations done by all FPGA companies, and stir in the enormous number of variables affecting power consumption in an FPGA design, Lattice claims lower overall power consumption than competitive devices of the same size across the ECP4 family.

ECP4 gets a big boost in DSP power compared with previous families.  In addition to almost doubling the number of multipliers, the company added “booster logic,” which allows the DSP blocks to be clocked in “double data rate” mode.  The new DSP blocks also include “pre-adder logic,” which the company claims can double DSP performance in common functions like FIR filters. 

With ECP4, Lattice has also continued their strategy of kinda-hard IP blocks called “MACO” blocks – where hardened versions of specific IP blocks are included in a metal-programmed portion of the device.  MACO blocks do the heavy lifting on functions like communications protocols, kicking up performance, reducing power consumption, and saving the programmable fabric for other uses.  Functions like PCIe 2.1×4, SRIOx4, Tri-Speed MACs, and 10Gb MACs are included in various devices as MACO blocks.  Lattice says that hardening these functions reduces power and cost for these blocks by an estimated 90%, and saves over 100K LUTs on the largest devices.

By the numbers, ECP4 offers 6 device sizes ranging from 33K to 241K LUTs, with from 1.2 to 10.6 Mbits of embedded memory.  On the DSP front, the devices feature from 64 to 576 18×18 multipliers, which the company claims provide equivalent capability to 256 to 2304 of the previous generation (ECP3) multipliers.  Maximum user IO ranges from 224 to 512 pins, with 4 to 16 SerDes transceivers and 18-40 1.25 Gbps CDRs.  The devices can support up to 1066Mbps DDR3 as well – allowing for the use of fast commodity memory in many applications.

On the tools side, version 1.4 of Lattice’s “Diamond” design tool suite supports ECP4, and it is in beta now with selected customers.  Device samples are expected in the first half of 2012, with production following in the second half.  In recent years, Lattice has had an excellent track record of delivering what they announced on the schedule they announced, so we’d have a high degree of confidence in these schedule estimates. 

All of these carefully constructed compromises in ECP4 allow Lattice to deliver most of the capabilities of high-end FPGAs at price points closer to low-cost FPGAs.  By focusing specifically on cost-reduced applications like remote radio heads, cellular base stations, and video signal processing, the new devices should score some big wins with customers in those areas for whom cost and power consumption are constantly-dueling priorities.  If your design could benefit from the unique capabilities of ECP4, don’t be put off by thinking it’s un-cool to be using a 65nm chip.  These days, retro is in.

2 thoughts on “Lattice Leaps Ahead”

  1. I wonder if the cost argument holds here. Given these devices are targeted at telecom applications, and those projects take years to go to full production, one ought to ask if 65-nm will really be cost competitive 2-3 years down the road…

Leave a Reply

featured blogs
Oct 21, 2020
We'€™re concluding the Online Training Deep Dive blog series, which has been taking the top 15 Online Training courses among students and professors and breaking them down into their different... [[ Click on the title to access the full blog on the Cadence Community site. ...
Oct 20, 2020
In 2020, mobile traffic has skyrocketed everywhere as our planet battles a pandemic. Samtec.com saw nearly double the mobile traffic in the first two quarters than it normally sees. While these levels have dropped off from their peaks in the spring, they have not returned to ...
Oct 19, 2020
Have you ever wondered if there may another world hidden behind the facade of the one we know and love? If so, would you like to go there for a visit?...
Oct 16, 2020
[From the last episode: We put together many of the ideas we'€™ve been describing to show the basics of how in-memory compute works.] I'€™m going to take a sec for some commentary before we continue with the last few steps of in-memory compute. The whole point of this web...

featured video

Demo: Inuitive NU4000 SoC with ARC EV Processor Running SLAM and CNN

Sponsored by Synopsys

See Inuitive’s NU4000 3D imaging and vision processor in action. The SoC supports high-quality 3D depth processor engine, SLAM accelerators, computer vision, and deep learning by integrating Synopsys ARC EV processor. In this demo, the NU4000 demonstrates simultaneous 3D sensing, SLAM and CNN functionality by mapping out its environment and localizing the sensor while identifying the objects within it. For more information, visit inuitive-tech.com.

Click here for more information about DesignWare ARC EV Processors for Embedded Vision

featured paper

Designing highly efficient, powerful and fast EV charging stations

Sponsored by Texas Instruments

Scaling the necessary power for fast EV charging stations can be challenging. One solution is to use modular power converters stacked in parallel. Learn more in our technical article.

Click here to download the technical article

Featured Chalk Talk

Benefits of FPGAs & eFPGA IP in Futureproofing Compute Acceleration

Sponsored by Achronix

In the quest to accelerate and optimize today’s computing challenges such as AI inference, our system designs have to be flexible above all else. At the confluence of speed and flexibility are today’s new FPGAs and e-FPGA IP. In this episode of Chalk Talk, Amelia Dalton chats with Mike Fitton from Achronix about how to design systems to be both fast and future-proof using FPGA and e-FPGA technology.

Click here for more information about the Achronix Speedster7 FPGAs