feature article
Subscribe Now

Tools and Transceivers

Dual Xilinx Announcements

In the old days, we had only two kinds of FPGA – small and smaller, also known as slow and slower, or hot and hotter…  The technology was useful for a few high-value applications, but it was limited on all three fronts – density (cost), speed, and power — from attacking a much broader market.  As we waltzed along the to the three-count meter of Moore’s Law, all three critical parameters improved.  Density went up, frequency got faster, power cooled down, and people got happier.

After a few rounds, however, FPGAs had pretty well saturated the bigger, faster, cooler crowd.  In order to reach a broader audience now, we needed to teach our favorite semiconductors some new tricks.  For the DSP people, we designed hardened high-speed multipliers.  For connectivity hogs, we grafted on multi-gigabit serial transceivers.  For embedded designers, we dropped in a processor core or two – and all of those people needed big blocks of memory to make effective use of the new features. 

The addition of all those features, however, gave us the FPGA that was something akin to a semiconductor Swiss Army Knife.  It had the features required to do just about any job you could name.  Unfortunately, almost no project required all the features, so you ended up buying (and powering) a bunch of extra features that you didn’t need.  Ever use the toothpick feature on your pocketknife?  Nope.  Me neither.

Luckily, FPGA vendors didn’t like throwing away money any more than we did, so they came back with a variety of different FPGA families with different mixtures of the features we might require for different applications.  Beginning with their Virtex-4 generation, Xilinx divided their line into several segments.  Need a bunch of plain-old LUT fabric?  LX was for you – lots of LUTs and not much else.  Doing DSP?  The SX family had more multipliers and memory in the mix.  Piping big amounts of data around?  FX included gigabit serial transceivers and PowerPC processors so you could blast bits anywhere you wanted them at a furious pace. 

Now, with 65nm Virtex-5, we’ve got even more options to choose from.  LX is still there, but even gradeschool kids need SerDes these days, so Xilinx has added LXT and SXT – the Lotsa LUTS and DSP-dialed versions with a few SerDes transceivers thrown in.  All of these families contain the whole range of features, but each has a different mixture, making it better suited to particular classes of applications. 

Xilinx has just rolled out their newest, biggest, baddest member of the Virtex-5 generation, the FXT.  FXT has all the goodies and the kitchen sink.  Its distinguishing features are faster transceivers than the other “T” families and built-in hard-wired PowerPC cores.  New is the PowerPC 440, more robust RocketIO transceivers, and a whole buncha built-in memory. 

One of the most common complaints about FPGA-based processors is a lack of performance, so Xilinx worked to build in a significant performance boost with FXT.  The platform has up to two PPC 440 processor cores, each with 32KB instruction and 32KB data caches.  If you’re down with Dhrystones, Xilinx claims that each core can deliver up to 1100 DMIPS at 550MHz – considerably more oomph than previous-generation FPGA-based processors. 

One of the big advantages of building an embedded system-on-chip in an FPGA is the availability of incredible amounts of connectivity for stitching processor to peripherals and on-chip memory.  In that spirit, Xilinx has included a crossbar interconnect structure that gives simultaneous access to I/O and memory.  It includes dedicated master and slave processor local bus interfaces, four DMA ports, and a dedicated memory bus interface.  As before, Xilinx has included their Auxiliary Processor Control Unit (APU) for interfacing your processor environment with any of the wide variety of co-processors and accelerators you can construct using the FPGA fabric. 

Another featured attraction in FXT are the RocketIO GTX transceivers.  These are high-performance, relatively low power SerDes blocks that bring significantly more performance to the table than those in the LXT and SXT lines.  The additional performance brings us the ability to support standards like XAUI, Fibre Channel, SONET, Serial RapidIO, and of course the ubiquitous PCI Express.  Xilinx says the GTX transceivers consume less than 200mW typical at 6.5Gbps and contain features such as 4-tap DFE equalization on the receive end and pre-emphasis on the transmit side.  A multi-code physical coding sublayer also allows support of both 64B/66B and 64B/67B encoding/decoding schemes without resorting to the LUT fabric for implementation.

For you DSP dudes, FXT contains up to 384 DSP slices along with up to 16.5MB of on-chip memory.  The DSP48E slice includes MAC, multiply-add/subtract, tree-input add, and barrel shift, as well as wide counters and comparators.  The combination of the FPGA’s ability to perform massively parallel datapath operations with the high-speed conventional processing power of the dual PowerPCs and to connect the two together with a high-performance, low-latency fabric creates a daunting performer.  Mate this power with the ability to get data on and off chip via very fast multi-gigabit SerDes channels and you have the kind of device that… well, there’s the next question.  Exactly who needs this much FPGA?

One of the interesting trends with the segmentation of FPGA families is that the target domain for these full-boat devices actually narrows.  Since many applications are now well served by the LX, LXT, and SXT families, FXT is left as the purview of only those who need almost the entire gamut of features.  Today, this means the “old school” FPGA customers – people bolting big bit pipes to backplanes — as well as very-high-throughput signal processing applications like radar or surveillance video processing.

A new family would be no fun at all without new tools, and Xilinx is rolling out their new ISE 10.1 almost in lock-step with Virtex-5 FXT.  Don’t be confused into thinking that 10.1 is all about FXT support, however.  10.1 is a full-fledged release of Xilinx’s entire tool suite supporting their entire range of products.  In fact, Xilinx has brought together their previously somewhat asynchronous release of tools into a single, unified release with 10.1 – presumably making life much easier for customers trying to keep their tool chain consistent while using a variety of Xilinx products. 

10.1 includes the traditional ISE Foundation and PlanAhead tools for basic hardware design, the EDK and Platform Studio tools for embedded design with FPGAs, and a set of system-level design tools including System Generator for DSP, AccelDSP Synthesis, and Xilinx Platform Studio for stitching together embedded designs. 

Two major priorities with 10.1 were boosting runtime performance to decrease the cycle time for design turns and improving quality of results to boost the performance of the resulting designs.  Both goals were reached, in part, by taking advantage of multi-machine and multi-core processing to run parallel attempts at compilation.  SmartXplorer allows multiple tuning strategies to be explored on multiple processors, improving both overall turnaround time and resulting design performance. 

Xilinx also created a new “lite” version of their PlanAhead floorplanning tool that brings out the pin planning features of “PinAhead” to a broader audience.  PinAhead facilitates pin/package planning early in the process and performs design rule checks early in the design cycle.  Apparently, the pin planning functionality was well enough received that the company wanted to make it available to more than the usual “floorplanners” who spring for the optional PlanAhead – thus, we now have PlanAhead Lite.

As design has become more complex, finding the appropriate balance of constraints including area, delay, power, and tool runtime has become far more difficult.  Over the years, tools have sprouted a bevy of features, settings, and options that can be daunting for even the most experienced designers.  10.1 has a new “strategy-based implementation” that helps sort out the settings to give you your desired results.

Finally, the new 10.1 suite includes enhanced power analysis and optimization capabilities.  A “2nd generation” XPower Analyzer gives improved accuracy and sports a new user interface with multiple views of power consumption and thermal analysis and a view of power by voltage rail.  Xilinx claims that new power optimization delivers a 10% reduction in total power for Virtex-5 devices and a 12% reduction for Spartan-3A.

The 10.1 ISE tools are available immediately, and the Virtex-5 FXT series is now at the “shipping samples” stage for FX30T and FX70T devices, with FX100T, FX130T and FX200T scheduled for the next six months.  Production devices are slated for Q3 2008.  FX30T will list for US$159 in 1,000-unit volumes by the second half of 2009.  Xilinx will also make Virtex-5 FXT devices available through their EasyPath cost-reduction program for higher-volume applications.

Leave a Reply

featured blogs
Nov 24, 2020
In our last Knowledge Booster Blog , we introduced you to some tips and tricks for the optimal use of the Virtuoso ADE Product Suite . W e are now happy to present you with some further news from our... [[ Click on the title to access the full blog on the Cadence Community s...
Nov 23, 2020
It'€™s been a long time since I performed Karnaugh map minimizations by hand. As a result, on my first pass, I missed a couple of obvious optimizations....
Nov 23, 2020
Readers of the Samtec blog know we are always talking about next-gen speed. Current channels rates are running at 56 Gbps PAM4. However, system designers are starting to look at 112 Gbps PAM4 data rates. Intuition would say that bleeding edge data rates like 112 Gbps PAM4 onl...
Nov 20, 2020
[From the last episode: We looked at neuromorphic machine learning, which is intended to act more like the brain does.] Our last topic to cover on learning (ML) is about training. We talked about supervised learning, which means we'€™re training a model based on a bunch of ...

featured video

Improve SoC-Level Verification Efficiency by Up to 10X

Sponsored by Cadence Design Systems

Chip-level testbench creation, multi-IP and CPU traffic generation, performance bottleneck identification, and data and cache-coherency verification all lack automation. The effort required to complete these tasks is error prone and time consuming. Discover how the Cadence® System VIP tool suite works seamlessly with its simulation, emulation, and prototyping engines to automate chip-level verification and improve efficiency by ten times over existing manual processes.

Click here for more information about System VIP

featured paper

Overcoming PPA and Productivity Challenges of New Age ICs with Mixed Placement Innovation

Sponsored by Cadence Design Systems

With the increase in the number of on-chip storage elements, it has become extremely time consuming to come up with an optimized floorplan using manual methods, directly impacting tapeout schedules and power, performance, and area (PPA). In this white paper, learn how a breakthrough technology addresses design productivity along with design quality improvements for macro-dominated designs. Download white paper.

Click here to download the whitepaper

Featured Chalk Talk

DC-DC for Gate Drive Power

Sponsored by Mouser Electronics and Murata

In motor control and industrial applications, semiconductor switches such as IGBTs and MOSFETS of all types - including newer wide-bandgap devices are used extensively to switch power to a load. This makes DC to DC conversion for gate drivers a challenge. In this episode of Chalk Talk, Amelia Dalton chats with John Barnes of Murata about DC to DC conversion for gate drivers for industrial and motor control applications.

More information about Murata Power Solutions MGJ DC/DC Converters: