Space Silicon

Most of us know Moore’s law to have only one independent variable – time. Mr. Moore predicts that, as time goes by, the complexity of our electronic designs increases exponentially. One thing Mr. Moore never mentioned in public, however, is the secret second factor. Aerospace engineers have long been aware that Moore’s law has a little- known term in the denominator. This factor is something like “one plus altitude squared.” As your design gets farther from the ground, the fabrication technology goes back in time along the line of progress. Until recently, by the time you got your system into orbit, your design was effectively a couple decades behind on the technology curve.

However, over the past decade or so, FPGAs have changed all that. Now, you can create your space-bound design with near state-of-the-art semiconductor technology and still have a fighting chance of meeting the demanding requirements of electronics for space flight. As we discussed last year in our “FPGAs in Space” article, modern programmable logic devices offer an attractive alternative to high-NRE ASIC designs with space-proven processes. The typical ASIC NRE, amortized over the small production run of a typical satellite system, for example, dominates the cost equation. Zero-NRE programmable logic devices are far more economically attractive, regardless of their unit cost.

Now there are even more options for aerospace electronics engineers, and an even greater measure of controversy. Space, you see, is a very tough neighborhood. Radiation levels that can confuse the coulombs out of even the most stable of earthbound electronics lurk around every corner. Vibration encountered en route will shake your solder joints to the bones. Heat sinks and fans that would cool your chips on the ground are useless in the airless vacuum of space. Added to all of this is the harsh reality that your favorite technician will never, ever be able to make a service call.

When it comes to cost, you may think it’s expensive to FedEx your laptop power supply from San Jose to Boston (now, why would you ever need to do that?), but that’s just peanuts compared to the $10K or more per pound your company will spend getting your science fair project into orbit. If your design has a post-launch problem, the career that’s cut short will almost certainly be yours.

It is interesting to study the trials and tribulations of space-based design, because some of the issues that space designers have faced for decades are slowly making their way down to earth with each passing process node. Radiation effects such as single-event upsets (SEUs) are now becoming a concern even for some designers of ground-based systems. Also, with higher density devices now being designed on a routine basis, many high-reliability techniques are useful and prudent even for consumer-level products, particularly those such as automobiles with safety considerations.

The first design decision in most projects is silicon selection. Before programmable logic entered the picture, we had a very small number of companies with space-proven processes, usually several generations behind the current commercial process node, who would fabricate an ASIC for you with a radiation-tolerant process, thermal packaging, and the inspection and qualification processes required to meet the rigorous standards of space flight. Then antifuse-based programmable logic entered the picture, and designers flocked to the new solution in droves. Antifuse programmable logic, such as that offered by Actel, provided rapid design turnaround, small volume economics, and excellent performance and radiation immunity.

Some projects, however, particularly those with demanding DSP or data capture requirements, required more density, higher DSP performance, or in-system reconfigurability. Because these applications’ needs seemed a good fit for more conventional commercial-style FPGAs, Xilinx set about qualifying some of their SRAM-based FPGAs for space use. Space and SRAM are definitely not a match made in heaven, however. Because the FPGA’s configuration is stored in something like SRAM memory cells, the configuration itself is vulnerable to SEUs. An errant particle passing by can temporarily clobber the cell, switching it to the opposite logic value, and suddenly your circuit has a new and unintended netlist. In order to get these devices to operate satisfactorily in space, a host of mitigation techniques have been developed. These techniques range from various flavors of triple-module-redundancy (TMR) to regular “scrubbing” (partial reconfiguration to eliminate accumulated errors.)

LSI Logic is expected to announce plans for a space-qualified version of its RapidChip structured ASIC family later this month. Structured ASIC would appear to be an excellent platform for space-based design. It offers the low-risk, low-NRE advantages of programmable logic, the higher densities and performance of ASIC design methodologies, and the mask-programmed metal-to-metal interconnect immunity to radiation effects. RapidChip also benefits from a rich IP library, high-performance DSP resources, and a simple, FPGA-like design tool flow.

Actel, long the most heavily invested programmable logic vendor in the space market, made two significant announcements this week advancing their stake in space. First, they announced an approximate doubling of the maximum density available in their antifuse-based RTAX-S FPGA family. According to Actel, this density increase takes the antifuse line to 4 million system gates, or an approximate equivalent to 500K ASIC gates. Second, they announced plans for the first flash-based space-qualified FPGA line. Flash is intriguing as a space technology because it is expected to have much greater SEU resistance in configuration than SRAM-based technologies while offering the advantages of reprogrammability missing in antifuse FPGAs.

Looking at the silicon scenario in light of these new announcements, Xilinx sits on one end of the spectrum with SRAM-based FPGAs (although Xilinx points out that the design of the configuration logic differs from SRAM in read/write characteristics, making it more stable than strict SRAM.) These devices are attractive if your application is not mission-critical, if you need high performance DSP resources on-chip, and if you require in-flight reprogrammability. However, these FPGAs require significant mitigation such as TMR of both registers and critical configuration circuitry, and give away a considerable portion of their density, system performance, and I/O capability to mitigation.

Next along the line comes Actel’s planned flash-based family. Although details are not yet announced, it is reasonable to assume that these devices will be based on the company’s commercial ProASIC3 architecture with significant modifications for radiation tolerance. They will offer reprogrammability, making them the only direct competition for Xilinx SRAM-based FPGAs. They will likely have advantages such as single-chip, non-volatile operation and greater radiation immunity and will probably lag their SRAM competition in operating frequency and DSP capabilities.

If your design does not require reprogrammability, Actel’s antifuse lines will be joined by LSI Logic’s expected structured ASIC family. Even with Actel’s newly announced density increase, LSI Logic’s offering will probably start in density where the Actel line leaves off and go up from there. Both lines are likely to have much better radiation immunity than either of the reprogrammable options, and significantly better timing performance as well. Although the company has recently struggled with problems of their own in their high-rel line, Actel’s family will still have the benefit of years of space-proven operation and refinement while LSI Logic’s family will be a comparatively unproven newcomer.

LSI Logic’s product may eventually benefit from the very rich IP offering currently available for the RapidChip line, but IP will have to be space-qualified as well, so don’t assume that every capability available in the commercial product line will be available or appropriate for space use. One such capability, which promises to improve reliability and performance of space-based systems overall, is SerDes I/O. SerDes has revolutionized communications backplane design by replacing problematic wide parallel bus structures with faster, more reliable differential pair serial signals. In space, the promise of lower pin counts and simpler signal integrity solutions are attractive from both a PCB design and system weight perspective. SerDes hasn’t yet been proven for space use, however, and considerable engineering is required to understand and resolve SerDes I/O and signal integrity issues in high-reliability applications.

Although the space silicon landscape is on the verge of significant change, the logic design side of the picture still poses significant challenges for engineers. Mitigation techniques are required for memory, registers, core logic, and I/O for designs required to operate reliably in radiation-intensive environments. Melanie Berg, Chief Staff Electrical Engineer at NASA’s Goddard/Muniz Space Flight Center, points out that no mitigation technique is fool-proof. “People will throw in a technique like TMR and claim that their design is ‘immune’ to radiation effects,” observes Berg, “but what happens if an SET [single event transient] hits right at a clock edge?” Every mitigation technique has holes that make it vulnerable to some kind of event. The key is to understand the benefits and vulnerabilities associated with the techniques you choose, and to weight the risks associated with those vulnerabilities.

Some techniques may actually result in a net loss of reliability. “You hear that binary-encoded state machines are more reliable,” continues Berg, “but the reliability increase from the smaller number of state bits is offset by worse behavior in the event of an upset. In a binary-encoded state machine, the majority of the states are used, so a single bit upset is more likely to land you in an unexpected valid state instead of an [error-trapping] invalid one.” By creating a state machine that doesn’t trap a radiation-induced error, you risk having your design jump into unpredictable and potentially unsafe states, particularly when there is a hierarchy of interdependent state machines. “I personally prefer one-hot,” Berg concludes. “Although it uses a few more bits for the state variable, in most reasonable sized state machines this is insignificant. The benefit is that you can parity check the state register and know when you’ve encountered an error, then send your system into error-recovery mode.”

Analysis and characterization of the effectiveness of various mitigation strategies, both in logic design and in semiconductor process, is a complex business. Test designs in silicon have to be operated at speed in artificially accelerated radiation environments. Errors have to be detected, and enough data must be collected to allow subsequent analysis of failure mechanisms. Ultimately, such analysis will lead to proven recipes for design success, but today high-reliability design is still a somewhat black art, practiced and studied by specialists like Berg and her colleagues at NASA.

An improved understanding of proper high-reliability design techniques, combined with advances in underlying silicon technology, promises to bring capabilities previously available only to commercial logic designers to the space domain. More capable and reliable spacecraft with lower design and construction costs will likely be the result. At the same time, ground-based design can benefit from the lessons learned in space to manage the complexity and increased vulnerability of future semiconductor processes. High reliability, while essential in space design, should be a priority in all of our engineering efforts.