Logo
Get direct access via EPNdirect to Europe’s most comprehensive database of electronic products & suppliers
Search    Advanced Search Criteria

TOP PRODUCTS

Print | PDF | Digg This | Slashdot It! | Add to Del.icio.us |
Product group : Software
ASIC Prototyping Using Off-the-Shelf FPGA Boards
Even though ASIC designs...
In a survey commissioned by Synplicity Inc. in December 2004, more than 20,000 developers around the world were questioned as to their hardware-assisted ASIC verification strategy. The results showed that one third of today's ASIC designs are verified by means of an FPGA-based prototype.
01/11/2006
Reference: 20280

Even though ASIC designs are increasing in size and complexity, recent advances in the capacity and performance of modern FPGAs means that two third of these designs can be modeled using a single FPGA. However, this still leaves one third of these designs requiring a multi-FPGA-based prototyping board. In the not-so-distant past, the predominant solution in these cases was for the ASIC design teams to internally create their own custom multi-FPGA prototyping boards. Today, however, using off-the-shelf multi-FPGA prototyping boards, such as the ones produced by Synplicity's partners in prototyping, in conjunction with the appropriate design tools can save weeks if not months of verification time and tens of thousands of dollars in NRE charges. This paper first discusses the predominant techniques available for ASIC verification. Next, the paper considers the advantages and disadvantages of creating a custom multi-FPGA prototyping board as compared to using an off-the-shelf product. Finally, the paper introduces the current state-of-the-art in design tools for the partitioning and synthesizing of large designs for verification using either internally developed or off-the-shelf multi-FPGA prototyping boards.

Alternative verification techniques

Today's high-end ASICs, such as those used in cell phones, communications, graphics subsystems, and signal processing applications, often contain multiple CPU and DSP cores combined with multiple hardware accelerator, peripheral, interface, and memory management cores. (For the purpose of these discussions, the term ASIC is assumed to encompass ASSP and SoC devices.) Thus, in order to meet the chip's market window, it is necessary to develop, port, integrate, debug, and verify any embedded software content as early in the design process as possible. Full functional verification of the ASIC's RTL, on its own and in the context of any embedded software, is one of the most time-consuming and difficult parts of the ASIC design process. Statistics show that 70 percent of today's ASIC designs require a re-spin. In addition to being extremely expensive, re-spins can cause the project to miss its market window, which can severely damage a company's reputation and financial bottom line. The three main verification options open to ASIC designers are simulation, emulation, and FPGA-based prototypes.

Simulation

Software-based simulation is widely used, but even when running on a really high-end (and correspondingly expensive) computer platform, it runs six to ten orders of magnitude slower than the actual ASIC hardware, which makes it an extremely time consuming and inefficient technique. To provide a sense of scale, software simulation of the entire system can typically achieve equivalent speeds of only a few Hz (that is, a few cycles of the design's system clock for each second of real time). In practice, this means that extensive software verification can be performed on only small portions of the design.

Emulation

Hardware-based emulation is another alternative, but it is still at least three orders of magnitude slower than the actual ASIC hardware, because the massive amounts of multiplexing involved slows the verification speed down to only 500kHz to 2MHz. Furthermore, this approach is extremely expensive, both in terms of budget and resources (depending on the size of the emulator, the cost can be anywhere from 25 cents to $1 per equivalent gate). What designers need is an alternative that will allow them to get to market quickly with low risk and at low cost.

FPGA-based Prototypes

In many cases, it is necessary to verify the design "at-speed." In the case of a video processing chip, for example, part of the verification may involve evaluating the subjective quality of the video output stream. Similarly, verifying the hardware in the context of the embedded software mandates extreme speed. The answer is to use multi-FPGA prototyping boards running at speeds from 10 to 80MHz, which is equal (or comparable) to real-time ASIC speeds ("real stimulus in, real responses out"). When it comes to designing a custom board versus using an off-the-shelf board, the latter, when coupled with the appropriate design tools, can shave weeks if not months of verification time and (at a typical value of under 1 cent per equivalent gate) save tens of thousands of dollars in NRE charges. Also of interest is the fact that, in addition to providing a platform for software development and hardware-software verification, the company designing the ASIC may simply require access to a fully-functional implementation of the design as soon as possible; for example, demonstration hardware may be required to take to a tradeshow.Full Custom vs Off-the-Shelf Prototyping BoardsSometime between three to five years ago at the time of this writing, all multi-FPGA prototyping boards were of the "grow your own" full-custom variety. In contrast, today there is a thriving community of off-the-shelf multi-FPGA prototyping board vendors. To provide a point of reference, traditional hardware emulation is currently a $100 million a year market. By comparison, over the last few years, off-the-shelf multi-FPGA prototyping boards have grown to be an approximately $75 million a year market. To put this another way, the off-the-shelf multi-FPGA prototyping board industry has grown to be three-quarters the size of the hardware emulation market without anyone really noticing. It is the nature of the engineer to think that anything generic is sub-optimal. In particular, engineers often wish toto build their own custom prototyping boards because they think the performance will be better, they believe that it will be easier to interface to the real world and such interfaces will be closer to what they want, they think it will reduce the cost of the project, and they think it will reduce time-to-market. Let's take these points in order:

Better Performance

In the case of prototyping boards involving anything more than two or three FPGAs, it is extremely unlikely that a custom implementation will out-perform any of its off-the-shelf counterparts. This is because designing such a board requires a very high level of knowledge and expertise, which is only gained by designing generations of such boards over the years.

Ease of Creation

If an ASIC design will fit in a single FPGA, then designing and implementing a custom board is relatively simple. By comparison, in the case of an ASIC design that requires two FPGAs, the problem becomes significantly more interesting; and things become exponentially more complicated when using three or more FPGAs.

Ease of Interfacing

If an ASIC design will fit in a single FPGA, then there are some compelling reasons to design a custom board. One of these reasons is that it will often make sense to implement this FPGA along with any interfacing logic on the same card. In the case of a multi-FPGA prototyping board solution, however, the interfacing problem is almost invariably simplified by taking a known-good off-the-shelf board and focusing one's efforts on the design of a special interfacing card.

Reducing Costs

Designing and implementing a high-end multi-FPGA prototyping board can require a large team of specialist design engineers and layout designers, which ends up costing significantly more that simply purchasing an off-the-shelf board.

Reducing Time-to-Market

Even for a company that specializes in designing and implementing multi-FPGA prototyping boards, the creation of a high-end board can easily take 9 months (and this assumes multiple engineers and layout designers working multiple shifts). Not surprisingly, a non-specialist team will almost certainly take much longer, which can easily cause a project to slip its schedule and miss its market window. As an example of the complexity of the multi-FPGA prototyping board design problem, consider the DN8000K10 board from the Dini Group (Figure 1). The Dini Group is a member of the Synplicity Partners in Prototyping Program. The DN8000K10 is a USB 2.0-hosted logic prototyping system that can be populated with two to sixteen high capacity FPGAs. In its highest configuration, this board can be used to prototype designs representing a conservative value of 24 million equivalent ASIC gates. The design and implementation of the DN8000K10 took nine months elapsed time. As part of this project, six layout designers worked two shifts for several months. The final product was a 28-layer board in which the chip-to-chip communication was implemented using low-voltage differential signals (LVDS) running at 350 MHz. (In the case of designs that are pin limited, each LVDS pin pair supports integrated SERDES that can provide up to 10:1 multiplexing.) At this level of complexity, addressing noise considerations and signal integrity issues requires an extreme level of knowledge and expertise. A board of this caliber is fully one to two orders of magnitude beyond the point where one of today's state-of-the-art auto-routers can find a solution; thus, every pin was "hand-picked" and every track was "hand-stitched", there was no use of auto-routing (except around the board's periphery).

Hand-Partitioning and Synthesizing Multi-FPGA Designs

In the case of a hand-partitioning environment, any ASIC-centric constructs (gated-clocks, Synopsys' DesignWare instantiations, etc.), in the original RTL source code, have to be hand-translated into their FPGA equivalents prior to the partitioning taking place. Apart from anything else, this immediately results in two separate code streams that can lose synchronization, thereby resulting in functional differences between the FPGA prototype and the ASIC it is intended to represent. When it comes to the partitioning process itself, the engineers attempt to gather different groups of functional blocks (modules) together, where each group is to be implemented in a different FPGA. This grouping (partitioning) was traditionally performed at the gate level. More recently, some flows support grouping at the RT level, in which case the resulting groups are each passed through a traditional FPGA synthesis tool, and it is only at this point that the actual resource utilization of the various FPGAs is known. One problem with both of these scenarios is that the engineers are "flying blind" as to the area and resource impacts of the various groups, which results in lots of time-consuming iterations. First, the engineers generate "guestimates" along the lines of "Block A will probably consume 'xxx' amount of resources, while Block B may require 'yyy' amount of resources." These guestimates are followed by large numbers of "group" commands, then synthesis (in the case of RTL-based partitioning), then analysis of the results, and then lots of "ungroup" and "regroup" commands to evaluate a different implementation. The task is further confused by the fact that such prototypes are often limited by the number of I/O pins on the FPGAs; an inefficient solution can easily consume 100% of the I/O resources on a device while at the same time utilizing only a relatively small amount of its internal logic resources. In order to surmount these I/O limitations, it may be necessary to multiplexex groups of I/Os together and/or to replicate the same block of logic in multiple FPGAs. (Logic replication is also often required in order to achieve specific performance goals.) Given that each FPGA used in this type of prototype is likely to have more than 1000 pins, a spreadsheet approach to managing connectivity can easily contain thousands of cells. Not surprisingly, keeping track of the blocks assigned to each FPGA and keeping track of the connectivity matrix (the connectivity between the various FPGAs) is a huge task that is resource-intensive, time-consuming, and prone to error.

Automatically Partitioning and Synthesizing Multi-FPGA Designs

When the Certify RTL Prototyping tool was created in the late 1990s, there weren't any off-the-shelf multi-FPGA prototyping boards available to ASIC design teams. At that time, the Certify software was conceived to be an aid to ASIC teams designing their own custom multi-FPGA prototyping boards. Using the Certify software, engineers could define the number and type of FPGAs on the board and the interconnections between them. This data was subsequently used to automatically partition the RTL for the ASIC design across the multiple FPGAs and to synthesize the partitioned RTL into the configuration files used to program the FPGAs. Once the engineers had used the Certify tool to define the underlying architecture of the board, one of the outputs from the software was a netlist describing the FPGAs and the connectivity between them. The format of this netlist, which was described in Verilog, was defined by Synplicity and became known as the *.vb (Verilog Board) format.Design teams wishing to create their own custom boards still use this technique today. The point is that Synplicity's *.vb format quickly became the de facto industry standard for this class of application. Now, every off-the-shelf multi-FPGA prototyping board vendor delivers their boards with corresponding *.vb files, which are read into the Certify software as input to define each board's architecture. The Certify tool can be used with Verilog, VHDL, and mixed-language designs. The first element in the flow is to employ Certify software to automatically convert any ASIC-specific coding into equivalent FPGA structures. In the case of an existing off-the-shelf multi-FPGA prototyping board, the user simply informs the software as to the type of board being used from a pull-down list that encompasses offerings from all of the major third-party vendors. (Alternatively, if this is to be a custom board, the Certify tool has the ability to create a "virtual" multi-FPGA board on-the-fly, where this virtual boardrd can subsequently be used as the basis for creating the real board.) Next, the Certify software is used to automatically partition the design across the multiple FPGAs (Figure 2).Tightly integrated with the Certify software is Synplicity's HDL Analyst utility, which automatically generates technology-independent graphical views of the design in the form of high-level hierarchical block diagrams and, following synthesis, corresponding gate-level schematics. The Certify and HDL Analyst tools support full bi-directional cross-probing between the HDL source code and the block-level and gate-level schematics, thereby allowing designers to quickly navigate around the design and locate signals and logical functions of interest. In addition to a variety of other views of the design, the Certify software provides a graphical representation of the FPGAs forming the prototype board (Figure 3). Each of these virtual components has two associated "thermometer-type" displays: one reflects the I/O utilization and the other the area/resources utilization of the device. Based on its knowledge of the I/O and logic resources associated and the FPGAs and the routing resources between FPGAs, the Certify software can automatically perform pin assignments and automatically implement a first-pass partitioning using its state-of-the-art Quick Partitioning Technology (QPT). Alternatively, the user can perform the partitioning interactively - by simply dragging blocks of code and dropping them on the different FPGAs, or a mixture of both techniques may be employed. The Certify software offers a number of extremely powerful tools to aid in the partitioning task. Following partitioning, for example, the software can analyze the results and present the user with opportunities to use Certify Pin Multiplexing (CPM), in which multiple sets of signals are multiplexed together to alleviate loading on the I/O resources associated with a device.In addition to facilitating logic replication in multiple devices, the Certify tool also provides bit-slicing utilities in which wide data-path structures can be broken apart into smaller entities. Moreover, the Certify software offers sophisticated "zippering" capabilities whereby large blocks can be broken up into smaller pieces (these pieces can subsequently be assigned to different FPGAs). Another very useful feature is that, as a candidate partitioning implementation is created, it can be named and saved. This allows users to maintain control over multiple partitioning scenario alternatives. This capability can be used in conjunction with the Certify software's impact analysis feature, which allows users to assess the impact of placing and/or moving a piece of logic with respect to the available area and I/O on the multi-FPGA board. Rather than the user having to guess as to which FPGA this logic should be assigned, impact analysis generates specific information upon which partitioning decisions can be based. Once partitioning has been performed, the Certify software is used to synthesize the code streams associated with the different FPGA devices. The tool employs the same underlying synthesis technology that is featured in Synplicity's market-leading Synplify Pro FPGA synthesis engine. For example, the Certify software takes full advantage of Synplicity's BEST (Behavior Extracting Synthesis Technology) algorithms, which analyze the RTL and implement high-level optimizations in advance of the main synthesis step. And the Certify tool boasts all of the Synplify Pro software's advanced synthesis capabilities, such as resource sharing, register balancing, retiming, replication, and re-synthesis. A key aspect to this process is that the Certify software regards the various FPGAs as simply being an additional layer in the design hierarchy. This means that the tool offers the unique ability to optimize timing paths for performance, even when those paths cross multiple FPGAs (the Certify software can also provide a timing report that informs designers as to the performance the prototype can achieve prior to the hardware being programmed). Synplicity20280a


Synplicity Europe, Ltd.

Chiltern House
45 Station Road
RG9 1AT Henley on Thames - United Kingdom -Oxfordshire

RELATED ARTICLES FROM Synplicity Europe, Ltd. All their related products...
RELATED ARTICLES All related products...
Search in the archives
Advanced Search Criteria
Magazine_mai_2012_small
Loupe
issue
May 2012