Part of the  

Chip Design Magazine

  Network

About  |  Contact

Posts Tagged ‘C/C++’

Formal, Logic Simulation, hardware emulation/acceleration. Benefits and Limitations

Wednesday, July 27th, 2016

Stephen Bailey, Director of Emerging Technologies, Mentor Graphics

Verification and Validation are key terms used and have the following differentiation:  Verification (specifically, hardware verification) ensures the design matches R&D’s functional specification for a module, block, subsystem or system. Validation ensures the design meets the market requirements, that it will function correctly within its intended usage.

Software-based simulation remains the workhorse for functional design verification. Its advantages in this space include:

-          Cost:  SW simulators run on standard compute servers.

-          Speed of compile & turn-around-time (TAT):  When verifying the functionality of modules and blocks early in the design project, software simulation has the fastest turn-around-time for recompiling and re-running a simulation.

-          Debug productivity:  SW simulation is very flexible and powerful in debug. If a bug requires interactive debugging (perhaps due to a potential UVM testbench issue with dynamic – stack and heap memory based – objects), users can debug it efficiently & effectively in simulation. Users have very fine level controllability of the simulation – the ability to stop/pause at any time, the ability to dynamically change values of registers, signals, and UVM dynamic objects.

-          Verification environment capabilities: Because it is software simulation, a verification environment can easily be created that peeks and pokes into any corner of the DUT. Stimulus, including traffic generation / irritators can be tightly orchestrated to inject stimulus at cycle accuracy.

-          Simulation’s broad and powerful verification and debug capabilities are why it remains the preferred engine for module and block verification (the functional specification & implementation at the “component” level).

If software-based simulation is so wonderful, then why would anyone use anything else?  Simulation’s biggest negative is performance, especially when combined with capacity (very large, as well as complex designs). Performance, getting verification done faster, is why all the other engines are used. Historically, the hardware acceleration engines (emulation and FPGA-based prototyping) were employed latish in the project cycle when validation of the full chip in its expected environment was the objective. However, both formal and hardware acceleration are now being used for verification as well. Let’s continue with the verification objective by first exploring the advantages and disadvantages of formal engines.

-          Formal’s number one advantage is its comprehensive nature. When provided a set of properties, a formal engine can exhaustively (for all of time) or for a, typically, broad but bounded number of clock cycles, verify that the design will not violate the property(ies). The prototypical example is verifying the functionality of a 32-bit wide multiplier. In simulation, it would take far too many years to exhaustively validate every possible legal multiplicand and multiplier inputs against the expected and actual product for it to be feasible. Formal can do it in minutes to hours.

-          At one point, a negative for formal was that it took a PhD to define the properties and run the tool. Over the past decade, formal has come a long way in usability. Today, formal-based verification applications package properties for specific verification objectives with the application. The user simply specifies the design to verify and, if needed, provides additional data that they should already have available; the tool does the rest. There are two great examples of this approach to automating verification with formal technology:

  • CDC (Clock Domain Crossing) Verification:  CDC verification uses the formal engine to identify clock domain crossings and to assess whether the (right) synchronization logic is present. It can also create metastability models for use with simulation to ensure no metastability across the clock domain boundary is propagated through the design. (This is a level of detail that RTL design and simulation abstract away. The metastability models add that level of detail back to the simulation at the RTL instead of waiting for and then running extremely long full-timing, gate-level simulations.)
  • Coverage Closure:  During the course of verification, formal, simulation and hardware accelerated verification will generate functional and code coverage data. Most organizations require full (or nearly 100%) coverage completion before signing-off the RTL. But, today’s designs contain highly reusable blocks that are also very configurable. Depending on the configuration, functionality may or may not be included in the design. If it isn’t included, then coverage related to that functionality will never be closed. Formal engines analyze the design, in its actual configuration(s) that apply, and does a reachability analysis for any code or (synthesizable) functional coverage point that has not yet been covered. If it can be reached, the formal tool will provide an example waveform to guide development of a test to achieve coverage. If it cannot be reached, the manager has a very high-level of certainty to approving a waiver for that coverage point.

-          With comprehensibility being its #1 advantage, why doesn’t everyone use and depend fully on formal verification:

  • The most basic shortcoming of formal is that you cannot simulate or emulate the design’s dynamic behavior. At its core, formal simply compares one specification (the RTL design) against another (a set of properties written by the user or incorporated into an automated application or VIP). Both are static specifications. Human beings need to witness dynamic behavior to ensure the functionality meets marketing or functional requirements. There remains no substitute for “visualizing” the dynamic behavior to avoid the GIGO (Garbage-In, Garbage-Out) problem. That is, the quality of your formal verification is directly proportional to the quality (and completeness) of your set of properties. For this reason, formal verification will always be a secondary verification engine, albeit one whose value rises year after year.
  • The second constraint on broader use of formal verification is capacity or, in the vernacular of formal verification:  State Space Explosion. Although research on formal algorithms is very active in academia and industry, formal’s capacity is directly related to the state space it must explore. Higher design complexity equals more state space. This constraint limits formal usage to module, block, and (smaller or well pruned/constrained) subsystems, and potentially chip levels (including as a tool to help isolate very difficult to debug issues).

The use of hardware acceleration has a long, checkered history. Back in the “dark ages” of digital design and verification, gate-level emulation of designs had become a big market in the still young EDA industry. Zycad and Ikos dominated the market in the late 1980’s to mid/late-1990’s. What happened?  Verilog and VHDL plus automated logic synthesis happened. The industry moved from the gate to the register-transfer level of golden design specification; from schematic based design of gates to language-based functional specification. The jump in productivity from the move to RTL was so great that it killed the gate-level emulation market. RTL simulation was fast enough. Zycad died (at least as an emulation vendor) and Ikos was acquired after making the jump to RTL, but had to wait for design size and complexity to compel the use of hardware acceleration once again.

Now, 20 years later, it is clear to everyone in the industry that hardware acceleration is back. All 3 major vendors have hardware acceleration solutions. Furthermore, there is no new technology able to provide a similar jump in productivity as did the switch from gate-level to RTL. In fact, the drive for more speed has resulted in emulation and FPGA prototyping sub-markets within the broader market segment of hardware acceleration. Let’s look at the advantages and disadvantages of hardware acceleration (both varieties).

-          Speed:  Speed is THE compelling reason for the growth in hardware acceleration. In simulation today, the average performance (of the DUT) is perhaps 1 kHz. Emulation expectations are for +/- 1 MHz and for FPGA prototypes 10 MHz (or at least 10x that of emulation). The ability to get thousands of more verification cycles done in a given amount of time is extremely compelling. What began as the need for more speed (and effective capacity) to do full chip, pre-silicon validation driven by Moore’s Law and the increase in size and complexity enabled by RTL design & design reuse, continues to push into earlier phases of the verification and validation flow – AKA “shift-left.”  Let’s review a few of the key drivers for speed:

  • Design size and complexity:  We are well into the era of billion gate plus design sizes. Although design reuse addressed the challenge of design productivity, every new/different combination of reused blocks, with or without new blocks, creates a multitude (exponential number) of possible interactions that must be verified and validated.
  • Software:  This is also the era of the SoC. Even HW compute intensive chip applications, such as networking, have a software component to them. Software engineers are accustomed to developing on GHz speed workstations. One MHz or even 10’s of MHz speeds are slow for them, but simulation speeds are completely intolerable and infeasible to enable early SW development or pre-silicon system validation.
  • Functional Capabilities of Blocks & Subsystems:  It can be the size of input data / simuli required to verify a block’s or subsystem’s functionality, the complexity of the functionality itself, or a combination of both that drives the need for huge numbers of verification cycles. Compute power is so great today, that smartphones are able to record 4k video and replay it. Consider the compute power required to enable Advanced Driver Assistance Systems (ADAS) – the car of the future. ADAS requires vision and other data acquisition and processing horsepower, software systems capable of learning from mistakes (artificial intelligence), and high fault tolerance and safety. Multiple blocks in an ADAS system will require verification horsepower that would stress the hardware accelerated performance available even today.

-          As a result of these trends which appear to have no end, hardware acceleration is shifting left and being used earlier and earlier in the verification and validation flows. The market pressure to address its historic disadvantages is tremendous.

  • Compilation time:  Compilation in hardware acceleration requires logic synthesis and implementation / mapping to the hardware that is accelerating the simulation of the design. Synthesis, placement, routing, and mapping are all compilation steps that are not required for software simulation. Various techniques are being employed to reduce the time to compile for emulation and FPGA prototype. Here, emulation has a distinct advantage over FPGA prototypes in compilation and TAT.
  • Debug productivity:  Although simulation remains available for debugging purposes, you’d be right in thinking that falling back on a (significantly) slower engine as your debug solution doesn’t sound like the theoretically best debug productivity. Users want a simulation-like debug productivity experience with their hardware acceleration engines. Again, emulation has advantages over prototyping in debug productivity. When you combine the compilation and debug advantages of emulation over prototyping, it is easy to understand why emulation is typically used earlier in the flow, when bugs in the hardware are more likely to be found and design changes are relatively frequent. FPGA prototyping is typically used as a platform to enable early SW development and, at least some system-level pre-silicon validation.
  • Verification capabilities:  While hardware acceleration engines were used primarily or solely for pre-silicon validation, they could be viewed as laboratory instruments. But as their use continues to shift to earlier in the verification and validation flow, the need for them to become 1st class verification engines grows. That is why hardware acceleration engines are now supporting:
    • UPF for power-managed designs
    • Code and, more appropriately, functional coverage
    • Virtual (non-ICE) usage modes which allow verification environments to be connected to the DUT being emulated or prototyped. While a verification environment might be equated with a UVM testbench, it is actually a far more general term, especially in the context of hardware accelerated verification. The verification environment may consist of soft models of things that exist in the environment the system will be used in (validation context). For example, a soft model of a display system or Ethernet traffic generator or a mass storage device. Soft models provide advantages including controllability, reproducibility (for debug) and easier enterprise management and exploitation of the hardware acceleration technology. It may also include a subsystem of the chip design itself. Today, it has become relatively common to connect a fast model written in software (usually C/C++) to an emulator or FPGA prototype. This is referred to as hybrid emulation or hybrid prototyping. The most common subsystem of a chip to place in a software model is the processor subsystem of an SoC. These models usually exist to enable early software development and can run at speeds equivalent to ~100 MHz. When the processor subsystem is well verified and validated, typically a reused IP subsystem, then hybrid mode can significantly increase the verification cycles of other blocks and subsystems, especially driving tests using embedded software and verifying functionality within a full chip context. Hybrid mode can rightfully be viewed as a sub-category of the virtual usage mode of hardware acceleration.
    • As with simulation and formal before it, hardware acceleration solutions are evolving targeted verification “applications” to facilitate productivity when verifying specific objectives or target markets. For example, a DFT application accelerates and facilitates the validation of test vectors and test logic which are usually added and tested at the gate-level.

In conclusion, it may seem that simulation is being used less today. But, it is all relative. The total number of verification cycles is growing exponentially. More simulation cycles are being performed today even though hardware acceleration and formal cycles are taking relatively larger pieces of the overall verification pie. Formal is growing in appeal as a complementary engine. Because of its comprehensive verification nature, it can significantly bend the cost curve for high-valued (difficult/challenging) verification tasks and objectives. The size and complexity of designs today require the application of all verification engines to the challenges of verifying and validating (pre-silicon) the hardware design and enabling early SW development. The use of hardware acceleration continues to shift-left and be used earlier in the verification and validation flow causing emulation and FPGA prototyping to evolve into full-fledged verification engines (not just ICE validation engines).

Rapid Prototyping is an Enduring Methodology

Thursday, September 24th, 2015

Gabe Moretti, Senior Editor

When I started working in the electronic industry hardware development had prototyping using printed circuit boards (PCB) as its only verification tool.   The method was not “rapid” since it involved building and maintaining one or more PCBs.  With the development of the EDA industry simulators became an alternative method, although they really only achieved popularity in the ‘80s with the introduction of hardware description languages like Verilog and VHDL.

Today the majority of designs are developed using software tools, but rapid prototyping is still a method used in a significant portion of designs.  In fact hardware based prototyping is a growing methodology, mostly due to the increased power and size of FPGA devices.  It can now really be called “rapid prototyping.”

Rapid Prototyping Defined

Lauro Rizzatti, a noted expert on the subject of hardware based development of electronics, reinforces the idea in this way: “Although one of the oldest methods used to verify designs, dating back to the days of breadboarding, FPGA prototyping is still here today, and more and more it will continue to be used in the future. “

Saba Sharifi, VP Business Development, Logic Business Unit, System LSI Group at Toshiba America Electronic Components, describes the state of rapid prototyping as follows: “While traditional virtual prototyping involves using CAD and CAE tools to validate a design before building a physical prototype, rapid prototyping is growing in popularity as a method for prototyping SoC and ASIC designs on an FPGA for hardware verification and early software development. In a rapid prototyping environment, a user may start development using an FPGA-based system, and then choose either to keep the design in the FPGA, or to transfer it into a hard-coded solution such as an ASIC. There are a number of different ways to achieve this end.”

To support the hardware based prototyping methodology Toshiba has introduced a new type of Device: Toshiba’s Fast Fit Structured Array (FFSA).   The FFSA technology utilizes metal-configurable standard cell (MCSC) SoC platform technology for designing ASICs and ASSPs, and for replacing FPGAs. Designed with FPGA capabilities in mind, FFSA provides pre-developed wrappers that can realize some of the key FPGA functionality, as well as pre-defined master sizes, to facilitate the conversion process.

According to Saba “In a sense, it’s an extension to traditional FPGA-to-ASIC rapid prototyping – the second portion of the process can be achieved significantly faster using the FFSA approach.  The goal with FFSA technology is to enable developers to reduce their time to market and non-recurring engineering (NRE) costs by minimizing customizable layers (to four metal layers) while delivering the performance, power and lower unit costs associated with standard-cell ASICs.   An FFSA device speeds front-end development due to faster timing closure compared to a traditional ASIC, while back-end prototyping is improved via the pre-defined master sizes. In some cases, customers pursue concurrent development – they get the development process started in the FPGA and begin software development, and then take the FPGA into the FFSA platform. The platform takes into consideration the conversion requirements to bring the FPGA design into the hard-coded FFSA so that the process can be achieved more quickly and easily.  FFSA supports a wide array of interfaces and high speed Serdes (up to 28G), making it well suited for wired and wireless networking, SSD (storage) controllers, and a number of industrial consumer applications. With its power, speed, NRE and TTM benefits, FFSA can be a good solution anywhere that developers have traditionally pursued an FPGA-based approach, or required customized silicon for purpose-built applications with moderate volumes.”

According to Troy Scott, Product Marketing Manager, Synopsys: “FPGA-based prototypes are a class of rapid prototyping methods that are very popular for the promise of high-performance with a relatively low cost to get started. Prototyping with FPGAs is now considered a mainstream verification method which is implemented in a myriad of ways from relatively simple low-cost boards consisting of a single FPGA and memory and interface peripherals, to very complex chassis systems that integrate dozens of FPGAs in order to host billion-gate ASIC and SoC designs. The sophistication of the development tools, debug environment, and connectivity options vary as much as the equipment architectures do. This article examines the trends in rapid prototyping with FPGAs, the advantages they provide, how they compare to other simulation tools like virtual prototypes, and EDA standards that influence the methodology.

According to prototyping specialists surveyed by Synopsys the three most common goals they have are to speed up RTL simulation and verification, validate hardware/software systems, and enable software development. These goals influence the design of a prototype in order to address the unique needs of each use case. “

Figure 1. Top 3 Goals for a Prototype  (Courtesy of Synopsys, Inc)

Advantages of Rapid Prototyping

Frank Schirrmeister, Group Director for Product Marketing, System Development Suite, Cadence provides a business  description of the advantages: “The growth of FPGA-based prototyping is really primarily driven by software needs because software developers are, in terms of numbers of users, the largest consumer base. FPGA-based prototyping also provides satisfactory speed while delivering the real thing at RTL accuracy and at a reasonable replication cost.”

Stephen Bailey, Director of Emerging Technologies, DVT at Mentor graphics puts it as follows: “Performance, more verification cycles in a given period of time, especially for software development, drives the use of rapid prototypes.  Rapid prototyping typically provides 10x (~10 MHz) performance advantage over emulation which is 1,000x (~1 MHz) faster than RTL software simulation (~1 kHz).
Once a design has been implemented in a rapid prototype, that implementation can be easily replicated across as many prototype hardware platforms as the user has available.  No recompilation is required.  Replicating or cloning prototypes provides more platforms for software engineers, who appreciate having their own platform but definitely want a platform available whenever they need to debug.”

Lauro points out that: “The main advantage of rapid prototyping is very fast execution speed that comes at the expense of a very long setup time that may take months on very large designs. Partitioning and clock mapping require an uncommon expertise. Of course, the assumption is that the design fits in the maximum configuration of the prototyping board; otherwise the approach would not be applicable. The speed of FPGA prototyping makes it viable for embedded software development. They are best used for validating application software and for final system validation.”

Troy sees the advantages of the methodology as: “To address verification tasks the prototype helps to characterize timing and pipeline latencies that are not possible with more high-level representations of the design and perhaps more dramatically the prototype is able to reach execution speeds in the hundreds of megahertz. Typical prototype architectures for verification tasks rely on a CPU-based approach where traffic generation for the DUT is written as a software program. The CPU might be an external workstation or integrated with the prototype itself. Large memory ICs adjacent to the FPGAs store input traffic and results that is preloaded and read back for analysis. Prototypes that can provide an easy to implement test infrastructure that includes memory ICs and controllers, a high-bandwidth connection to the workstation, and a software API will accelerate data transfer and monitoring tasks by the prototyping team.

Software development and system validation tasks will influence the prototype design as well. The software developer is seeking an executable representation to support porting of legacy code and develop new device drivers for the latest interface protocol implementation. In some cases the prototype serves as a way for a company to deploy an architecture design and software driver examples to partners and customers. Both schemes demand high execution speed and often real world interface PHY connections. For example, consumer product developers will seek USB, HDMI, and MIPI interfaces while an industrial product will often require ADC/DAC or Ethernet interfaces. The prototype then must provide an easy way to connect to accessory interfaces and ideally a catalog of controllers and reference designs. And because protocol validation may require many cycles to validate, a means to store many full milliseconds of hardware trace data helps compliance check-out and troubleshooting.”

Rapid Prototyping versus Virtual Prototyping

According to Steve “Rapid prototyping, also called FPGA prototyping, is based on a physical representation of the design under test (DUT) that gets mapped on an array of FPGA devices. Virtual prototyping is based on a virtual representation of the DUT. The source code of an FPGA prototype is RTL code, namely synthesizable design. The source code of a virtual prototype is a design description at a higher level of abstraction, either based on C/C++/SystemC or SystemVerilog languages, that is not synthesizable.  Its rather limited support for hardware debugging hinders its ability to verify drivers and operating systems, where hardware emulation excels.”

Troy describes Synopsys position as such: “To address verification tasks the prototype helps to characterize timing and pipeline latencies that are not possible with more high-level representations of the design and perhaps more dramatically the prototype is able to reach execution speeds in the hundreds of megahertz. Typical prototype architectures for verification tasks rely on a CPU-based approach where traffic generation for the DUT is written as a software program. The CPU might be an external workstation or integrated with the prototype itself. Large memory ICs adjacent to the FPGAs store input traffic and results that is preloaded and read back for analysis. Prototypes that can provide an easy to implement test infrastructure that includes memory ICs and controllers, a high-bandwidth connection to the workstation, and a software API will accelerate data transfer and monitoring tasks by the prototyping team.

Software development and system validation tasks will influence the prototype design as well. The software developer is seeking an executable representation to support porting of legacy code and develop new device drivers for the latest interface protocol implementation. In some cases the prototype serves as a way for a company to deploy an architecture design and software driver examples to partners and customers. Both schemes demand high execution speed and often real world interface PHY connections. For example, consumer product developers will seek USB, HDMI, and MIPI interfaces while an industrial product will often require ADC/DAC or Ethernet interfaces. The prototype then must provide an easy way to connect to accessory interfaces and ideally a catalog of controllers and reference designs. And because protocol validation may require many cycles to validate, a means to store many full milliseconds of hardware trace data helps compliance check-out and troubleshooting.”

Lauro kept is answer short and to the point.  “Rapid prototyping, also called FPGA prototyping, is based on a physical representation of the design under test (DUT) that gets mapped on an array of FPGA devices. Virtual prototyping is based on a virtual representation of the DUT. The source code of an FPGA prototype is RTL code, namely synthesizable design. The source code of a virtual prototype is a design description at a higher level of abstraction, either based on C/C++/SystemC or SystemVerilog languages, that is not synthesizable.”

Cadence’s position is represented by Frank in his usual thorough style.  “Once RTL has become sufficiently stable, it can be mapped into an array of FPGAs for execution. This essentially requires a remapping from the design’s target technology into the FPGA fabric and often needs memories remodeled, different clock domains managed, and smart partitioning before the mapping into the individual FPGAs happens using standard software provided by the FPGA vendors. The main driver for the use of FPGA-based prototyping is software development, which has changed the dynamics of electronics development quite fundamentally over the last decade. Its key advantage is its ability to provide a hardware platform for software development and system validation that is fast enough to satisfy software developers. The software can reach a range of tens of MHz up to 100MHz and allows connections to external interfaces like PCIe, USB, Ethernet, etc. in real time, which leads to the ability to run system validation within the target environments.

When time to availability is a concern, virtual prototyping based on transaction-level models (TLM) can be the clear winner because virtual prototypes can be provided independently of the RTL that the engines on the continuum require. Everything depends on model availability, too. A lot of processor models today, like ARM Fast Models, are available off-the-shelf. Creating models for new IP often delays the application of virtual prototypes due to their sometimes extensive development effort, which can eliminate the time-to-availability advantage during a project. While virtual prototyping can run in the speed range of hundreds of MIPS, not unlike FPGA-based prototypes, the key differences between them are the model fidelity, replication cost, and the ability to debug the hardware.

Model fidelity often determines which prototype to use. There is often no hardware representation available earlier than virtual prototypes, so they can be the only choice for early software bring-up and even initial driver development. They are, however, limited by model fidelity – TLMs are really an abstraction of the real thing as expressed in RTL. When full hardware accuracy is required, FPGA-based prototypes are a great choice for software development and system validation at high speed. We have seen customers deliver dozens if not hundreds of FPGA-based prototypes to software developers, often three months or more prior to silicon being available.

Two more execution engines are worth mentioning. RTL simulation is the more accurate, slower version of virtual prototyping. Its low speed in the Hz or KHz range is really prohibitive for efficient software development. In contrast, due to the high speed of both virtual and FPGA-based prototypes, software development is quite efficient on both of them. Emulation is the slower equivalent of FPGA-based prototyping that can be available much earlier because its bring-up is much easier and more automated, even from not-yet-mature RTL. It also offers almost simulation-like debug and, since it also provides speed in the MHz range, emulation is often the first appropriate engine for software and OS bring-up used for Android, Linux and Windows, as well as for executing benchmarks like AnTuTu. Of course, on a per project basis, it is considered more expensive than FPGA-based prototyping, even though it can be more cost efficient from a verification perspective when considering multiple projects and a large number of regression workloads.”

Figure 2: Characteristics of the two methods (Courtesy of Synopsys Inc.)

Growth Opportunities

For Lauro the situation boils down to this: “Although one of the oldest methods used to verify designs, dating back to the days of breadboarding, FPGA prototyping is still here today, and more and more it will continue to be used in the future. The dramatic improvement in FPGA technology that made possible the manufacturing of devices of monstrous capacity will enable rapid prototyping for ever larger designs. Rapid prototyping is an essential tool in the modern verification/validation toolbox of chip designs.”
Troy thinks that there are growth opportunities and explained: “A prototype’s high-performance and relatively low investment to fabricate has led them to proliferate among IP and ASIC development teams. Market size estimates show that 70% of all ASIC designs are prototyped to some degree using an FPGA-based system today. Given this demand several commercial offerings have emerged that address the limitations exhibited by custom-built boards. The benefits of almost immediate availability, better quality, modularity for better reuse, and the ability to out-source support and maintenance are big benefits. Well documented interfaces and usage guidelines make end-users largely self-sufficient. A significant trend for commercial systems now is development and debugging features of the EDA software being integrated or co-designed along with the hardware system. Commercial systems can demonstrate superior performance, debug visibility, and bring-up efficiency as a result of the development tools using hardware characterization data, being able to target unique hardware functions of the system, and employing communication infrastructure to the DUT. Commercial high-capacity prototypes are often made available as part of the IT infrastructure so various groups can share or be budgeted prototype resources as project demands vary. Network accessibility, independent management of individual FPGA chains in a “rack” installation, and job queue management are common value-added features of such systems.

Another general trend in rapid prototyping is to mix a transaction-level model (TLM) and RTL model abstractions in order to blend the best of both to accelerate the validation task. How do Virtual versus Physical prototypes differ? The biggest contrast is often the model’s availability during the project.  In practice the latest generation CPU architectures are not available as synthesizable RTL. License and deployment restrictions can limit access or the design is so new the RTL is simply not yet available from the vendor. For these reasons virtual prototypes of key CPU subsystems are a practical alternative.  For best performance and thus for the role of software development tasks, hybrid prototypes typically join an FPGA-based prototype, a cycle-accurate implementation in hardware with a TLM prototype using a loosely-timing (LT) coding style. TLM abstracts away the individual events and phases of the behavior of the system and instead focuses on the communication transactions. This may be perfectly acceptable model for the commercial IP block of a CPU but may not be for new custom interface controller-to-PHY design that is being tailored for a particular application. The team integrating the blocks of the design will assess that the abstraction is appropriate to satisfy the verification or validation scenarios.  Although one of the oldest methods used to verify designs, dating back to the days of breadboarding, FPGA prototyping is still here today, and more and more it will continue to be used in the future. The dramatic improvement in FPGA technology that made possible the manufacturing of devices of monstrous capacity will enable rapid prototyping for ever larger designs. Rapid prototyping is an essential tool in the modern verification/validation toolbox of chip designs.”

Steve’s described his opinion as follows: “Historically, rapid prototyping has been utilized for designs sized in the 10’s of millions of gates with some advanced users pushing capacity into the low 100M gates range.  This has limited the use of rapid prototyping to full chips on the smaller range of size and IP blocks or subsystems of larger chips.  For IP block/subsystems, it is relatively common to combine virtual prototypes of the processor subsystem with a rapid prototype of the IP block or subsystem.  This is referred to as “hybrid prototyping.
With the next generation of FPGAs such as Xilinx’s UltraScale and Altera’s Stratix-10 and the continued evolution of prototyping solutions, creating larger rapid prototypes will become practical.  This should result in expanded use of rapid prototyping to cover more full chip pre-silicon validation uses.
In the past, limited silicon visibility made debugging difficult and analysis of various aspects of the design virtually impossible with rapid prototypes.  Improvements in silicon visibility and control will improve debug productivity when issues in the hardware design escape to the prototype.  Visibility improvements will also provide insight into chip and system performance and quality that were previously not possible.”

Frank concluded that: “The growth of FPGA-based prototyping is really primarily driven by software needs because software developers are, in terms of numbers of users, the largest consumer base. FPGA-based prototyping also provides satisfactory speed while delivering the real thing at RTL accuracy at a reasonable replication cost. Second, the rollout of Cadence’s multi-fabric compiler that maps RTL both into the Palladium emulation platform and into the Protium FPGA-based prototyping platform significantly eases the trade-offs with respect to speed and hardware debug between emulation and FPGA-based prototyping. This gives developers even more options than they ever had before and widens the applicability of FPGA-based prototyping. The third driver for growth in prototyping is the advent of hybrid usage of, for example, virtual prototyping with emulation, combining fast execution for some portions of the design (like the processors) with accuracy for other aspects of the design (like graphics processing units).

Overall, rapid or FPGA-based prototyping has its rightful place in the continuum of development engines, offering users high-speed execution of an accurate representation. This advantage makes rapid or FPGA-based prototyping a great platform for software development that requires hardware accuracy, as well as for system validation.”

Conclusion

All four of the contributors painted a positive figure of rapid prototyping.  The growth of FPGA devices, both in size and speed, has been critical in keeping this type of development and verification method applicable to today’s designs.  It is often the case that a development team will need to use a variety of tools as it progresses through the task, and rapid prototyping has proven to be useful and reliable.

System Level Power Budgeting

Wednesday, March 12th, 2014

Gabe Moretti, Contributing Editor

I would like to start by thanking Vic Kulkarni, VP and GM at Apache Design a wholly owned subsidiary of ANSYS, Bernard Murphy, Chief Technology Officer at Atrenta,and Steve Brown, Product Marketing Director at Cadence for contributing to this article.

Steve began by nothing that defining a system level power budget for a SoC starts from chip package selection and the power supply or battery life parameters. This sets the power/heat constraint for the design, and is selected while balancing functionality of the device, performance of the design, and area of the logic and on-chip memories.

Unfortunately, as Vic points out semiconductor design engineers must meet power specification thresholds, or power budgets, that are dictated by the electronic system vendors to whom they sell their products.   Bernard wrote that accurate pre-implementation IP power estimation is almost always required. Since almost all design today is IP-based, accurate estimation for IPs is half the battle. Today you can get power estimates for RTL with accuracy within 15% of silicon as long as you are modeling representative loads.

With the insatiable demand for handling multiple scenarios (i.e. large FSDB files) like GPS, searches, music, extreme gaming, streaming video, download data rates and more using mobile devices, dynamic power consumed by SOCs continues to rise in spite of strides made in reducing the static power consumption in advanced technology nodes. As shown in Figure 1, the end user demand for higher performance mobile devices that have longer battery life or higher thermal limit is expanding the “power gap” between power budgets and estimated power consumption levels.

Typical “chip power budget” for a mobile application could be as follows (Ref: Mobile companies): Active power budget = 700mW @100Mbps for download with MIMO, 100mW @IDLE-mode; Leakage power <5mW with all power-domain off etc.

Accurate power analysis and optimization tools must be employed during all the design phases from system level, RTL-to-gate level sign-off to model and analyze power consumption levels and provide methodologies to meet power budgets.

Skyrocketing performance vs. limited battery & thermal limit (ref. Samsung- Apache Tech Forum)

The challenge is to find ways to abstract with reasonable accuracy for different types of IP and different loads. Reasonable methods to parameterize power have been found for single and multiple processor systems, but not for more general heterogeneous systems. Absent better models, most methods used today are based on quite simple lookup tables, representing average consumption. Si2 is doing work in defining standards in this area.

Vic is convinced that careful power budgeting at a high level also enables design of the power delivery network in the downstream design flow. Power delivery with reliable and consistent power to all components of ICs and electronic systems while meeting power budgets is known as power delivery integrity.  Power delivery integrity is analogous to the way in which an electric power grid operator ensures that electricity is delivered to end users reliably, consistently and in adequate amounts while minimizing loss in the transmission network.  ICs and electronic systems designed with inadequate power delivery integrity may experience large fluctuations in supply voltage and operating power that can cause system failure. For example, these fluctuations particularly impact ICs used in mobile handsets and high performance computers, which are more sensitive to variations in supply voltage and power.  Ensuring power delivery integrity requires accurate modeling of multiple individual components, which are designed by different engineering teams, as well as comprehensive analysis of the interactions between these components.

Methods To Model System Behavior With Power

At present engineers have a few approaches at their disposal.  Vic points out that the designer must translate the power requirements into block-level power budgeting to come up with specific metrics.

Dynamic power estimation per operating power mode, leakage power and sleep power estimation at RTL, power distribution at a glance, identification of high-power consuming areas, power domains, frequency-scaling feasibility for each IP, retention flop design trade-off, power-delivery network planning, required current consumption per voltage source and so on.

Bernard thinks that Spreadsheet Modeling is probably the most common approach. The spreadsheet captures typical application use-cases, broken down into IP activities, determined from application simulations/emulations. It also represents, for each IP in the system, a power lookup table or set of curves. Power estimation simply sums across IP values in a selected use-case. An advantage is no limitation in complexity – you can model a full smart phone including battery, RF and so on. Disadvantages are the need to understand an accurate set of use-cases ahead of deployment, and the abstraction problem mentioned above.  But Steve points out that these spreadsheets are difficult to create and maintain, and fall short for identifying outlier conditions that are critical for the end users experience.

Steve also points out that some companies are adapting virtual platforms to measure dynamic power, and improve hardware / software partitioning decisions. The main barrier to this solution remains creation of the virtual platform models, and then also adding the notion of power to the models. Reuse of IP enables reuse of existing models, but they still require effort to maintain and adapt power calculations for new process nodes.

Bernard has experienced engineers that run the full RTL against realistic software loads, dump activity for all (or a large number) of nodes and compute power based on the dump. An advantage is that they can skip the modeling step and still get an estimate as good as for RTL modeling. Disadvantages include needing the full design (making it less useful for planning) and significant slowdown in emulation when dumping all nodes, making it less feasible to run extensive application experiments.  Steve concurs.  Dynamic power analysis is a particularly useful technique, available in emulation and simulation. The emulator provides MHz performance enabling analysis of many cycles, often times with test driver software to focus on the most interesting use cases.

Bernard is of the opinion that while C/C++/SystemC Modeling seems an obvious target, it also suffers from the abstraction problem. Steve thinks that a likely architecture in this scenario has the virtual platform containing the processing subsystem and memory subsystem and executes as 100s of MHz, and the emulator contains the rest of the SoC and a replication of the memory subsystem and executes at higher speeds and provides cycle accurate power analysis and functional debugging.

Again,  Bernard wants to underscore, progress has been made for specialized designs, such as single and multiple processors, but these approaches have little relevance for more common heterogeneous systems. Perhaps Si2 work in this area will help.


Extension Media websites place cookies on your device to give you the best user experience. By using our websites, you agree to placement of these cookies and to our Privacy Policy. Please click here to accept.