Published in June 2008 issue of Chip Design Magazine
For many years, good coding practices have been an essential component of design methodology. Register-transfer-level (RTL) structures have been known to cause unintended bugs as well as DFT, power, clock-structure, and sometimes even place-and-route issues. As the industry rapidly adopts multi-voltage techniques for low power, how will these rules and guidelines change? In this article, we’ll examine some common coding practices and evaluate how they’ll change in the low-power era.
Many testbenches are written to detect an X on various critical signals and give error messages. In some instances, they may abruptly end the testcase with a fail status. This is an outright conflict with low-power design practices, which rely on X and Z corruption to reflect logic values in shutdown. Modifying such X-detection routines to account for shutdown is one of the most common changes to current coding practices in testbenches.
For starters, consider the ubiquitous 1’b1 and 1’b0 constants that are used all over the RTL. This approach was absolutely fine in the days when the entire chip (or at least the core) had one single supply voltage and a single ground. In a multi-voltage design, there’s no such thing as a single VDD or ground connection. In addition, rails like back bias nets or retention supplies might not even drive 1s and 0s. They may be arbitrary voltage values that aren’t equal to VDD/VSS value, which makes it questionable for them to be declared as supply1/0 nets.
Note that emerging standards like the Unified Power Format (UPF) do define power nets/rails. In addition, a type/value can be assigned to them. This alleviates some of the difficulty in analyzing the power nets and their connections. But the burden of avoiding hardwired constants is still on
the RTL designer.
In most cases, the answer seems to connect to the local VDD/VSS of the standard cell. That may work well most of the time—especially in static multi-domain designs. However, consider the case of power gated domains, in which both the source VDD and switched VDD are considered supply1. It would be legal to connect either one to a supply1 connection.
Additionally, consider the case in which the constant is connected across from another domain in the design. Placeand- route tools in particular grapple with this issue in their “flat” view of the design. In the worst case, the parent module is in one power domain and instantiates constants in the portmap of a module that’s partitioned into another domain.
Note that synthesis/physical synthesis optimizes constants away. This may no longer be valid for some multi-voltage designs. This situation must be treated differently depending on whether the constant is local and subject to being turned off and whether there’s any interaction with other domains (see Figure 1).
Figure 1: Here are various possibilities for a 1’b1 in a low-power design.
One solution that will work is the creation of TIE_HI_ VDDx or TIE_LO_VSSx nets. This approach will force RTL designers to explicitly identify constants and think through the implications. It also will serve as an unambiguous guide to the verification and implementation tools.
To summarize this section, here are a few simple rules for multi-voltage low-power designs:
There are designs in which the first stage of logic is recommended to be a storage element. Although this used to help timing, it doesn’t work very well if the power domain containing the logic is turned off and the sender of the data is still powered on. In fact, it could be outright dangerous if the flip flop’s first stage is a pass transistor.
A deeper look tells us that the problem has two aspects. One is the case in which the eventual target library has a pass transistor connection at its first stage (D or scan input). When a live/on domain drives this connection and when the domain with the first-stage flip flop is off, there could be a sneak path for current. After all, the state of the gate is unknown. In rare and extreme cases, this can cause device breakdown. But the normal symptom is power wastage (see Figure 2).
Figure 2: First-stage flip flops in off domains can be problematic.
Speaking of the state of the pass transistor’s gate, it depends on the clock condition. If the clock to the domain is wiggling, it potentially connects to many first-stage CMOS gates. Thus, a lot of capacitance is wiggling even though the domain is off. In addition, this could keep opening the pass transistors described in the earlier paragraph if the external clock directly drives them. This situation is a pure waste of power and must be avoided.
Here are a few rules for coding IP blocks:
It’s customary to write testbench code to monitor various functions in the code. Similarly, assertions may be written either at the testbench level or deep in the code. Unfortunately, when most of these assertions or monitors were written, there may not have been any multi-voltage architecture planned. The verification engineer is likely to encounter many tricky situations when migrating such a testbench/environment to the multi-voltage world.
First, consider monitor statements that directly access name nets hierarchically (ouch, bad coding to start!). If the domain goes to shutdown, these nets may be assigned to z or x, throwing off code written in the monitor. Similarly, assertions may not factor in shutdown conditions. It’s not
as simple as factoring in x and z values in the code, however. A power-state transition like shutdown goes through a number of pre- and post-shutdown management events, such as clock gating, multiple resets, retention/restore sequence, etc. The monitor or assertion set needs to “stall” or
account for these transitionary states. In fact, new assertion and monitor code may be needed to factor in the powerstate tables.
Broadly speaking, the change in monitor/assertion code is that there may be the following: code that’s “always” monitoring the block, code that’s “off ” when the block is off, code that’s “on” when the block is off, and any further code to monitor transitionary states. Extending this concept further, force statements at the testbench level may make crossmodule references. Such references are done particularly to set up pin strap options, device ID bits, etc. These force statements can conflict with any assignments that are being done by the simulator—especially in shutdown and retention situations. Even without low-power design, using cross-module force statements is a big no-no. Low-power
design adds further twists to the usage of this construct.
Almost every testbench infrastructure utilizes initial blocks. Often, initial statements are used to load memories, set constants, set finish times/stop times, etc. If the initial block (along with readmem) is used for a block that’s off by default and wakes up only later, any initializations must be deferred until the actual wakeup. Similarly, for a block that can be turned off, any memory initialization must be repeated after every power up. In addition, such an initialization needs to be sensitive to any handshake with power good and reset signals applied to the block. Indeed, this handshake is often a source of bugs. It’s therefore best to avoid such readmem-based initializations. In the interest
of simulation performance, at least a few tests must cover the actual hardware-based initialization sequence.
On the contrary, registers like nonvolatile memory bits, laser fuse bits, one-time programmable, etc. must not be corrupted by shutdown. Unfortunately, current HDLs do not provide for a simulation semantic to such bits in the first place. In the era of low-power design, recognizing and supporting such bits is essential to accurate verification. Note further that these bits don’t wake up instantly. There’s a point of activation along the rising power rail as it turns on. In addition, the protocol often involves power good and reset signals to “latch” these bits, adding further complexity to how this mechanism can be verified.
Extending this concept further, any pli routines that form behavioral models/collect data—including debug/coverage routines—must be aware of the shutdown conditions. For example, a CPU simulation model may be built in C and hidden inside a wrapper. A shutdown of the CPU may completely escape such a model. In fact, the model may not just not shut down accurately. It may wake up or reset
incorrectly as well.
Retention is altogether a new semantic that’s being applied to sequential elements. Consider a sequential element like a flip flop being assigned to be a retention element. In this case, the flip
flop is probably coded (in Verilog HDL) as an always @ posedge or negedge of clock construct. However, the intended behavior is that when the domain goes to shutdown, there are additional save/restore signals hooked up to the actual sequential elements that retain and restore the value of the bit. There are numerous implementations of retention elements available. They change
the protocol that’s followed for save/restore and further impact the behavior of clock and reset (and scan in some cases). So, the same RTL may need to be simulated differently, depending on the actual behavior of the element being used in the context of instantiation. For example, a reset may clear the output of the flip flop but not the retention element. In addition, there may be a special reset pin needed to flush the retention element itself.
The complexity, of course, also stems from the fact that the original RTL doesn’t have save/restore pins instantiated locally in the first place. This implies that such a “connection” is done by the UPF file on the side. While this is extremely convenient and useful for the overall flow, RTL and gate-level simulation results may vary based on how the save/restore signals are connected in the netlist.
It’s common practice to synchronize asynchronous signals whilecrossing domains. Yet power-management control loops involve many asynchronous signals whose “state” is relevant to the powermanagement unit. Although synchronizers can still be used, one must recognize that there may be additional isolation latches in the path—making the synchronizers somewhat redundant. Furthermore, the design may enter a deadlock state by gating the clock to the synchronizers while waiting for a wakeup event that never makes it past the gated synchronizers.
The industry as a whole is going through a phase in which the representations for voltage are being hashed out in various standard formats, such as UPF. It’s also fair to say that emerging power formats will go through an evolution in coding guidelines themselves—apart from the changes noted here. This is just the beginning. We must proactively evolve best practices for multivoltage design. This article has focused mostly on Verilog coding practices and verification-centric guidelines. Yet additional guidelines will generally evolve all across the flow. Semiconductor and EDA companies are just formulating these rules and undergoing the learning process. Hopefully, there won’t be many
painful mistakes along the way for design teams.
Srikanth Jadcherla is group director of R&D in the Verification Group of Synopsys.