Published on May 27th, 2008
Designing in 65-nm and smaller geometries can be
intimidating. But engineers can ensure first-time silicon
success by paying attention to certain aspects of their design
methodologies. Many of those methodologies have already
been used for 130- and 90-nm designs. They simply need
more attention at 65 nm and beyond (see Figure 1). Power
management, for example, is a key aspect of design—especially
if the application is battery powered. There are two kinds of
power: active and leakage. A variety of techniques exist to
manage both types.
Figure 1: A 65-nm flow must take into
consideration low power, timing, and
For active power, it’s essential to incorporate power-management
techniques at the architectural level. Yet additional techniques are
available during the physical implementation of the design. Because
a significant portion of the power dissipated is due to net capacitance,
signals with high activity are used to minimize interconnect
capacitance. In addition, logic is restructured to minimize the fanout
of high-activity signals.
During timing optimization, cell sizing and buffer insertion
are performed with power taken into consideration. Because
a significant portion of active power is in the clock network,
several methods are used to reduce clock-tree power. Clock
gating optimizes clock power while integrated clock-gate cells
eliminate the need for close placement of the flip flop and gating
logic. In addition, clock gating is integrated with clock-tree
synthesis to optimize power with timing, skew, and insertion
delay constraints. Other tools create custom flip flops to reduce
power even further. Another useful method is using multiple
voltage islands, as power is proportional to the square of the
voltage. Yet that approach requires multiple power supplies and
more than two generally isn’t practical.
Leakage power is a concern as geometries shrink and the gate
oxide thickness of the transistor and operating voltage gets
smaller. At 65 nm and below, it is a significant component of
the total power—especially at higher operating temperatures.
The most effective way to reduce leakage power is to use power
islands and shut-off sections of the design when they’re not
active. This approach reduces total power consumption including
leakage power. Two considerations need to be taken into account
with power islands. First, critical data should be saved in dataretention
flip flops before shutting down. When the main power
island is shut down, its output signals also should go into a
high-impedance state and leave the inputs of the active island
in floating state. Depending on the voltage level at these floating
inputs, short-circuit currents could occur in the active island’s
input stages. They would consume significant power. As a result,
inputs are always gated with sleep-mode signals to ensure that
the signals are always in a known state. The wake-up time is
typically in the range of a few milliseconds.
Another way to reduce leakage power is to use footer switches with
the logic. In this design, the logic cells are connected to ground by
a low-leakage footer switch. That switch is implemented using a
high-threshold transistor. In the normal mode of operation, this
transistor is on and the logic cells have a common virtual ground. In
sleep mode, the footer switch is off—causing the logic to go inactive.
Because the footer uses high-threshold transistors, it minimizes
leakage in the logic circuitry.
Although this approach reduces leakage, it negatively affects area
and performance. To be effective, the footer switch is distributed
across the layout. Each switch controls a group of logic cells.
Additional resources are required for routing the sleep signal and
virtual ground across all switches. The switch presence causes
some degradation to the logic speed as well. As it is in the powerisland
scenario, retention flip flops are required. But the switches
are typically fast with wake-up times in the range of μs.
A third method of reducing leakage is by applying a back-gate bias
to the substrate. The MOSFET has four terminals. Yet it’s typically
treated as a three-terminal device by tying the source and substrate
terminals together. By applying a separate voltage to the substrate
terminal, the threshold voltage can be increased—thereby reducing
leakage. This approach has area and performance penalties, however.
An internal charge pump is required to generate the substrate voltage. Additional routing resources also are needed to distribute this voltage
across the logic cells. Unfortunately, the logic slows down as well.
The final method for reducing leakage is the use of multi-threshold logic
cells. This approach invites very little penalty in area or performance.
During the physical design process, multiple threshold-voltage libraries
are used for logic synthesis and optimization. Standard or lowthreshold
devices are selected for critical portions of the design, while
high-threshold cells are used in non-critical sections. Typically only 20%
of the design uses the standard or low-threshold devices. The remaining
cells are therefore high threshold, enabling lower leakage.
Typically, timing signoff is achieved by performing analyses at the
fast and slow corners of a design. In 65-nm and lower geometries, the
worst-case corner isn’t clearly defined. It also is design dependent—
especially with a large operating-voltage range due to multiple voltage
islands. The corner for the worst-case timing path becomes circuit
dependent. In addition, the circuit timing behavior is no longer
monotonic. As a result, there may be multiple conditions when the
worst-case timing occurs. Such conditions arise due to temperature
inversion, in which a transistor becomes slower at a lower temperature
and affects the delay of a path depending on its location. Clearly,
timing analysis needs to be performed at more corners.
Signal integrity, crosstalk, and on-chip variations also become
significant at smaller geometries. Variations arise from process,
geometry, and power-supply changes due to IR-drop and temperature
fluctuations. The relative magnitude of these variations can be as
much as 20%. Although static timing analysis uses additional margins
for on-chip variations, it could be overly pessimistic. In other words,
achieving signoff may require additional area overhead or may simply
not be possible. Statistical-timing-analysis techniques are emerging
that promise to address both of these issues. These techniques treat
each timing path as a statistical variable and determine the probability
of it meeting the timing requirements.
Normally, design-for-methodology (DFM) rules are incorporated
in a technology’s design rules. While it’s necessary for a design to
pass the technology’s design rules, that requirement isn’t sufficient
to achieve good yield. Although a design may pass a rule, how close
it adheres to that rule and the statistical distribution of its deviation
from the rule have an impact on yield. Many foundries don’t
mandate DFM analysis for 90-nm and larger geometries. Yet it is
critical for 65-nm and smaller geometries (see Figure 2). Performing
DFM analysis for these designs can identify simple changes that will
improve overall yield. DFM rules are useful in the early stages of a
technology. As the technology matures, the manufacturing process
is under tighter control and becomes more robust.
Two types of DFM rules exist. The first set affects critical parameters
of the circuit and its operation. It is used by extraction tools to more
accurately model the physical layout. This set also is applied to cell
libraries, analog circuits, and custom layout designs. The second set of
rules isn’t critical. But it should improve yield.
Figure 2: DFM-aware design flows are a necessity
at 65 nm to address emerging physical effects
earlier in the design process.
Many foundries perform DFM analysis for a design and make
suggestions to improve yield. Some of those suggestions can be
implemented by modifying the layout without increasing the design
area. Such capabilities also have been introduced in the layout tools
based on a target foundry. These can be applied throughout the
design flow—from cell level to completed chip. Wire spreading and
wire widening, for example, can be performed to reduce critical areas
during multiple stages of the flow. Examples of such stages include
global and detailed routing. Lithography process check (LPC) is an
analysis that’s superior to rule-based systems, which don’t address
design complexity and variability well. By using a model-based
approach that utilizes lithography simulation, these tools provide
better accuracy. Lithography-aware routing and extraction provide a
more robust design and sign-off.
In summary, designing in 65 nm and beyond isn’t as hard as one
would imagine. There are more steps with more pitfalls. As long
as the designer is aware of them and proactively addresses them
in the methodology, however, he or she has a good chance of
first-time silicon success.