Gabe Moretti, Senior Editor
As the size of a transistor shrinks and modifies, power management becomes more critical. As I was polling various DA vendors, it became clear that most were offering solutions for the analysis of power requirements and software based methods to manage power use, at least one, was offering a hardware based solution to power use. I struggled to find a way to coherently present their responses to my questions, but decided that extracting significant pieces of their written responses would not be fair. So, I organized a type of virtual round table, and I will present their complete answers in this article.
The companies submitting responses are; Cadence, Flex Logix, Mentor, Silvaco, and Sonics. Some of the companies presented their own understanding of the problem. I am including that portion of their contribution as well to provide a better meaning to the description of the solution.
Krishna Balachandran, product management director for low power solutions at Cadence provided the following contribution.
Not too long ago, low power design and verification involved coding a power intent file and driving a digital design from RTL to final place-and-route and having each tool in the flow understand and correctly and consistently interpret the directives specified in the power intent file. Low power techniques such as power shutdown, retention, standby and Dynamic Voltage and Frequency Scaling (DVFS) had to be supported in the power formats and EDA tools. Today, the semiconductor industry has coalesced around CPF and the IEEE 1801 standard that evolved from UPF and includes the CPF contributions as well. However, this has not equated to problem solved and case closed. Far from it! Challenges abound. Power reduction and low power design which was the bailiwick of the mobile designers has moved front-and-center into almost every semiconductor design imaginable – be it a mixed-signal device targeting the IoT market or large chips targeting the datacenter and storage markets. With competition mounting, differentiation comes in the form of better (lower) power-consuming end-products and systems.
There is an increasing realization that power needs to be tackled at the earliest stages in the design cycle. Waiting to measure power after physical implementation is usually a recipe for multiple, non-converging iterations because power is fundamentally a trade-off vs. area or timing or both. The traditional methodology of optimizing for timing and area first and then dealing with power optimization is causing power specifications to be non-convergent and product schedules to slip. However, having a good handle on power at the architecture or RTL stage of design is not a guarantee that the numbers will meet the target after implementation. In other words, it is becoming imperative to start early and stay focused on managing power at every step.
It goes without saying that what can be measured accurately can be well-optimized. Therefore, the first and necessary step to managing power is to get an accurate and consistent picture of power consumption from RTL to gate level. Most EDA flows in use today use a combination of different power estimation/analysis tools at different stages of the design. Many of the available power estimation tools at the RTL stage of design suffer from inaccuracies because physical effects like timing, clock networks, library information and place-and-route optimizations are not factored in, leading to overly optimistic or pessimistic estimates. Popular implementation tools (synthesis and place-and-route) perform optimizations based on measures of power using built-in power analysis engines. There is poor correlation between these disparate engines leading to unnecessary or incorrect optimizations. In addition, mixed EDA-vendor flows are plagued by different algorithms to compute power, making the designer’s task of understanding where the problem is and managing it much more complicated. Further complications arise from implementation algorithms that are not concurrently optimized for power along with area and timing. Finally, name-mapping issues prevent application of RTL activity to gate-level netlists, increasing the burden on signoff engineers to re-create gate-level activity to avoid poor annotation and incorrect power results.
To get a good handle on the power problem, the industry needs a highly accurate but fast power estimation engine at the RTL stage that helps evaluate and guide the design’s micro-architecture. That requires the tool to be cognizant of physical effects – timing, libraries, clock networks, even place-and-route optimizations at the RTL stage. To avoid correlation problems, the same engine should also measure power after synthesis and place-and-route. An additional requirement to simplify and shorten the design flow is for such a tool to be able to bridge the system-design world with signoff and to help apply RTL activity to a gate-level netlist without any compromise. Implementation tools, such as synthesis and place-and-route, need to have a “concurrent power” approach – that is, consider power as a fundamental cost-factor in each optimization step side-by-side with area and timing. With access to such tools, semiconductor companies can put together flows that meet the challenges of power at each stage and eliminate iterations, leading to a faster time-to-market.
Geoff Tate, Co-founder and CEO of Flex Logix is the author of the following contribution. Our company is a relatively new entry in the embedded FPGA market. It uses TSMC as a foundry. Microcontrollers and IOT devices being designed in TSMC’s new ultra-low power 40nm process (TSMC 40ULP) need
• The flexibility to reconfigure critical RTL, such as I/O
• The ability to achieve performance at lowest power
Flex Logix has designed a family of embedded FPGA’s to meet this need. The validation chip to prove out the IP is in wafer fab now.
Many products fabricated with this process are battery operated: there are brief periods of performance-sensitive activity interspersed with long periods of very low power mode while waiting for an interrupt.
Flex Logix’s embedded FPGA core provides options to enable customers to optimize power and performance based on their application requirements.
To address this requirement, the following architectural enhancements were included in the embedded FPGA core:
• Power Management containing 5 different power states:
- Off state where the EFLX core is completely powered off.
- Deep Sleep state where VDDH supply to the EFLX core can be lowered from nominal of 0.9V/1.1V to 0.5V while retaining state
- Sleep state, gates the supply (VDDL) that controls all the performance logic such as the LUTs, DSP and interconnect switches of the embedded FPGA while retaining state. The latency to exit Sleep is shorter than that that to exit from Deep Sleep
- Idle state, idles the clocks to cut power but is ready to move into dynamic mode quicker than the Sleep state
- Dynamic state where power is highest of the 4 power management states but where the latency is the shortest and used during periods of performance sensitive activity
The other architectural features available in the EFLX-100 embedded FPGA to optimize power-performance are:
• State retention for all flip flops and configuration bits at voltages well below the operating range.
• Ability to directly control body bias voltage levels (Vbp, Vbn). Controlling the body bias further controls leakage power
• 5 combinations of threshold voltage(VT) devices to optimize power and performance for static/performance logic of the embedded FPGA. Higher the threshold voltage (eHVT, HVT) lower the leakage power and lower performance while lower the threshold voltage (SVT) device, higher the leakage and higher the performance.
In addition to the architectural features various EDA flows and tools are used to optimize the Power Performance and Area (PPA) of the FlexLogix embedded FPGA:
• The embedded FPGA was implemented using a combination of standard floor-planning and P&R tools to place and route the configuration cells, DSP and LUTs macros and network fabric switches. This resulted in higher density thereby reducing IR drops and the need for larger drive strengths thereby optimizing power
• Design and use longer (non-minimum) channel length devices which further help reduce leakage power with minimal to no impact to the performance
• The EFLX-100 core was designed with an optimized power grid to effectively use metal resources for power and signal routing. Optimal power grids reduce DC/AC supply drops which further increase performance.
Arvind Narayanan, Architect, Product Marketing, Mentor Graphics contributed the following viewpoint.
One of the biggest challenges in IC design at advanced nodes is the complexity inherent in effective power management. Whether the goal is to reduce on-chip power dissipation or to provide longer battery life, power is taking its place alongside timing and area as a critical design dimension.
While low-power design starts at the architectural level, the low-power design techniques continue through RTL synthesis and place and route. Digital implementation tools must interpret the power intent and implement the design correctly, from power aware RTL synthesis, placement of special cells, routing and optimization across power domains in the presence of multiple corners, modes, and power states.
With the introduction of every new technology node, existing power constraints are also tightened to optimize power consumption and maximize performance. 3D transistors (FinFETs) that were introduced at smaller technology nodes have higher input pin capacitance compared to their planar counterpart, resulting in the dynamic power component to be higher compared to leakage.
Power Reduction Strategies
A good strategy to reduce power consumption is to perform power optimization at multiple levels during the design flow including software optimization, architecture selection, RTL-to-GDS implementation and process technology choices. The biggest power savings are usually obtained early in the development cycle at the ESL & RTL stages. (Fig 1). During physical implementation stage there is less opportunity for power optimization in comparison and hence choices made earlier in the design flow are critical. Technology selection such as the device structure (FinFET, planar), choice of device material (HiK, SOI) and technology node selection all play a key role.
Figure 1. Power reduction opportunities at different stages of the design flow
Studies have shown that only optimizations applied early in the design cycle, when a design’s architecture is not yet fixed, have the potential for radical power reduction. To make intelligent decisions in power optimization, the tools have to simultaneously consider all factors affecting power, and apply early in the design cycle. Finding the best architecture enables to properly balance functionality, performance and power metrics.
RTL-to-GDS Power Reduction
There are a wide variety of low-power optimization techniques that can be utilized during RTL to GDS implementation for both dynamic and leakage power reduction. Some of these techniques are listed below.
RTL Design Space Exploration
During the early stages of the design, the RTL can be modified to employ architectural optimizations, such as replacing a single instantiation of a high-powered logic function with multiple instantiations of low-powered equivalents. A power-aware design environment should facilitate “what-if” exploration of different scenarios to evaluate the area/power/performance tradeoffs
Multi-voltage design, a popular technique to reduce total power, is a complex task because many blocks are operating at different voltages, or intermittently shut off. Level shifter and isolation cells need to be used on nets that cross domain boundaries if the supply voltages are different or if one of the blocks is being shut down. DVFS is another technique where the supply voltage and frequency can vary dynamically to save power. Power gating using multi-threshold CMOS (MTCMOS) switches involves switching off certain portions of an IC when that functionality is not required, then restoring power when that functionality is needed.
Figure 2. Multi-voltage layout shown in a screen shot from the Nitro-SoC™ place and route system.
MCMM Based Power Optimization
Because each voltage supply and operational mode implies different timing and power constraints on the design, multi-voltage methodologies cause the number of design corners to increase exponentially with the addition of each domain or voltage island. The best solution is to analyze and optimize the design for all corners and modes concurrently. In other words, low-power design inherently requires true multi-corner/multi-mode (MCMM) optimization for both power and timing. The end result is that the design should meet timing and power requirements for all the mode/corner scenarios.
FinFET aware Power Optimization
FinFET aware power optimization flow requires technologies such as activity driven placement, multi-bit flop support, clock data optimization, interleaved power optimization and activity driven routing to ensure that the dynamic power reduction is optimal. The tools should be able to use transforms with objective costing to make trade-offs between dynamic power, leakage power, timing, and area for best QoR.
Using the strategy to optimize power at all stages of the design flow, especially at the architecture stage is critical for optimal power reduction. Architecture selection along with the complete set of technologies for RTL-to-GDS implementation greatly impact the ability to effectively manage power.
Seena Shankar, Technical Marketing Manager, is the author of this contribution.
Analysis of IR-drop, electro-migration and thermal effects have traditionally been a significant bottleneck in the physical verification of transistor level designs like analog circuits, high-speed IOs, custom digital blocks, memories and standard cells. Starting from 28 nm node and lower, all designers are concerned about power, EM/IR and thermal issues. Even at the 180 nm node if you are doing high current designs in LDMOS then EM effects, rules and thermal issues need to be analyzed. FinFET architecture has increased concerns regarding EM, IR and thermal effects. This is because of complex DFM rules, increased current and power density. There is a higher probability of failure. Even more so EM/IR effects need to be carefully analyzed and managed. This kind of analysis and testing usually occurs at the end of the design flow. Discovering these issues at that critical time makes it difficult to stick to schedule and causing expensive rework. How can we resolve this problem?
Power integrity issues must be addressed as early in the design cycle as possible, to avoid expensive design and silicon iterations. Silvaco’s InVar Prime is an early design stage power integrity analysis solution for layout engineers. Designers can estimate EM, IR and thermal conditions before sign-off stage. It performs checks like early IR-drop analysis, check of resistive parameters of supply networks, point to point resistance check, and also estimate current densities. It also helps in finding and fixing issues that are not detectable with regular LVS check like missing vias, isolated metal shapes, inconsistent labeling, and detour routing.
InVar Prime can be used for a broad range of designs including processors, wired and wireless network ICs, power ICs, sensors and displays. Its hierarchical methodology accurately models IR-drop, electro-migration and thermal effects for designs ranging from single block to full-chip. Its patented concurrent electro-thermal analysis performs simulation of multiple physical processes together. This is critical for today’s’ designs in order to capture important interactions between power and thermal 2D/3D profiles. The result is physical measurement-like accuracy with high speed even on extremely large designs and applicability to all process nodes including FinFET technologies.
InVar Prime requires the following inputs:
● Layout- GDSII
● Technology- ITF or iRCX
● Supplementary data- Layer mapping file for GDSII, Supply net names, Locations and nominal of voltage sources, Area based current consumption for P/G nets
Figure 3. Reliability Analysis provided by InVar Prime
InVar Prime enables three types of analysis on a layout database: EM, IR and Thermal. A layout engineer could start using InVar to help in the routing and planning of the power nets, VDD and VSS. IR analysis with InVar will provide them early analysis on how good the power routing is at that point. This type of early analysis flags potential issues that might otherwise appear after fabrication and result in silicon re-spins.
InVar EM/IR engine provides comprehensive analysis and retains full visibility of supply networks from top-level connectors down to each transistor. It provides a unique approach to hierarchical block modeling to reduce runtime and memory while keeping accuracy of a true flat run. Programmable EM rules enable easy adaptation to new technologies.
InVar Thermal engine scales from single cell design to full chip and provides lab-verified accuracy of thermal analysis. Feedback from thermal engine to EM/IR engines provides unprecedented overall accuracy. This helps designers understand and analyze various effects across design caused by how thermal 2D/3D profiles affect IR drop and temperature dependent EM constraints.
The main benefits of InVar Prime are:
● Accuracy verified in lab and foundries
● Full chip sign-off with accurate and high performance analysis
● Analysis available early in the back end design, when more design choices are available
● Pre-characterization not required for analysis
● User-friendly environment designed to assist quick turn-around-times
● Effective prevention of power integrity issues
● Broad range of technology nodes supported
● Reduces backend verification cycle time
● Improves probability of first silicon success
Scott Seiden contributed his company viewpoint. Sonics has developed a dynamic power management solution that is hardware based.
Sonics has Developed Industry’s First Energy Processing Unit (EPU) Based on the ICE-Grain Power Architecture. The EPUICE stands for Instant Control of Energy.
Sonics’ ICE-G1 product is a complete EPU enabling rapid design of system-on-chip (SoC) power architecture and implementation and verification of the resulting power management subsystem.
No amount of wasted energy is affordable in today’s electronic products. Designers know that their circuits are idle a significant fraction of time, but have no proven technology that exploits idle moments to save power. An EPU is a hardware subsystem that enables designers to better manage and control circuit idle time. Where the host processor (CPU) optimizes the active moments of the SoC components, the EPU optimizes the idle moments of the SoC components. By construction, an EPU delivers lower power consumption than software-controlled power management. EPUs possess the following characteristics:
- Fine-grained power partitioning maximizes SoC energy savings opportunities
- Autonomous hardware-based control provides orders of magnitude faster power up and power down than software-based control through a conventional processor
- Aggregation of architectural power savings techniques ensures minimum energy consumption
- Reprogrammable architecture supports optimization under varying operating conditions and enables observation-driven adaptation to the end system.
The Sonics’ ICE-G1 EPU accelerates the development of power-sensitive SoC designs using configurable IP and an automated methodology, which produces EPUs and operating results that improve upon the custom approach employed by expert power design teams. As the industry’s first licensable EPU, ICE-G1 makes sophisticated power savings techniques accessible to all SoC designers in a complete subsystem solution. Using ICE-G1, experienced and first-time SoC designers alike can achieve significant power savings in their designs.
Markets for ICE-G1 include:
- Application and Baseband Processors
- Tablets, Notebooks
- EnergyStar compliant systems
- Form factor constrained systems—handheld, battery operated, sealed case/no fan, wearable.
-ICE-G1 key product features are:Intelligent event and switching controllers–power grain controllers, event matrix, interrupt controller, software register interface—configurable and programmable hardware that dynamically manages both active and leakage power.
- SonicsStudio SoC development environment—graphical user interface (GUI), power grain identification (import IEEE-1801 UPF, import RTL, described directly), power architecture definition, power grain controller configuration (power modes and transition events), RTL and UPF code generation, and automated verification test bench generation tools. A single environment that streamlines the EPU development process from architectural specification to physical implementation.
- Automated SoC power design methodology integrated with standard EDA functional and physical tool flows (top down and bottom up)—abstracts the complete set of power management techniques and automatically generates EPUs to enable architectural exploration and continuous iteration as the SoC design evolves.
- Technical support and consulting services—including training, energy savings assessments, architectural recommendations, and implementation guidance.
As can be seen from the contributions analysis and management of power is multi-faceted. Dynamic control of power, especially in battery powered IoT devices is critical, since some of there devices will be in locations that are not readily reachable by an operator.