Part of the  

Chip Design Magazine

  Network

About  |  Contact

Posts Tagged ‘IEEE1801’

What Is Not Testable Is Not Fixable

Wednesday, September 17th, 2014

Gabe Moretti, Senior Editor

In the past I have mused that the three letter acronyms used in EDA like DFT, DFM, DFY and so on are superfluous since the only one that counts is DFP (Design For Profit).  This of course may be obvious since every IC reason for existence is to generate income for the seller.  But it is also true that the above observation is superficial since the IC must be testable, not only manufacturable and must also reach a yield figure that makes it cost effective.  Breaking down profitability into major design characteristics is an efficient approach, since a specific tool is certainly easier to work with than a generic one.

Bassilios Petrakis, Product Marketing Director at Cadence told me that: “DFT includes a number of requirements including manufacturing test, acceptance test, and power-on test.  In special cases it may be necessary to test the system while it is in operation to isolate faults or enable redundancies within mission critical systems.  For mission critical applications, usage of logic and memory build-in-self-test (BIST) is a commonly used approach to perform in-system test. Most recently, a new IEEE standard, P1687, was introduced to standardize the integration and instrument access protocol for IPs. Another IEEE proposed standard, P1838, has been initiated to define DFT IP for testing 3D-IC Through-Silicon-Via (TSV) based die stacks.”

Kiran Vittal, Senior Director of Product Marketing at Atrenta Inc. pointed out that: “The test coverage goals for advanced deep submicron designs are in the order of 99% for stuck-at faults and over 80% for transition faults and at-speed defects. These high quality goals can only be achieved by analyzing for testability and making design changes at RTL. The estimation of test coverage at RTL and the ability to find and fix issues that impact testability at RTL reduces design iterations and improves overall productivity to meet design schedules and time to market requirements.”

An 80% figure may seem an under achievement, but it points out the difficulty of proving full testability in light of other more demanding requirements, like area and power to name just two.

Planning Ahead

I also talked with Bassilios  about the need for a DFT approach in design from the point of view of the architecture of an IC.  The first thing he pointed out was that there are innumerable considerations that affect the choice of an optimal Design for Test (DFT) architecture for a design.   Designers and DFT engineers have to grapple with some considerations early in the design process.

Bassilios noted that “The number of chip pins dedicated to test is often a determining factor in selecting the right test architecture. The optimal pin count is determined by the target chip package, test time, number of pins supported by the automated test equipment (ATE), whether wafer multi-site test will be targeted, and, ultimately, the end-market segment application.”

He continued by noting that: “For instance, as mixed signal designs are mostly dominated by Analog IOs, digital test pins are at a premium and hence require an ultra low pin count test solution. These types of designs might not offer an IEEE1149.1 JTAG interface for board level test or a standardized test access mechanism. In contrast, large digital SoC designs have fewer restrictions and more flexibility in test pin allocation.  Once the pin budget has been established, determining the best test compression architecture is crucial for keeping test costs down by reducing test time and test data volume. Lower test times can be achieved by utilizing higher test compression ratios – typically 30-150X – while test data volume can be reduced by deploying sequential-based scan compression architectures. Test compression architectures are also available for low pin count designs by using the scan deserializer/serializer interface into the compression logic. Inserting test points that target random resistant faults in a design can often help reduce test pattern count (data volume).”

The early exploration of DFT architectures to meet design requirements – like area, timing, power, and testability – is facilitated by modern logic synthesis tools. Most DFT IP like JTAG boundary scan, memory BIST collars, logic BIST and compression macros are readily integrated into the design netlist and validated during the logic synthesis process per user’s recipe. Such an approach can provide tremendous improvements to designer productivity. DFT design rule checks are run early and often to intercept and correct undesirable logic that can affect testability.

Test power is another factor that needs to be considered by DFT engineers early on. Excessive scan switching activity can inadvertently lead to test pattern failures on an ATE. Testing one or more core or sub-block in a design in isolation together with power-aware Automatic Test Pattern Generation (ATPG) techniques can help mitigate power-related issues. Inserting core-wrapping (or isolation logic) using IEEE1500 is a good way to enable a core-based test, hierarchical test, and general analog mixed signal interactions.

For designs adopting advanced multi-voltage island techniques, DFT insertion has to be power domain-aware and construct scan chains appropriately levering industry standard power specifications like Common Power Format (CPF) and IEEE1801. A seamless integration between logic synthesis and downstream ATPG tools helps prime the test pattern validation and delivery.

ATPG

Kiran delved in greater details into the subject of testability by giving particular attention to issues with Automatic Test Pattern Generation (ATPG).  “For both stuck-at and transition faults, the presence of hard to detect faults has a substantial impact on overall ATPG performance in terms of pattern count, runtime, and test coverage, which in turn has a direct impact on the cost of manufacturing test. The ability to measure the density of hard to detect faults in a design early at the RTL stage, is extremely valuable. It gives RTL designers the opportunity to make design changes to address the issue while enabling them to quickly measure the impact of the changes.”

The performance of the ATPG engine is often measured by the following criteria:

- How close it comes to finding tests for all testable faults, i.e. how close the ATPG fault coverage comes to the upper bound.  This aspect of ATPG performance is referred to as its efficiency. If the ATPG engine finds tests for all testable faults, its efficiency is 100%.

- How long it has to run to generate the tests.  Full ATPG runs need to be completed within a certain allocated time, so the quest for finding a test is sometimes abandoned for some hard to test faults after the ATPG algorithm exceeds a pre-determined time limit.

- The larger the number of hard to test faults, the lower the ATPG efficiency.  The total number of tests (patterns) needed to test all testable faults. Note that a single test pattern can detect many testable faults.

To give a better idea of how test issues can be addressed Kiran provided me with an example.

Figure 1 (Courtesy of Atrenta)

Consider Figure 1, which has wide logic cones of flip flops and black boxes (memories or analog circuits) feeding a downstream flip flop. ATPG finds it extremely difficult to generate ‘exhaustive’ patterns and the test generation time is either long or the fault coverage is compromised. These kinds of designs can be analyzed early at RTL to find areas in the design that have poor controllability and observability, so that the designer can make design changes to improve the efficiency (test data and time) of downstream ATPG tools to generate optimum patterns to not only improve the quality of test, but also be economical to lower the cost of manufacturing test.

Figure 2 (Courtesy of Atrenta)

Figure 2 shows the early RTL analysis using Atrenta’s SpyGlass DFT tool suite. This figure highlights the problem through the schematic representation of the design and shows a thermal map on the low control/observe areas, which the designer can fix easily by recoding the RTL.

The analysis of the impact of hard to test faults at RTL can save significant design time in fixing low fault coverage and improving ATPG effectiveness for runtime and pattern count early in the design cycle, resulting in over 50x more efficiency in the design flow to meet the required test quality goals.

Conclusion

Bassilios concluded that “further improvements to testability can be achieved by performing a “what if” analysis with test point insertion and committing the test points once the desired coverage goals are met. Both top-down and bottom-up hierarchical test synthesis approaches can be supported. Early physical placement floorplan information can be imported into the synthesis cockpit to perform physically aware synthesis as well as scan ordering and congestion-free compression logic placement.”

One thing is certain: engineers will not rest.   DFT continues to evolve to address the increased complexity of SoC and 3D-IC design, its realization, and the emergence of new fault models required for sub-20nm process nodes.  With every advance, whether in the form of a new algorithm or new IP modules, the EDA tools will need to be updated and, probably, the approach to the IC architecture will need to be changed.  As the rate of cost increase of new processes continues to grow, designers will have to be more creative in developing better testing techniques to improve the utilization of already established processes.

Power Analysis and Management

Thursday, August 25th, 2016

Gabe Moretti, Senior Editor

As the size of a transistor shrinks and modifies, power management becomes more critical.  As I was polling various DA vendors, it became clear that most were offering solutions for the analysis of power requirements and software based methods to manage power use, at least one, was offering a hardware based solution to power use.  I struggled to find a way to coherently present their responses to my questions, but decided that extracting significant pieces of their written responses would not be fair.  So, I organized a type of virtual round table, and I will present their complete answers in this article.

The companies submitting responses are; Cadence, Flex Logix, Mentor, Silvaco, and Sonics.  Some of the companies presented their own understanding of the problem.  I am including that portion of their contribution as well to provide a better meaning to the description of the solution.

Cadence

Krishna Balachandran, product management director for low power solutions at Cadence  provided the following contribution.

Not too long ago, low power design and verification involved coding a power intent file and driving a digital design from RTL to final place-and-route and having each tool in the flow understand and correctly and consistently interpret the directives specified in the power intent file. Low power techniques such as power shutdown, retention, standby and Dynamic Voltage and Frequency Scaling (DVFS) had to be supported in the power formats and EDA tools. Today, the semiconductor industry has coalesced around CPF and the IEEE 1801 standard that evolved from UPF and includes the CPF contributions as well. However, this has not equated to problem solved and case closed. Far from it! Challenges abound. Power reduction and low power design which was the bailiwick of the mobile designers has moved front-and-center into almost every semiconductor design imaginable – be it a mixed-signal device targeting the IoT market or large chips targeting the datacenter and storage markets. With competition mounting, differentiation comes in the form of better (lower) power-consuming end-products and systems.

There is an increasing realization that power needs to be tackled at the earliest stages in the design cycle. Waiting to measure power after physical implementation is usually a recipe for multiple, non-converging iterations because power is fundamentally a trade-off vs. area or timing or both. The traditional methodology of optimizing for timing and area first and then dealing with power optimization is causing power specifications to be non-convergent and product schedules to slip. However, having a good handle on power at the architecture or RTL stage of design is not a guarantee that the numbers will meet the target after implementation. In other words, it is becoming imperative to start early and stay focused on managing power at every step.

It goes without saying that what can be measured accurately can be well-optimized. Therefore, the first and necessary step to managing power is to get an accurate and consistent picture of power consumption from RTL to gate level. Most EDA flows in use today use a combination of different power estimation/analysis tools at different stages of the design. Many of the available power estimation tools at the RTL stage of design suffer from inaccuracies because physical effects like timing, clock networks, library information and place-and-route optimizations are not factored in, leading to overly optimistic or pessimistic estimates. Popular implementation tools (synthesis and place-and-route) perform optimizations based on measures of power using built-in power analysis engines. There is poor correlation between these disparate engines leading to unnecessary or incorrect optimizations. In addition, mixed EDA-vendor flows are plagued by different algorithms to compute power, making the designer’s task of understanding where the problem is and managing it much more complicated. Further complications arise from implementation algorithms that are not concurrently optimized for power along with area and timing. Finally, name-mapping issues prevent application of RTL activity to gate-level netlists, increasing the burden on signoff engineers to re-create gate-level activity to avoid poor annotation and incorrect power results.

To get a good handle on the power problem, the industry needs a highly accurate but fast power estimation engine at the RTL stage that helps evaluate and guide the design’s micro-architecture. That requires the tool to be cognizant of physical effects – timing, libraries, clock networks, even place-and-route optimizations at the RTL stage. To avoid correlation problems, the same engine should also measure power after synthesis and place-and-route. An additional requirement to simplify and shorten the design flow is for such a tool to be able to bridge the system-design world with signoff and to help apply RTL activity to a gate-level netlist without any compromise. Implementation tools, such as synthesis and place-and-route, need to have a “concurrent power” approach – that is, consider power as a fundamental cost-factor in each optimization step side-by-side with area and timing. With access to such tools, semiconductor companies can put together flows that meet the challenges of power at each stage and eliminate iterations, leading to a faster time-to-market.

Flex Logix

Geoff Tate, Co-founder and CEO of Flex Logix is the author of the following contribution.  Our company is a relatively new entry in the embedded FPGA market.  It uses TSMC as a foundry.  Microcontrollers and IOT devices being designed in TSMC’s new ultra-low power 40nm process (TSMC 40ULP) need

•             The flexibility to reconfigure critical RTL, such as I/O

•          The ability to achieve performance at lowest power

Flex Logix has designed a family of embedded FPGA’s to meet this need. The validation chip to prove out the IP is in wafer fab now.

Many products fabricated with this process are battery operated: there are brief periods of performance-sensitive activity interspersed with long periods of very low power mode while waiting for an interrupt.

Flex Logix’s embedded FPGA core provides options to enable customers to optimize power and performance based on their application requirements.

To address this requirement, the following architectural enhancements were included in the embedded FPGA core:

•             Power Management containing 5 different power states:

  • Off state where the EFLX core is completely powered off.
  • Deep Sleep state where VDDH supply to the EFLX core can be lowered from nominal of 0.9V/1.1V to 0.5V while retaining state
  • Sleep state, gates the supply (VDDL) that controls all the performance logic such as the LUTs, DSP and interconnect switches of the embedded FPGA while retaining state. The latency to exit Sleep is shorter than that that to exit from Deep Sleep
  • Idle state, idles the clocks to cut power but is ready to move into dynamic mode quicker than the Sleep state
  • Dynamic state where power is highest of the 4 power management states but where the latency is the shortest and used during periods of performance sensitive activity

The other architectural features available in the EFLX-100 embedded FPGA to optimize power-performance are:

•             State retention for all flip flops and configuration bits at voltages well below the operating range.

•          Ability to directly control body bias voltage levels (Vbp, Vbn). Controlling the body bias further controls leakage power

•             5 combinations of threshold voltage(VT) devices to optimize power and performance for static/performance logic of the embedded FPGA. Higher the threshold voltage (eHVT, HVT) lower the leakage power and lower performance while lower the threshold voltage (SVT) device, higher the leakage and higher the performance.

•             eHVT/eHVT

•             HVT/HVT

•             HVT/SVT

•             eHVT/SVT

•             SVT/SVT

In addition to the architectural features various EDA flows and tools are used to optimize the Power Performance and Area (PPA) of the FlexLogix embedded FPGA:

•             The embedded FPGA was implemented using a combination of standard floor-planning and P&R tools to place and route the configuration cells, DSP and LUTs macros and network fabric switches. This resulted in higher density thereby reducing IR drops and the need for larger drive strengths thereby optimizing power

•          Design and use longer (non-minimum) channel length devices which further help reduce leakage power with minimal to no impact to the performance

•          The EFLX-100 core was designed with an optimized power grid to effectively use metal resources for power and signal routing. Optimal power grids reduce DC/AC supply drops which further increase performance.

Mentor

Arvind Narayanan, Architect, Product Marketing, Mentor Graphics contributed the following viewpoint.

One of the biggest challenges in IC design at advanced nodes is the complexity inherent in effective power management. Whether the goal is to reduce on-chip power dissipation or to provide longer battery life, power is taking its place alongside timing and area as a critical design dimension.

While low-power design starts at the architectural level, the low-power design techniques continue through RTL synthesis and place and route. Digital implementation tools must interpret the power intent and implement the design correctly, from power aware RTL synthesis, placement of special cells, routing and optimization across power domains in the presence of multiple corners, modes, and power states.

With the introduction of every new technology node, existing power constraints are also tightened to optimize power consumption and maximize performance. 3D transistors (FinFETs) that were introduced at smaller technology nodes have higher input pin capacitance compared to their planar counterpart, resulting in the dynamic power component to be higher compared to leakage.

Power Reduction Strategies

A good strategy to reduce power consumption is to perform power optimization at multiple levels during the design flow including software optimization, architecture selection, RTL-to-GDS implementation and process technology choices. The biggest power savings are usually obtained early in the development cycle at the ESL & RTL stages. (Fig 1). During physical implementation stage there is less opportunity for power optimization in comparison and hence choices made earlier in the design flow are critical. Technology selection such as the device structure (FinFET, planar), choice of device material (HiK, SOI) and technology node selection all play a key role.

Figure 1. Power reduction opportunities at different stages of the design flow

Architecture selection

Studies have shown that only optimizations applied early in the design cycle, when a design’s architecture is not yet fixed, have the potential for radical power reduction.  To make intelligent decisions in power optimization, the tools have to simultaneously consider all factors affecting power, and apply early in the design cycle. Finding the best architecture enables to properly balance functionality, performance and power metrics.

RTL-to-GDS Power Reduction

There are a wide variety of low-power optimization techniques that can be utilized during RTL to GDS implementation for both dynamic and leakage power reduction. Some of these techniques are listed below.

RTL Design Space Exploration

During the early stages of the design, the RTL can be modified to employ architectural optimizations, such as replacing a single instantiation of a high-powered logic function with multiple instantiations of low-powered equivalents. A power-aware design environment should facilitate “what-if” exploration of different scenarios to evaluate the area/power/performance tradeoffs

Multi-VDD Flow

Multi-voltage design, a popular technique to reduce total power, is a complex task because many blocks are operating at different voltages, or intermittently shut off. Level shifter and isolation cells need to be used on nets that cross domain boundaries if the supply voltages are different or if one of the blocks is being shut down. DVFS is another technique where the supply voltage and frequency can vary dynamically to save power. Power gating using multi-threshold CMOS (MTCMOS) switches involves switching off certain portions of an IC when that functionality is not required, then restoring power when that functionality is needed.

Figure 2. Multi-voltage layout shown in a screen shot from the Nitro-SoC™ place and route system.

MCMM Based Power Optimization

Because each voltage supply and operational mode implies different timing and power constraints on the design, multi-voltage methodologies cause the number of design corners to increase exponentially with the addition of each domain or voltage island. The best solution is to analyze and optimize the design for all corners and modes concurrently. In other words, low-power design inherently requires true multi-corner/multi-mode (MCMM) optimization for both power and timing. The end result is that the design should meet timing and power requirements for all the mode/corner scenarios.

FinFET aware Power Optimization

FinFET aware power optimization flow requires technologies such as activity driven placement, multi-bit flop support, clock data optimization, interleaved power optimization and activity driven routing to ensure that the dynamic power reduction is optimal. The tools should be able to use transforms with objective costing to make trade-offs between dynamic power, leakage power, timing, and area for best QoR.

Using the strategy to optimize power at all stages of the design flow, especially at the architecture stage is critical for optimal power reduction.  Architecture selection along with the complete set of technologies for RTL-to-GDS implementation greatly impact the ability to effectively manage power.

Silvaco

Seena Shankar, Technical Marketing Manager, is the author of this contribution.

Problem:

Analysis of IR-drop, electro-migration and thermal effects have traditionally been a significant bottleneck in the physical verification of transistor level designs like analog circuits, high-speed IOs, custom digital blocks, memories and standard cells. Starting from 28 nm node and lower, all designers are concerned about power, EM/IR and thermal issues. Even at the 180 nm node if you are doing high current designs in LDMOS then EM effects, rules and thermal issues need to be analyzed. FinFET architecture has increased concerns regarding EM, IR and thermal effects. This is because of complex DFM rules, increased current and power density. There is a higher probability of failure. Even more so EM/IR effects need to be carefully analyzed and managed. This kind of analysis and testing usually occurs at the end of the design flow. Discovering these issues at that critical time makes it difficult to stick to schedule and causing expensive rework. How can we resolve this problem?

Solution:

Power integrity issues must be addressed as early in the design cycle as possible, to avoid expensive design and silicon iterations. Silvaco’s InVar Prime is an early design stage power integrity analysis solution for layout engineers. Designers can estimate EM, IR and thermal conditions before sign-off stage. It performs checks like early IR-drop analysis, check of resistive parameters of supply networks, point to point resistance check, and also estimate current densities. It also helps in finding and fixing issues that are not detectable with regular LVS check like missing vias, isolated metal shapes, inconsistent labeling, and detour routing.

InVar Prime can be used for a broad range of designs including processors, wired and wireless network ICs, power ICs, sensors and displays. Its hierarchical methodology accurately models IR-drop, electro-migration and thermal effects for designs ranging from single block to full-chip. Its patented concurrent electro-thermal analysis performs simulation of multiple physical processes together. This is critical for today’s’ designs in order to capture important interactions between power and thermal 2D/3D profiles. The result is physical measurement-like accuracy with high speed even on extremely large designs and applicability to all process nodes including FinFET technologies.

InVar Prime requires the following inputs:

●      Layout- GDSII

●      Technology- ITF or iRCX

●      Supplementary data- Layer mapping file for GDSII, Supply net names, Locations and nominal of voltage sources, Area based current consumption for P/G nets

Figure 3. Reliability Analysis provided by InVar Prime

InVar Prime enables three types of analysis on a layout database: EM, IR and Thermal. A layout engineer could start using InVar to help in the routing and planning of the power nets, VDD and VSS. IR analysis with InVar will provide them early analysis on how good the power routing is at that point. This type of early analysis flags potential issues that might otherwise appear after fabrication and result in silicon re-spins.

InVar EM/IR engine provides comprehensive analysis and retains full visibility of supply networks from top-level connectors down to each transistor. It provides a unique approach to hierarchical block modeling to reduce runtime and memory while keeping accuracy of a true flat run. Programmable EM rules enable easy adaptation to new technologies.

InVar Thermal engine scales from single cell design to full chip and provides lab-verified accuracy of thermal analysis. Feedback from thermal engine to EM/IR engines provides unprecedented overall accuracy. This helps designers understand and analyze various effects across design caused by how thermal 2D/3D profiles affect IR drop and temperature dependent EM constraints.

The main benefits of InVar Prime are:

●      Accuracy verified in lab and foundries

●      Full chip sign-off with accurate and high performance analysis

●      Analysis available early in the back end design, when more design choices are available

●      Pre-characterization not required for analysis

●      User-friendly environment designed to assist quick turn-around-times

●      Effective prevention of power integrity issues

●      Broad range of technology nodes supported

●      Reduces backend verification cycle time

●      Improves probability of first silicon success

Sonics

Scott Seiden contributed his company viewpoint.  Sonics has developed a dynamic power management solution that is hardware based.

Sonics has Developed Industry’s First Energy Processing Unit (EPU) Based on the ICE-Grain Power Architecture.  The EPUICE stands for Instant Control of Energy.

Sonics’ ICE-G1 product is a complete EPU enabling rapid design of system-on-chip (SoC) power architecture and implementation and verification of the resulting power management subsystem.

No amount of wasted energy is affordable in today’s electronic products. Designers know that their circuits are idle a significant fraction of time, but have no proven technology that exploits idle moments to save power. An EPU is a hardware subsystem that enables designers to better manage and control circuit idle time. Where the host processor (CPU) optimizes the active moments of the SoC components, the EPU optimizes the idle moments of the SoC components. By construction, an EPU delivers lower power consumption than software-controlled power management. EPUs possess the following characteristics:

  • Fine-grained power partitioning maximizes SoC energy savings opportunities
  • Autonomous hardware-based control provides orders of magnitude faster power up and power down than software-based control through a conventional processor
  • Aggregation of architectural power savings techniques ensures minimum energy consumption
  • Reprogrammable architecture supports optimization under varying operating conditions and enables observation-driven adaptation to the end system.

About ICE-G1

The Sonics’ ICE-G1 EPU accelerates the development of power-sensitive SoC designs using configurable IP and an automated methodology, which produces EPUs and operating results that improve upon the custom approach employed by expert power design teams. As the industry’s first licensable EPU, ICE-G1 makes sophisticated power savings techniques accessible to all SoC designers in a complete subsystem solution. Using ICE-G1, experienced and first-time SoC designers alike can achieve significant power savings in their designs.

Markets for ICE-G1 include:

- Application and Baseband Processors
- Tablets, Notebooks
- IoT
- Datacenters
- EnergyStar compliant systems
- Form factor constrained systems—handheld, battery operated, sealed case/no fan, wearable.

-ICE-G1 key product features are:Intelligent event and switching controllers–power grain controllers, event matrix, interrupt controller, software register interface—configurable and programmable hardware that dynamically manages both active and leakage power.

- SonicsStudio SoC development environment—graphical user interface (GUI), power grain identification (import IEEE-1801 UPF, import RTL, described directly), power architecture definition, power grain controller configuration (power modes and transition events), RTL and UPF code generation, and automated verification test bench generation tools. A single environment that streamlines the EPU development process from architectural specification to physical implementation.

- Automated SoC power design methodology integrated with standard EDA functional and physical tool flows (top down and bottom up)—abstracts the complete set of power management techniques and automatically generates EPUs to enable architectural exploration and continuous iteration as the SoC design evolves.

- Technical support and consulting services—including training, energy savings assessments, architectural recommendations, and implementation guidance.

Conclusion

As can be seen from the contributions analysis and management of power is multi-faceted.  Dynamic control of power, especially in battery powered IoT devices is critical, since some of there devices will be in locations that are not readily reachable by an operator.