Part of the  

Chip Design Magazine

  Network

About  |  Contact

Posts Tagged ‘Cadence’

Next Page »

Blog Review – Monday November 24, 2014

Monday, November 24th, 2014

Call for new technology for SoC verification; Four steps to integrated design; Securing the IoT; My Big Data is bigger than yours; Immigration issues

Cadence fellow Mike Stellfox is the subject of an interesting Q&A, relayed by Richard Goering, Cadence, where he talks about UVM at SoC and system level and the need for a new approach

Essential tips from a keynote in Japan by architect Cristiano Ceccato, are put to good use by Akio, Dassault Systèmes. It turns out that there are parallels with bricks and mortar for those dealing with IP blocks and design teams.

An optimistic note for a secure IoT is sounded by Zach Shelby, ARM, as he details the component parts in this blog.

Perhaps growing tired of empty boasting, Michael Ford looks at just how big data collection is on today’s factory floor, and how savings can be made.

The fact that 3,000 copies of its virtual prototyping book have been distributed is the least noteworthy in the blog by Tom De Schutter, Synopsys. A follow-up survey has produced some interesting views on software challenges for virtual prototyping.

Taking a different view to Europe, which is currently wrestling with immigration controls, limits and quotas, Peter Muller, Intel and Brian Toohey, SIA welcome the initiatives of President Obama for increasing the skilled visa program that should benefit the industry.

Blog Review – Monday, Nov. 17 2014

Monday, November 17th, 2014

Harking back to analog; What to wear in wearables week; Multicore catch-up; Trusting biometrics
By Caroline Hayes, Senior Editor.

Adding a touch of nostalgia, Richard Goering, Cadence, reviews a mixed signal keynote at Mixed-Signal Summit that Boris Murmann made at Cadence HQ. His ideas for reinvigorating the role of analog make interesting reading.

As if there wasn’t enough stress about what to wear, ARM adds to it with its Wearables Week. Although David Blaza finds that Shane Walker, IHS is pretty relaxed, offering a positive view of the wearables and medical market.

Practise makes perfect, believes Colin Walls, Mentor, who uses his blog to highlight common misconceptions of C++, multicore and MCAPI for communication and synchronisation between cores.

Biometrics are popular and ubiquitous but Thomas Suwald, NXP looks at what needs to be done for secure integration and the future of authentication.

Blog review – Monday, Nov. 03 2014

Monday, November 3rd, 2014

Intel cooks up a vision of the IoT; Cadence turns up the verification volume; Synopsys celebrates being ‘-free’; ARM and AMD join RapidIO.org; Imagination adds some details to wearable devices.

Envisioning the future scenario of the connected kitchen, Dylan Jarson, considers the inter-cusine communication but also the demands this will place on data centers and the changes that this may mean. (His vision of appliances talking to you and to each other makes a refreshing change from the teenager monologue heard in our kitchen: “What is there to eat? When’s lunch? I’m hungry/starving/famished”.)

Cadence, just like Pink, wants to get this party started. Steve Carlson celebrates and urges everyone to join in the SoC verification and even provides a comprehensive list of ingredients (and diagrams) needed for a good mix of progress and innovation.

An interesting premise is proposed by Tom De Schutter, enjoy what is not there. He is talking about hardware-free software development and adapts a gluten-free marketing slogan, for engineers who might be hardware-intolerant.

Steve Leibso delivers RapidIO news – ARM and AMD have joined the switched fabric interconnect organization. Xilinx, as a RapidIO.org member will track the 64bit processor in preparation for the 64bit processor specification.

Part of a series, Alexandru Voica adds some stats about wearable devices to pad out a ‘teaser blog’ with links to two SoCs from Imagination’s partners.

Pushing the Performance Boundaries of ARM Cortex-M Processors for Future Embedded Design

Friday, October 31st, 2014

By Ravi Andrew and Madhuparna Datta, Cadence Design Systems

One of the toughest challenges in the implementation of any processors is balancing the need for the highest performance with the conflicting demands for lowest possible power and area. Inevitably, there is a tradeoff between power, performance, and area (PPA). This paper examines two unique challenges for design automation methodologies in the new ARM®Cortex®-M processor:  how to get maximum performance while designing for a set power budget and how to get maximum power savings while optimizing for a set target frequency.

Introduction

The ARM®Cortex®-M7 processor is the latest embedded processor by ARM specifically developed to address digital signal control markets that demand an efficient, easy-to-use blend of control and signal processing capabilities. The ARM Cortex-M7 processor has been designed with a large variety of highly efficient signal processing features, which demands very power- efficient design.

Figure 1: ARM Cortex-M7 Block Diagram

The energy-efficient, easy-to-use microprocessors in the ARM Cortex-M series have received a large amount of attention recently as portable and wireless / embedded applications have gained market share. In high-performance designs, power has become an issue since at those frequencies power dissipation can easily reach several tens of watts.   The efficient handling of these power levels requires complex heat dissipation techniques at the system level, ultimately resulting in higher costs and potential reliability issues. In this section, we will isolate the different components of power consumption on a chip to demonstrate why power has become a significant issue. The remaining sections will discuss how we approached this problem and resolved it using Cadence® implementation tools, along with other design techniques.

We began the project with the objective of addressing two simultaneous challenges:

1. Reach, as fast as possible, a performance level with optimal power (AFAP)

2. Reduce power to the minimum for a lower frequency scenario (MinPower)

Before getting  into the details of how we achieved the desired frequency  and power  numbers,  let’s first examine the components which contribute to dynamic power  and the factors which gate  the frequency  push. This experiment has been conducted on the ARM Cortex-M7 processor.  The ARM Cortex-M7 processor has achieved 5 CoreMark / MHz – 2000 CoreMark* in 40LP and typical 2X digital signal processing (DSP) performance of the ARM Cortex-M4 processor.

Dynamic power components

In high-performance microprocessors, there are several key reasons which are causing a rise in power dissipation. First, the presence of a large number of devices and wires integrated on a big chip results in an overall increase in the total capacitance of the design. Second, the drive for higher performance leads to increasing clock frequencies, and dynamic power is directly proportional to the rate of charging capacitances (in other words, the clock frequency). A third reason that may lead to higher power consumption is an inefficient use of gates.  The total switching device capacitance consists of gate oxide capacitance, overlap capacitance, and junction capacitance. In addition, we consider the impact of internal nodes of a complex logic gate.  For example, the junction capacitance of the series-connected NMOS transistors in a NAND gate contributes to the total switching capacitance, although it does not appear at the output node.  Dynamic power is consumed when a gate switches. However, interest has risen in the physical design area, to make better use of the available gates by increasing the ratio of clock cycles when a gate actually switches. This increased device activity would also lead to rising power consumption. Dynamic power is the largest component of total chip power consumption (the other components are short-circuit power and leakage power). It occurs as a result of charging capacitive loads at the output of gates.  These capacitive loads are in the form of wiring capacitance, junction capacitance, and the input (gate) capacitance of the fan-out gates. Since leakage is <2% of total power, the focus of this collaboration was only on dynamic power.

The expression for dynamic power is:

In (1), C denotes the capacitance being charged /discharged, Vdd is the supply voltage, f is the frequency of operation, and α is the switching activity factor. This expression assumes that the output load experiences a full voltage swing of Vdd. If this is not the case, and there are circuits that take advantage of this fact, (1) becomes proportional to (Vdd * Vswing). A brief discussion of the switching factor α is in order at this point. The switching factor is defined in this model as the probability of a gate experiencing an output low-to-high transition in an arbitrary clock cycle. For instance, a clock buffer sees both a low-to-high and a high-to-low transition in each clock cycle. Therefore, α for a clock signal is 1, as there is unity probability that the buffer will have an energy-consuming transition in a given cycle. Fortunately, most circuits have activity factors much smaller than 1. Some typical values for logic might be about 0.5 for data path logic and 0.03 to 0.05 for control logic. In most instances we will use a default value of 0.15 for α, which is in keeping  with values reported in the literature for static CMOS designs [1,2,3]. Notable exceptions to this assumption will be in cache memories, where read /write operations take place nearly every cycle, and clock-related circuits.

Here are five key components of dynamic power consumption and how we addressed a few of these components:

• Standard cell logic and local wiring

• Global interconnect (mainly busses, inter-modular routing, and other control)

• Global clock distribution (drivers + interconnect + sequential elements)

• Memory (on-chip caches) — this is constant in our case

• I /Os (drivers + off-chip capacitive loads) — this is constant in our case

Timing closure components

One fundamental issue of timing closure is the modeling of physical overcrowding.  The problem involves, among other factors, the representation and the handling of layout issues. These issues include placement congestion, overlapping of arbitrary-shaped components, routing congestion due to power/ground, clock distribution, signal interconnect, prefixed wires over components, and forbidden regions of engineering concerns.  While a clean and universal mathematical model of physical constraint remains open, we tend to formulate the layout problem using multiple constraints with sophisticated details that complicate the implementation. We need to consider multiple constraints with a unified objective function for a timing-closure design process. This is essential because many constraints are mutually conflicting if we view and handle their effects only on the surface. For example, to ease the routing congestion of a local area, we tend to distribute components out of the area to leave more room for routing.  However, for multi-layer routing technology, eliminating components does not save much on routing area. The spreading of components actually increases the wire length and demands more routing space. The resultant effect can have a negative impact on the goals of the original design. In fact, the timing can become much worse. Consequently, we need an intelligent operation that identifies both the component to move out and the component to move in to improve the design.

Accurately predicting the detail routed signal-integrity (SI) effects, before the detail routing happens, and its impact to timing is of key interest.  This is because a reasonable misprediction of timing before the detail route would create timing jumps after the routing is done.  Historically, designs for which it is tough to close timing have relied solely on post-route optimization to salvage setup /hold timing. With the advent of “in-route optimization”, timing closure has been bridged earlier during the routing step itself using track assignment. In addition, if we can reduce the wire lengths and make good judgment calls based on the timing profiles, we can find opportunities to further reduce power.  This paper will walk through the Cadence digital implementation flow and new tool options used to generate performance benefits for the design. The paper will also discuss the flow and tool changes that were done to get the best performance and power efficiency out of the ARM Cortex-M7 processor implementation.

Better Placement and Reduced Wirelength for Better Timing and Lower Power

As discussed in the introduction, wire capacitance and gate capacitance are among the key factors that impact dynamic power, while also affecting wire delays. While evaluating the floorplan and cell placement, it was noticed that the floorplan size was bigger than needed and the cell placement density was uniform. These two aspects could lead to spreading out of cells, resulting in longer wirelength and higher clock latencies. In order to improve the placement densities, certain portions of the design were soft-blocked, and the standard cell densities were kept above 75%.

Figure 2: Soft-Blocked Floorplan

Standard cell placement plays a vital role. If the placement is done right, it will eventually pay off in terms of better Quality of Results (QoR) and wirelength reduction. If the placement algorithms can take into account  some of the power  dissipation-related issues, like reducing  the wirelength and considering overall slack profile of the design, and also make the right moves during placement, this would tremendously improve the above mentioned aspect. This is the core principle behind the “Giga Place” placement engine. The Giga Place engine, available in Cadence Encounter® Digital Implementation System 14.1, helps place the cells in a timing-driven mode by building up the slack profile of the paths and performing  the placement adjustments based  on these  timing slacks. We have introduced this new placement engine on the ARM Cortex-M7 design and seen good improvements on the overall wirelength and Total Negative Slack (TNS).

Figure 3: “GigaPlace” Placement Engine

With a reduced floorplan and by removing the uniform placement and utilizing the new GigaPlace technologies, we were able to reduce the wirelength significantly. This helped push the frequency as well as reduce the power. But, there were still more opportunities available to further benefit the frequency and dynamic power targets.

Figure 4: Wirelength Reduced  with “GigaPlace and Soft-Blocked” Placement

Figure 5: Total Negative Slack (ns) Chart

In-Route Optimization: SI-Aware Optimization Before Routing to Achieve Final Frequency Target

“In-route optimization” for timing optimization happens before routing begins. This is a very close representation of the real routes, which does not account for the DRC fixes and the leaf-cell pin access. This enables us to get an accurate view of timing /SI and make bigger changes without disrupting the routes.  These changes are then committed to a full detail route.  In-route optimization technology utilizes an internal extraction engine for more effective RC modeling. The timing QoR improvement observed after post-route optimization was significant at the expense of a slight runtime increase (currently observed at only 2%). A successful usage of an internal extraction model during in-route optimization helped reduce the timing divergence seen as we go from the pre-route to the post-route stage.  This optimization technology pushed the design to achieve the targeted frequency.

Figure 6: In-Route Optimization Flow Chart

Design Changes and Further Dynamic Power Reduction

In the majority of present-day electronic design automation (EDA) tools, timing closure is the top priority and, hence, many of these tools make the trade-off to give priority to timing. However, opportunities exist to reduce area and gate capacitance by swapping cells to lower gate cap cells and by reducing the wirelength. To address the dynamic power reduction in the design, three major sets of experiments were done  to examine the above aspects.

In the first set of experiments, two main tool features were used in the process of reducing dynamic power.  These were the introduction of the “dynamic power optimization engine” along with the “area reclaim” feature in the post-route stage.  These options helped save 5% of dynamic power @400MHz and enabled us to nearly halve the gap that earlier existed between the actual and desired power target.

Figure 7: Example of Power Optimization

In the second set of experiments, the floorplan was soft-blocked by 100 microns to reduce the wirelength. This was discussed in detail in an earlier section. This floorplan shrink resulted in:

• Increasing the density from ~76% to 85%

• Wirelength reduction by 5.1% – post route

• Area (with combo of #1 and shrink) shrinkage by ~4% – post route

This helped saved an additional 2% @400MHz, and the impact was similar across the frequency sweep.

The third set of experiments was related to design changes where flop sizes were downsized to a minimum at pre_ cts opt and the remaining flops of higher drive strengths were set to “don’t use”.  This helped to further reduce the sequential power.  An important point to note is that the combinational power did not increase significantly. After we introduced the above technique, we were able to reduce power significantly, as shown in the charts below.

Results

By using these latest tool technologies and design techniques, we were able to achieve 10% better frequency and reduced the dynamic power by 10%. Results are shown here based on the 400MHz and 200MHz for the dynamic power reduction.

Table 1: Dynamic Power Reduction Results

The joint ARM /Cadence work started with addressing challenges at two points /scenarios on the PPA curve:

1. Frequency focus with optimal power (400MHz)

2. Lowest power at reduced frequency   (200MHz)

For scenario #1, out of box 14.1 allowed us to reach 400MHz. With the use of PowerOpt technology, available in Encounter Digital Implementation System 14.1, we were able to reduce power to an optimal number.  For scenario #2, additional use of GigaPlace technology and inherently better SI management allowing relaxed clock slew, and much higher power reduction at 200MHz was possible. With the combination of ARM design techniques and Cadence tool features, we were able to show 38% dynamic power reduction (for standard cells) going from

400MHz – 13.2-based run to 200MHz – 14.2 best power recipe run.

Summary

Reducing the wirelength and slack profile-based placement, and predicting the detailed routing impact in the early phase of the design, are important aspects to improve the performance and reduce the dynamic power consumption in designs. Tools perform better when given the right floorplan along with the proper directives at appropriate places. With a combination of design changes,  advanced  tools, and engineering expertise, today’s physical design engineers  have the means  to thoroughly  address  the challenges associated  with timing closure while keeping  the dynamic power  consumption of the designs low.

Figure 8: Dynamic Power ( Normalized) for Logic

Several months of collaborative work between ARM and Cadence, driven by many trials, have led to optimized PPA results. Cadence tools – Encounter RTL Compiler/ Encounter Digital Implementation System 14.1 – have produced better results out of box compared to Encounter RTL Compiler/ Encounter Digital Implementation System 13.x. The continuous refinement of the flow along with design techniques such as floorplan reduction and clock slew relaxation allowed a 38% dynamic power reduction. The ARM /Cadence implementation Reference Methodology (iRM) flow uses a similar recipe for both scenarios: lowest power (MinP) and highest frequency (AFAP).

References

[1] D. Liu and C. Svensson, “Power consumption estimation in CMOS VLSI chips,” IEEE Journal of Solid-State

Circuits, vol. 29, pp. 663-670, June 1994.

[2] A.P. Chandrakasan and R.W. Broderson, “Minimizing power consumption in digital CMOS circuits,” Proc. of the

IEEE, vol. 83, pp. 498-523, April 1995.

[3] G. Gerosa, et al., “250  MHz 5-W PowerPC microprocessor,” IEEE Journal of Solid-State Circuits, vol. 32, pp.

1635-1649, Nov. 1997.

Blog Review – Monday October 27 2014

Monday, October 27th, 2014

Synopsys won’t let the hybrid debate mess with your head; automating automotive verification; the write stuff; software’s role in wearable medical technology; ARM’s bandwidth stretching.
By Caroline Hayes, Senior Editor

Playing with your mind, Michael Posner, Synopsy, relishes a mashup blog, with a lion/zebra image to illustrate IP validation in software development. He does not tease the reader all through the blog though, and gives some sound advice on mixing it up with ARM-based system for development and FPGA for validation and combinations in-between.

Indulging in a little bit of a promo-blog, Richard Goering, deconstructs the addition to the Incisive additions of Functional Safety Simulator and Functional Safety Analysis for the vManager. We will let him off the indulgence, though, as the informative, well-researched piece is as much a blog for vehicle designers as it is for verification professionals.

Not that he needs much practice in a writing studio, Hamilton Carter is still turning up for class and finds parallels in the beauty of prose and the analysis of code. Instead of one replacing the other, he advocates supplementing one with the other so that the message and intent is clear for all.

Taking an appreciative step back, Helene at Dassault, reviews the medical market and how the wearable trend might influence it. She also looks at how the company’s software helps designers understand what is needed and create it.

There are plenty of diagrams to illustrate the point that Jakublamik is making in his blog for bandwidth consumption. After clearly setting out the culprits for bandwidth hunger, he lays out the ARM Mali GPU appetizers in a conversational, yet detailed very useful blog (and with a Chinese version available too).

Cortex-M processor Family at the Heart of IoT Systems

Saturday, October 25th, 2014

Gabe Moretti, Senior Editor

One cannot have a discussion about the semiconductor industry without hearing the word IoT.  It is really not a word as language lawyers will be ready to point out, but an abbreviation that stands for Internet of Things.  And, of course, the abbreviation is fundamentally incorrect, since the “things” will be connected in a variety of ways, not just the Internet.  In fact it is already clear that devices, grouped to form an intelligent subsystem of the IoT, will be connected using a number of protocols like: 6LoWPAN, ZigBee, WiFi, and Bluetooth.  ARM has developed the Cortex®-M processor family that is particularly well suited for providing processing power to devices that consume very low power in their duties of physical data acquisition. This is an instrumental function of the IoT.

Figure 1. The heterogeneous IoT: lots of “things” inter-connected. (Courtesy of ARM)

Figure 1 shows the vision the semiconductor industry holds of the IoT.  I believe that the figure shows a goal the industry set for itself, and a very ambitious goal it is.  At the moment the complete architecture of the IoT is undefined, and rightly so.  The IoT re-introduces a paradigm first used when ASIC devices were thought of being the ultimate solution to everyone’s computational requirements.  The business of IP started  as an enhancement to application-specific hardware, and now general purpose platforms constitute the core of most systems.  IoT lets the application drive the architecture, and companies like ARM provide the core computational block with an off-the-shelf device like a Cortex MCU.

The ARM Cortex-M processor family is a range of scalable and compatible, energy efficient, easy to use processors designed to help developers meet the needs of tomorrow’s smart and connected embedded applications. Those demands include delivering more features at a lower cost, increasing connectivity, better code reuse and improved energy efficiency. The ARM Cortex-M7 processor is the most recent and highest performance member of the Cortex-M processor family. But where the Cortex-M7 is at the heart of ARM partner SoCs for IoT systems, other connectivity IP is required to complete the intelligent SoC subsystem.

A collection of some of my favorite IoT-related IP follows.

Figure 2. The Cortex-M7 Architecture (Courtesy of ARM)

Development Ecosystem

To efficiently build a system, no matter how small, that can communicate with other devices, one needs IP.  ARM and Cadence Design Systems have had a long-standing collaboration in the area of both IP and development tools.  In September of this year the companies extended an already existing agreement covering more than 130 IP blocks and software.  The new agreement covers an expanded collaboration for IoT and wearable devices targeting TSMC’s ultra-low power technology platform. The collaboration is expected to enable the rapid development of IoT and wearable devices by optimizing the system integration of ARM IP and Cadence’s integrated flow for mixed-signal design and verification.

The partnership will deliver reference designs and physical design knowledge to integrate ARM Cortex processors, ARM CoreLink system IP, and ARM Artisan physical IP along with RF/analog/mixed-signal IP and embedded flash in the Virtuoso-VDI Mixed-Signal Open Access integrated flow for the TSMC process technology.

“The reduction in leakage of TSMC’s new ULP technology platform combined with the proven power-efficiency of Cortex-M processors will enable a vast range of devices to operate in ultra energy-constrained environments,” said Richard York, vice president of embedded segment marketing, ARM. “Our collaboration with Cadence enables designers to continue developing the most innovative IoT devices in the market.”  One of the fundamental changes in design methodology is the aggregation of capabilities from different vendors into one distribution point, like ARM, that serve as the guarantor of a proven development environment.

Communication and Security

System developers need to know that there are a number of sources of IP when deciding on the architecture of a product.  In the case of IoT it is necessary to address both the transmission capabilities and the security of the data.

As a strong partner of ARM Synopsys provides low power IP that supports a wide range of low power features such as configurable shutdown and power modes. The DesignWare family of IP offers both digital and analog components that can be integrated with any Cortex-M MCU.  Beyond the extensive list of digital logic, analog IP including ADCs and DACs, plus audio CODECs play an important role in IoT applications. Designers also have the opportunity to use Synopsys development and verification tools that have a strong track record handling ARM based designs.

The Tensilica group at Cadence has published a paper describing how to use Cadence IP to develop a Wi-Fi 802.11ac transceiver used for WLAN (wireless local area network). This transceiver design is architected on a programmable platform consisting of Tensilica DSPs, using an anchor DSP from the ConnX BBE family of cores in combination with a smaller specialized DSP and dedicated hardware RTL. Because of the enhanced instruction set in the Cortex-M7 and superscalar pipeline, plus the addition of floating point DSP, Cadence radio IP works well with the Cortex-M7 MCU as intermediate band, digital down conversion, post-processing or WLAN provisioning can be done by the Cortex-M7.

Accent S.A. is an Italian company that is focused on RF products.  Accent’s BASEsoc RF Platform for ARM enables pre-optimized, field-proven single chip wireless systems by serving as near-finished solutions for a number of applications.  This modular platform is easily customizable and supports integration of different wireless standards, such as ZigBee, Bluetooth, RFID and UWB, allowing customers to achieve a shorter time-to-market. The company claims that an ARM processor-based, complex RF-IC could be fully specified, developed and ramped to volume production by Accent in less than nine months.

Sonics offers a network on chip (NoC) solution that is both flexible in integrating various communication protocols and highly secure.   Figure 3 shows how the Sonics NoC provides secure communication in any SoC architecture.

Figure 3.  Security is Paramount in Data Transmission (Courtesy of Sonics)

According to Drew Wingard, Sonics CTO “Security is one of the most important, if not the most important, considerations when creating IoT-focused SoCs that collect sensitive information or control expensive equipment and/or resources. ARM’s TrustZone does a good job securing the computing part of the system, but what about the communications, media and sensor/motor subsystems? SoC security goes well beyond the CPU and operating system. SoC designers need a way to ensure complete security for their entire design.”

Drew concludes “The best way to accomplish SoC-wide security is by leveraging on-chip network fabrics like SonicsGN, which has built-in NoCLock features to provide independent, mutually secure domains that enable designers to isolate each subsystem’s shared resources. By minimizing the amount of secure hardware and software in each domain, NoCLock extends ARM TrustZone to provide increased protection and reliability, ensuring that subsystem-level security defects cannot be exploited to compromise the entire system.”

More examples exist of course and this is not an exhaustive list of devices supporting protocols that can be used in the intelligent home architecture.  The intelligent home, together with wearable medical devices, is the most frequent example of IoT that could be implemented by 2020.  In fact it is a sure bet to say that by the time the intelligent home is a reality many more IP blocks to support the application will be available.

Blog Review – Monday Oct 13, 2014

Monday, October 13th, 2014

Cambridge Wireless discusses wearables; Cadence unmask Incisive ‘hidden treasures’; ON Semi advocates ESD measures; Synopsys presents at DVCON Europe; 3DIC reveals game-changer move at IEEE S3S

At this month’s Cambridge Wireless SIG (Special Interest Group) David Maidment, ARM, listened to an exchange of ideas for wearable and new business opportunities but with considerations for size, cost and consumer ease of use.

Revealing rampant prejudice for all physical media, Axel Scherer, Cadence, learns a lesson in features that are taken for granted and offers a list of 10 features in Incisive that may not be evident to many users.

Silicon ESD protection need to consider automotive designers, encourages Deres Eshete, explaining the reasons why ON Semiconductor has introduced the ESD7002, ESD7361, and ESD7461 ESD protection devices.

This week, DVCON Europe will include a tutorial from Synopsys about VCS AMS to extend digital verification for mixed-signal SoCs. Hélène Thibiéroz and colleagues from Synopsys, STMicroelectronics and Micronas will present October 14, but some hits of what to expect are on her blog.

Following on from the 2014 IEEE S3S conference, Zvi Or-Bach, discusses how monolithic 3D IC will be a game changer as he considers how existing fab transistor processes can be used and looks ahead to EDA efforts for the technology, covered at the conference.

Caroline Hayes – Monday October 13, 2014

Blog Review – Monday Oct. 06, 2014

Monday, October 6th, 2014

Real Intent assesses ARM TechCon; processor update; measure, analyse, change – writing and design rules; Richard Goering enjoyed learning more about the mbed IoT platform.
By Caroline Hayes, Senior Editor

ARM TechCon proved a place to reflect on both ARM’s contribution to the SoC space and also for Real Intent’s CTO, Pranav Ashar and changes in the SoC paradigm.

Different uses for the ARM Cortex-A17 and Cortex-A12 are the topic of a blog by Stefan Rosinger, ARM and how the two will become one – in name anyway.

There are parallels between electronic design principles and academic writing, discovers Hamilton Carter. The theories at the heart of both disciplines are discussed. Your assignment this week is to compare, contrast and discuss.

Another person taking notes at ARM TechCon was Richard Goering, Cadence. He reviews a keynote by ARM CTO Mike Muller describing the mbed OS platform for ARM Cortex-M based devices and mbed device server.

What Is Not Testable Is Not Fixable

Wednesday, September 17th, 2014

Gabe Moretti, Senior Editor

In the past I have mused that the three letter acronyms used in EDA like DFT, DFM, DFY and so on are superfluous since the only one that counts is DFP (Design For Profit).  This of course may be obvious since every IC reason for existence is to generate income for the seller.  But it is also true that the above observation is superficial since the IC must be testable, not only manufacturable and must also reach a yield figure that makes it cost effective.  Breaking down profitability into major design characteristics is an efficient approach, since a specific tool is certainly easier to work with than a generic one.

Bassilios Petrakis, Product Marketing Director at Cadence told me that: “DFT includes a number of requirements including manufacturing test, acceptance test, and power-on test.  In special cases it may be necessary to test the system while it is in operation to isolate faults or enable redundancies within mission critical systems.  For mission critical applications, usage of logic and memory build-in-self-test (BIST) is a commonly used approach to perform in-system test. Most recently, a new IEEE standard, P1687, was introduced to standardize the integration and instrument access protocol for IPs. Another IEEE proposed standard, P1838, has been initiated to define DFT IP for testing 3D-IC Through-Silicon-Via (TSV) based die stacks.”

Kiran Vittal, Senior Director of Product Marketing at Atrenta Inc. pointed out that: “The test coverage goals for advanced deep submicron designs are in the order of 99% for stuck-at faults and over 80% for transition faults and at-speed defects. These high quality goals can only be achieved by analyzing for testability and making design changes at RTL. The estimation of test coverage at RTL and the ability to find and fix issues that impact testability at RTL reduces design iterations and improves overall productivity to meet design schedules and time to market requirements.”

An 80% figure may seem an under achievement, but it points out the difficulty of proving full testability in light of other more demanding requirements, like area and power to name just two.

Planning Ahead

I also talked with Bassilios  about the need for a DFT approach in design from the point of view of the architecture of an IC.  The first thing he pointed out was that there are innumerable considerations that affect the choice of an optimal Design for Test (DFT) architecture for a design.   Designers and DFT engineers have to grapple with some considerations early in the design process.

Bassilios noted that “The number of chip pins dedicated to test is often a determining factor in selecting the right test architecture. The optimal pin count is determined by the target chip package, test time, number of pins supported by the automated test equipment (ATE), whether wafer multi-site test will be targeted, and, ultimately, the end-market segment application.”

He continued by noting that: “For instance, as mixed signal designs are mostly dominated by Analog IOs, digital test pins are at a premium and hence require an ultra low pin count test solution. These types of designs might not offer an IEEE1149.1 JTAG interface for board level test or a standardized test access mechanism. In contrast, large digital SoC designs have fewer restrictions and more flexibility in test pin allocation.  Once the pin budget has been established, determining the best test compression architecture is crucial for keeping test costs down by reducing test time and test data volume. Lower test times can be achieved by utilizing higher test compression ratios – typically 30-150X – while test data volume can be reduced by deploying sequential-based scan compression architectures. Test compression architectures are also available for low pin count designs by using the scan deserializer/serializer interface into the compression logic. Inserting test points that target random resistant faults in a design can often help reduce test pattern count (data volume).”

The early exploration of DFT architectures to meet design requirements – like area, timing, power, and testability – is facilitated by modern logic synthesis tools. Most DFT IP like JTAG boundary scan, memory BIST collars, logic BIST and compression macros are readily integrated into the design netlist and validated during the logic synthesis process per user’s recipe. Such an approach can provide tremendous improvements to designer productivity. DFT design rule checks are run early and often to intercept and correct undesirable logic that can affect testability.

Test power is another factor that needs to be considered by DFT engineers early on. Excessive scan switching activity can inadvertently lead to test pattern failures on an ATE. Testing one or more core or sub-block in a design in isolation together with power-aware Automatic Test Pattern Generation (ATPG) techniques can help mitigate power-related issues. Inserting core-wrapping (or isolation logic) using IEEE1500 is a good way to enable a core-based test, hierarchical test, and general analog mixed signal interactions.

For designs adopting advanced multi-voltage island techniques, DFT insertion has to be power domain-aware and construct scan chains appropriately levering industry standard power specifications like Common Power Format (CPF) and IEEE1801. A seamless integration between logic synthesis and downstream ATPG tools helps prime the test pattern validation and delivery.

ATPG

Kiran delved in greater details into the subject of testability by giving particular attention to issues with Automatic Test Pattern Generation (ATPG).  “For both stuck-at and transition faults, the presence of hard to detect faults has a substantial impact on overall ATPG performance in terms of pattern count, runtime, and test coverage, which in turn has a direct impact on the cost of manufacturing test. The ability to measure the density of hard to detect faults in a design early at the RTL stage, is extremely valuable. It gives RTL designers the opportunity to make design changes to address the issue while enabling them to quickly measure the impact of the changes.”

The performance of the ATPG engine is often measured by the following criteria:

- How close it comes to finding tests for all testable faults, i.e. how close the ATPG fault coverage comes to the upper bound.  This aspect of ATPG performance is referred to as its efficiency. If the ATPG engine finds tests for all testable faults, its efficiency is 100%.

- How long it has to run to generate the tests.  Full ATPG runs need to be completed within a certain allocated time, so the quest for finding a test is sometimes abandoned for some hard to test faults after the ATPG algorithm exceeds a pre-determined time limit.

- The larger the number of hard to test faults, the lower the ATPG efficiency.  The total number of tests (patterns) needed to test all testable faults. Note that a single test pattern can detect many testable faults.

To give a better idea of how test issues can be addressed Kiran provided me with an example.

Figure 1 (Courtesy of Atrenta)

Consider Figure 1, which has wide logic cones of flip flops and black boxes (memories or analog circuits) feeding a downstream flip flop. ATPG finds it extremely difficult to generate ‘exhaustive’ patterns and the test generation time is either long or the fault coverage is compromised. These kinds of designs can be analyzed early at RTL to find areas in the design that have poor controllability and observability, so that the designer can make design changes to improve the efficiency (test data and time) of downstream ATPG tools to generate optimum patterns to not only improve the quality of test, but also be economical to lower the cost of manufacturing test.

Figure 2 (Courtesy of Atrenta)

Figure 2 shows the early RTL analysis using Atrenta’s SpyGlass DFT tool suite. This figure highlights the problem through the schematic representation of the design and shows a thermal map on the low control/observe areas, which the designer can fix easily by recoding the RTL.

The analysis of the impact of hard to test faults at RTL can save significant design time in fixing low fault coverage and improving ATPG effectiveness for runtime and pattern count early in the design cycle, resulting in over 50x more efficiency in the design flow to meet the required test quality goals.

Conclusion

Bassilios concluded that “further improvements to testability can be achieved by performing a “what if” analysis with test point insertion and committing the test points once the desired coverage goals are met. Both top-down and bottom-up hierarchical test synthesis approaches can be supported. Early physical placement floorplan information can be imported into the synthesis cockpit to perform physically aware synthesis as well as scan ordering and congestion-free compression logic placement.”

One thing is certain: engineers will not rest.   DFT continues to evolve to address the increased complexity of SoC and 3D-IC design, its realization, and the emergence of new fault models required for sub-20nm process nodes.  With every advance, whether in the form of a new algorithm or new IP modules, the EDA tools will need to be updated and, probably, the approach to the IC architecture will need to be changed.  As the rate of cost increase of new processes continues to grow, designers will have to be more creative in developing better testing techniques to improve the utilization of already established processes.

Blog Review – Monday, Sept 01, 2014

Monday, September 1st, 2014

The generation gap for connectivity; seeking medical help; IoT messaging protocol; Cadence discusses IoT design challenges.

While marvelling at the Internet of Things (IoT), Seow Yin Lim, Cadence, writes effectively about its design challenges now that traditional processor architectures and approaches do not apply.

ARM hosts a guest blog by Jonah McLeod, Kilopass, who discusses MQTT, the messaging protocol standard for the IoT. His detailed post provides context and a simplified breakdown of the protocol.

Bringing engineers together is the motivation for Thierry Marchal, Ansys, who writes an impassioned blog following the biomedical track of the company’s CADFEM Users Meeting in Nuremberg, Germany. We are all in this together, is the theme, so perhaps we should exchange ideas. It just might catch on.

Admitting to stating the obvious, John Day, Mentor Graphics, states that younger people are more interested in automotive connectivity than older people are. He goes on to share some interesting stats from a recent survey on automotive connectivity.

Caroline Hayes, Senior Editor

Next Page »