Posts Tagged ‘verification’

Next Page »

Experts At The Table: Low-Power Verification

Thursday, February 9th, 2012

Low-Power Engineering sat down to discuss the problems of identifying and verifying power issues with Barry Pangrle, solutions architect for low-power design at Mentor Graphics; Krishna Balachandran, director of low-power verification marketing at Synopsys; Kalar Rajendiran, senior director of marketing at eSilicon; Will Ruby, senior director of technical sales and support at Apache Design; and Lauro Rizzatti, general manager of EVE-USA. What follows are excerpts of that conversation.

LPE: What’s the big challenge with verifying power in an SoC?
Ruby: Power has a couple of different components. One is how the low-power techniques impact functionality. If you talk about things like power gating, power supply shutoff, multiple supply voltages and so on, this is where you need to understand certain rules of turning on and off power supplies. You need to be able to create retention cells, to be able to retain state, and to retain functionality. That’s one major aspect. The other side is that you have to look at the power consumption itself. How do you verify that you are on target, if you have a target, and that you are not exceeding a specification? And how do you ensure the design has efficiency built in.
Rajendiran: This is all about is trying to verify what your intentions were that you stated in the beginning and making sure that has been implemented—and when the chip comes out, making sure it is functioning that way. In the old days we simply meant functional and timing verification. Now, just on the functional side, it has become so complex that just getting it out the door is a challenge. It’s the same with software. No one thinks about verifying it all. That’s the practical problem. The person who is verifying the power states doesn’t have the time to put in the right hooks. We have the Unified Power Format to help, but we still don’t have standardization as to how you verify the states. Tools rely a lot of naming conventions, but even though there are fewer companies there is still not compatibility in reading all of those things. Tools are always playing catch-up, too. The ideal solution will be a combination of great tools and planning. In addition, you can have the best tools, but if you put them in the wrong hands you don’t get results.
Pangrle: There’s a functional part and a physical part of verification. A lot of what is going on in the industry right now, especially with the power formats and the convergence around UPF and the 1801 IEEE working group, has been to keep the power intent separate from what has been the standard part of functional verification. It’s allowing people to use their standard flow, take it to RTL, and still be able to design RTL blocks that can be used in different design scenarios with different power management. You don’t have to hard-code isolation, level-shifting, retention registers into those blocks. You can still design your block your same way, and if in one design you’re going to power down that block that’s okay because the intent information is in a separate format and you can bring that in. From that standpoint, there has been good collaboration between EDA companies and their customers. From the standpoint of putting it all together and being able to support the tools, one of the things we’re seeing is that as EDA companies work with designers there are times where something is a little different and different vendors have created support. That’s where it gets tougher to move designs from one company’s set of tools to another. It also brings up some new questions. From the physical side, if you’re powering up and down blocks it has a real impact on your power grid and whether it’s going to function. Just because logically it looks as if it should work, that doesn’t mean when you get your chip back from the foundry you’re not going to run into other issues. And in terms of the complexity of testing, you can do the standard ATPG, but when you go through the dynamics of running different voltages and frequencies and bringing things up and taking them down, to what extent are you actually going to test that?
Balachandran: Verification is complex enough without low power, stretching the resources from both a verification productivity standpoint as well as IP cost. When you add low power into the mix, it makes things much worse. The complexity of low-power designs has been going up slowly but steadily. Some companies that are on the cutting edge, particularly in the mobile market, started adopting low-power designs about five or six years ago. They were the frontrunners of the whole low-power wave. They put the initial pressure on low-power verification, because now you have to start thinking about verification differently. You have to start thinking about voltages, multiple supplies, and whether things going to work in all those conditions. Clock gating is the most basic technique, and almost every company you talk with has been doing clock gating. Now that has expanded into more sophisticated techniques to curb the power, and with that comes the burden to verify properly. All it takes is one unverified state or transition or sequence for the design to completely lock up and not function at all.’

LPE: How bad is this problem?
Balachandran: It’s becoming more widespread. There are government regulations and green initiatives. Everything is going green. There are demands on specifications, and even on power for devices connected to the wall. That requires chipmakers to make their designs much more power-efficient. Customers typically start with four or five power domains. Some of that verification can be done with static techniques or with some rudimentary simulation. But it’s becoming more complex, and this complexity is increasing for the mainstream market, not just the mobile market. The number of power domains is exploding. We’ve seen designs with 50 power domains, which is potentially 250 power states. It’s pretty much impossible to verify all of them. So you need to come up with a really good test plan. When people are confronted with low-power designs the first time, they have no clue about how to write a testbench for low power. Often they need a lot of methodology help, in addition to having the right tools in place, to figure out what they’re going to do, how they’re going to go about doing it, and how they know when they’re done. Then, what is the measure of confidence they have at the end to figure out if they’re really done?
Rizzatti: From the perspective of emulation, this technology has been used for functional verification. Ten years ago, power management was essentially a gated clock. You turned off and on some part of the chip and saved energy there. Around 2001-2002, designs with 10 or 20 of these were called derived clocks. Today we have customers with 100,000 derived clocks. There’s an explosion. But that’s only one problem. Over the past five years, and especially in the past one or two, there are all these new techniques for turning on and off voltages. We had one customer with well more than 100 power domains. The whole industry is changing. Power management is a nightmare, and it makes SoC verification orders of magnitude more difficult.

LPE: With a disaggregated supply chain and more IP re-use, does it make it more difficult to verify the design? Not all of the IP is fully characterized for power.
Balachandran: UPF, or IEEE 1801, and CPF have ways to model the power intent of IP. The issue isn’t so much the ability to specify the power intent of IP. Talking to all the major customers, everybody is either integrating internal IP or using third-party IP. Some of the IP blocks have their own power management, too. It has to be communicated to the integrator of that SoC as to what are the legal ways to integrate the IP into the SoC. That information has to be passed along. The power format is not the right way to pass that information. So the industry has to work out a way—together—to solve this problem. The IP companies, the EDA companies and the whole ecosystem has to work on this to facilitate communicating the right behavior that IP can be integrated from a power perspective, and to tell the IP integrator when they are doing something wrong. If IP is coming from a third party and you have no idea what is going on with that IP in terms of its inner functionality or how the power is implemented and what ways you can put it together on the block, then you can shoot yourself in the foot pretty quickly. This is a problem that needs to be solved. One potential solution is to create assertions for an IP block. The IP developer doesn’t know how IP is going to be used, but they do know what is legal or not. They can create assertions for that and ship it with the IP. Then, when the integrator puts it into the SoC and runs the verification, they are able to figure out if they’ve done it properly or not. If it’s not, then they can have a dialog with the IP company. It’s a way of communicating the data sheet of the IP to the next-level integrator. This is one way of solving the problem. It requires close collaboration between IP partners and EDA and design services companies.
Rajendiran: More times than not, people don’t do that. There are many ways that tools can help, too. If some expert designed the IP block, he can provide some input and then a tool can insert assertions back into the RTL. Ideally you want to keep it separate as a companion file. That’s one approach. But the problem is more complex than that when it comes to low-power verification. IP is one issue. There is physical IP where you can’t do much because it’s already hard coded. There’s also soft IP. Each of the classes has its own challenges. With the soft IP, a lot of activity only happens at the gate level. Depending on how the RTL gets synthesized and mapped, you can have a perfectly functioning solution when you use a particular library in a particular foundry, and the same thing may not work somewhere else. You need deep knowledge about this stuff. You need collaboration of tools, the integrator and the IP developer to make sure you at least get the product out to market on time.
Ruby: There is another dimension of IP—the power intent side, which is the functional verification aspect. That’s absolutely essential to ensure the functionality. Time and time again, what I’ve come across is the need for some way to describe the power consumption behavior of IP, as well. It could be technology dependent or technology independent. It could be models that describe assumptions as a function of clock frequency or data rates. From my customer perspective, this is also becoming essential in the power verification area because they’re not just worried about functional intent. They’re also worried about hitting their power specs. They need models for the IP coming in. If they plug IP into their design and they run their clock frequency at a certain rate, what power consumption can they expect? That’s another very important element to this verification challenge.

Status Report: Power-Aware Design Flow

Thursday, January 12th, 2012

By Ann Steffora Mutschler
While the term “design flow” can be a moving target, there are some specific requirements for a low-power/power-aware tool flow. Looking at this from a high level, where is the industry today, and where is it headed?

There are really two sides to power, which are almost like two sides of the same coin: power consumption and power integrity. And both of those are global, spanning the system and the package and the increasing convergence of both.

“One thing required in this day and age of ever-shrinking product lifecycles is some degree of predictability,” said William Ruby, senior director of RTL power product engineering at Apache Design. “You want to be able to predict early on, when you’re not even halfway finished with the design, what is your power consumption going to be with a reasonable degree of accuracy? What does the thermal picture looks, even spilling over into power integrity? If I can estimate my power, I should be able to also predict some of the power-induced noise considerations, as well. Looking at the power-aware flow from that perspective, early power analysis for the consumption side as well as the power integrity side is really one of the keys here.”

But what about the tools? The back-of-the-napkin or spreadsheet-type calculations worked to a certain extent when things were not very complicated. There needs to be more precision built in. Apache’s answer to this is the RTL power model (RPM) to get better accuracy and more predictability early on. Ruby explained the RTL description allows for a good power number early on, looking at various operating modes. It takes that data into the power integrity side for early chip power integrity analysis. The predictability comes also with the ability to use RPM throughout the design flow to maintain consistency.

Mary Ann White, director of Galaxy power marketing at Synopsys, said various tools exist today that can deal with many aspects of the complete low-power flow. The problem is that systems engineers don’t tend to think about tools in this way. “Just within the implementation flow, there’s verification and implementation, and we find that those engineers don’t exactly talk and work together as easily, so can you imagine what the challenge would be if it went all the way from system-level to somebody that has to deal with manufacturing and then packaging? Even though we tend to provide solutions in those spaces, we find that customers are still very specialized in their very specific areas.”

What engineers want
Krishna Balachandran, director of low-power verification marketing at Synopsys, said to understand what engineering teams really need it helps to segment customers into different buckets. “There are customers that are very advanced in their needs and there are some other customers who have some low-power needs but they kind of know what they are doing—they’ve been doing low-power for longer than the power formats have existed so they’ve evolved with what has happened in terms of power formats and they’ve started using that. Then there are some new customers that are being forced to think about power not because their devices are by themselves low-power, but by virtue of the fact that they are using smaller geometries to reduce the cost and to take advantage of the wafer pricing which can drop. Those customers think that if they drop down to the lower geometries they’ll have to use some power techniques now in their design, because if they don’t then the leakage power becomes unacceptable. So for these reasons some of these customers are coming into the flow and their requirements are very modest. They are almost able to address in an ad hoc way what they have to deal with, rather than by design aiming for lower power chips. There is a whole range of sophistication when it comes to low-power designs and flows. I see that their needs are very different.”

Barry Pangrle, solutions architect for low-power design at Mentor Graphics, said in the future there will be more emphasis on front-end tools. “That will include architectural-level, system-level type stuff, especially hardware/software tools that will allow designers and even software developers to be able to get a better understanding of how the code they are writing impacts the overall power of the products they are developing. You can have really great hardware and if the software doesn’t take advantage of all the capabilities of the hardware, you throw all that effort away.”

Power formats, mixed-signal designs
In the middle part of the flow, one positive step forward last year was that all major EDA vendors came together to pledge their support on the IEEE 1801 power format standard, which should help with tying everything together. More than just the power format support, the underlying methodology is also critical. Qi Wang, technical marketing group director of solutions marketing Cadence, said a converged methodology is still needed—a single power-intent description that can be used in every stage of the design flow to provide consistency.

Overall, he said, it looks as if we have all the pieces of the power-aware design flow, but there’s still a long way to go to address the multi-vendor flow. “Right now we have two formats. Even if we have one format there will still be challenges, but that will play out over the years because at least the whole market on the customer side will be adopting the same power format approach. Right now some of them use CPF, some of them use UPF. The methodology shift is happening. That train has left the station; that will not be changed. It just takes time for the vendors to work out this multi-vendor flow.”

However, he pointed out, there still are technical areas that need more investment. “One big important thing is in the area of mixed-signal design. If you look at all the hard products right now, it’s all about mixed-signal and low power: you have a mobile application, you want to access everywhere, you have wireless, you have Wi-Fi here and there. It’s all about a mobile and battery powered. This means low power and mixed signal. Customers have combined these together. The technologies need to be combined, as well.”

Another key area is verification. Erich Marschner, product marketing manager for functional verification at Mentor Graphics said, “The verification aspects of low power are largely related to methodology because of the capabilities in the tools have been developed over the last four or five years to model the effects of low power, power management and active power management. Users are still behind the curve in terms of trying to understand what to do with those capabilities. Most of the low power simulations that are done today are still done in the context of UPF 1.0 – the previous version of the standard.”

In this regard, many users still have a way to go to take full advantage of the technology available today.

Experts At The Table: Verification At 28nm And Beyond

Friday, May 6th, 2011

By Ed Sperling
Low-Power Engineering sat down to discuss issues in verification at 28nm and beyond with Frank Schirrmeister, director of product marketing for system-level solutions at Synopsys, Ran Avinun, marketing group director at Cadence, Prakash Narain, president and CEO of Real Intent, and Lauro Rizzatti, general manager of EVE-USA. What follows are excerpts of that conversation.

LPE: The big challenge is bounding whatever you can?
Schirrmeister: Exactly. One interesting thing about the application domain is it often determines verification requirements. In the wireless area, if I have to reboot my phone once a day that’s annoying. Someone has made a risk analysis of when it’s annoying enough to buy another phone. There are other areas, like in mil/aero, where the verification has to be much more complete. In mil/aero, automotive and medical, designers are much more open to adopting repeatable and checkable processes beyond just RTL. UML adoption is great in those areas. You have a higher-level model, generate your code, and then you have formal checkers.

LPE: Does the verification process become more complex as we move from 28nm to 22nm and eventually into 3D, or is there just more data we have to deal with?
Narain: Yes, it becomes much more difficult.
Avinun: With functional verification it’s based on complexity and size. What’s new is the number of cores and embedded software you have to integrate. We don’t care if it’s 28nm or 22nm. You can have devices that may be more complex and challenging at 28nm than at 22nm.
Narain: Timing closure and power become very different.
Avinun: But I don’t see a bump from one process node to another. For functional verification it’s design complexity, the number of processor cores and the amount of embedded software.
Schirrmeister: But then you have to verify all the other aspects.
Narain: Yes, it’s timing, power, test modes, timing closure—all of these things become serious issues. If you have a 250 million-gate design, how many machines do people have to run all these processes at that level? The way you sign off is very different. You need to put a methodology in place so that when it’s broken down the pieces are done correctly and the overall process is still correct.

LPE: It sounds like we’re talking about a much broader definition of verification, right?
Narain: Definitely. The chips are failing not just because of functional issues. A bad timing constraint is just as bad as a functional issue in the design. Many times you can re-do software. People can mask off functional modes. But how do you solve a clock-domain issue that makes an interface unreliable? That device is dead. It gets less attention because an attempt has been made to make those problems more bounded. But as far as risk factors go, these physical effects are as catastrophic as the functional aspects.
Avinun: At each new node you have more opportunity to get more complexity, more features and more software into your device, so naturally it’s also increasing the functional verification challenges. One of our customers told us that by moving to the next node they got 3x more capacity to lower power. It is easier to implement those features when you have more real estate and more gates. Otherwise it would be more difficult. You will use more real estate because it’s free from a material point of view. But it’s certainly not free from other standpoints.
Schirrmeister: On top of this, all these timing effects have to be verified. One element of making this more bounded is that the components are pre-verified. There are only a couple of IDMs out there. What’s changing is which tasks the foundry user does versus the OEM, and that will change even further. From 28nm to 22nm there will be more need to be bound. More IP will have to be pre-verified so the user doesn’t have to worry anymore that the processor core won’t work.
Narain: As an example, there’s an IP with an asynchronous reset. People can re-use that somewhere else, but when the noise levels went up the reset got asserted. It became a vulnerability point. These are $1 million bugs. You don’t think this can happen, but methodologies are breaking down.
Schirrmeister: The divide is getting bigger. Gary Smith said verification is going up and going down. It’s going down to the electron spin. At the top, at the software verification level you’re worried about having 12 cores on a design. The divide between these two areas is getting bigger. The software can break. Ptolemy (software environment) stopped working at UC Berkeley when they ran it on a multicore PC because there were a couple of deadlocks that were not properly programmed. Once you introduce this complexity, things break at the software and system level.
Narain: EDA vendors develop software. Software bugs tend to be more forgiving because software can be rewritten. It’s much harder to change things in silicon.
Avinun: But if they need to spend two or three months in the lab, it doesn’t matter. It’s forgiving once you find it, but if you miss a market window that’s not forgiving.
Schirrmeister: The guy who has the unforgiving three-month schedule has to design smarter. You have to switch the fabric. In an FPGA fabric the hardware is forgiving because it’s programmable. That’s partly bound because it’s all bound for timing, and you can do a completely new thing where the hardware can be changed.
Rizzatti: It all depends on what kind of software you’re talking about. There are three major classes—the drivers, the operating system and the application software. When you deal with drivers you have less forgiveness. We developed two demos. In the first one we injected an error into the description of the driver. The error shows up in verification as hardware. In the other demo we injected an error in the hardware and it showed up as an error in the driver.

LPE: What happens when we go to stacked die? Is it even harder?
Schirrmeister: It’s just another component to make it more complex. You can model that as an interconnect if you need to, and at the base level you need to make sure the electrons take the right path. It’s another step in verification.
Avinun: It’s not so much functional verification. But there is more complexity.
Narain: The physical effects that were in the second and third order that could be ignored can no longer be ignored. They’re now all first-order effects.

Experts At The Table: Verification At 28nm And Beyond

Thursday, April 14th, 2011

Low-Power Engineering sat down to discuss issues in verification at 28nm and beyond with Frank Schirrmeister, director of product marketing for system-level solutions at Synopsys, Ran Avinun, marketing group director at Cadence, Prakash Narain, president and CEO of Real Intent, and Lauro Rizzatti, general manager of EVE-USA. What follows are excerpts of that conversation.

LPE: Power seems to be an increasing portion of the whole design process. Who’s responsible for creating the power models?
Schirrmeister: The technology provider—TSMC or GlobalFoundries—will have to characterize their libraries for power modeling. There are high-level models for power intent. And then when it comes to libraries there has always been characterization. For high-level synthesis, you have area, performance and power. Some companies like ChipVision enhanced that to include dynamic power, which basically meant you no longer had one number for the power multiplier. It was now dependent on how the inputs were targeted. What we see today in TSMC’s reference flow 11 is they characterize their libraries for low power to make it accessible for transaction-level modeling. Then you add up meaningful power numbers that correlate to those technologies. This is in the early stages today and it’s proprietary. At one point this needs to be standardized.
Avinun: There are two topics here. One is methodology—how to do this. The other is what is the format and who owns the accuracy of the data behind the format. What hasn’t been solved with methodology is beyond TSMC. We have compared our results with real-world results. When you tell the customer, ‘Compare your back-end flow with your libraries,’ that’s not good enough. That’s only going to give you the data about your SoC or ASIC. When they look at power, they look at what they test in the lab, in real environments, and with all the other devices—the noise, the environment beyond this. That’s what you measure in the lab. First, there’s a problem measuring the power. Even the most advanced customers don’t have a good methodology. They also don’t know how to partition the die of the ASIC, so what they measure is the overall power. They don’t know how to partition those components and there is no good way to model those. We’re looking at the ASIC and the die level. But you need to model the whole system. And then, what are the key critical components of the power? No. 1 is the system power, including the chassis and the die. The second is dependent on what you are running. Beyond this, once the methodology is solved, there is an issue of formats and who controls the libraries. The other component that needs to be considered is the memory. Memory is consuming most of the power, and these models are being controlled by the memory vendors.
Schirrmeister: And the access involves software.
Avinun: At the block level there are methodologies addressing these issues. The problem is at the system level. If you go high enough too early you may have 300% error. Or if memory is 80% of your power consumption, if you haven’t decided what memories you’re going to use then all you can do is make the tradeoff analysis. But you can’t say it’s going to consume this much. Over time we will get better at simulating and emulating the different scenarios, but it’s all still in the very early phase. TSMC seems like one of the key companies to take the initiative here. It’s not just ASIC issues.
Narain: I have to question why the spreadsheet approximation will break down. For power estimation you need the characterization of the libraries and all this information from the vendor at the block level—and that’s where you get more precise power estimation information. But if you’re doing planning, why would you move over from spreadsheets if you’re 300% off?

LPE: Isn’t it just complexity? At 28nm you’ve got electromagnetic interference (EMI), electromagnetic compatibility (EMC), electrostatic discharge (ESD) and all these different power islands.
Narain: But you’ll still be off in accuracy with power. The simplicity of the methodology will still persist.
Schirrmeister: There’s complexity on the technology level. But the functionality is dynamic and cannot be predicted.
Narain: When you want precision you can’t do that at the system level. If you’re doing system-level planning, you’re doing first-order estimates and second-order estimates. How precise does your planning need to be? Spreadsheets should suffice.
Schirrmeister: We’ve already seen that breaking down. Within the model there is dynamic power, depending on the inputs. And even in high-level synthesis you have schedules. You also need predictable success and then that correlate back to the predictions from the beginning. There also are a lot of parameters beyond the system on chip. But if you contain it to the chip and the memory it accesses, a lot of things can be done pre-RTL if you really execute on transaction-level models. Is it as accurate as RTL? No, but the layout designer laughs about the RTL designer. In the end it comes down to predictability and correlation.
Avinun: You need to separate between the planning and the analysis. For the planning you can do it with chip-planning approaches. We allow you to take data-sheet components—our legacy components—which are not the complete design, and do tradeoffs between memories and IP. And for new IP, we use high-level synthesis, which allows you to make quick tradeoffs between area and performance. If you optimize for area, then you have to optimize for performance. And then, for full system-level analysis, you need a presentation of your system. That could be RTL. It doesn’t have to be high-level synthesis. But as soon as you go to the full system, it’s not accurate.

LPE: Verification has taken 70% of the NRE in design. Will that ever be brought under control at future nodes, or will it just get worse and worse?
Rizzatti: More and more, emulation is replacing simulation, especially for block-level verification. You can do a lot more work in a given amount of time. I would expect that will be a major contributor in containing the cost of verification. In a given amount of time you will achieve higher verification. And because SoCs will have more and more embedded software, it will be more complicated. Just saving one re-spin at 28nm will be $10 million or more.
Narain: If you look at the design cycle times, there was a time when people used to tape out 1 million-gate designs in six to nine months. Then they went to 10 million-gate designs in the same time period. Now it’s approaching 100 million gates and at some point in the future it will go to 1 billion gates. In six to nine months you can’t design 100 million gates from scratch. There’s a lot of re-use. SoC methodologies have improved. Simulation is very important in designing gates from scratch. Emulation is important for system-level design. But at the end of the day you still have to tape out the chip. The signoff requirements aren’t just about functional verification. The number of issues you need to look at with clock-domain crossing may increase 10x, so the signoff time will increase 10x. That’s not possible. These other methodologies are becoming much more important to the overall verification strategies. Verification is no longer just about simulation. It’s a lot of processes that are done in parallel, and they’re better served by an independent methodology to sign off. Power intent verification is one item. Clock domain checking is an item. Methodology management is also going to become very important.
Avinun: We’ve seen verification, integration, and overall hardware-software verification are becoming the key problems. If you look at the block level, most vendors and customers know how to do it. It’s still challenging, but it’s not as challenging as it used to be. The challenge now is in the integration, and we see several trends evolving here. One is on the emulation side. This may be the same solution we had 10 years ago, but it’s still one that customers are using. We also see major pressure from companies to migrate from block-level to SoCs. Some are using acceleration, but they’re also becoming smarter about the way they verify the SoC and do simulation. They partition the problem into multiple domains, they use more off-the-shelf IP and verification IP. In addition, they are moving to a higher level of abstraction. We don’t see this yet as successful as emulation, but companies say this will be one of the ways to solve the problem. They won’t be able to continue to do RTL verification. They’re going to hit the wall, which means they will be spending too much money. It’s a major change but it’s not going to happen overnight.
Schirrmeister: Simulation and hardware emulation are important, but the underlying problem is it’s different machinery to do the same job faster and better. There’s also the issue of becoming smarter about the verification. Verification is an unbound problem, so the only solution is design. You need to be smarter about how you design the components. With block design, a lot of verification relies on the fact that the block is pre-verified. IP qualification makes sure the IP will work in the new system context. That has become an important part of the overall verification. On top of that, to make it smarter, you want to avoid verifying functional components. Instead of running all variations of an MPEG decoder on a hardware block, you make sure you have the right instructions. You need to separate the functional verification from the structural verification and move the verification into the software.

Verification At 28nm And Beyond

Thursday, April 14th, 2011

Low-Power Engineering looks at the challenges ahead in IC verification with Frank Schirrmeister of Synopsys, Ran Avinun of Cadence, Prakash Narain from Real Intent and Lauro Rizzatti from EVE.

YouTube Preview Image

Experts At The Table: Verification Nightmares

Thursday, May 13th, 2010

By Ed Sperling
Low-Power Engineering sat down with Shabtay Matalon, ESL marketing manager in Mentor Graphics’ Design Creation Division; Bill Neifert, CTO at Carbon Design Systems; Terrill Moore, CEO of MCCI Corp., and Frank Schirrmeister, director of product marketing for system-level solutions at Synopsys. What follows are excerpts of that conversation.

LPE: How important is a high-level model in verification?
Matalon: If you have a reference model that is a TLM and you have a good way to find equivalency between the TLM and RTL, then why not give the TLM to software designers? For many applications TLM without timing will be sufficient for certain timing-critical tests. You also need a TLM to model timing accurately for the approximately timed level. But you can create hundreds or thousands of replicates with a TLM platform that are free. The replication is free—or almost free. For the software guys, that’s a very powerful solution.
Schirrmeister: And that’s the challenge to figure out. We haven’t quite figured out how to do equivalency checking against the TLM.
Neifert: The average SoC has tens to hundreds of blocks. If you’re starting from scratch and want to generate your TLMs, that’s a great approach. But what we’re seeing is that companies are only developing 20% to 40% of this IP internally and the rest they’re getting from outside. Who knows what form that stuff is in.
Moore: Isn’t equivalence checking hard intrinsically hard?
Schirrmeister: Yes, but from a verification perspective everything we do has to add up to less than what we do today. If adding TLMs to your software isn’t helping you to reduce the time you spend on verification, people will be hesitant.
Matalon: Allow me to disagree. First, I’m not talking about using formal methods to validate equivalency between a TLM and RTL. But inherently when you build an OVM environment to validate a block, you need a reference model. What does it mean to do verification of the RTL? It’s a comparison. We always validate RTL by comparison. You can use simulation techniques to say this TLM is functionally equivalent to the RTL that’s getting implemented. If your TLMs allow you to model the registers correctly and things that maybe in the past weren’t done, we can assemble TLMs. It will be the standard practice of every IP provider to provide a TLM 2.0-compatible model. For the re-used IP, which constitutes 80% of the design, I think this opens the door for replication of the TLM.
Schirrmeister: I agree with the equivalence verification. We’re not quite there with IP providers. But the challenge with TLM models, because you don’t have synthesis and formal techniques, it is not just an ordinary part of every design flow. It’s an additional effort. And what happens at the end is someone changes the RTL before tapeout and people don’t keep the TLM models in sync with what ends up being implemented at the end.
Neifert: They should be generating this automatically from the RTL. Then you solve that bottom end.
Matalon: If the RTL has changed without updating the reference model, then you haven’t validated your RTL in the context of the system. It’s all about bridging between the transaction level and the RTL. To change functionality, you change two lines of code.
Schirrmeister: For us on the TLM side, it’s always an investment decision.
Moore: It all comes down to economics. Why do people do something stupid like changing the RTL without changing the model? It’s because they think they’ll make money by doing so.
Schirrmeister: And if you have a set of hundreds of them it’s hard to keep them in sync.

LPE: Let’s talk about economics. Verification used to be 70% of the NRE. Is it going up? Or is it now blurred between what’s verification and what isn’t?
Neifert: It’s getting blurred. It’s as much an integration issue as anything else. You’re obviously spending more money now, but the integration task is taking over some of verification because people are using software to drive some of this. Is it a software budget or a verification budget? I don’t think you can draw that line as definitively anymore because you probably re-use some of that stuff in your software once you verify things work.
Schirrmeister: Verification definitely is going up overall. The question is where it’s done. Hardware verification has gone down and the hardware verification manager is thrilled. But the verification nightmare has shifted to software.
Moore: The classic example is a baseband processor in a cell phone. That processor contains boot code and it has to operate the USB and operate the software during mass production. If that doesn’t work you don’t have a product. And because it’s sitting in the mask, that boot routine has got to be right.
Matalon: Verification doesn’t go down as a whole. How can it go down with increased complexity from multicore and functions that are implementing hardware and software? It really depends on who the verification manager is. If I’m the verification manager and I’m confined to System Verilog and my life is to carve out verification for hardware blocks, my life is easier. Now there are off-the-shelf transactors and you can just add them in. But if you’re the verification manager who has to validate your design is correct and meeting spec at the system level, and also responsible for meeting performance and low power, then the load is not going down. And if you’re not keeping up with advanced methodologies, you will be in trouble.
Moore: And as each node comes along the absolute cost of failure is escalating.
Matalon: If you don’t validate early and catch what you call stupid bugs, or in some cases nasty bugs, you are in trouble. That’s where the shift is happening. The kind of verification people do will shift from the block level to the newer ESL space where there isn’t maturity yet.
Schirrmeister: Verification is never complete. It’s a question of when you are comfortable enough. But the sword of Damocles is always hanging over you. If you mess up the chip, it’s $3 million for a new mask in never-recurring costs. If it’s software, there’s always service pack two. But in the case of Toyota, the impact can be devasting.
Moore: The economics of Moore’s Law were such that shipping fast was imperative.
But it’s not just Toyota. There’s a strong suspicion there have been numerous glitches in drive by wire. If you look into, there are lots of situations where software problems are present. Verification is a hard problem and you have to really set up your workflow so you’re throwing every tool that’s economically justified.
Matalon: I don’t see any tool being taken off the table. It’s all methodology and which tool you use when. Not everyone is using the more innovative technologies. For someone used to waiting for the silicon to come back and then sticking it on a board and validating it, this is a huge transition. Validating by writing a model at the SystemC level and dealing with virtualization is much different. You need to know when to use the tool, use the right tool as early as possible, and take advantage of what you develop earlier during the downstream phase. If you use TLMs early, then you re-use those TLMs when you verify. And if you use silicon validation, which is not going away, then use all the tools that you have used before for reference debugging and running system-level scenarios where you replicate some of the problems you see in silicon by validating your original assumptions. If you have to debug a problem on your silicon, that’s very hard. You can use emulation as a reference for debugging. You can use a transaction-level model with timing and power information to compare what you have received, where you made the mistake and how to fix it.
Neifert: That TLM framework can be used throughout the debug cycle, which is where you get the dollars to justify it. Initially you can look at it as an incremental expense. But when you look at how it scales and you realize you don’t need to generate an independent model for this and an independent model for that, that’s where the real value is.

LPE: What are the bugs that are fatal? Are they power? Design?
Neifert: If you look at the stuff that makes the news, it was the division problem in the Pentium. It was a pure hardware bug. If you applied the verification tools of today, that would have been caught. Today it’s not just hardware. Most engineers think there’s some aspect of software.
Schirrmeister: The fatal bugs are the ones that cannot be corrected in software today and which end careers and are not reported on. Those are the ones you never read about.
Moore: The fear for most companies in our space are the bugs that kill companies. What causes those? Mask spins. And what causes those typically are system-level problems. You go to hook it up to a critical system and it doesn’t work. And it’s down at the RTL level and it’s not accessible because of all the protocols that are running too fast to make it accessible.
Matalon: If I look at a functional bug that can be overcome by software, it’s not a fatal bug. A functional bug that cannot be fixed by software is fatal. But the reality is that if you have a performance issue or a power problem, where does it stem from? It’s probably because you’ve validated hardware in isolation, but not in the context of the software. Those are the things that are fatal and scare users away. We have a customer that designed IP and wanted to find out if it would meet performance in the context of the system. They couldn’t simulate at the gate level so they abstracted to the TLM. There are ways to fix functionality sometimes with software. But I’m not aware of any way to fix a design that hasn’t met performance or power in the context of the software by fixing the software.

Experts At The Table: Verification Nightmares

Thursday, April 8th, 2010

By Ed Sperling
Low-Power Engineering sat down with Shabtay Matalon, ESL marketing manager in Mentor Graphics’ Design Creation Division; Bill Neifert, CTO at Carbon Design Systems; Terrill Moore, CEO of MCCI Corp., and Frank Schirrmeister, director of product marketing for system-level solutions at Synopsys. What follows are excerpts of that conversation.

LPE: What’s the big problem in verification?
Matalon: In today’s modern designs, many functions are implemented in software. Obviously, performance-critical functions are implemented in hardware. We’re seeing designs moving to multicore, multiprocessor. When I look at verification, I look at how you verify your requirements have been met at various levels of implementation. That includes functional verification, meeting performance requirements, and meeting low-power requirements. Those three are getting more and more intertwined unless the designs are not performance-critical or low-power critical, and we don’t see many of those kinds of designs these days.
Schirrmeister: The meanest bugs are at the hardware-software interface. This whole notion of getting something on which to develop the software as early as possible is critical. And it’s verification on both sides. It’s verification of the hardware, the software, and the hardware-software interface. We also see an increase in people using the software to verify the hardware, which augments the traditional System Verilog test benches. It’s not just the function and the driver. That’s a new trend.
Matalon: Why do you think that’s a new trend? People have been using software to verify hardware for as long as I’ve been in EDA, which is a long time. It has been used in emulation, in products like Seamless. Maybe what is new is that there are more advanced techniques like SystemC and TLM 2.0 models and more sophisticated modeling of software. But the need to verify the hardware in the context of the software has been there for quite a long time.
Schirrmeister: You are right. There are new techniques coming in earlier in the flow. But if you look at hardware and software and you think about the drivers and the OS and the application software, what used to be seamless was the low-level drivers. But it is being applied to new techniques.
Neifert: Customers have always been finding hardware bugs by running software on top of it, but mostly that was a by-product. It wasn’t a concerted effort to find bugs in the RTL with the software. What I’m seeing more and more is a blending where they are spending time writing software specifically to test the hardware. There’s great block-level testing with System Verilog test benches, but the interplay between these blocks is really only seen when you get the software running on there. Direct software testing to get the interplay of these blocks is the emerging area.
Moore: We’re a software company, and we get involved in verification with our customers selling complete systems. What we see as the big problem is the 1+1=3 issue. If you have two systems talking to each other, typically you’re only designing the SoC for one of the two. You can’t get to some of the bugs without assembling the complete system and testing it. You can’t get the coverage. Certainly the hardware-software interactions are becoming easier with virtualization, but the stuff that really is a problem is when you take a Windows system and you hook it up to your embedded system and you find there’s something you overlooked.

LPE: In the past, many designs were static. Now the designs are in motion. Power, software and hardware all change along the way. How accurate is the verification?
Matalon: Software provides a lot of configurability. You can change it and implement different functions. We also see a lot of configurability in the hardware. One effect of configurability is that it increases complexity, particularly with verification. You basically need to deploy almost every tool in your inventory to tackle the verification problem. You have to start very early and use abstracted methodologies and technologies such as transaction-level models to model your hardware and to run enough scenarios. You can use emulation. Verifying hardware and software on silicon isn’t going away. And even though it’s less prevalent, verifying software against RTL isn’t going away. All of this is forcing use of all the combined options on nailing the system-level verification problem.
Schirrmeister: There are so many moving targets between hardware and software that what the user is verifying has to be more specific. For example, if you have a configurable Tensilica processor and that gets used to replace a hard-coded RTL block, it has a profound impact on verification. You don’t verify the processor beyond the test benches you got from the vendor. You need to verify the connectivity between the different modules. And the functional verification actually happens a lot in software while the chip is burning in the fab. You don’t need to verify all the Postscript while the chip is running. You verify the connections and the architecture. Now they’re doing software verification for the functionality, and it becomes more specific for what the user wants to verify and has to verify.
Neifert: When you look at the commoditization of the IP blocks, almost every phone has the same core set of chips in it. They really differentiate themselves based on the software. If you look at the ways consumer products are increasingly differentiating themselves, it’s based on the software. The stuff you’ve got to put into the hardware to support that is enormous. Everything out that has a USB and graphics, and all the necessary software has to be integrated and support it. You may have software that only uses three-fourths of these, and then you take that same IP that was working fine in another chip, put it in a new environment and it exposes a whole new set of problems. You can have the same chip and new software is going to create new problems.
Moore: With the increasing adoption of open standards, you don’t have a closed system anymore. In the old days, even if the chip was being used in unanticipated ways it was still a closed environment. You could adjust what you were trying to do. With open standards you can’t change what Microsoft is trying to do or what the cell phone is trying to connect to. You might not want to just connect to one model, and you don’t know what they’re going to do next year. And your verification requirements may not anticipate how it’s going to actually be used.
Schirrmeister: So it needs to be future-proof.

LPE: Isn’t that even harder with derivative chips? It’s too expensive to develop one chip and they may have to last until the next node, so how do you future-proof it?
Neifert: But that’s exactly what’s happening. In the wireless space, they’re trying to design three or four years of cell phones in one chip. But few people anticipated four years ago that the iPhone was coming out. There are a lot of features being built in. And there’s a high cost if you guess wrong.
Schirrmeister: One of our customers told us they didn’t predict MP3 and they didn’t go with a programmable solution. Their chip wasn’t bought once the next standard came out. But on the verification side, it depends on what the user is looking to verify as well as the type of software. If you look at the range of software in the stack, the vehicle with which you verify the software changes depending upon the needs of the software. Starting at a very high level, if you’re downloading the iPhone SDK then you get something that’s hardware-dependent. If you go lower, you want to verify if the register fields are okay. The next level down you get to software that needs to understand the performance, so you care about memory management and cache. And when it gets into automotive safety critical, then you need cycle by cycle.
Matalon: The need to verify hardware and software is easier at various stages. But it also depends on what you need to get out first. Sometimes you need to get the hardware out first and you can modify the software. In many other cases, the software is your gating item. If you don’t have the software ready when the hardware is ready, you may not be able to tape out. The hardware may be inadequate to support the software. That’s one of the things we’re seeing today.

LPE: Inadequate in what way?
Matalon: Here’s an example. How would you know the performance of your final design if you are validating without the context of the software. You wouldn’t know the dynamic power being consumed by a device without something that really represents the workload of your application software. If you tape out your chip you might find out later that you’re missing your performance or low-power targets. You cannot validate them without software. You need to validate all three in concert—the hardware, the software, or the interactions between them. You need to create a functional model that can show you are meeting functionality, power and performance that can be used as a reference.

Experts At The Table: Low-Power Management And Verification

Thursday, March 11th, 2010

By Ed Sperling

Low-Power Engineering moderated a panel featuring Bhanu Kapoor, president of Mimasic; John Goodenough, director of design technology at ARM; and Prapanna Tiwari, CAE manager at Synopsys. What follows are excerpts of their presentations, as well as the question-and-answer exchange that followed.

Bhanu Kapoor: There are two types of power you need to consider: Dynamic power, which is consumed because you are doing some useful activity, and leakage power, which gets consumed whether you’re doing something or not.

The dynamic power has dependence on switching activity, the frequency, the capacitance and the supply voltage. There are two components of leakage—sub-threshold and gate tunneling. Gate-tunneling is addressed by advances in process technology such as metal gates. Sub-threshold leakage grows exponentially with the decrease in threshold voltage. At 90nm it was significant, at 65nm it was equal to the dynamic power, and it grows from there.

If you look at the typical smart phone, it’s the same system-on-chip that is running different applications. These different modes of operation have different performance requirements. You can use different voltages to achieve those different levels of performance.

A typical power-managed SoC includes a power-management IC that provides different cores. One core can be a processor. And if it’s an ARM Cortex A9, there is power management in that core, as well. A second core might be for mixed signal, which potentially could require higher performance. And then this power controller, which is on all the time.

All of these power techniques have an implication on verification.

Slide5

If you look at standby leakage, one of techniques is power gating, which is cutting off power to certain regions. If you don’t need portions of the chip to be on, you can completely shut it down. That is power gating. But that has an effect on performance, because turning on and off a function is a long event compared to a clock cycle. You need to sometimes retain the state so you can come up fairly quickly.

All of this has an effect on verification, as you can see from the following chart.

Slide6

If you can do gate-level simulation, that is very helpful. You need input/ouput and power connected and you need to have appropriately modified your library definitions so power is one of the variables. With domain isolation, once you shut down you have to make sure you are not sending floating values to other regions. You have to isolate it to proper ones and zeros, which you can check with isolation gates using a rule-based checker.

If you have power in your simulation, a lot of rule-based issues can be addressed right up front. Over the years, simulation was not power aware. In the future, simulation will take a more and more important role. Simulation, by default, will incorporate power.

John Goodenough: We are verifying systems on chip. They’re large. They have lots of power domains to match all the application workloads that are going to be demanded on those devices. They have processors and software. Some of the domains are being switched on and off to meet the energy profile. They have virtually every technique available. The state space you’re trying to validate is therefore exploding by an order of magnitude.

One of the things we think about a lot at ARM is that it’s not so much the techniques that you can apply. It’s how you’re going to scale them to tackle these problems. There are lots of clever ways to validate, but not all of them scale effectively into workflows and onto your infrastructure. Power verification is not just about logical verification.

If you get a chip like the one below, you can mess it up in a lot of different ways.

Slide3

Usually, you can fix it in software. But you also can mess up the connectivity between the power domains. If you get your level shifter or always-on buffer or retention register wired up wrong, it’s not going to work. It’s going to be D.O.A. on the bench. A lot of chip failures are being caused by the failure to verify the integrity of the power network.

That’s a non-standard piece of verification, particularly where that interacts with the logical function of the chip and you’re trying to measure the maximum in-rush current and the average in-rush current. If you’re switching domains on and off, what’s the power domain going to look like from an electrical perspective? Is turning one domain on and turning another domain off going to put the voltages on either side of a level shifter into a pathological state that will damage or degrade the transistors and the level-shifting buffer?

There are some very interesting cross-coverage issues between what is traditionally more of the analog verification space on the power network and the logical verification space. We need, when considering power simulation, to run abstracted analog simulations, SPICE-level simulations, and cross between the two.

Unfortunately, the explosion in power states is also increasing because of the number of software states or the number of field configuration states. From a verification standpoint, not only are you adding a multiplier due to power states, you also have things like a secure or non-secure state. Will they work when a chip is configured for a single package and pinout if it uses another package and pinout? There’s an explosion in these operating modes.

The other pressure we have is making sure you’re going to hit a given schedule. In looking at the power metrics it’s important to see how they can be applied into practical workflows and how you can feed performance metrics from wherever you are in the process back up into reporting and closure reporting. If you combine the need for those two, one of the things it leads to is enterprise scaling, both in terms of infrastructure to support the simulation and how you scale this across workgroups that are not co-located.

The other problem you face is that if you do all of the verification, you’re never going to get the chip out the door. You’ve got to have a verification plan and really narrow down which of the power modes are going to be pathological and which ones can be worked around in software. A major part of thes power verification is the integration of a VP of engineering risk-reduction play into a more mainstream verification practice.

We’ve come a long way in a lot of the techniques, but at the end of the day you have a block diagram that needs to be simulated. Today that block diagram consists of RTL and some way of describing the power network or the power intent and power state space of the design. You also have to support the verification IP and transactors. You need coverage across the RTL and the power descriptions. It’s not rocket science. It’s just a more complicated block diagram.

Slide4

Verifying Low-Power Designs

Thursday, January 14th, 2010

By Ed Sperling
Power islands and multiple voltages used to be reserved for cell phone and process companies, but as more companies move to 65nm and 45nm process nodes these approaches to saving power—particularly in chips with multiple cores—are becoming mainstream.

The problem isn’t in the architecture of the chips, although that certainly brings its own set of challenges. More and more, the real holdup is at the verification level. While the percentage of time spent in verification has remained relatively steady—anywhere between 50% to 75% of the total time it takes between architectural design and tapeout—the size of the verification teams has doubled and in some cases tripled.

“Verification is the next big challenge,” said Naveed Sherwani, CEO of Open Silicon. “As an industry we have not done a good job managing verification. A new methodology would be very welcome. We have had to develop methodologies in-house to deal with this.”

Sizing up the problem
All of the major EDA vendors recognize the extent of the problem. They’ve been dealing with horror stories from the field since the 90nm process node. And according to TSMC, about two-thirds of the industry is now at that node or beyond.

The most advanced parts of the semiconductor industry are now working on 32nm and 28nm, with even more power states—on, off, sleep, and sometimes even more in-between states—more power islands and more processor cores. In the most advanced chips, some of those cores are even heterogeneous, which means they may have different voltages and states than the other cores. That allows a system to reduce power consumption overall and concentrate power where and when it’s most needed.

“When you cross 100nm, you’ve got to design this stuff in or you’re not competitive,” said Barry Pangrle, solutions architect for low-power design and verification at Mentor Graphics. “We’ve got a number of people well down the road on this. Larger companies with larger design teams can afford the engineering expense to make this work. But as more people go to more advanced nodes they’re going to be dealing with issues they never had to deal with before.”

The first thing that most designers encounter is complexity. What used to be done on a spreadsheet is much harder to manage now.

“There are a whole series of interrelated topics of increasing complexity,” said Srikanth Jadcherla, group director for R&T at Synopsys. “The state space is huge, and when you start dealing with three or four power islands it’s amazing how quickly the number of states and sequences explodes.”

It’s also amazing how complicated this stuff can get very quickly. Consider, for example, what happens when you’ve got a device and you’re checking e-mail. The processor wakes up a number of mixed signal blocks, then turns off what’s not being used. But that sequence also has to be ordered, which means you also have to order the power islands.

“You may wire it from low to high when you need to go from high to low,” said Jadcherla. “The problem is that you’re trying t predict island orders. You can create a safe graph, which is a set of possible states so you can look at a design and ask, ‘What are the safe ways this will work?’ But when you’re dealing with 36 to 40 islands, there’s no way you can set it up safely.”

Tales from the crypt
One of the most common mistakes that design teams make in chip engineering is internal organization and communication. The team design and communication has to reflect what’s going on in the chip design and verification.

“We’ve seen problems in a library group, for example, where they save power in a certain way that’s different from other groups,” said Mike Carroll, product marketing manager for front-end design at Cadence. “Communications between teams is not always the tightest loop. If one group instantiates it the wrong way, you may have power shutoff without state retention.”

In a library, that can be disastrous for a system—or at least some of the system’s functions.
It’s also a big problem in flash. Consider, for example, a smart phone where the low-battery signal is flashing and the system is ready to shut down to keep enough charge in the device to maintain essential data in memory.

“If you get a phone call at that time and you pick it up, it can be disastrous for the system,” said Synopsys’ Jadcherla. “But how do you prove that? It’s not easy. You need to come up with a methodology to test it. That’s where random constraints and testing come in.”

Another problem is when engineers route signals across other blocks or power domains. Pangrle noted that may not show up in the block diagram, particularly if the block is powered down.

“The key is to keep the logical hierarchy matching the physical hierarchy,” he said. “But design teams are not experienced with that. Another problem is that the signal may not be the same on one side as the other.”

That can also happen at advanced process nodes with process variations—an issue that no one even paid attention to at 130nm. At 45nm, it can be the difference between a functioning chip and a buggy one.

Advice from the experts
Low-power experts have consistently advised design teams to think about low power at the architectural level, and nothing has changed in that regard. What has changed are the numbers of possibilities for verification. Adam Sherer, product marketing manager at Cadence, said that for every power domain there are two-to-that-power possible states. So if there are two domains, there are four possible states, and so on.

“Verification does not have a theoretical limit, but pragmatically there are limitations,” Sherer said. “The problem is coverage. If you can manage to create a loop, you can extend it to the power domains. We’re seeing the same from the functional teams. Randomization testing is where the functional coverage comaes in. As long as there is coverage and you can see functional sequences you have vision into the power domain space. It has to be able to come out of shutdown and on the implementation side it has to work.”

That means establishing power intent so you shut off something at a particular time.

All the EDA companies say that a verification methodology helps, as well, although each favors their own flavor, whether it’s OVM or VMM. Other higher-level abstraction standards such as CPF and UPF, and TLM 2.0 also help significantly.

“With TLM you can figure out what’s in hardware and what’s in software and which blocks run at which voltage,” said Pangrle. “Then you can put in which blocks to shut down entirely and specify the power states.”

And if you can create an effective coverage model based upon those factors, then at least you have a chance of getting a chip out the door on time, possibly within budget, and one that actually works.

Experts At The Table: Rising Complexity Meets Verification

Friday, December 18th, 2009

By Ed Sperling

Low-Power Engineering sat down to discuss rising complexity and its effects on verification with Barry Pangrle, solutions architect for low power design and verification at Mentor Graphics; Tom Borgstrom, director of solutions marketing at Synopsys; Lauro Rizzatti, vice president of worldwide marketing at EVE, and Prakash Narain, president and CEO Real Intent. What follows are excerpts of that conversation.

LPE: With multiple power islands and states, how do you ensure your coverage model is complete?
Borgstrom: The overall verification challenge explodes when you get into some of these low-power design styles. It’s not just coverage. It’s also catching the bugs in the first place. Our surveys show more than half the companies are using some sort of low-power design technique. Where we saw a couple of years ago it was one or two power islands, more common today is five or six. We’ve seen designs with up to 30 different islands. In the past, you could do ad hoc techniques. With even 5 or 6 power domains, you need some automation in there, whether it’s UPF-driven flows or multi-voltage simulation. This isn’t just for analog-like verification, either. It’s functional verification that the design actually works.
Pangrle: This is a trend that will continue, too. The amount of circuitry you can put on a chip is continuing to scale. People are using that to put on more processors and more cores. AMD came out with its first Opteron in 2003 and now they’re up to six cores. It doesn’t look like AMD and Intel are about to stop adding more cores. These chips are going into servers and often there are situations where some of these cores are idle. They’re all being measured by how much power they’re consuming in an idle mode. At that point, each one of those cores becomes a candidate for a power island so you can throttle it back or shut it down. The number of islands is just going to continue to scale. Having the process from the beginning, and looking at how you’re going to partition it with a format like UPF to track that information as it’s going through the design flow opens it up for the tools to look at what’s stored there and what the power intent is. That allows you to develop tests around it so you can make sure you’re verifying those different cases and different modes. But the reality is you won’t have all those states operating at the same time, so you can specify these are the allowed modes at one time.
Borgstrom: This also requires a shift in methodology and in how people go about verifying these low-power designs. What are the best practices in architecting a verification environment for low-power designs? You need to make sure you verify the power management unit and all the power transitions. One of the things we’ve done in collaboration with ARM and Renesas is write a book on low-power verification techniques called “VMM for Low Power.”
Rizzatti: The road map from Intel shows that by 2011 it will have 4 billion gates and 128 cores.

LPE: That’s the Larrabee chip, right?
Pangrle: Yes. And Nvidia has 512 cores. It’s 16 streaming processors, each with 32 cores.

LPE: But with low power, all of this has to be designed up front. Does verification need to be considered up front, as well?
Narain: Architects have to consider performance, timing and power.

LPE: But it’s also a business case that it has to come out the door on time, right? Do we need verification IP?
Borgstrom: More and more, verification is becoming a limiting factor on the scale and scope of the most complex designs. We’ve talked with customers that have scaled back the functionality of their designs and rejected changes in their design because of the impact that will have on verification schedules.

LPE: You mentioned VMM, and there’s been a lot of talk about how that stacks up against OVM. Does it matter which methodology verification engineers use?
Pangrle: Open standards matter. Mentor has donated technology for UCIS (Unified Coverage Interoperability Standard) and everyone has access to the UCDB (Unified Coverage Database) work that we have put together in terms of helping track information on coverage. Having that kind of format in the industry where you have the freedom to go from one vendor to another helps speed the adoption of these new technologies. If you’re using the tools from only one vendor then you’re at risk because if anything happens to that vendor you’re stuck. If you have a choice, you’ve got options down the road.
Borgstrom: The debate sometimes gets a bit tiresome. The industry seems to love a controversy. The VMM came out in 2005, so it’s been in production for five years. It has more than 500 successful tapeouts, lots of companies using it, and it has evolved and expanded since we first launched it. We’ve had quite a lot of interaction with our customers around methodology as we develop and enhance this. One thing we’ve heard is that customers want a single industry standard methodology that’s driven by an open standards body so they can have interoperable verification environments. When we first released VMM it was the first open methodology with a specification. We published details on the library and the methodology. We then open sourced the library and the applications built on top of it.
Pangrle: Sometimes you guys have nasty ties when you download your software, though. There are strings attached.
Borgstrom: It’s a standard Apache 2.0 license. There are no strings attached.
Pangrle: The .lib parsers supposedly are open, but there are statements in the access language about if there’s ever any dispute arising between the two companies then you immediately lose access.
Borgstrom: I thought we were talking about verification here. I’m not the right person to talk to about .lib. But VMM is available under Apache 2.0. In any event, there are two methodologies that have gotten attention. The cry I hear is from users who want one standard methodology and get on with innovating.
Pangrle: Does that mean Synopsys is going to support OVM?
Borgstrom: We support developing an open industry standard driven by an industry organization like Accellera. There has been great work done by the Accellera subcommittee. The next step will be to come up with a common base class library that will go a long way toward bringing unity and progress here.

LPE: What do the other participants think about this?
Narain: We’re neutral. We don’t have a stake here.
Rizzatti: We run a survey each year, and one of the questions is VMM vs. OVM. VMM is ahead of OVM in terms of checkmarks by visitors, but it’s not by much.

LPE: What’s the next big challenge for verification? Is it complexity, is it integration?
Narain: It’s the cost. And verification is such a multidimensional problem today that no one way is going to solve it. The only way to deal with this is to break it up into its pieces and to have the most cost-effective solution for each piece. One of the biggest problems today is simulation. We’re trying to throw everything into the simulation cauldron. We have to find a way around simulation. That’s where formal technology becomes important. But formal is an engine. If you don’t package it properly it’s useless.
Pangrle: It really is a cost-driven process in the end. People are trying to figure out how to get chips out the door with the least amount of expense and in the least amount of time. It really is more than just a point tool solution. Having tools that can go through and automatically look at the testbenches and vectors you’re running for your verification can improve the coverage with a fraction of the tests. Rather than just looking at how fast two simulators are, if you look at the whole system you can cut down the number of tools and get better quality results.
Rizzatti: Part of it is cost, but part of it is saving respins and not being late to market. If you’re two months late you can lose one-third of your revenue. That’s hundreds of millions of dollars.
Borgstrom: Two of the biggest drivers for verification are cost and software. They’re related, and they’re being driven by the complexity of devices today. What’s important is that there are different types of verification done at different phases. Whether it’s algorithm analysis or transaction-level modeling or RTL simulation or analog/mixed signal simulation or hardware-software validation on a virtual prototype—all of those have to work together in a flow. Being able to successfully transition from one phase to the next and making sure all the tools work together is really important.

Next Page »