Posts Tagged ‘Magma’

Next Page »

Experts At The Table: Billion-Gate Design Challenges

Friday, April 1st, 2011

By Ed Sperling
Low-Power Engineering sat down to discuss billion-gate design challenges with Charles Janac, CEO of Arteris; Jack Browne, senior vice president of sales and marketing at Sonics; Kalar Rajendiran, senior director of marketing at eSilicon; Mark Throndson, director of product marketing at MIPS; and Mark Baker, senior director of business development at Magma. What follows are excerpts of that discussion.

LPE: If you don’t look at this as 1 billion gates, but instead look at it from the standpoint of subsystems, is it easier to justify from a business standpoint?
Browne: Yes because you know these customers because you’re tier one, two or three in this segment and you know what to put together. You may be nowhere in another segment. So here you do something original. Here you try something new.
Rajendiran: There is a difference between a derivative and a variant. You can start out with one chip, do a re-spin and get a derivative. A variant is where you start out with a big system and then that hardware is given to all the divisions in the company. Each product line comes up with what to do to create a variant, mostly in software.
Janac: Is a variant a software re-spin?
Rajendiran: That’s what we’re seeing. It’s like a superset.
Browne: We’ve seen that in a lot of companies, too. You don’t know what you need for a particular market so you create a superset.
Rajendiran: Not only don’t you know what you need, but the markets are changing. You don’t have time to figure it out. Are you really going to do a billion-gate design from scratch? Probably not. When you do a new chip the traditional defect density model tells you that your yield is low. So can you easily take what you have and do it in four chips? This isn’t the traditional way of doing integration. If I can make them into four chips and tie them together with 2.5D, then you get better yield.
Browne: Or can company ‘A’ race to market with multiple chips. If not, then the slow and steady guy may win. How far do you jump out ahead before it’s off a cliff?

LPE: On a multicore/many-core implementation, are these core sizes becoming more heterogeneous?
Throndson: There’s definitely a lot more interest in that area. One of the more popular configurations in the application processor space for these Internet-connected applications is in mobile or the digital home. It may have floating point or no floating point, which can affect a significant chunk of the core size. That works on other features, too.
Browne: It’s hardware vs. software.
Throndson: Yes. Software needs to be a little bit more aware of where those dedicated resources exist, but that’s a manageable problem. It definitely helps to save power and area, though.

LPE: EDA traditionally has been one size fits all. Are the tools moving in those directions.
Baker: System-level change based on applications is very interesting. Right now we’re in a vertical space and there are functional verification, custom design and digital implementation areas. All of us are trying to find ways to automate the process by abstracting it up a level to get to an answer more quickly. The EDA industry needs to make tradeoffs on area, power and cost so we can add productivity to the design teams. Everyone is working on that now.

LPE: It used to be a tradeoff between power and performance. Is performance no longer an issue?
Janac: It depends on the market. If you’re in DTV and you’re operating on a 25% or 30% gross margin, the die size becomes very important because it’s so cost-sensitive. If you’re in a high-margin base station, area is less important. It’s all performance. It depends on the market. But in the billion-gate chip, the big concern will be risk. People get fired for being late and for quality problems.
Browne: But risk is different things for different people. Samsung’s president said his company will be using TSVs in 2013. There are ‘Haves’ and ‘Have Nots.’ If you need to get there first you’re going to have a different risk profile than if you’re a follower. And it’s a whole continuum.
Janac: But whoever is in charge of the Samsung TSV chip is going to get fired if he doesn’t get there by 2013. He’s got to be very cognizant of the implementation risk he’s going to take to do the project.
Browne: And someone else will get fired if the factory isn’t full.
Janac: But the guys who create the design don’t get fired if the factory isn’t full. They get fired for not delivering on time, on scope and with quality.

LPE: Will we be able to get these chips out the door on time with a billion gates?
Browne: We have to improve on quality at the same rate as we improve on dealing with complexity. It’s a marathon race, not a sprint.
Rajendiran: Some companies can afford to take a huge risk. Hopefully other companies will be smarter about how they approach this. It is important to differentiate by market. But there are more ways to get there than just by following Moore’s Law. We don’t have billions of dollars to write off.

LPE: Isn’t some of this about getting more granular in the design?
Janac: The key in a billion-gate design is how you manage the partitioning and the IP re-use. You need to understand the risk of not redoing the IP, as well as the risk of redoing it.
Browne: It’s all about how it works in the system. The guy with more understanding that will have the ability to reuse more cleverly.
Baker: Certain companies will rise and succeed because they’ve built the knowledge base internally.

LPE: What happens on the manufacturing side? How do you manage yield issues?
Rajendiran: At any process node it’s the same. One thing the better foundries do is apply their learning to get to a better level of yield. The more chips you do, the more expertise you have, the better you get. We’ve done it and learned it with in-house expertise. You have the building blocks, the tools and the expertise. That’s what sets one company apart from another. Anyone can buy the tools, but can the produce it?

Experts At The Table: Billion-Gate Design Challenges

Friday, March 25th, 2011

By Ed Sperling
Low-Power Engineering sat down to discuss billion-gate design challenges with Charles Janac, CEO of Arteris; Jack Browne, senior vice president of sales and marketing at Sonics; Kalar Rajendiran, senior director of marketing at eSilicon; Mark Throndson, director of product marketing at MIPS; and Mark Baker, senior director of business development at Magma. What follows are excerpts of that discussion.

LPE: Will anyone be able to afford to create these complex chips in the future?
Janac: Sure, but it will be extremely expensive.
Browne: Apple is doing it. They’ve come at it with a systems approach. The user will have a great experience because they’re going to add a whole bunch of devices. But we’ve got to find ways to attach to the software at a higher level. We’re doing a full system design. We’re not hooking up a couple of widgets anymore.
Baker: Apple has moved up the stack. From an EDA standpoint we see all these challenges. We’re actively seeing designs at 28nm, planning for 20nm. We’ve yet to see designs at 14nm. But the complexity of validating one of these devices, whether it’s a single die or a multiple-die approach and in the future 3D, is increasing by orders of magnitude.
Browne: With 100 times the number of elements you can’t just extend the methodologies we use today. You have to define the interactions so you can abstract this. You can’t manage this many power domains when the use models are different for all the users. There may be 200 things you’re turning on and off to reduce leakage and increase battery life. To date, most people haven’t done that. In the rush to get to production people want to know if it runs Android or Angry Birds, not whether you’ve done all the power management stuff up front. We’re back to the speed of execution in getting it almost right and being early.
Rajendiran: That’s correct. Verizon, after years of rumors, finally launched the iPhone. But as they got near to release they said it cannot do multitasking. Who was asleep at the wheel? Then the next day they had a software fix to enable that. Why didn’t they think about it ahead of time? With all these complications we should really partition who does what.
Browne: Yes, it’s a system problem.
Rajendiran: But it’s something people could have easily thought out ahead of time. We need to define the components that need to be addressed and give it to the people who can address it. If you take a processor and optimize it for a set of libraries vs. another set of libraries, for the same performance level, one might take a third of the power of the other one. But who should tell you that? Should it be the company that makes the processor or the company that builds the SoC?

LPE: But increasingly you’re not building the chip. You’re integrating parts.
Throndson: You can see people racing ahead of each other, depending on the pieces you’re considering. Part of it is just a matter of getting to market early with a solution. But in terms of parallel hardware, it’s still way out in front of parallel software. Even with power part of the answer is going back to better utilize the hardware that’s already there, whether it’s the processor itself or at the larger system level. It’s very difficult to optimize and deliver every component that goes into these systems today.

LPE: From the network-on-chip perspective, will these chips be running at the same node and power or will there be an array of nodes, power and legacy technologies.
Janac: You’re going to be dealing with multiple processes and legacy applications. It doesn’t make sense to put analog IP on a 16nm design. You will have to use multiple die using a system-in-package approach where the digital part of the system is running at the latest nodes optimized for low power and cost and the analog stuff is running on trailing-edge processes where the IP is available.
Browne: We’re building a system using building blocks, and good enough wins if it’s early enough. The more you re-use, theoretically, the quicker you can get there. But the real challenge is how you better enable mix and match in the software area.

LPE: And that ‘good enough’ is also tested well enough?
Browne: Good enough has programmability. The fabric allows reprogramming. We think it’s important to be able to do things in parallel. If you can get enough of them done simultaneously, even if they’re running slower, then you don’t need buffers to manage those serial events and you have less logic and less wires and slower transistors in the linear area of design. That also means there is less leakage.

LPE: Will the tools be able to deal with this kind of structure?
Baker: Re-use has been around for about 15 years. So what’s preventing the re-use? A lot of that scaling and functionality is available today. It’s not a new challenge. The challenge we face is that re-use isn’t happening. We’re redesigning these components with each iteration.
Janac: Once you get past RTL the tools are horizontal. The chain of synthesis, place and route, verification and DFM are applicable to that entire system. Above RTL it’s like the silos of IP. Those tools are not addressing that. The MIPS and ARM processors each have their own tools. Arteris’ NoC has its own tools. You wind up with horizontal silos where the IPs are tied to the tools. Only when they reach RTL do they hit the Magma, Mentor, Synospys and Cadence tools. There is no horizontal toolset that can handle all of these IPs at the architectural level.
Rajendiran: There’s no reason to keep up with Moore’s Law for things that have already been certified and verified. In the old days we were following it. When Moore came up with that law he wasn’t talking about cost. He was talking about transistors. At that time you could do a chip for $50,000. That’s not the case anymore. People are slowly coming to the realization that if you have a chip working, why bother re-doing all of it? You can put software on it, you can even re-do it on the latest process, and use an interposer to make it work. So 90% of the chip is already validated. You add new software and you get the chip out sooner.
Browne: You also cover more markets, which adds more complexity to the definition. The requirements are different for a smart phone and a tablet computer.

LPE: But some of the functionality may be the same between a smart phone and a set-top box, right?
Browne: Yes, and that’s why the big companies have more data points. They know which subsystems can be re-used. When you’re doing audio on these devices everything works. When you add more cores or video, it’s different. The guys with a bunch of technology in-house just need to add more things out of what they already have.

LPE: How many of these billion-gate designs will be on 2D structures vs. 2.5D or 3D?
Rajendiran: With 3D, the problem is more on the manufacturing side. When you drill a hole there are problems. It’s just a matter of time before full 3D works.
Browne: The fabless community is huge. There are $3 billion fabless companies that have very expensive product portfolios. There are also startups that build similar point devices to try to go after those markets. The difference is the big guys get to run more experiments. The little guy only has one.
Janac: The answer depends on what you’re trying to do. If you’re building a unified chip that fulfills a unique function, throwing it on 16nm process makes sense. If you’re mixing functions that are mixed signal, analog, RF or legacy it makes sense to put it on more die. But fundamentally the mixed-die approach is more expensive than trying to put it all on a single die in 2D, assuming you can use one process and the IP is all packaged correctly.

LPE: How many derivative chips do you need to get these days to make it economically feasible?
Browne: At 28nm the cost is about $80 million. How are you going to get that back?
Janac: People who make wireless chips are spinning them off into automotive and home gateways, so you wind up with seven to 10 derivatives for a successful platform.
Browne: In some cases a subsystem is re-used, in others it’s the same chip.

Experts At The Table: Billion-Gate Design Challenges

Thursday, March 17th, 2011

By Ed Sperling
Low-Power Engineering sat down to discuss billion-gate design challenges with Charles Janac, CEO of Arteris; Jack Browne, senior vice president of sales and marketing at Sonics; Kalar Rajendiran, senior director of marketing at eSilicon; Mark Throndson, director of product marketing at MIPS; and Mark Baker, senior director of business development at Magma. What follows are excerpts of that discussion.

LPE: What are the big issues we need to contend with in billion-gate designs?
Rajendiran: Billion-gate designs are no longer a fantasy. We can do that at 28nm with a 20 x 20 mm chip. But just to put this in perspective, when we first sent a man to the moon they had three computers. The power and the memory those three had together was less than we have in a phone today. So the question you have to ask is are your really putting that to good use? And from a business perspective, will it work when it comes out and who can help across the business value chain?
Baker: We’re approaching billion-gate designs in the GPU or microprocessor area. In the SoC area, we’re approaching about 100 million gates. In the next generation, we’ll see SoCs with quad cores. Beyond that, there will need to be some very significant changes in what kinds of applications we can apply those to and how we’re going to deal with the power aspects. These will most likely be in the mobile market and we’re going to have to deal with system-level issues like verification, battery life, and power. From an EDA perspective we’re on track for capacity and for some of the turnaround time, but power will need some of the focus.
Throndson: Process migration hasn’t continued to scale forward. We hit a performance wall years ago. Power hasn’t scaled, either, as we reached some of the smaller geometries. Area is the one piece that is scaling better, which enables these large numbers of gates. The keys here are systems integration and multicore processing horsepower.
Browne: When you look at design costs for billion-gate designs you have to look at the markets that are going to drive them. The mobile market has enough volume to handle the cost of these types of designs. It also has a lot of parallelism and concurrency because there is a lot of functionality, and there are a lot of different use scenarios. Traditional EDA is scaling so it can take advantage of this—traditional designs partitioned at a chip boundary in a way that fits well with the system architecture. That’s probably where 80% of us will see business opportunities. The other 20% is where you take a design and partition it across two chips. Their bigger challenge is on the tool and the architecture side and the ability of semiconductor and system companies to manage that level of complexity. When you scale to four or eight cores, there’s a huge amount of parallelism and on-chip memory. The issue we see is how you get that right, and today the solution is a lot of subsystem design. LTE radios are a good example. We’re going to replace GSM radios with LTE radios. They’re going to be 15mm of area and have a half-dozen DSP cores, but it’s going to be a standalone system that allows you to do verification, have a known good block, and which is characterized with the others. But you can’t do this as a billion gates at the top level.
Janac: What I have in my house isn’t a personal computer. My phone is a personal computer, and it will have everything I need in terms of data, family photos, passwords and payment systems. It’s more like a supercomputer and it’s going to be the driver for the billion-gate design. You’ll need storage and the computing power to make this a true PC. There are four criteria for this. The first is processing power. We’re going to have to go to many cores, so you’ll need cache coherency to utilize those cores from a programming perspective. Another key is integration. How do you bring these cities of silicon together, which is where the communication system for the SoC becomes critical? You also need partitioning. As you build more and more functions, those functions have different dynamics. The modem has to go through SoC evaluation, so it’s on an 18-to-24 month cycle, whereas the efficient digital SoC people are going to be on an annual cycle. You have to decide whether you’re going to put it on one die or multiple dies, whether you can stack the functions, and whether you can mix processes in the same dies. The partitioning and the support for the partitioning are going to have to be there. The last part involves the cost of the hardware and software. The hardware cost has been increasing slowly but the software has been increasing rapidly. So how can you use the hardware and the parameters in the hardware to lower the cost of embedded software, if not the operating system?

LPE: Will an increase in granularity in designs, in terms of various core sizes, wider I/O and multiple cores and processors, affect how we build these devices?
Janac: We’re going to have tremendous power, but we’re not going to be able to afford to keep it all on. When you’re doing graphics the GPU will be on and the rest of it needs to be shut off. For audio it will be the same. You need to be able to manage turning on and off of this functionality. And in terms of 3D silicon, some of the high-power parts of the chip such as RF and some of the modems probably need to be on a different die and connected through wide I/O and TSVs (through-silicon vias). These things will need very intelligent and capable power architectures. While you have more transistors you’re still dealing with the same power budgets.

LPE: Won’t it be even tighter budgets? In 3D stacks, the dies are actually thinner?
Browne: The terminals are better in those packages, though. Even though the dies are thinner there is a lot better coefficient with the bonding. But it’s still a problem.
Throndson: But the power source is not scaling with the demands.
Browne: We’re seeing designs today with a dozen to 100 power domains. Those are at 40nm. We have customers starting 14nm designs now. You’re going to have to move to abstractions. There are 1,000 voltage domains. Somebody will have to have a product that generates the HAL (hardware abstraction layer) of software. We generate RTL. Generating RTL and C code are not that different. That’s where you’re going to see a lot of growth in the supply chain.
Rajendiran: If you look at 130nm, we used to have one type of transistor. Now we have multiple types of transistors and different process flavors, which add a level of complexity. You now have a whole bunch of different libraries, depending on which type of transistor you use. That’s an opportunity and a challenge. How are you going to pick and choose your implementation? Then you throw in a billion transistors, and you’re talking about putting it into a single SoC. It’s going to cost a lot of money and you don’t even know if you’re taking the right path to optimize power, performance and the market. And most of it is driven by consumer markets where each person will use a device differently. What you put on the chip affects battery, performance and even leakage. There are great opportunities, but it’s also more complex. It comes down to who can you partner with for the software, for planning the product, and for implementing the chip in hardware. And it really needs to be tied together so you hit the product introduction times.

Billion-Gate Chips

Wednesday, March 16th, 2011

Low-Power Engineering examines hurdles ranging from power to cost in billion-gate IC designs with Arteris; Jack Browne, senior vice president of sales and marketing at Sonics; Kalar Rajendiran, senior director of marketing at eSilicon; Mark Throndson, director of product marketing at MIPS; and Mark Baker, senior director of business development at Magma.

YouTube Preview Image

Experts At The Table: The Reliability Factor

Thursday, January 28th, 2010

Low-Power Engineering sat down to discuss reliability with Ken O’Neill, director of high reliability product marketing at Actel; Brani Buric, executive vice president at Virage Logic; Bob Smith, vice president of marketing at Magma, and John Sanguinetti, chief technology officer at Forte Design Systems. What follows are excerpts of that conversation.

LPE: Is a more complex supply chain causing reliability issues?
Sanguinetti: That does happen as a result of disaggregation. There’s third-party IP and the associated issues of putting it all together. The most common problem is when you buy a piece of third-party IP and you want it to do something just a little bit different. You make a change—or a third party makes a change and it isn’t fully tested. That’s the way bugs get introduced.
Buric: This touches the gray area between quality and reliability. That’s why people have standards. TSMC is developing TSMC 9000, which is just one measure of an ISO standard. When you build a complex device that’s how you make sure all sub-components conform to quality standards. That’s part of the equation. If you can’t establish quality standards you cannot manage a design these days.
O’Neill: Quality standards would mean things like toggle coverage and code coverage for validation and test programs?
Buric: Yes. Whatever you may need.

LPE: How disaggregated is the supply chain?
O’Neill: It’s becoming more disaggregated. Years ago when I started with Actel we designed the entire chip. We designed the I/Os, the logic, the routing and the programmable interconnect. Today, because of the complexity of our current and future generations of FPGAs, we’re sourcing pieces of IP from outside of our own company. There’s a supply chain issue there. We have to choose our suppliers wisely, as well as the products we purchase from them, and ensure that we’re consistent with our own verification and validation techniques. Going forward, we will do more and more sourcing of our IP as our products become more complicated.

LPE: As the foundries impose restrictive design rules at future process nodes, how will that affect reliability?
Buric: Restrictive design rules have always been there. They have not always been so obvious, though. They help reliability. If you don’t use them, then you have a fundamental failure in yield. Reliability is a marginal case of yield. They are establishing rules to minimize freedom of implementation on silicon, which will result in higher reliability. As the process matures, some of those rules may be waived. They can learn enough to see it is not impacting anything.
Sanguinetti: Over time, our coding requirements have gotten more strict. They don’t have the effect of design rules, though.

LPE: But isn’t this complexity hitting every part of the flow?
Sanguinetti: Yes, it is, and we’ve had to modify our product over time to put out RTL that is more regular and follows different rules. That has been a maturation process. But it’s on the order of 10s or 100s of rules, not thousands.
O’Neill: We see some movement in the end-user community and among their customers to impose coding standards, as well. If you’re doing a design for an industrial process at a petrochemical plant or something that’s safety critical you may be working to a certain specification imposed by intended operator of the plant. They will impose coding standards. Similarly, for commercial aircraft we’re seeing the certifying authorities imposing standards on the contractors, who in turn are purchasing FPGAs or ASICS to implement critical digital logic.
Buric: The European Community has been doing that with their contracts for the past 20 years. Companies have had to follow their codec style or they would not get the contract.
Smith: The challenge for the restrictive design rules is that if the foundries are too restrictive, people will not move to the next node. It will be too restrictive, too expensive, and far less attractive. We have a lot of customers designing at 40nm, but a lot of design is still being done at 180nm. There are reasons to go to 40nm—density and power—but there’s not free lunch. It costs more to do designs for all the reasons we’re talking about: verification, reliability and everything else.

LPE: Are the economics of reliability changing? Does it become more expensive to guarantee reliability at future process nodes?
O’Neill: Absolutely.
Smith: The number of rules and the number of cases you need to check for goes up, so it’s more expensive.
Buric: Reliability is becoming a liability. You have to design with that in mind and it costs you money. I have seen people doing it for alpha particles where it’s two-bit detection, one-bit correction. I’m now seeing algorithms with 8-bit detection. People need it or they wouldn’t do it. It costs silicon and it costs in the design.

LPE: Is reliability now part of the cost equation in developing a chip?
Buric: Yes.
Smith: For the fabs, definitely. They’re staking their reputation on being able to deliver a product that meets certain standards. Being able to test that and then being able to somehow abstract that and do a bunch of rules so folks like us can take those rules and make sure everyone follows them is a tough job.
O’Neill: From an FPGA standpoint, we’re not designing for a single application. We’re designing for a whole range of applications. We go from consumer electronics applications, which have a lifespan of two years, to military and aerospace systems that have to survive 20 years. We don’t have the luxury of just designing an FPGA that just survives two years. We have to design for the maximum lifetime. That really causes us to look very carefully at the tradeoffs between density, reliability, power and performance.
Sanguinetti: What would you do differently if you could design products just for the consumer electronics market?
O’Neill: Let the parts run hotter so the power density would be higher. There’s a tradeoff there, too, because you need low power to conserve battery.
Buric: You could save on the package, too.

LPE: Is reliability considered essential all the time?
Buric: It is company-specific. It’s hard to see a trend. In the consumer market, cost is critical. In that market, failure is measured against cost of replacement.
Sanguinetti: If you have a TV set that drops frames every 30 seconds or you get a blocky picture, you get a bad reputation and no one buys your products.
Smith: It depends on the application. In military and aerospace, it’s vital to be reliable. On the other extreme, if you go into the Hallmark store and pick up a card with a synthesizer attached to a battery, it has to last through the shelf life, but design for reliability is minimal.

Experts At The Table: The Reliability Factor

Friday, January 22nd, 2010

Low-Power Engineering sat down to discuss reliability with Ken O’Neill, director of high-reliability product marketing at Actel; Brani Buric, executive vice president at Virage Logic; Bob Smith, vice president of marketing at Magma, and John Sanguinetti, chief technology officer at Forte Design Systems. What follows are excerpts of that conversation.

LPE: As we push to the next process nodes, do all of the tricks of power islands and multiple voltages become more common in designs?
Buric: No, because people will look at the cost of implementation. We have customers looking at simulating and figuring out what are the power savings of switching something off and turning it back on. That costs power. They are doing fairly complicated analyses and staying away from these techniques if they don’t have to use them.

LPE: If you implement all of these techniques, though, is a device more reliable or less reliable?
Buric: In my opinion, it becomes more difficult to make it reliable. But it’s all comes back to your capabilities. If you know how to do it and you’ve done it before, it’s more difficult but you can still do it.
Smith: I don’t think it’s de facto less reliable. But it makes it a heck of lot more difficult to maintain reliability. That requires a lot of work. The other side of this is that low power used to be just battery-powered devices. Now it’s part of the overall Green movement. Everybody is searching for ways to cut power. About 1.5% of all the power generated goes into servers, server farms, and the cooling associated with it. We have a lot of wireless customers. Power is a huge deal. Reliability is a huge deal, too. If you have a phone that dies all the time, the manufacturer is going to go out of business. But we’re starting to see more of a focus on low power for things people plug into the grid, and the government is starting to mandate that. Reducing power without giving up the reliability is very hard.
Sanguinetti: That’s true even in this industry. Quite a number of companies have verification server farms. You run your regression tests with 10,000 processors. That has a power bill of about $1 million a year. That’s a lot of electricity.
Buric: We have 1,000-plus processors just to do fairly straightforward tests. One of the things that is responsible for what’s going on here at the lower technology nodes is that IP memory and logic design and EDA tools help solve reliability problems that may be caused by calculation errors from migration that overload certain parts of the design. It’s much more critical on the current nodes than before. You have to properly characterize IP and use synthesis, high-level synthesis and place-and-route tools to avoid any potential reliability problems in operating conditions.
O’Neill: We’re all coming at this from a variety of different directions, but in our world we see an intersection between low power and reliability. This is from a system-level for high-reliability and outer-space systems. The motivation for achieving low power in the consumer space is battery life and the greater good of the planet. In military and aerospace, battery life can be important in handheld radio systems, but there’s much more interest in reliability. Power becomes a reliability issue for a different reason. There are a lot of very high-performance systems where they can’t have forced cooling. The cooling fan itself is a factor in reliability. Fans can fail and they can increase the risk of foreign debris, which can cause short circuits and add other reliability risks.

LPE: So it’s imperative to lower the power in the parts?
O’Neill: Yes, because as you increase the performance of these systems, more power is being generated. That means more heat, and heat equals reliability risk. There are very few failure mechanisms that decelerate with temperature. Most of them accelerate with temperature, and some of them accelerate exponentially. So minimizing the heat dissipation inside enclosures that have very complex systems running inside them becomes a primary issue. People often come to us seeking low-power FPGAs because there’s some other power-hogging device inside the enclosure generating a lot of heat. They need to minimize that. They can’t afford another heat-generating device.

LPE: Do devices that use lower power last longer?
O’Neill: Given the same process node, if you run at lower power you’re probably going to have better reliability because your junction temperature is lower. That’s going to result in prolonged life.
Buric: If you look at what people are doing with end of life now, they’re saying lower temperature will extend the life. It’s very clear that if you go to overheated conditions where a lot of parts become unpredictably fast then you have a very high chance of a device completely malfunctioning.

LPE: Where does this get designed in?
Sanguinetti: We don’t have any visibility into this.
Smith: Neither do we. It’s something our customers have to deal with. Our job is to get them from concept to manufacturing. Their job is to figure out the application and the expected lifespan. In some applications, a year is fine. If it’s going into the engine controller of an automobile, a year is not fine. That would be more like 15 or 20 years at a minimum.
Buric: End of life is primarily a process function, and for that reason it is analyzed and characterized at the device level. A lot of people simulate end of life, and those models are typically provided by the process side. These are similar to any other SPICE model.
O’Neill: When we design a chip, we design with a package that comes from the foundry. That package will include things like the design rules, which are decided by tradeoffs between reliability, power, functionality and sheer utilization. You want to cram as much logic into as small a space as possible, but you’d better not exceed the electromigration rules or whatever other rules are in there for reliability.

LPE: In the synthesis world, what’s the biggest reliability issue?
Sanguinetti: Logic errors. It’s getting the design right. The issues that our customers have, aside from the spec being wrong, is interfaces between blocks or sections of the design that are done with high-level synthesis and those sections that are done with legacy RTL or manually. That’s where the opportunity for errors is the greatest. When you’re within the confines of high-level synthesis, the opportunity for error gets reduced and the verification problem is reduced. But at those interfaces there’s a lot of potential for miscommunication.

LPE: As the industry becomes more disaggregated, does it become more difficult to pinpoint the source of reliability problems?
Buric: No. Problems are well defined and everyone has to take responsibility. If you go back to our discussion about radiation and alpha particles, you design to eliminate that problem. That design can be in the memory space or error correction. You know what the problem is and you solve the problem. If the problem is end of life and the technology provider gives you all the guidelines, then you design with those in mind. If you own the design, you own the problem. It’s in designers’ hands and it’s a well-defined boundary.
Smith: It’s a gray area. In the ideal world, we get a set of rules from the foundry. There are thousands of them, and if you do this and this then they’ll stand behind the process in the design. For a big part of the population, that’s the way it works. But the people with the deeper pockets will go back to the foundry and say, ‘Let’s talk about where these margins really are because we need to do something special for this product line.’ They’ll get the data, analyze it, and they may design outside of those guidelines. The rules are the rules except when you break them to get an advantage, and then you’d better have the time and the money to go analyze everything to make sure you don’t end up with a product that fails or doesn’t yield.
Buric: You actually don’t break rules. You define a new set of rules that are mutually agreed upon. If you set those rules unilaterally, you’re shooting yourself in the foot. Memory cells have violated every rule, but they’re so predictable in manufacturing that they can violate the rules. All of those are mutually agreed upon, though.

Combining Power And Synthesis

Thursday, January 14th, 2010

By Ann Steffora Mutschler

Each passing design node shrinks electronic designs ever smaller and more complex, which has made power management a critical design priority – even in the synthesis step in the design flow.

Synthesis has always been an integral part of the design process, particularly at the RTL level. But as chip design has become more complicated, the need to raise the process up a level of abstraction into high-level synthesis also has started gaining traction. What’s new is the inclusion of power management in both steps.

“Just like closing on timing or closing on area, closing on power is a responsibility of the synthesis tool including a high-level synthesis tool,” said Thomas Bollaert, product marketing manager for Catapult C at Mentor Graphics Corp. “Low power is becoming increasingly important on the designer’s radar screen and this is why they are asking more of the synthesis tool when it comes to low power.”

High-level synthesis can now be power-aware. While there are certain widely-adopted low-power techniques – such as clock gating, which is critical to understanding how much dynamic power can be saved – that designers typically use and implement when doing manual RTL coding, these can take a lot of time to do by hand and can be error-prone. High-level synthesis has the potential of making the analysis and the transformation automatically on behalf of the designer thereby generating much more power-efficient designs.

Gal Hasson, senior director of marketing for synthesis, power and test automation at Synopsys, said this is all about time to results. “If you start optimizing as soon as you start synthesis you end up with better results. For many years customers have been doing MCMM (multi-corner, multi-mode) in place-and-route only. When that is started in synthesis, better results are achieved. Power can no longer be an afterthought.”

Another key to making sure the best results are reached is a tight interaction between synthesis and place-and-route. “When you’re doing a lot of low-power design, you’re doing a lot of clock gating, or you’ve got physical considerations… If all you’re doing is thinking about low-power on the RTL side or the synthesis side and then you throw it over the wall to a totally different tool or totally different group, a lot of the optimizations that you planned in the RTL might not work out,” said Rob Knoth, senior product manager at Magma Design Automation.

While the majority of semiconductor designers are seriously looking at power, they are also trying to figure out what they can do to manage it in the context of their overall design goals, said Jack Erickson, a member of Cadence Design Systems’ Encounter Digital Implementation marketing group. There are plenty of techniques available today for minimizing power, but ultimately designers have goals in terms of meeting a performance spec, a die size and achieving time-to-market, so it comes down to how they manage all of these things and get their chips out the door, he said.

In response to this, in Cadence’s synthesis tool, as with most synthesis tools on the market today, power is not a separate tool or even a separate step within the tool. “When you look at power versus performance, they trade off versus each other orthogonally, and we don’t want our customers to try to have to manage that manually because they’ll never get their chip out the door in time,” said Erickson. “So we built power awareness in.” Cadence includes the notion of cost functions so that when logic is structured it is done so for performance and using cost functions such as area, leakage power and dynamic power. As a result, everything is considered simultaneously as the logic is being built.

To address leading-edge design concerns, Mentor’s Bollaert said customers are now asking for dynamic voltage frequency scaling. That allows a system to use a different voltage or clock frequency and slow down a design so it consumes less power. As power consumption is a function of frequency and a function of voltage squared, if the frequency is reduced, the voltage is lowered, and this can reduce power consumption.

“Doing this is simple when you look at the math, but doing this in hardware is actually very complicated because the proper intelligence is needed in the system to determine when to slow down the design, when to slow down the clock, and when to reduce the voltage. In order to make an intelligent decision, sufficient information about what each of the subsystems are doing and so forth is also needed. It truly requires designing up front for dynamic voltage and frequency scaling, and it also requires fine-tuning intelligence on when to drive the right frequency and voltage. This is truly an advanced low-power design technique. Few are doing it, some are exploring it, and many would like to be able to do it,” he said.

ST-Ericsson used Mentor’s previous-generation Catapult C low-power technology to reduce the area of IP by approximately 30%, according to ST-Ericsson. With new capabilities, Mentor claims a 70% reduction in power consumption for a different customer using Catapult’s clock gating capabilities.

Magma’s Knoth is hearing the same kinds of demands from customers. From the synthesis perspective, making this happen requires a much more robust and multi-mode, multi-corner infrastructure, which is something that more synthesis tools have started doing, he said.

Going from one voltage domain to another requires a lot of extra logic such as level shifters and isolation cells. Knoth said Magma’s synthesis tool is able to consider that extra logic ahead of time, which is significant because those things take up a lot of area, and are typically slow.

“Level shifters and buffers are very slow cells and when Magma’s tool reads in the UPF or CPF, as you are synthesizing those paths, it will allow you to understand the timing and power impacts of all that extra logic,” he explained. “What users are really looking for is higher-level optimization in the RTL for power as well as better verification, because as you’re doing all these complex clock gating or higher-level RTL optimizations you are changing the logic. If you can’t verify it easily, no one is going to tape it out.”

Experts At The Table: The Reliability Factor

Thursday, January 14th, 2010

By Ed Sperling

Low-Power Engineering sat down to discuss reliability with Ken O’Neill, director of high reliability product marketing at Actel; Brani Buric, executive vice president at Virage Logic; Bob Smith, vice president of marketing at Magma, and John Sanguinetti, chief technology officer at Forte Design Systems. What follows are excerpts of that conversation.

LPE: Do chips become less reliable as we start adding power islands, multiple voltages into the mix with smaller line widths and multiple cores?
O’Neill: As we go down the process curve, one of the effects we have observed is that radiation effects are no longer the domain of designers just working on aerospace systems. We’ve seen radiation effects start to become dominant from a reliability standpoint in commercial ground-level systems. These are background neutrons in the atmosphere. That can cause memory cells to have single-bit upsets, and the consequences can be severe. For most applications, you don’t care about a single bit. In consumer electronics, it may be a pixel changing color. But if you have system-critical data in a server or a router it could mean enterprise-class equipment having field failures.
Buric: All of those effects started showing up at 90nm when people started doing a serious analysis of the radiation space. It’s not getting better. It’s getting worse each node. There are two aspects of dealing with this. One is how the system is designed. The other is how system designers can further ensure that nothing has happened.

LPE: Does it show up on the design side?
Sanguinetti: We don’t see it directly, but our customers are putting tighter requirements on the style of RTL that we produce. Most of it includes locally produced rules. Those rules have been developed to ensure a smooth transition through the back-end flow. Now they’re starting to add rules it will produce better results. That’s about our only exposure to the physical side. The other piece of reliability is whether you’ve got the logic right, and that’s just verification.
Smith: On the implementation side, reliability is more around design rules. Presumably some of these effects are covered by those rules. What’s most important in our world are the low-power effects—multiple voltage domains, turning blocks on and off, dynamic voltage frequency scaling. The question is how you model that design so it will be reliable under a bunch of different operating conditions. It’s now multi-load, multi-corner and multicore. We’ve seen designs that have up to 100 different models. That affects reliability.

LPE: From the standpoint of all the multiple everything, does it become less reliable as you keep dropping the voltage?
Buric: You’re talking about a badly implemented design. We have customers today that, because of technology scaling, have thousands of memory instances on the same design. This is a nightmare by itself. There is also a lot of process variation. You need to make sure that the same device in place ‘A’ has the same performance on place ‘B.’ You have to consider all of that. If you do it wrong, you don’t get proper yield. But you also might have a chip that escapes detection in the test but which has problems in the field. We have customers making hearing aids. They are trying to drop the voltage as low as you can get, and they are really hitting the limit at low voltages for both how to maintain content in memory and how to operate. Then the design misbehaves because they are pushing the lower corners of performance. On advanced nodes you also have a phenomenon of low voltage combined with other low voltages behaves much worse than a typical array. That creates additional reliability problems.

LPE: Finding all the corners and verifying all the various pieces is getting harder. What’s the impact on reliability with that?
Smith: The amount of time that’s being spent on verification, particularly on small geometry and very complex chips, is rising. Implementation and place and route is still a huge job. But the verification is taking more time. Most of the verification flows are 15 years old. Variability on chip is a huge effect. At 180nm we didn’t have to worry about that very much. At 40nm and 28nm, it’s a big concern. Verification takes a lot more time and more compute power. It now takes a CPU farm. The industry is ripe for big change there. In terms of reliability, it depends on whether you’re going to do a marginal design or a really good design. To do a good design you have to do a lot of verification and you have a lot of operating conditions. And if you’re doing a low-power design and using all these techniques, there are likely to be a lot of operating environments. Then you have to consider all the corners in the process. When you multiply those out you have hundreds of different scenarios.
Sanguinetti: These issues are less important for some products than others. But the line between what we think of as consumer products and other products is blurry. Is an industrial printer a consumer product? If something goes wrong there, you may get degraded quality. But what if it’s a prototype of a device that prints 3D models. A failure there, or an intermittent failure, can result in something a good deal worse. Is a hearing aid a consumer device?
Smith: Or a pacemaker?
Sanguinetti: Reliability also is a bigger issue if you have to reboot something. Do you really want to reboot your television or your hearing aid.
Buric: But when you talk about power islands and multiple voltages, those techniques are much less used than you might expect. It’s extremely difficult to use them in a reliable fashion. We see people staying away from using them, especially in consumer devices. It’s extremely expensive. If there was an easy way to verify them, that would eliminate some of the problems we face now. But right now, there’s much more noise than reality.

LPE: Is it the same in the verification world?
O’Neill: There are two aspects to that. There’s the verification we as a company do. And then there’s the verification our customers do on their designs when they’re getting ready to program the design into a part. From the customer perspective, the changes we’ve seen are scaling linearly with the fact that they are doing bigger and faster designs. But it’s not an exponential increase in the amount of verification and validation of those designs. A lot of the deep submicron effects are somewhat shielded from the customers by the fact that there’s a level of silicon design and we just give them a tool to place logic into this part. The onus is on us to design our FPGA with as much competence as possible to mitigate the deep submicron effects. From the point of view of the design teams at Actel, they are spending more time using more complex tools to do verification and validation of their designs. In addition to implementing programmable gates and routing structures we’re adding hard IP into our parts. That includes multiply/accumulate blocks, increasingly sophisticated I/Os, different flavors of SRAM cells and various forms of non-volatile memory. All of those things represent different types of what is effectively ASIC design, and it’s now being done by the Actel design team. We, on the other hand, have experienced an exponential increase in validation and verification.

2×4: Low Power, Complexity And Reliability

Thursday, January 7th, 2010

Low-Power Engineering poses two questions about reliability  to Brani Buric, executive vice president at Virage Logic: Ken O’Neill, director of high reliability product marketing at Actel;  John Sanguinetti, chief technology officer at Forte Design Systems, and Bob Smith, vice president of marketing at Magma.

YouTube Preview Image

Considerations For Choosing The Right Low-Power Tools

Thursday, October 15th, 2009

By Cheryl Ajluni
Regardless of what you are designing these days, one fact holds true: Your design is only as good as the design tools you use.

Gone are the days when a design could be done on the back of napkin. Today, engineers require a complex ecosystem of interworking tools to guide them through the complex design flow. This is especially true when it comes to low-power design, as its complexity now permeates every aspect of the design flow, creating challenges that threaten to derail design closure at every turn. Here, automated design tools can play a key role in speeding the design process, selecting optimal low-power architecture and ensuring design closure.

The problem, of course, is low-power or “power-aware” design tools and flows are still in their infancy—a fact that poses a bit of a dilemma for designers. Not only do they need to figure out what type of power management and low-power design techniques to employ, but they must also determine which tool vendors support those techniques. Then they have to evaluate the possible tool options and make a selection. This can be a stressful and time-consuming process, especially when you consider the decision is critical to the success of any design project and, for that matter, to a company’s overall success and vitality. While there are no hard and fast rules for selecting the right tool, or the right vendor, there are a number of considerations—over and above a tool’s verified functionality—that engineers can use to help simplify their decision. Those considerations include:

  • Cost. A tool’s actual cost and its available pricing options are important considerations when evaluating a design tool. Of course, a tool’s true cost is also impacted by its learning curve and overall reliability—both of which can affect downtime—and therefore must also be considered prior to making a tool purchase.
  • Speed. While it may not always seem like a key consideration, how fast a tool operates can directly impact the designer’s time-to-market schedule as well as overall design costs and therefore should not be overlooked. Was it designed for multicore processors, or simply updated to take advantage of them?
  • Support for Industry Standards. Using a tool built to emerging low-power industry standards, such as the Common Power Format (Cadence and Magma) or the Unified Power Format (Synopsys, Mentor and Magma), ensures that it will interoperate with a range of other design tools and flows. It is also smart to select a tool that can be used within industry-accepted reference flows such as the power-aware reference flow recommended by the Low-Power Coalition (LPC) of Si2 or Accellera, respectively.
  • Ease of Use. Is the design tool easy to use? Does it require special training or low-power design expertise? Does it make you more efficient or productive? Does it support multi-language user interfaces for globally disperse design team members and are the user interfaces familiar? Is it easy to deploy, administer and maintain? Does it integrate well with other low-power design tools and design flows? All of these factors should be carefully considered during a tool’s evaluation.
  • Flexibility. Is the tool flexible enough to accommodate changes in technology and can it adapt to changing business conditions—an especially critical question given the current state of the global economy? Can it support the needs of a globally-disperse design team with features like revision control and policy control for IP management?
  • Customer Support. How responsive a tool vendor is to the designer’s support needs can be vitally important to the success or failure of your low-power design. Does the vendor provide quality documentation, training when needed or on site technical support? Does the vendor have proven expertise in low-power design? Such expertise may prove invaluable if you find yourself facing a difficult low-power design problem.
  • Vendor Credibility. Don’t forget to verify the tool vendor’s reputation with other designers. If they have had trouble with the vendor, then chances are good that you will, too.

Design Tool Options
Despite the fact that low-power design tools and flows are still relatively new, there are a number of options to choose from. A sampling of these tools includes the following:

  • Catapult C Synthesis and SpyGlass-Power, from Mentor Graphics and Atrenta, respectively. SpyGlass-Power is an RTL power estimation and reduction tool that is used to automate multi-level clock gating. Catapult is a high-level synthesis tool that offers a fast path to verified RTL from pure C++. New low-power optimizations enable the tool to thoroughly analyze a design to determine gateable clocks and build the appropriate logic. An interface now exists between these two tools that allows RTL output from Catapult to be handed off to SpyGlass-Power. Static and dynamic power estimates from SpyGlass-Power can then be fed back into Catapult C.
  • Eclypse Low Power Solution from Synopsys. Eclypse is an integrated flow of tools, intellectual property and methodologies that allows designers to include everything from MTCMOS power gating, multiple voltages, dynamic voltage and frequency scaling. The goal is to dramatically simplify design and the increasingly complex verification portion of that design. Eclypse also includes clock gating, low-power clock tree synthesis and leakage power recovery. As you might expect, it includes UPF support, as well as support for the Low-Power Methodology Manual created by Synopsys and ARM.
  • Cadence Low-Power Solution from Cadence Design Systems. Cadence’s Low-Power Solution is a CPF-enabled design-to-signoff methodology that makes it easy to incorporate low-power design techniques in advanced SoCs. It includes tools like the InCyte Chip Estimator for chip planning, Encounter RTL Compiler for logic synthesis, Encounter Conformal Low Power for structural, functional and equivalence checking; the Encounter Digital Implementation System for physical implementation, the Encounter Power System for power rail analysis, and Incisive Formal Verifier for formal property checking (Figure 1).

cheryl1

Figure 1. The Encounter Power System solution accelerates power optimization and signoff with a unified timing and power database. It can be used by front-end logic designers seeking high-quality early power and rail analysis, as well as by back-end physical designers looking for comprehensive signoff analysis and silicon-correlation.

  • PowerPro CG and PowerPro MG, from Calypto Design Systems (www.calypto.com). The PowerPro CG tool reduces power by implementing sequential clock gating logic in the non-memory portions of an RTL design. PowerPro MG is a memory gating tool that automatically generates power-optimized RTL by taking advantage of the low-power modes available in on-chip memories. It works with PowerPro CG to produce the lowest power design possible.
  • Talus Implementation System, from Magma. The Talus implementation system provides a fully integrated RTL-to-GDSII flow for high-performance, high-complexity, low-power nanometer designs. Talus Design and Talus Vortex are key tools in the system. Talus Design is a full-chip synthesis environment, while Talus Vortex is a physical design environment. Another tool, Talus Power Pro, works in conjunction with Talus Design and Talus Vortex to enable optimal power management throughout the flow.
  • PowerArtist-XP and PowerTheater, from Sequence Design (now part of Apache). PowerArtist-XP is an RTL Design For Power (DFP) platform that features fully-integrated advanced analysis and automatic reduction (Figure 2). Using it, designers can achieve a 10 to 60 percent or more power savings. PowerTheater is a solution for RTL power analysis.

cheryl2

Figure 2. PowerArtist-XP enables designers to make intelligent design decisions that maximize power savings while minimizing design impact.

The Bottom Line
While designing for low power remains a difficult and complex challenge these days, appropriate use of low-power (power-aware) design tools can help simplify the process. Such tools will only become better and easier to use with time. Of course, selecting the right tool or tools is absolutely critical to a successful low-power design, perhaps just as critical as determining which low-power design and power management techniques to implement. While there is no set criterion to follow when making this decision, the considerations outlined above can serve as a guide in helping to make your decision that much easier.

Next Page »