Posts Tagged ‘eSilicon’

Next Page »

Experts At The Table: Changing Design

Friday, February 3rd, 2012

By Ed Sperling
System-Level Design sat down to discuss the changing design landscape with Juan Rey, senior director of engineering for Calibre in Mentor Graphics’ Design to Silicon Division; Michael McNamara, vice president and general manager of Cadence’s System-Level Division; Yervant Zorian, chief architect at Synopsys; Prasad Subramaniam, vice president of design technology at eSilicon; and Ravi Varadarajan, an Atrenta fellow. What follows are excerpts of that conversation.

SLD: As we go up in abstraction, will transaction-level modeling be enough? It doesn’t address software or the physical effects, does it?
Zorian: That’s correct. It’s a higher-level model. But it’s good for initial exploration.
Subramaniam: You need to be able to explore all the different multidimensional aspects of the problem—power, performance and area. Being able to look at one aspect is good, but it’s not sufficient. That’s the missing link here.
McNamara: That’s where you get to things like parasitic extraction. You need to lift up the power information from the RTL and the gate-level simulation and project that up to the TLM so you can run simulations there and boot software on it and get a sense of your power. As you switch to a different manufacturer, that also has an effect on the leakage power. You’d like to be able to project that information up, too. That’s crossing nodes with abandon. You’re going from a choice of one transistor to another and projecting that onto system-level power. But for us to tame the complexity we have to have a system that lets us do that. I may not care how fast it works. I may just need more than a day of life on a battery. I may need it to last the flight from here to India.
Varadarajan: The abstraction only works when there is correlation. It enables you to do design exploration. If you make all these choices and you can’t trust and implement those choices when you go to the back end then it’s not useful. The correlation is key as you go down in implementation. In a typical SoC you have IPs and complex bus fabrics switching up these IPs. What designers do is transform the SoC from a hierarchy where they collect IPs and push them to the bus fabrics into larger subsystems that get implemented by design teams. That is a difficult task. One of our customers complained there was a timing signal that went from the lower left corner of the chip to the top and all the way back down. That’s something that should have been addressed up front. When you go down into the implementation of a very complex IP that is coming from a third party, the implementation is going to be a function of how you break down the data flow, how you organize the memory and the floor plan of the entire IP. Being able to capture that and make high-value decisions, like whether you can push it from 400MHz to 500MHz, is going to be a function of how the floor plan is going to look. You need a predictable way of handling that. Once you have taken care of the global topology, all the physical synthesis tools are commodities. If you organize the data flow, they can all converge to the same timing targets. You need to demystify IP and break it down at the abstract level, and have enough confidence you can achieve your targets, whether that’s timing or condition or power, and then be able to implement that predictably when you go into the back end.

SLD: To get a design out faster you’re decoupling IP from the rest of the design. But to do exploration you need to really understand all the tradeoffs of performance, power and area from a system level. How do we bridge those disparate ideas?
McNamara: There’s a tools aspect to it of being able to extract information so it can be shared. The end customer needs to be trade off ‘A’ versus ‘B.’ You can assemble one company’s IP with another company’s IP, but you’re going to be just like someone else. So how do you differentiate?
Subramaniam: This is where system-level tools come into play. Today you can figure out performance and what speed you need to run your system at. But they don’t have the ability to absorb information that is available at the lower level. What we need to do is to input the implementation-related information up into the system-level tools. When the system-level designer is exploring the system bandwidth requirements and how to architect the system and organize the software and hardware partition, they should be cognizant of the implementation-level details at that level. That is a missing link today, and it’s a gap that needs to be addressed.
Rey: There is the issue of certification. You need to understand interactions when IP is placed in a certain area. Look at 3D integration, where you have things in different die that have to work together. Even though the general consensus is this is really a cost equation, as soon as you want to stretch the limits a little bit and go to higher frequency the approach breaks down. More research into implementation and testing will be required to make sure it works. When you go to very high frequencies we’re starting to see some effects that were not important before.
Subramaniam: If you look at memory, there are different kinds of memory interfaces. You can have a wide I/O interface in a 3D stack that is running at a much lower speed but giving you the same bandwidth as a much higher-speed interface that is SerDes-based and has fewer pin-count. In a 3D environment, both are applicable. In fact, the wide I/O is more amenable to a 3D environment. One of the things system-level tools should enable is for you to determine which makes more sense for this design. Should you use a SerDes-based high-speed serial interface to access memory or a wide I/O low-speed interface? This is where the power-performance-bandwidth-area tradeoff comes into play.
McNamara: It’s also the implementation. If it’s displaying video you need a lot of data, but it’s also 24 frames per second so you know exactly what bandwidth is needed for the next frame. Wide I/O could make more sense in that case. If you have a low-latency message and you need to get it quickly and it’s small, both would work. You could send the same message over either channel. But you need to see which one is better for the application. We’ve talked about power-performance-area, but there are many other cost metrics—things like routability. One design may be 10% smaller but the routing guys will kill you. You also have reliability. As geometries get smaller and smaller, maybe there’s another way of implementing something that’s more reliable, particularly if this is going into the back office of a bank. If you care about one of these other dimensions, is there a way to extract that information and project it through the system?
Subramaniam: You need to define the metrics and have those metrics defined for each one of these subsystems. That’s required for exploration.
Zorian: Exactly. But architecture-level exploration is there already—the prototyping capabilities where you can explore different architecture options, whether it’s 3D or 2D or wide I/O or not wide I/O. You can play with those. We do have virtual prototyping capabilities that allow you to play the game early on—pre-silicon.

SLD: That’s assuming your IP is very well characterized, though. From the big companies like Cadence, Synopsys, Mentor and ARM, that’s going to happen. From other sources, maybe not.
Zorian: When we started IP it was just for re-use. It was a design piece. Today that isn’t the case. There is differentiation between IP providers. I believe in IP completeness. You need to put out a complete package for IP. From a design point of view and from a manufacturing point of view you have to prove it in silicon, you have to have the silicon reports with it, and you have to maintain it internally with built-in self-test and built-in self-repair and debug diagnostics. That makes your IP complete, and it’s a differentiator in the future between one IP provider and another.
Subramaniam: This is where a standard like IP-XACT may come into play. If you have a host of third-party IP providers out there, how can the small IP players enable their IP in the ecosystem? The answer is by using a common standard to characterize the IP and fit their IP into the tools environment. That will enable designers to access that IP in their environment.
McNamara: And you mentioned the subsystem. It’s no longer that I want this PCI Express. I need a compute subsystem that gives me the floating point, the graphics, and so on. There’s IP that’s software for that. Often IP suppliers of these subsystems are giving you demo drivers. They almost work. You’re assembling this device to get to market by Christmas. You figure all you have to do is slap this thing together, put on Android and you’re there. Then you realize the driver doesn’t actually work. Then you integrate it all with the rest of the system and you find out you have to fix the driver again, because while it worked great after you fixed it on a point-to-point link, it doesn’t work as well on a shared bus. There’s also a software component there that has to be tested and proven. When you get to platform-based design, there will be common subsystems and the Linux kernels will already know about these. They’ll already have the device drivers. So then there may be hot IP from another company that gives you 10% more performance with 15% less area, but the Linux kernel doesn’t know about it. That’s now another challenge to the small IP providers. They need Unix developers working for them to deliver all of this and they may not be able to afford them. But how we’ve made these devices better every year is by taming complexity.

SLD: We also have to start dealing with yield issues, right? Multiple known good die could end up in a package or stack as all bad die.
Zorian: It’s not only the dies. It’s also the interconnects. Those TSVs are of a different nature. The defects that can hurt them are different. Our ability to test them at the end is not sufficient because by then you’ve destroyed your whole device. At the stack level you have to test and prove and retest.

SLD: It’s also the interposer, right?
Zorian: Same approach.

SLD: From a designer standpoint, assuming we have subsystems that can plug and play and IP that can be mixed and matched, how do companies differentiate themselves.
McNamara: That’s where having a complete flow is essential. One company did a great memory controller that could handle any device with 15 common memories, and you could use this IP on your device and build something that didn’t require custom memory. What this meant was there was wasted silicon. There were bits available to talk to memories that this device would never talk to. When we have this great system where we have the transformation tools down and the analysis tools up and this way to select IP and put it together, there’s another phase of this optimization that goes across the whole thing. It isn’t just gate-level optimization to clean up a couple bits. You’re looking across the whole stack and realizing the customer is going to run iOS on it and it will never run Android, so maybe there’s something you can get rid of reliably on this design, but keep the IP for another design and delete something else. That’s a differentiator. You need a way to optimize a design.
Rey: There is always the possibility of stretching the technology into limits that are defined in a conservative way by the manufacturer. Yield is one aspect. Another aspect is related to performance. You can see the companies that have an intimate understanding of the technology processes. They can go beyond the recommended rules and the established rules that the foundries have imposed, and it can give them an advantage.

SLD: Can that be done in a disaggregated market, or can it only be done by the large IDMs?
Rey: It can be done in a disaggregated market, and it has been proven by some of the companies we work with. What you need is a certain level of volume so the manufacturers will pay attention to you. Otherwise it’s going to be a lot harder.
Subramaniam: I agree. If you look at the performance of different technology nodes, there’s a significant overlap in performance between neighboring technologies—40nm and 28nm, and 40nm and 65nm. Somebody who makes a smart choice can differentiate their product by implementing it in a cheaper technology versus someone else who uses brute force in a more expensive technology. That’s one way of differentiation. But in spite of there being a lot of re-use, there’s always going to be some aspect of the problem which is unique to the individual—algorithms, software techniques—and those will be designed independently. They will be complementary to the re-usable IP and that will always be there. But when you talk about wasted silicon, that is a by-product of re-use. Not everyone can afford to do a custom design for every design. There will be some customization, but that will be dedicated to a small portion of the overall system.

What’s Changing In System-Level Design

Thursday, January 26th, 2012

System-Level Design talks about what’s changing and what’s needed with Juan Rey of Mentor Graphics: Yervant Zorian of Synopsys; Michael McNamara of Cadence; Prasad Subramaniam of eSilicon; and Ravi Varadarajan of Atrenta.

YouTube Preview Image

Will It Work?

Thursday, January 26th, 2012

By Ed Sperling
Estimates of how much time it takes to verify a complex SoC are still hovering around 70% of the total non-recurring engineering costs, but with more unknowns and more things to verify it’s becoming harder to keep that number from growing.

Verification has always been described as an unbounded problem. You can always verify more, and just knowing when to call it quits is something of an art. Moreover, with software now thrown into the mix, engineering teams have to decide what’s good enough for tapeout and what can be fixed once the chip is already in the market.

Making that decision is becoming tougher, though. The amount that has to be verified is less clear, in part because of the growing amount of outside IP that is now included in designs. Of the 70% or 90% of IP that is used or re-used in a complex SoC, less than 50% is now commercially purchased with the remainder internally developed, often for previous projects. The amount of commercially generated IP is expected to rise over the next few years, though, basically creating a series of black boxes that companies didn’t create internally.

While much of this commercial IP will be sold as pre-verified, what works in one design may not work exactly the same way in another. That’s particularly true with different process technologies. A general-purpose process built for speed may cause IP to behave completely differently than one optimized for low power. And in stacked die, two known good die may no longer work when they are packaged together.

“The new world is a broader supply chain for chips,” said Mike Gianfagna, vice president of marketing at Atrenta. “There is a need for better visibility in the supply chain, including everything from early predictions to yield to the track record of the supplier. There are multiple points of failure. For data management, planning, thermal and mechanical analysis you need fundamental enabling technologies. At the same time there is a re-invention of the industry into smaller, more niche markets.”

Knowing what to verify
Just knowing how much to verify is a challenge. Taher Madrawala, vice president of engineering at Open-Silicon, said this is not a simple decision because file sizes for verification are becoming enormous. That means what gets left out of the verification process may be as strategic as what gets included, because all of this can affect time to market. Verification budgets remain tight, both from a manpower and equipment standpoint.

“On top of that you don’t always have access to all of the functionality,” Madrawala said. “That’s especially true in 3D stacks or system-in-package. You don’t always have access to increased functionality because some things are encapsulated inside the package.”

He noted that from an NRE perspective, the percentage spent on verification has remained constant from 90nm down to 45nm. That has been helped by more standards, including modeling of IP in C or C++, an increased use of emulation, and the ability to run tests on multiprocessing computers. But with compressed schedules and greater complexity, those numbers can change.

There also are differences of opinion about what works, what will continue to work, and what needs to be changed in the future, both from a physical and a functional standpoint. Tools vendors insist that most of the capabilities are already there to do verification, even though they will need to be speeded up through better modeling at a higher level of abstraction with a greater reliance on multiprocessing servers. They also say that verification teams need to learn to use the tools that are out there better.

Chipmakers generally acknowledge the need for better training on the tools, but they say the growth in complexity will create the need for additional testbenches. In particular, there will need to be new tools for partitioning designs and verifying the results once stacked die become more mainstream.

“As complexity grows, integration will be the issue,” said Prasad Subramaniam, vice president of design technology at eSilicon. “You will need specialists for each part of the design. People’s specialties will get narrower. And then you will need people to manage more specialties. The generalists, who will be the architects and higher-level engineers, will define the problem. Once they have made the decision about what to do, then the specialists will take over. But there will also be a lot of feedback. This will be an iterative process. There will be meetings where you need to reconcile differences and make adjustments. There will be a lot of collaboration, and verification will start from the get-go.”

Verification strategies
There are two main approaches to verification. One is to verify the pieces. Another is to verify the system. Both are necessary, but the order in which they need to be done as part of a verification flow can vary greatly for even derivative chips.

Samta Bansal, 3D IC lead and silicon realization digital project manager at Cadence, said that in stacked die an incremental approach will be needed to do verification. “If you analyze it all together it overcomplicates the process,” Bansal said. “For one thing, not all of the pieces will be available at the same time. A more feasible approach will be to verify each chip in a stack as part of a verification flow, then focus on the microbumps, TSVs, LVS and DRC for alignment and ultimately create a single file.”

That’s not so simple, of course. In stacked die there are physical verification issues that can complicate the functional verification, notably stress and power. And there is now software that needs to be considered in the mix, with the trend toward an increasing portion of the stack.

“Functional and physical verification are both important but independent tasks,” said George Zafiropoulos, vice president of solutions marketing at Synopsys. “In both cases, verification is moving up in system complexity. We’ve gone from blocks to lots of blocks to lots of processes and I/O, and there is more stuff coming. Complex interface IP at the periphery of the chip has gone up by an order of magnitude. The design team can’t verify everything, though.”

Zafiropoulos said design teams used to think there was not enough time to do verification at the block level. He said that putting 100 blocks together increases the challenge exponentially.

“A lot of this is bottom up,” he said. “You build sub-circuits up to the chip and then in multiple chips. You can’t afford to have errors inside these blocks. But you also need to change the scope of what has to be done. In the past, one engineer could comprehend everything on a chip. Now we’ve gone from the guy who knows everything about a chip to teams that are in different companies and maybe different countries.”

The result, he said, will be a gradual change in three areas. First, more and more engineers will do verification, rather than just specific verification teams. Second, all engineers will become more software savvy. And third, new kinds of tools will be introduced, including formal approaches.

Experts At The Table: Changing Design

Thursday, January 26th, 2012

By Ed Sperling
System-Level Design sat down to discuss the changing design landscape with Juan Rey, senior director of engineering for Calibre in Mentor Graphics’ Design to Silicon Division; Michael McNamara, vice president and general manager of Cadence’s System-Level Division; Yervant Zorian, chief architect at Synopsys; Prasad Subramaniam, vice president of design technology at eSilicon; and Ravi Varadarajan, an Atrenta fellow. What follows are excerpts of that conversation.

SLD: What’s changing in design?
McNamara: Software is a component that can’t be ignored anymore. It can’t be done later in the design cycle. The hardware-software co-design is a real problem—and a real opportunity. When you look at battery life, often it’s the software that has to be fixed. It isn’t the hardware. The software is now smarter and does more.
Subramaniam: As you go into smaller geometries you have more choices in terms of technology and complexity. There are more variables in the technology, different libraries that are available, more choices in terms of transistor Vt and channel length. If you look at the combination of what’s available to do a design, finding the optimal point in your design optimization is key. How do you optimize for power, performance and area all at the same time. Many people only see a small portion of the design space that’s available to them, but while it may be good enough for them it may not be the optimal solution.
Varadarajan: What’s changed is the increasing commonality between SoCs today. You see an ARM core, a GPU from ARM or Imagination, and DDRs. The challenge is how to improve the turnaround time for closure. These designs undergo a lot of derivatives over their lifespan, which are incremental changes. The real challenge is how you can close on a new derivative in 10% to 15% of the time it took to do the original design. What can you capture to achieve closure at the front end so every single design is not a complex back-end-to-front-end netlist handoff process? There are some customers going down that path already, putting in methodologies and mechanisms to ease that process.
Rey: There are a lot of tendencies and trends that require the tools to be faster and have new functionality, and the designers need to be aware of that to get an edge over their competition. In addition to that, there is also a need for portability. People don’t want to be tied to a single semiconductor manufacturer. They want more than one source. Designers need to be aware of that and be able to port their design to different places, and the tools need to be able to play in more than one place at the same level.
Zorian: There are two things going on. One is that complexity is going up, and the other is that shrinkage continues. On the complexity side IP usage is going up. It’s now 70% to 90% of a chip. But if you look at vertical markets, IP is not enough. These need to be grouped together into subsystems to improve re-use and reduce design time. On the shrinkage side, maintaining the health of the chip is a problem. The more we shrink, the more important our yields, quality, reliability and testability. We need to ensure those capabilities are available early on in our IP and our subsystems. You have placeholders, so as you move down you can shrink while maintaining the health of the chip.

SLD: How do we deal with all of this complexity? Is it a matter of raising the abstraction or do we need to deal with the problem differently?
Varadarajan: Abstraction is the key message. If you look at all the advancement over 20 years, physical synthesis still closes timing at the cell level. It places and routes standard cells and finally signs off on timing at the standard-cell level. That’s what designers trust, and it has to go down to that level today. Being able to have enough confidence at a higher level of abstraction and having global design closure with confidence is essential. Being able to predictably create subsystems and validate timing closure at that level, before you descend into the detailed implementation is essential.
Subramaniam: Abstraction alone is not enough. What you need in addition to that is an actual implementation of that abstraction. One of the reasons IP re-use has been successful, particularly in the analog and mixed-signal area, is that you have the abstraction and the actual implementation. You know that when you implement something it’s going to work. We need to extend that to the subsystem level. Today we have it at the individual IP block level. We don’t have it at the level of the subsystem. We need to create hardened or quasi-hardened models, so you can have an abstract view of that subsystem and also know how it will behave. You know its performance characteristics, its power characteristics and all the other critical characteristics. In 3D ICs, you might have pre-implemented version of subsystems that you can simply drop in. That makes derivatives simple. You have a common subsystem with a standard interface. It allows the designer to focus on one portion of a design, no matter what the technology node.
Zorian: I agree that having abstraction isn’t sufficient. You also need the ability to explore the different options and optimize between them. You need automation capabilities to explore those options and see what the best choices are. It may be 2D or it may be 3D. But with designs today you have so many ways of optimizing them. To explore that is very important.
Rey: On the implementation side, historically there were not robust methods to keep the abstraction level all the way down to the implementation. That is changing. You come up with IP, you know where you can stretch the limits of implementation, and you need to get those details down to the implementation. The foundries are jumping into that. It’s becoming mainstream. Think about keeping several different power domains. You need to make sure information moves all the way from the system level to GDS II to ensure you are not crossing from one power domain to another without the proper protection. That’s happening now.
McNamara: At each design level we need some way of modeling the design. Then we need a way of verifying the design is behaving the way we think it should behave, whether it’s functional or electrical. There’s a transformation stage, which we call synthesis, where you take that abstraction level and push it down to the next level, whether that’s RTL to gates to polygons. And then you have an analysis or extraction phase that takes information from that level and projects it up to a higher level of abstraction. We’ve all built tools that do pieces of that. But if we want to raise it to a higher level of abstraction we need to have a modeling language and a way of transforming it to the lower levels and abstracting information up. There is a system above every design node that just considers us as a component. That plays all the way from the transistor to the macro cell and up. As we look at increasing complexity, we have to tame this. We need a way of bounding these things and putting rules around them so the higher levels can abstract away that detail but also take advantage of the power. If you have too firm a rule that doesn’t let you take advantage of what the power can do, then you’re leaving performance, area or power on the table. It helps to think of this in four boxes. Do we have a modeling language? Do we have a way of verifying it’s correct. Do we have a way of projecting it to the lower level? And do we have a way of abstracting things up to a higher level?
Subramaniam: The modeling language should be able to address all the key attributes of the design. You need to be able to define a set of metrics for every design, and then you need to be able to look at all the different alternatives that are available for that design and translate each of those implementations into the metrics. Now, when you’re in the exploration phase, you can take a particular design and if you use this combination of transistors or this process from this foundry, what does it look like? That’s really critical as part of the exploration phase. Both tools and language are required for this exploration phase.

SLD: Is there a language that does this?
Zorian: We are going up in abstraction. TLM is a high-level description. That same IP has multiple levels of description. But what is being shared for exploration purposes is a transaction-level model. Once you decide to use it, you go down to the detailed models. But there’s a good exchange of information at that level.

High-Speed Bus Architecture And Data Transmission Technology Overview

Wednesday, January 25th, 2012

High speed and low power embedded processors are used frequently in today’s high performance networking and communication systems, digital consumer electronics, and office automation applications. It is extremely important for the equally fast I/O and multiprocessor busses to keep pace with them so as to enable an effective product solution. This report is intended to be a quick reference for a high level understanding of bus architectures, the most widely used data transmission standards and I/O bus solutions. It also includes an extensive glossary and set of references for further research.

To download this white paper, click here.

Experts At The Table: The Future Of Stacked Die

Friday, January 20th, 2012

By Ed Sperling
System-Level Design sat down to discuss the future of stacked die with Riko Radojcic, director of engineering at Qualcomm; Prasad Subramaniam, vice president of design technology at eSilicon; Mike Gianfagna, vice president of marketing at Atrenta; and Herb Reiter, 3D/TSV working group chair for the GSA. What follows are excerpts of that conversation.

SLD: There will need to be an infrastructure of models that are power-aware, interconnect aware and system-aware, right?
Gianfagna: I don’t see how you can express the parameters for those models. Are they accurate enough? I would say no. There is no well-defined set of parameters that characterize this with sufficient detail to be meaningful.
Subramaniam: It will be a challenge to have two thermally alike chips stacked one on top of another. What may ultimately happen is that one chip will be thermally dominant and one will not. That would be an easier problem to address.
Reiter: We have learned a lot in ASICs over the years, and this will be the same. There will be a lot of learning from the data. That’s one of the problems the EDA vendors have now. They can’t get meaningful data from the suppliers. That’s also why they’re reluctant to do a lot of tools right now. If you don’t have good input, the tools without the data will not produce accurate results.
Gianfagna: We know what high-temperature accelerated life testing is for a chip. But what will that be for a 3D stack?
Radojcic: We’ve been pushing this issue of stress awareness for awhile with Sematech, Fraunhofer and Imec. We had a series of workshops to focus on this. The first three were focused on how do we simulate what is basically the DFM. What are the model characteristics you need? The next two were focused on the reliability issues. What are the qualification standards that are required for a 3D stack? Do we need to do something new, or is a 1,000-hour stress test the same?
Gianfagna: I doubt it’s the same. The failure mechanisms are far more varied and subtle.
Radojcic: We’ve already looked at what are the failure mechanisms we should all worry about. There is no doubt stress and the many faces of stress come up on the list. There is a list knocking around of things to worry about. And now we’re going to look at standards.

SLD: Does 3D stacking become harder with 3D structures, such as FinFETs or a deeper channel?
Radojcic: The value proposition for stacking is the same. There is no reason that will go away.
Subramaniam: Those are lower layers. A finFET chip doesn’t look any different than a standard transistor chip.
Radojcic: The reality is no one knows the answer to that. We don’t know if a finFET reacts with a TSV.
Reiter: What happens when we get a much higher software content in 3D? How do we test hardware and software interactions? Software many times causes a lot of headaches. How do we make sure 3D stacking doesn’t get a black eye because the software isn’t mature enough?
Gianfagna: We see the problem very clearly in 2D. My view is 3D won’t be built and thrown over the wall to the software guys. The software guys will be collaborating with the 3D stack design team to build a system. That means the model the software guys are using will not be static and there will have to be an iterative back-and-forth communication with the hardware guys. If software guys are really as invested as the hardware guys, things should be better. It’s not rocket science to solve that problem, but design teams will have to react differently. In my experience, those are the harder problems to solve. To make an organization work differently is much harder than writing a new piece of software.

SLD: How does the business world look at 3D chips? Will it be one vendor producing all of these, or will it multiple of vendors providing lots of different pieces?
Radojcic: Somebody has to be the aggregator. Whether that’s Qualcomm or a third party or the foundry doesn’t matter. But clearly part of the value proposition is intimate interaction with different types of chips, and very few entities design all kinds of chips.

SLD: And the biggest piece of the pie is for the aggregator, right?
Radojcic: Yes. There is money and IP in that space.
Gianfagna: This is like the fabless versus the foundry model. Qualcomm does a lot, but it doesn’t build substrates. Smaller companies will outsource more.

SLD: But your risk goes up with any partnership, right?
Subramaniam: The issue is ownership. Who owns the final product? That will be the aggregator because the aggregator is responsible for putting it all together and making it available to the end customer. In this case, it could be Qualcomm. Or it could be a third party providing it to their own customer.
Radojcic: It also could be an end user like Nokia. That still has to be worked out. It probably will be all of those.
Gianfagna: Let’s look at the seamy underbelly here. The end customer will take all the glory and all the kudos, but they’re not taking the yield risk or the inventory risk. Their supply chain is.
Subramaniam: That’s no different than what’s happening today.
Gianfagna: No, it’s not. But it is more complicated.
Subramaniam: Yes, it makes the problem for the aggregator harder.
Gianfagna: Presumably the aggregator can extract more value for all that additional service.
Reiter: Some companies have already said they will not be the aggregators taking on a lot of responsibility and risk for current margins. The assembly guys already are seeing a big opportunity in this area to become consolidators.
Gianfagna: There are opportunities here for clear thinkers and a little bit of nerve.
Reiter: The problem is some of these guys have just invested millions of dollars in things like wire bonders for SiP, and SiP will largely be replaced.
Radojcic: All of this stuff will co-exist. The TSVs will cost more. There is incremental risk. Some vendors will leverage the incremental value to pay for their costs. But not everything will be that way.
Reiter: But if you look at the I/Os you need SiP. You can’t thin the dies too much because you need to PoP onto this bonding. TSVs, where you are taking the old I/O ring away, make that much more cost-effective. And in two or three years, that will be more cost effective than wire bonding for higher volumes that are architected for 3D.
Radojcic: If you replace the wire bonds with TSVs, the math doesn’t work. It’s too expensive. Wire bonding is a perfectly good, cheap technology.
Subramaniam: For the same number of I/Os, the SiP will be cheaper than the TSV solution. But what TSV offers is the chance to blow up your I/Os. You can have many more I/Os in the same footprint. There will be an incremental cost, but the benefit is significant. You have more connectivity and reduced power. Those are the tradeoffs that need to be made. For applications where that is not necessary, they will tend to use SiP. For applications where you want to take advantage of extra connectivity and low power, that will move to TSVs.
Reiter: What’s the highest pin count in SiP that you have seen? About 500?
Subramaniam: Yes, something like that. Power starts to dominate. There are only so many signals you can put in. You need to have power and ground.
Reiter: From a performance standpoint, TSVs will be much better. The inductance of the bonding wire is huge.

SLD: Who gets hurt by 3D stacking?
Reiter: It’s an incremental opportunity.
Subramaniam: It’s a new technology that offers many more alternatives than are available today. I don’t believe it’s taking away from anybody. It’s adding to the whole space.
Gianfagna: The aggregators will take on more risk with an increased opportunity for reward. The only people who get hurt are the ones who can’t figure out how to use this stuff.
Reiter: Every logic foundry today is a big memory supplier. If memories get pulled out, the logic die gets much smaller and the revenue will get impacted.
Subramaniam: No one will replace SRAM with DRAM. The SRAM will still be there, whether it’s on the same die or different die. You don’t need a different process. I’m not convinced the logic foundries will suffer.
Reiter: But large SRAMs are getting slower than DRAMs because of the wire delay. Having large embedded SRAMs on the SoC is slower than having a DRAM with TSVs on top of it.
Subramaniam: But you can have SRAMs with TSVs through them, as well. And SRAMs are inherently better than DRAMs for that kind of application.
Radojcic: In addition, that’s why you typically partition your SRAM into a whole bunch of smaller instances so you don’t have wire a mile away.
Reiter: But if you look at the wide I/O standard, you can have a lot of small SRAMs on top of your logic-only die.
Radojcic: You could, and there is talk about re-architecting chips to eliminate L3 cache. But it’s not the same value proposition as what’s being enabled by wide I/O.

SLD: What’s the big advantage of 3D? Is it saving money, or is it performance and power?
Reiter: It’s performance and power.
Gianfagna: Performance and power will lead to saving money, not the other way around.
Subramaniam: There are cases where it saves money. In Xilinx’s case, by breaking up the die into four components they were able to significantly improve yield. There will be many cases where by using smaller pieces of silicon and replicating them, either stacking them horizontally or vertically, you will significantly improve yield and get a cost benefit. It’s all of the above.

Experts At The Table: The Future Of Stacked Die

Friday, January 13th, 2012

By Ed Sperling
System-Level Design sat down to discuss the future of stacked die with Riko Radojcic, director of engineering at Qualcomm; Prasad Subramaniam, vice president of design technology at eSilicon; Mike Gianfagna, vice president of marketing at Atrenta; and Herb Reiter, 3D/TSV working group chair for the GSA. What follows are excerpts of that conversation

SLD: Aren’t there physical effects that need to be dealt with by the standards groups, as well?
Radojcic: No, the focus for phase one is to drive standards exchange mechanisms within the supply chain. What does a carrier look like and how does the foundry ship a thin wafer to an assembly house and what does the handler need to have? What is the inspection criteria? There is a set of standards in the manufacturing flow intended to link the chains within the supply chain rather than standardizing inside a link. There isn’t much focus on a standard TSV. It’s how you ship 10 wafers.
Subramaniam: There’s also the issue of who owns the library for the wafer. Now that one company is doing part of the processing and another company is doing part of the processing, who owns the problem if there is an issue with the wafer? That’s an issue that has not been resolved, in my opinion.

SLD: The business issues are interesting in 3D, particularly if you have known good die creating a bad die.
Radojcic: Yes, we totally agree. But before we get too pessimistic about this, the world is shipping billions of PoP parts and there is a perfectly good PoP business model.
Subramaniam: It isn’t that it can’t be worked out. But it still has to be done.
Radojcic: And it could be an opportunity for companies like eSilicon to take responsibility on both sides.
Reiter: Test will be a big issue, too, especially if you have logic chips on top of each other. But this will be two or three years away. One logic layer with a memory stack on top of it is probably manageable.
Gianfagna: Known good die backs up to test, too. We grew up knowing about process monitors and paddle transistors and wafer acceptance tests and probe and package assembly. What else do you need for exchanging slices? Don’t you need more acceptance tests?
Reiter: The test guys are finally interested. People are coming forward to learn about 3D because they realize you can do a lot with built-in self-test and minimize the time of the tester.
Subramaniam: If you look at how we package parts today, we test them on the wafer. We don’t package parts blindly by taking parts and putting them on the package. How is it different? You still have to test something on a die before you go to the next level of assembly. The issue is that with 3D IC you won’t have all the pins accessible.
Reiter: And you can’t use PoP cards anymore.
Subramaniam: Yes, and that’s why I believe we’re heading in the direction of BiST. That enables you to test the die in a 3D environment.
Gianfagna: That backs up into an IP opportunity and a design challenge to make all that work. The business issues can be solved with some technology issues.
Radojcic: A lot of us have questioned how we test logic on logic, or memory on logic. But if you look at Wide I/O, it’s going to be shipped in a package that’s going to be tested like any other package. The only thing that won’t be tested are the microbumps and the drivers that go with it. Otherwise, the coverage is the same. It’s shipped not as a die but as a package unit. You test logic the same way you always did. So when you stack it, you have untested TSVs. That’s an incremental risk. But that shouldn’t be too bad. There’s also a potential loss of memory die due to the assembly process. That shouldn’t be prohibitive, either. But for early implementation, memory and logic shouldn’t be too bad. The hard part will be 2.5D. How are you going to test the interposer? It’s only wires, and it’s yield is not that good. So you have to take perfectly good die and put them with an untested interposer. That’s the test challenge. The only way to test an interposer is microcoding. But the things most of us spend 90% of our DFT time worrying about aren’t so bad.
Subramaniam: I don’t think it’s that simple. How do you handle temperature test? Generally when you package a product you test it at different temperatures, not just room temperature. But wafer test is only done at room temperature. Now you have good die that are tested only at room temperature, you’re putting them on a 3D package, and you don’t have the ability to test the assembled solution at different temperatures.
Radojcic: There’s nothing to stop you from doing post-assembly testing.
Subramaniam: You need to have all the pins of the die. If you have two die connected internally, you may not have access to those pins externally.
Radojcic: I’m assuming, perhaps foolishly, that whoever has built this thing has designed a scan solution of some kind.
Subramaniam: That may not always be possible.
Radojcic: If you have designed your system so that it is intrinsically untestable, you get everything you deserve.
Subramaniam: That’s why a BiST solution is appropriate. Scan may not do the job.
Radojcic: You have to do a good DFT job. It’s no different than for a 2D SoC. You have to provide the infrastructure to have observability and testability. Whether it’s 2D or 3D doesn’t make any difference.
Subramaniam: But the number of pins you have in 3D that go to the outside will be far fewer. Observability with the outside world is limited.

SLD: Does it become harder to do these tests if you are working with multiple power domains and portions of the chip being turned on and off all the time?
Gianfagna: There’s more opportunity for power domain definitions in a 3D stack than in 2D.
Subramaniam: I don’t see it as being any different.
Gianfagna: The problem gets bigger, but it’s the same problem. There is a related issue of proximity effects, though. Test over temperature gets interesting. How a system will behave over temperature is not obvious. In a 2D chip it’s more obvious.

SLD: Strangely, it may run faster, right?
Gianfagna: Yes, that’s true.
Subramaniam: This is where the analysis tools come into play. You need to model the whole problem.
Gianfagna: Nothing is new here, but it’s a bigger, more complex problem with more interrelationships. The tools will probably break due to capacity.
Radojcic: This is another tool issue or flow issue. Today most of us look at timing where everything is hot or everything is cold. You don’t have to deal with gradients. In 2D you deal with this by doing a good job of characterizing a product. Die A can be perfectly fine. Die B can be perfectly fine. But when you slap them together you may have a gradient that is different and you can’t characterize them separately. You need the ability to have timing characterization based upon thermal gradients, and there is no tool that does that today.

SLD: Don’t we also need to add even finer-grained capabilities into tools? Everything seems to be pointing to a higher level of abstraction, but we need to go the other way, as well.
Radojcic: Yes. I know how to simulate temperature in thermal profiles. I can figure out what the thermal profile will look like and I can superimpose it on my die. We can narrow down the problem to a portion of the die. But I don’t know how to import that into timing, because all of our timing is spatially unaware. That, to me, is one of the key EDA disconnects.
Reiter: I see 3D not as a design or implementation methodology, but as a system design methodology. A lot of things we’re seeing on the board level will be reflected in a 3D IC environment.
Gianfagna: Except the proximity effects are significantly worse.
Reiter: Yes, because people who lived on the range before are suddenly moving into a condo building. They cannot behave the same way.

The Week In Review: Dec. 16

Thursday, December 15th, 2011

By Ed Sperling
Mentor Graphics introduced an integrated component-to-system thermal characterization and analysis solution that combines hardware test with its FloTherm software. This is a particularly interesting more for the LED and IC packaging arenas, given the focus on leakage and heat.

Cadence won a deal with Panasonic for its Palladium XP platform, which combines simulation, acceleration and emulation. The tools will be used in a variety of digital consumer electronics.

HiSilicon has licensed eSilicon’s 40nm ternary content-addressable memory macros for its networking chips.

Blu-Wireless has licensed Sonics’ on-chip communications IP for its wireless communications processors aimed at the unlicensed 60 GHz market. Blu-Wireless will use the IP for a new generation of multi-gigabit communications for consumer electronics.

Synopsys claimed a share of the victory in GlobalFoundries first complex 20nm tapeout, complete with double patterning. A number of Synopsys tools were used to achieve silicon success.

Experts At The Table: The Future Of Stacked Die

Thursday, December 15th, 2011

By Ed Sperling
System-Level Design sat down to discuss the future of stacked die with Riko Radojcic, director of engineering at Qualcomm; Prasad Subramaniam, vice president of design technology at eSilicon; Mike Gianfagna, vice president of marketing at Atrenta; and Herb Reiter, 3D/TSV working group chair for the GSA. What follows are excerpts of that conversation.

SLD: Where are we with 2.5D and 3D?
Radojcic: I think 2.5D was a misnomer, because that implies they are sequential. It’s clear that what we call 2.5D and 3D are going to co-exist for a long time. Some things make sense with an interposer and some make sense to be 3D.
Reiter: I agree—2.5D is a parallel effort to 3D. Lots of things will not use 3D because it’s too expensive. In 2.5D we will see production this year. With 3D it will take until next year for the first ones. I would guess computing or networking would be the first.
Radojcic: I would think those guys will pursue 2.5D.
Subramaniam: Memory makers are already offering 3D solutions today. If you look at just the memory chip, to increase the size of the memory rather than the die they’re stacking it vertically. That kind of 3D is already in production. It’s the question of co-mingling logic and memory that will take time. The advantage of 2.5D is that it allows afterthought. It allows you to take an existing design and to create a new set of I/Os and put in a 3D type of application.
Radojcic: I see no value in doing that. You’re creating an expensive solution to something you can do more cheaply. If you add the 3D interposer you’re adding another wafer. That’s cost. We can solve that problem with a flip chip. It’s cheaper.
Subramaniam: I disagree. We’ve done the analysis. It allows us to take an existing design, like an ARM subsystem in 28nm, even though surrounding logic doesn’t have to be at that 28nm process node. It can be 40nm or 65nm. Rather than building a new chip at 28nm, I can take my existing design, use it as one component of my 3D IC, and build a second chip in a cheaper, older technology.
Radojcic: Yes, as long as you’ve architected your chip like that, such that you can partition it.
Subramaniam: You can’t take any design, no. There has to be some partitioning in the architecture and some forethought. It’s not 100% an afterthought, but there is still some afterthought there.
Radojcic: You have to architect for it. If you haven’t done that, taking an existing chip will just cost you more. If you have done that, of course there is an avenue to doing things better and more flexibly.
Subramaniam: There is enough flexibility in designs that allow you to partition it in some manner.
Radojcic: True, but before 3D came along most of us wouldn’t have partitioned. We wouldn’t have architected it that way. To be able to leverage that value proposition, you must have 3D in mind.
Gianfagna: That’s true. It’s a premeditated act. If you don’t think it through way up front it doesn’t work.
Subramaniam: Because the SoC has a well-defined architecture, it lends itself to this type of application.
Radojcic: But only if you plan for it ahead of time.

SLD: Is this true in all cases?
Reiter: That’s the view of a high-volume supplier. I see low-volume solutions where they use an existing die, put it face down on an interposer, and connect memory to it. So for low to medium volume, 2.5D works. You call it an afterthought. I call it a customized solution.
Radojcic: Why wouldn’t you do that in a traditional multichip package?
Subramaniam: Because you don’t get the interconnectivity. The advantage of a silicon interposer is that you get thousands of interconnects.
Radojcic: But you have to design it sufficiently so you can leverage the interconnects from die to die. If you had designed for a traditional design, though, you would say, ‘I can’t have thousands of interconnects so I’m going to make a serial interface with 100 pins.’ If you take that design for a 100-pin interconnect and stick in an interposer it’s an expensive way of doing things.
Subramaniam: You may be able to take some internal signals out, which you are not able to do with a traditional MCM (multi-chip module) approach.

SLD: Let’s do a reality check. How far along are we toward stacking?
Gianfagna: Last year we had a hot-wired 3D system that was 2D with a bunch of scripts and manual effort. The customer base had strange, contrived designs and they were trying to see what they could and couldn’t do, and the foundries didn’t know what they wanted to do. A year later we have native 3D planning capability, the customer base has specific designs for implementation this year and next, and the foundries have a laser-sharp focus on process learning, mostly around 2.5D initially. If that’s a metric, things are clearer this year than last year. From an EDA perspective, I still think the market is two years away. But we still think this is big.
Reiter: If you look at the Atom chip with the FPGA from Altera, that’s basically a 2.5D solution. The FPGA is for customizing things. The Atom chip was not designed for this application.
Radojcic: But why use an interposer? Why not use a substrate and a multichip package?
Reiter: You could do that.

SLD: What’s missing from the tools side to make all this work?
Reiter: The ability to demonstrate what this technology can do is the most important capability. If you look at big corporations, top management is still hesitant to invest in this technology. If we could demonstrate in a credible way what it can do, people will be more successful in getting money to start programs using this technology.
Gianfagna: The way that happens is the early adopters blaze the trail, everyone tries to follow and the market heats up. What’s needed are commercial drivers. The tools aren’t there, but they’re close enough.
Subramaniam: The tools are not the issue. The development needed to support 3D is incremental. It can be done with the existing infrastructure. It’s really the end application.
Radojcic: Other than path-finding, which is hard to do with traditional tools. And the analysis.
Gianfagna: The complexity is higher. We’ve discovered that, too. RTL prototyping for a single chip has a certain set of challenges. When you go to 3D the modeling requirements are much greater, the constraint generation is more complicated. And we need standards. We can generate all the constraints, but we don’t know where to put them and how to express them because there is no agreed upon way to do that.

SLD: Do the standards organizations know where to start with all of this?
Radojcic: Standards are on a good track. We’ve worked with Si2 and Sematech to propose initials blasts for standards so we can feed them into Si2 and the EDA community and accelerate the process. The bits and pieces are moving, and we are on track to have a set of design exchange format standards by early next year.
Reiter: And Wide I/O.
Radojcic: Yes. The standards are channeled and the engine is revving.
Reiter: We have a bunch of players in a 3D enablement center participating. There are 15 companies listed, including Intel, IBM, TSMC, GlobalFoundries, and so on.
Radojcic: The way this was set up was Sematech said we are going to start a 3D enablement center initiative driven by the SIA. All the members of Sematech were mapped into this. Then a number of companies like Qualcomm, LSI and ASE joined.

Rebalancing Power, Performance And Area

Thursday, December 15th, 2011

By Ed Sperling
The tradeoffs between performance, power and area are being fine-tuned to a degree never seen before in the IC business, driven partly by complexity, partly by better tools, and partly by the need to gain a competitive edge in specific applications.

Just being able to make these kinds of tradeoffs is a technological feat that marries everything from high-level modeling and synthesis to prototypes of hardware and software and better characterization of IP. But being able to use the data from these tools more effectively is changing what can be done in design.

“This is going to be a way of life going forward,” said Jack Harding, president and CEO of eSilicon. “If you look at the number of variations between process nodes, types of IP and voltage, no human can sort through all the permutations anymore and come up with an optimal design, and certainly not around PPA.”

This is particularly true at 28nm, where power, performance and area are not linear extensions from previous nodes. And in stacked die, where multiple generations of technology at different voltages are packaged together, those tradeoffs may be greater still.

“The fact is that we may not know the best combinations anymore because there are too many things to consider,” said Harding. “I’m convinced this is a permanent change, too. I liken it to place and route in the early 1990s, where it was used to help smart guys design chips. By the late 1990s that had to be automated in all chips because the gate count was too high. We’re now approaching tool-assisted SoC architectures.”

Cause and effect
At least part of what’s behind this shift is market demand for more customized solutions. The move to subsystems and off-the-shelf IP means that companies have to find a way to differentiate their chips, something that will become even more apparent as the industry begins shifting to 2.5D and 3D stacks over the next couple years. Even the software might not be enough to differentiate the product in some markets, such as Android phones.

“The stakes are higher and the tools are better,” said Wally Rhines, chairman and CEO of Mentor Graphics. “Now a microwatt matters. It can be the difference in a win or a loss.”

The same is true of area and performance. But the new wrinkle is those variables almost need to be tweaked for each customer. Naveed Sherwani, president and CEO of Open-Silicon, said that for some of the large search engine, cloud computing and social media companies, the emphasis is on performance at any cost. This is contrary to the direction of most data centers, where power has become the major focus due the cost of running and cooling racks upon racks of servers.

“The reason PPA is changing is because now we can change it,” Sherwani said. “There are a lot more tools that can play with more things that affect power, performance, area and cost. We’re heading toward a very platform-style approach to design, so the changes from one customer to another may be only about 20%. With 3D stacking and memory, the next few years should be very interesting.”

Why now?
As with all significant changes in the IC business, there is no single factor that is responsible. The push to advanced nodes has added more complexity to designs, starting with more transistors (with 3D transistor designs at 14nm), more leakage, more features that require more complex power management, more IP re-use, a larger software component that needs to be written more quickly and with energy efficiency in mind. On top of that there are better tools for making these kinds of tradeoffs, and all of the big EDA companies are working on better analysis of the data that can be added into major flows.

Still, getting sufficiently good data to make these kinds of tradeoffs isn’t easy—even with better tools.

“PPA perplexes everyone,” said Bernard Murphy, chief technology officer at Atrenta. “The time to do PPA tradeoffs is early on, but the challenge is that there are a lot of unknowns at that time. There is recognition that if you can’t solve the big problem you can break it into smaller pieces, and these days not all of the design is unknown.”

He noted that Atrenta has been examining the effect of bus fabrics on the whole PPA equation. He said the direct influence on power is less, but they do contribute heavily to idle mode power, something that will become particularly apparent once wide I/O becomes mainstream.

“One of the reasons the NoC (network-on-chip) exists in the first place, and why ARM is looking at pseudo-NoCs is to control congestion,” he said. “The bus fabrics are only getting more complicated, and there is very little expertise in detailed performance analysis.”

Education is critical across the board in PPA. Teams of software engineers working with hardware engineers on designs alongside groups that are focused on manufacturability have expanded the scope of many design engineers. To some extent, everyone will have to think like a systems engineer in the future, even if they have their own area of expertise. But the very fact that they are talking to other team members is eliminating some of the silo behavior that various teams have lived with for the past couple of decades.

The future
While PPA has always been a way to spin cost, increasingly it also is seen as a way to improve time to market. Stacked die, and re-use of IP, subsystems and even entire platforms and die, will alter this equation even more—and add far more options for trading off power, performance and area.’

Those tradeoffs already are being done on a localized basis, with one IP block versus another or one processor core or multiple cores versus one or more other cores. In the future, it could include entire chips, as well, which may be customized quickly for individual markets or customers.

These changes also are likely to bring shifts within the supply chain. How the pieces will be reassembled is unknown at this point, but most experts agree that more change is inevitable.

Next Page »