Posts Tagged ‘Semico Research’

Power Issues In 3D

Thursday, April 14th, 2011

By Ann Steffora Mutschler
The challenges associated with implementing IP subsystems range from maintaining a consistent I/O voltage, achieving consistency in metal stacks to managing a clock distribution network and creating adequate isolation between subsystems on a chip. It’s enough to make your brain hurt. Add to that 3D or 2.5D stacking and the engineering considerations grow substantially.

The concept of stacking die has captivated the semiconductor industry with its promise of, among other things, better performance, shorter signal distances and in some cases a smaller footprint. But with it comes additional design complexity and cost considerations.

“In the old days when people talked about IP subsystems very often they were talking about one SoC, because within this one SoC you have IP that has certain well-defined behavior and has a good interface with protocols around it,” said Dian Yang, general manager and senior VP of product management at Apache Design Solutions. “Now the problem is that those kinds of subsystems are getting more and more complicated, so the subsystem itself becomes a gigantic chip. To be embedded inside a chip, sometimes it may not be economically feasible or may not be technically feasible in theory, especially when you look at 3D implementation.”

For example, if an IP subsystem is implemented on separate die and then stacked with the original SoC, the user interface is well-defined, a micro bump or TSV is used to connect them, But if you separate that design in order to achieve 3D, there is some disadvantage – you have two wafers, two dies, and have to use a more expensive 3D package, he said.

There are advantages, too, Yang said. “Let’s say the subsystem itself can be independently manufactured from the SoC guys in terms of the process technology: one is in 65nm and the other one can be 40nm or 28nm. The second advantage is that the testing embedded inside the 3D SoC can be a little more difficult but separately, you can do wafer-level testing much easier.”

A third implementation advantage is that if the chip can be separated into two die versus one die and uses a 3D package, sometimes the cost can actually be lower. This isn’t always true, but in some cases, especially when mixing two process technologies, the end cost can indeed be less.

“Customers may already have a subsystem implemented and well-tested – they don’t want to change anything because there is always risk associated with that…especially subsystems from a third party. For whatever economical reason they don’t want to migrate to 40 or they don’t want to migrate to 28, on the other side, if I want to shrink my whole SoC – that economically makes sense,” Yang said.

Today, RF subsystems commonly are implemented separately from the SoC because to combine them is not easy in terms of technology shrinking. Memory subsystems and IP such as embedded DRAM are also often implemented separately from the SoC as it is much more cost-effective to separate the DRAM using stacked die.

Fig. 1: Example of a contemporary audio IP susbsystem. (Source: Semico/Synopsys)

The impact of 3D on IP subsystems
Understanding the impact of 3D on IP subsystem implementation is key to moving ahead, reminded Samta Bansal, product marketing for Encounter digital implementation system, and a 3D IC expert at Cadence. “When I look at immediate implementation of 3D, I actually think about it utilizing the IP subsystems. Immediately, I see we will leverage the 3D with TSV structures as being able to use IPs and build something more, tailoring it to different applications.”

3D impacts IP subsystems on a number of fronts, including what people will actually be designing and partitioning in 3D and how to utilize the IP that is going to come from third parties. “In bringing all of this together, what becomes very important is co-design: how are you going to co-design everything together knowing that IP will come from different sources,” she explained. DFT is another area that is affected by 3D. Every chip could have its own DFT, but when put together as a system how will they talk to one another?

“There are a couple of ways to solve the challenge when you put power in the context of 3D or IP subsystems. Number one, we can solve it by adding additional resources. I can add decoupling capacitance, I can add extra vias, I can do more I/Os and increase the routing area through the power distribution network but that means cost. And the whole idea of doing 3D is to somehow manage the cost in addition to the power and performance that you would get. That is a way to do it, but is that an effective way? It depends on the application that it is targeted for, and the volume you are going to bring up, that may work and it may not work,” Bansal offered.

Good engineering always helps too, she said, pointing to Freescale’s use of stacked decoupling capacitors (decaps) instead of its standard gate oxide decaps, which shows how smart, next-generation decaps can be leveraged to realize some improvement in the clock frequency. That, in turn, helps the power network overall. IBM, in contrast, uses trench capacitors for decoupling. “You can either throw cost at the problem with more capacitance, more routing areas, more vias and try to solve the challenge of power, or you can implement these smart engineering techniques. And you can use EDA tools to optimize the power.”

Looking at power in IP subsystems from a higher level, Bansal believes the dies that are going to be farther away from the die with the connection to the package will be susceptible to noise. They will get noise from their own switching and they will be affected by the power noise from the die below it. From the pin count and routing limitations, sometimes it is very difficult to isolate the power distribution network between the two dies—both for the power and the ground supply network.

Navraj Nandra, director of analog/mixed signal marketing at Synopsys, disagrees on this point. He said the biggest challenge in terms of implementing IP subsystems that contain 3D structures is packaging and stacking up various die to make sure that the interaction of the IP functions are managed in terms of signal integrity and noise. “The idea of 3D or 2.5D actually solves a lot of these problems because the goal of 3D is to get your form factor into something that’s mobile, but what it does is reduce a lot of the trace lengths. Imagine if you had two chips communicating with each other across a board and now they are communicating with each other on top of each other through TSVs so that distance now is reduced. This actually makes the challenge from a noise perspective less because there is less distance to communicate over and the signal integrity will improve.”

Bansal doesn’t agree. “Even if you can think of these IP subsystems having their own power and ground networks unique to them, you can imagine the chip higher up in the stack will probably have more increased power noise and voltage drop because there are now a large number of metal and via layers that are added in the conduction path, and also because of the capacitance/inductance noise that must be happening from the intermediate chips. To mitigate that, design teams will not only have to focus on how to design the robust power grid on that IP subsystem, but also to ensure that the power delivery network that gets designed in other chips in that stack are also adequate to meet the system need. They can’t just worry about the design and density of the power grid routing. Now they also need to worry about the via count, the design and placement of the TSVs that are going to connect those power grid networks to the other die. That means when you are thinking about the power grid network or distribution, this co-design becomes very important.”

In all cases, this will be an interesting space to watch as IP vendors look for ways to combine their IP in unique ways. In fact, Semico Research Corp. predicts that advanced performance multicore SoC will be the device type shipping with the most IP subsystems, reaching 1.558 billion units by 2015, which is a 25% CAGR.

FPGA Trends Highlight Move To IP Subsystems

Tuesday, April 12th, 2011

By John Blyler
Low-Power Engineering sat down to discuss trends in FPGA design and related IP subsystem aggregation with Rich Wawrzyniak, senior market analyst for ASIC and SoC at Semico Research. What follows are excerpts of that conversation.

LPE: Let’s start by talking about the trends in design starts in the FPGA. Is there anything new?
Wawrzyniak: Tracking actual design starts in the programmable logic space is full of quirks, compared to ASIC starts. The problem is that the majority of programmable logic design starts come from some guy playing around at his desk. While this is important information, such designs starts seldom go further than just exploration. This activity occurs at such a granular level that it’s very difficult to know what the desktop designer is actually doing. The other problem in tracking this data comes from the way people view prototyping and designing with FPGAs. For example, you might be using 5 to 10 FPGAs to mirror one ASIC. Are each one of those FPGA instances a design start? If not, what ratio of them does represent a design state? Who knows? Nevertheless, I do have track programmable logic design start data from 71 end applications. This chart (Fig. 1) indentifies FPGA design starts by market segment.

johnpic1

LPE: Most areas are seeing increases.
Wawrzyniak: They are increasing, although others may disagree. Unfortunately, whether specific groups see design starts increasing or decreasing depends upon their comfort level and their affiliation. For example, EDA would like to say that design starts are going down because their revenues are flat to down. Conversely, IP vendors would like to say the opposite, namely, that design starts are on the rise. Each group can support their viewpoint. The EDA companies justify decreases in design starts by not counting derivative designs in an SoC project. By derivative design I mean that you design the most complex family member first, then add or remove features or functions to suit the market needs for other family member products.

LPE: Don’t shuttle test runs and respins add to the confusion about what constitutes a design start?
Wawrzyniak: Test shuttle runs can be a problem, depending upon how involved they become.
I do count respins as design starts since they represent (to me) a new design. I know this may seem like I’m splitting hairs. But what all of the interested parties—EDA and IP vendors to fabless companies and foundries—ultimately want to know is the number of designs that are going to production. I agree that the answer to this question is very important. But my approach is to determine the level of design activity in the market. I feel that this is a better metric of what is actually going on. It reflects the true competitiveness of the silicon. For example, you may have 10,000 design starts but only one of those goes to product. That tells you that something is going on. Rather than focus on the designs that go to production, I look at the activity. What is preventing or enabling more design starts to go into production?

LPE: Do you consider an FPGA to be an SoC?
Wawrzyniak: I define an SoC in terms of IP. If the design includes internally re-used or third-party IP, then it’s an SoC. If it doesn’t, then it isn’t a SoC.

LPE: Traditionally, isn’t an SoC defined as a system that includes at least a processor, memory and interface circuits?
Wawrzyniak: There are several parameters that I have created to define whether it’s an SoC or not. It doesn’t necessary have to have all of them. For example, if I remember correctly, Sharp had a line of micro-peripherals that didn’t have a CPU core. But they used third-party IP to incorporate other functionality into the SoC. Well, you better count that instead of not counting it. It just isn’t the way that people are used to conceptualizing an SoC (Fig. 2). The way I conceived of these new definitions, new looks at FPGAs, is based around interconnect and IP subsystems.

IP subsystems refers to an aggregation of lots of related functional blocks. There aren’t a lot of products in the market yet. IP subsystems have evolved to handle the increasing complexity of SoCs. In order to keep pumping out these complex designs, vendors have had to integrate lots of IP. The reason is that it is almost impossible to manage 100 discrete IP blocks. It can be done, but it’s costing more money, time and resources. The answer is the aggregation of certain IP blocks around particular core functionality, such as security, communication and multimedia. Further, these aggregated blocks are given their own interconnect (in the FPGA fabric) to turn the individual block to work better together, e.g., for better performance and low power. For example, Xilinx has been developing IP subsystems for the last year or so.

LPE: Are these IP subsystems coming from several different third party vendors or from just one IP company?
Wawrzyniak: At the moment, it’s mostly from the IDMs. This isn’t surprising, given the history of the semiconductor industry. Once these IDMs prove out a good idea, then everyone is going it. This approach is at the heart of the trends that drive the semiconductor industry – evolution and integration. It doesn’t make sense to say that the forces of integration only apply to silicon. It’s going to apply to IP. That is what this is a reflection of—it is the next iteration, the next twist in the road for IP.

FPGA Trends Highlight Move To IP Subsystems

Thursday, November 4th, 2010

By John Blyler
Low-Power Engineering sat down to discuss trends in FPGA design and related IP subsystem aggregation with Rich Wawrzyniak, senior market analyst for ASIC and SoC at Semico Research. What follows are excerpts of that conversation.

LPE: Let’s start by talking about the trends in design starts in the FPGA. Is there anything new?
Wawrzyniak: Tracking actual design starts in the programmable logic space is full of quirks, compared to ASIC starts. The problem is that the majority of programmable logic design starts come from some guy playing around at his desk. While this is important information, such designs starts seldom go further than just exploration. This activity occurs at such a granular level that it’s very difficult to know what the desktop designer is actually doing. The other problem in tracking this data comes from the way people view prototyping and designing with FPGAs. For example, you might be using 5 to 10 FPGAs to mirror one ASIC. Are each one of those FPGA instances a design start? If not, what ratio of them does represent a design state? Who knows? Nevertheless, I do have track programmable logic design start data from 71 end applications. This chart (Fig. 1) indentifies FPGA design starts by market segment.

johnpic1

Fig. 1

LPE: Most areas are seeing increases.
Wawrzyniak: They are increasing, although others may disagree. Unfortunately, whether specific groups see design starts increasing or decreasing depends upon their comfort level and their affiliation. For example, EDA would like to say that design starts are going down because their revenues are flat to down. Conversely, IP vendors would like to say the opposite, namely, that design starts are on the rise. Each group can support their viewpoint. The EDA companies justify decreases in design starts by not counting derivative designs in an SoC project. By derivative design I mean that you design the most complex family member first, then add or remove features or functions to suit the market needs for other family member products.

LPE: Don’t shuttle test runs and respins add to the confusion about what constitutes a design start?
Wawrzyniak: Test shuttle runs can be a problem, depending upon how involved they become.
I do count respins as design starts since they represent (to me) a new design. I know this may seem like I’m splitting hairs. But what all of the interested parties—EDA and IP vendors to fabless companies and foundries—ultimately want to know is the number of designs that are going to production. I agree that the answer to this question is very important. But my approach is to determine the level of design activity in the market. I feel that this is a better metric of what is actually going on. It reflects the true competitiveness of the silicon. For example, you may have 10,000 design starts but only one of those goes to product. That tells you that something is going on. Rather than focus on the designs that go to production, I look at the activity. What is preventing or enabling more design starts to go into production?

LPE: Do you consider an FPGA to be an SoC?
Wawrzyniak: I define an SoC in terms of IP. If the design includes internally re-used or third-party IP, then it’s an SoC. If it doesn’t, then it isn’t a SoC.

LPE: Traditionally, isn’t an SoC defined as a system that includes at least a processor, memory and interface circuits?
Wawrzyniak: There are several parameters that I have created to define whether it’s an SoC or not. It doesn’t necessary have to have all of them. For example, if I remember correctly, Sharp had a line of micro-peripherals that didn’t have a CPU core. But they used third-party IP to incorporate other functionality into the SoC. Well, you better count that instead of not counting it. It just isn’t the way that people are used to conceptualizing an SoC (Fig. 2). The way I conceived of these new definitions, new looks at FPGAs, is based around interconnect and IP subsystems.

johnpic2

Fig. 2

IP subsystems refers to an aggregation of lots of related functional blocks. There aren’t a lot of products in the market yet. IP subsystems have evolved to handle the increasing complexity of SoCs. In order to keep pumping out these complex designs, vendors have had to integrate lots of IP. The reason is that it is almost impossible to manage 100 discrete IP blocks. It can be done, but it’s costing more money, time and resources. The answer is the aggregation of certain IP blocks around particular core functionality, such as security, communication and multimedia. Further, these aggregated blocks are given their own interconnect (in the FPGA fabric) to turn the individual block to work better together, e.g., for better performance and low power. For example, Xilinx has been developing IP subsystems for the last year or so.

LPE: Are these IP subsystems coming from several different third party vendors or from just one IP company?
Wawrzyniak: At the moment, it’s mostly from the IDMs. This isn’t surprising, given the history of the semiconductor industry. Once these IDMs prove out a good idea, then everyone is going it. This approach is at the heart of the trends that drive the semiconductor industry – evolution and integration. It doesn’t make sense to say that the forces of integration only apply to silicon. It’s going to apply to IP. That is what this is a reflection of—it is the next iteration, the next twist in the road for IP.

First Down On The 40nm Line

Tuesday, June 30th, 2009

The race to 40nm is over. Some chipmakers are already there, taping out designs and implementing IP that has already been qualified at the 40nm process.

When exactly volume production begins and when yields improve is a matter of conjecture. TSMC so far is the only major foundry actively using the 40nm process, which is a half-node beyond 45nm. But the Common Platform already has briefed analysts and customers on its 40nm process, even though most of its work is at 45nm, and the Global Foundry—the AMD spinoff—has 40nm ready to go if there is customer demand.

A side benefit to consumers—and a big headache for design engineers—is that the power envelope continues to shrink with the line-widths. Low power is now standard in every design, which puts pressure on all IP vendors to create low-power versions at least concurrently with their newly qualified IP, if not first—or to make all versions low power. In the past, low-power versions typically trailed initial rollouts by 6 to 18 months.

And while that doesn’t mean all pieces of an SoC design need to be manufactured using a 40nm process—non-volatile memory, for example, is still at least a node behind—it does mean that research is well underway and on track for 32/28nm and that 40nm appears to be a relatively stable manufacturing process.

AMD, with its ATI line, and Nvidia both have 40nm versions of their latest graphics processors, which typically run at the leading edge of Moore’s Law because there is far greater potential for using more cores with existing software than many other chips. Video, in particular, is one of the easier applications to write for multiple cores because graphics rendering can be parsed into discrete units.

Low power everywhere

The power envelope in a more densely-packed piece of silicon has to be significantly lower, however. Signal integrity is a growing problem, according to design engineers, in part because of the density and the amount of current moving through the wires. Higher density also opens up real estate on a single chip for more functions that previously were on multiple chips or even multiple devices.

All of that points to lowering power wherever possible. And it means that to be successful in the market, low power design is a must. Virage Logic, which makes a variety of memory and logic IP, saw the trend clearly at 65nm when it incorporated low-power options into all of its IP instead of offering a separate low-power version.

“At 40 nanometers, if you want to create a new chip it has to be low power,” said Brani Buric, Virage’s executive vice president of marketing and sales. “We used to have high-density, high-speed and low-power versions of our IP. At 40nm, there are no separate low power products. There is a full set of low power features in both our high-density and high-speed IP, whether that’s memories or logic.”

AMD’s graphics processor group rolled out its first product at 40nm this spring. Stan Ossias, director of product management in AMD’s global/discrete graphics unit, said the bulk of the company’s work is still at 55nm and the company got a huge performance gain by re-architecting its 55nm chips.

“A lot of what we do has to do with predicting the readiness of the process at any time,” said Ossias. “We capitalize on the IP that’s available and the design he have to maximize our competitiveness. Last year, we had the choice of going to 40nm using the same architecture, but we thought we could do a better job of reaching our performance goals by redesigning the architecture. We didn’t feel the 40nm process was ready.”

That approach is one that is becoming more common among companies that typically hopped from one process node to the next in the past. The complexity of getting to the next node, along with the rising costs and uncertainties about manufacturability, yield and the IP needed in a design—not all IP available at 40nm has been proven in silicon yet—makes each new process node an increasing risk, and one that is no longer just an automatic decision.

At least part of the risk assessment also has to do with power consumption. Each new node also requires reducing the power consumption, which involves a litany of design tricks ranging from power gating for active power to utilizing power islands for static leakage, different gate structures and a variety of exotic insulation materials.

“Power is one of the fundamental areas we think about with technology evolution,” said Ossias. “Every time we shrink the process, we have to put more and more effort into decreasing power. That involves not just the individual device, but how that device interoperates with other devices. It’s a big consideration.”

40 vs. 45nm

Even moving from 45nm to 40nm is raising some questions. The foundry business is extremely competitive and having the next process used to be a competitive advantage, but so far only TSMC is actively pushing 40nm. The foundry told analysts that it opted for 40nm instead of 45nm because the process could be tuned better for device performance.

Joanne Itow, managing director of manufacturing at Semico Research, said the number of half nodes is exploding. She said that gives both foundries and companies a chance to firm up the processes and move more gradually to the next full node. The Common Platform, for example, is working on 28nm, which is the half node between 32nm and 22nm.

Global Foundries, which is the AMD spinoff, will work with customers for a specific implementation at 40nm or refine its bulk 45nm process, according to spokesman Jon Carvill. But he said the next step under development is a 32nm and 28nm bulk CMOS process.

Still, now that the foundries have reached the node and are working on the next one, the question remains of just how many chipmakers will move to the next half node and how quickly. There is a lot of conjecture now that the pieces are falling into place for 40nm production, but so far there are no definitive answers.