Posts Tagged ‘Qualcomm’

Next Page »

Power Becomes Bigger Issue In Stacked Die

Thursday, May 10th, 2012

By Ed Sperling
Concern over getting the heat out of stacked die is well defined, even if the current raft of existing and proposed solutions ranges from ineffective to exotic and expensive. What is less well understood is how to plan for and manage power inside of stacked die.

While power and heat frequently go hand in hand—where there is heat there is almost always power dissipation—they can be very different from a design standpoint. Each can be affected by the other, and each needs to be modeled as part of a holistic design, but system power budgets may be too high to be acceptable and still low enough not to cause thermal issues. Nevertheless, the number of power issues that can result from stacking die can be far greater simply because there are more possible permutations, and so far there is little information about how to solve this.

The reasons for the dearth of knowledge in this area stem partly from the fact that some of these devices are just now being built—the best knowledge about design always comes from experience and history—and partly because power can vary greatly from one system to the next, from one user to another, and sometimes from one chip to another even within the same design. In a stacked die, all of these come into play in the same package, often with unexpected results. That makes it difficult to model power accurately enough up front, and equally difficult to deal with as the design progresses.

“We’re seeing some complex power management schemes emerging,” said Mike Gianfagna, vice president of marketing at Atrenta. “The problem is that if you have an error, you automatically generate incorrect power management circuitry. The opportunity is to enhance much more complex verification schemes to deal with this.”

He noted that many large chipmakers have their own homegrown version of power modeling, but it will take time—and standards—before there is a systematic way of dealing with it.

What needs to be addressed where
There are several distinct points where power needs to be addressed in a design. The first is at the architectural level, where modeling will be inaccurate. But it can be accurate enough to get an idea of which IP, including processor cores, to choose, which memory, various interconnect schemes, and I/O preferences. Each of those has a different effect on power, and together they have a cumulative effect.

“As you go down in the design flow you refine the power models and the software,” said Ghislain Kaiser, CEO of Docea Power. “But your accuracy depends heavily on the IP. For some IP, if you have an error of more than 20%, it will impact decisions later on. For IP that is small and not power-hungry, an error of that size may not cause any problems. But you do have to think about the global impact of the power, especially in a stacked die.”

While this is complicated enough in a planar SoC, it becomes even more complex in a stacked die because not all of the pieces are necessarily built at the same time. In addition, IP blocks and even entire subsystems can interact in unforeseen ways, sometimes decreasing power consumption as with Wide I/O, and at other times generating more power than anticipated because of unexpected proximity and other physical effects such as increased temperature.

“There are lots of things going on,” said Andrew Yang, president of Apache Design. “The voltage is fluctuating, so you’ve got on-chip voltage regulators to stabilize the power supply and back biasing to further reduce leakage power. At 20nm, reliability is becoming a key driver. Electromigration and electrostatic discharge are now mandatory for robust volume manufacturing. And we’re not just dealing with IR drop. In a platform solution, IR drop is one small item. You have to consider a full-chip power model.”

No simple answers
In addition to understanding power throughout the flow, Apache has been a strong advocate of understanding power over time, a necessary perspective that further complicates the design process with a fourth dimension. Power can be affected by a number of factors over time—even small increments of time from one die to the next.

“Die-to-die interactions are a form of variability,” said Riko Radojcic, director of design for silicon initiatives at Qualcomm. “You need a timer that understands thermal gradients and the impact of thermal gradients on time. There is a gap there right now.”

How to solve this problem is a big unknown, particularly when it comes to power. Power models and power numbers are dynamic rather than fixed, needing adjustments and tweaking throughout the life of the design and even beyond.

The general consensus is that none of this will ever be automated beyond a certain point, and no single tool will handle all of the power issues—even in 2D designs. In 2.5D and 3D, the number of options and possible interactions increases non-linearly. As the industry progresses into the next dimension, one of the biggest challenges will just be grasping all of the possibilities—and all of the subsequent effects that go along with those options.

Processor Subject To Change

Thursday, February 9th, 2012

By Ann Steffora Mutschler
With power complexity driving sophisticated management techniques, SoC design engineering teams are turning to a new class of customizable processor architectures from ARM, CEVA, NVIDIA, Qualcomm and Tensilica and others to take advantage of the best in power saving techniques.

While these new architectures are novel approaches, the concepts are not especially new, particularly in mobile applications.

“If you look at what mobile processors have been doing, I would argue they’ve been doing some sort of big.LITTLE for a long time,” explained Nandan Nayampally, director of applications processor marketing in the processor division of ARM. “By that I mean you have microcontrollers taking charge when the big application processor is not working, or you’ve got video engines being separate from the main application processing. The compartmentalization of the activities around the chip have been always a focus for mobile because you will save power any which way you can. That’s a given.”

ARM has observed that what’s changed in the recent past is that the main OS needs to be running more and more of the time because with apps like Twitter feeds and Facebook updates, those are little apps that are constantly running on top of the OS.

As fun and/or useful as they are, these apps are killing battery life.

Nayampally explained the big.LITTLE architecture with an example. “Let’s say I’m doing an MP3 playback in the old days. You’d say, ‘I’m running on the big core, I kick off the task to a little core and then turn off the big processor because the MP3 can run just fine on a microcontroller type device. It’s all on the same die. Then suddenly you get a call and it wakes up the big processor and it takes over again. But when you offloaded that MP3 in the olden days—six months or so ago—you actually could have a separate task that wasn’t really run by the OS. Now there are so many more things and services that people are coming to expect that you can’t have them done specifically for targets that are different from the application processor itself and they run on top of the OS. Now you are telling the chip, ‘No, I won’t do these specialized things as separate things for very power-efficient sub-components, they have to be done by the main processor.’ But the main processor also has to become very schizophrenic in the level of performance it requires for the main tasks as well as what it needs for the little tasks.”

Source: ARM

What makes big.LITTLE interesting is that the processors are fully coherent so the software engineer doesn’t have to worry as much about maintaining every piece of data. The coherency in hardware takes care of that. That makes the software development quicker and can actually improve performance and battery life.

Designed to be an extension of DVFS, there are multiple use models in which big.LITTLE can work, with the simplest use meant to be effectively transparent to the OS, Nayampally continued. “The power management software always speaks to a driver that is the right power and performance needed based on what is required. If, for example, you had today’s processor and it was using the lowest performance level it could while doing Twitter update, it just can’t be as efficient as something that was designed to be a fifth smaller or something like that. What if your DVFS had a next step that is more efficient and you can work there for a while? From an OS standpoint, or an application standpoint, it doesn’t matter. It’s just another step in your DVFS. Underneath it what happens is the driver now can do the kick-off to switch the operations from the big core to the little core or from the little core to the big core or cluster in fact.”

NVIDIA’s Tegra 3 employs variable symmetric multiprocessing (vSMP) while Qualcomm uses asynchronous symmetrical multiprocessing (aSMP) – which are the same principles that govern ARM’s big.LITTLE architecture.

NVIDIA’s Tegra 3, launched last November is a quad-core mobile processor for smartphones and tablets, currently shipping in the ASUS Transformer Android tablet. A company spokesman explained that behind Tegra 3’s power efficiency is a fifth lower-power “companion” CPU core that goes with the four CPU cores and is specifically targeted at battery savings. Tegra 3’s architecture allows it to provide the best combination of performance and battery life by switching between the four main CPU cores and the fifth core for less demanding tasks and active standby mode.

For CEVA, which licenses DSPs, programmability has always been the name of the game, according to Eran Briman, the company’s vice president of marketing. About seven years ago it became apparent that general-purpose DSPs are not going to make the cut for next-generation designs—particularly in 40nm communications designs. In one of its newest offerings, the CEVA-XC DSP software-defined radio architecture, users can run the complete receive and transmit channels entirely in software, except for very few hardware engines that simply don’t make sense in software, he said. To accompany this and to allow for advanced power management, CEVA recently released a software development kit that includes advanced power management. Looking ahead, Briman believes there will be fully programmable communications units on SoCs.

CEVA isn’t the only company in the DSP space to see this trend.

“Many baseband designs particularly, when they are operating on complex protocols and care a lot about energy have moved to neither completely hard-wired—because that would be too fragile or intolerant of inevitable corrections and improvements—nor completely general-purpose, because a general-purpose processor is generally much less energy-efficient than something that is more specific to the task at hand,” observed Chris Rowen, CTO at Tensilica. “Especially in low-power baseband processing, we’re seeing more and more optimization of programmable engines to do this, where the baseband subsystem might include 6 or 8 or 10 different cores that are programmable. Some of these still may be fairly general-purpose, because you may say in this function though there’s a wide variety of different tasks that I need to do on the data and it is more energy efficient for me to have one that is shared among these different, diverse functions than to have one piece of hardware for every single function. That would make it too big. Having a programmable solution can in some cases also make it a smaller solution. In general, small is good for energy.”

Tensilica offers a range of DSP cores. It also allows users to build their own customized dataplane processors.

Power Bits: Closer But Better

Thursday, August 11th, 2011

By Ed Sperling
Near-field communications has attracted an enormous amount of attention of late. Banks are allowing customers to scan checks from their cell phones and, increasingly, smart phones are being used for everything from airport access to paying bills at the grocery store. They’re even being used for parking lot fees.

But there’s a hidden pricetag behind all of this—the amount of energy needed to drive these transactions. That has led to a slew of development efforts behind the scenes, with the latest entry showing up this week from Texas Instruments. TI’s is bragging that its new NFC transceiver uses half as much power as the competitors’ products, running only as high as 120 milliampere in full-power mode and less than 1 microampere when it’s powered down.

TI clearly is not alone in this game of leapfrog. Companies such as Broadcom, Qualcomm and ST are racing in the same direction. But what’s interesting is that energy consumption has become the marketing focus rather than performance. The transceiver comes with eight possible power modes, which is particularly important in matching user preferences with the end devices.

EVM Board. Source: Texas Instruments

Dueling Power Formats

Thursday, August 11th, 2011

By Ed Sperling
Multiple power formats and increasingly complex SoCs don’t sound like a winning formula. So just how bad have things become? Low-Power Engineering asked Sorin Dobre, senior staff engineer at Qualcomm, for a real-world assessment of the situation.

LPE: There are three power formats—CPF, UPF and IEEE 1801. How big a problem is this for Qualcomm?
Dobre: Actually we have CPF 1.1 and 2.0 and UPF 1.0 and UPF 2.0, which is also called IEEE 1801. In the CPF area there is a unified approach. There is backward compatibility and consistent methodology for power intent and verification. In the UPF area there are many inconsistencies between UPF 1.0 and IEEE 1801. Instead of parallel formats there is a level of confusion about what can be used in a consistent fashion.

LPE: Which standard should everyone be using?
Dobre: On the UPF side, IEEE 1801 is the standard that can be used by the EDA community to develop all their tools, and it can be used consistently by designers. Keeping both UPF 1.0 and IEEE 1801 is creating problems. There should be a consistent power intent and a road map for tool development. That’s well defined in the CPF camp, and there is a path toward convergence to have one format in 2012.

LPE: What holds you back from just choosing one or the other?
Dobre: Today we can use IEEE 1801, but not all the EDA tools vendors support that standard. Even if you have a very good standard, you are limited by the set of tools.

LPE: And you have every vendor’s tools?
Dobre: Yes.

LPE: So as a result you have to support both?
Dobre: That’s correct.

LPE: How do you get around this problem?
Dobre: There is no easy way. It adds a lot of complexity. We have to define the power intent files and maintain the power intent files. Having a consistent, fully automated power intent flow for verification is difficult. You need translation from CPF to UPF and back to CPF. And it is not a straightforward translation process.

LPE: What do you want to keep from CPF?
Dobre: We believe we can take the hierarchical methodology, which is very well defined in CPF, and have that ported to the next version of UPF. We also need the automated macro-modeling generation, and it all has to be put into an agnostic format. If we have all of that we can have a very strong and powerful single standard.

LPE: Is there any movement in that direction?
Dobre: There is a big effort by Si2.

LPE: If you’re designing a chip now, what do you have to do now?
Dobre: Today, we define power intent at the system level for the product. We define the power domains and power modes. It’s a high-level description. From this high-level description you create UPF and CPF files and provide a reference to the block-level owners. They have to generate CPF and UPF files. Then there’s an integration process where the lower-level files are integrated with the higher-level power format files. You create a complete power intent file for the whole SoC, which can be used for functional verification and design implementation.

LPE: Do you need separate models for UPF and CPF?
Dobre: With power intent files you are dealing with multiple voltages. There are two ways to describe the models. One is multi-models. The other is libraries. If I have 10 voltages, I can use 10 libraries. It’s a much bigger port to have 10 libraries. It’s much more power efficient to have one model rather than a family of libraries.

LPE: If you have an engineering change order, do you have to adjust both formats as you go forward?
Dobre: That’s a very important aspect of power intent work. You need to define an integration methodology. You need custom integration capability for every change. What is required is to have an integration methodology. All changes at the system level or block level must be integrated with the top-level power intent file. As long as you don’t impact the macro model it won’t impact the top-level power intent file.

LPE: Does having more than one power format increase the risk of something going wrong?
Dobre: Yes. I get more power intent information in CPF than in UPF. Without it you are missing some power intent information. But there also is a lack of consistency. You need to pay attention if you do a check in UPF, you need to make sure you do all the checks with CPF so you have complete information.

LPE: What are the real-world ramifications of this?
Dobre: The biggest impact is in the overall design flow. You don’t have all the verification capabilities.

TSVs Ease Heat In 3D ICs

Thursday, August 11th, 2011

By Ann Steffora Mutschler
In the evolving discussion of 3D ICs and through silicon via (TSV) technology, a key issue engineering teams are facing today is how to reduce the thermal coefficients between substrates in a stacked die. Simply put, what is the best way to get the heat out of the 2.5 or 3D IC?

The answer, of course, is anything but simple.

“In a 3D system, the heat hierarchy is through the package, through the heat sink, through the bumps, through the adhesives and through the stacked tier layers. If the wafers were thicker, the heat would have a chance to flow out horizontally or vertically and dissipate a bit. As wafers are thinned more and more, the heat dissipation becomes an issue, and if you stack them it gets worse. Within them, thermal flux increases, your peak temperature within the stack increases and since your wafers are thinner, you also have a higher temperature gradient across the thinned wafer,” explained Sesh Ramaswami, senior director at Applied Materials.

The first step in managing thermal issues today is accurately calculating the power and leakage in the design, with leakage now one of the most dominant issues to be addressed.

“The low-dielectric constant materials are actually causing more of a problem because they’ve got lower thermal conductivity. That in itself is not helping with thermal gradients on a die,” said Pete McCrorie, product marketing director at Cadence. Thermal analysis technology employs IR drop power rail calculation to generate instance power in the design, so for each of the instances in a design that power is based on activity information. That is added to the leakage power, which is calculated, and it all gets thrown into a solution where the thermal conductivity of the substrate, interconnects, ball bonds and package is extracted and is solved for thermal at that point.

“When you’re stacking—and it doesn’t matter if it’s a 3D package where the package is a system-in-package or MCM, or a number of die on top of each other—with 3D IC it’s the same thing,” observed Navraj Nandra, senior director of marketing for analog/mixed-signal IP at Synopsys. “The difference between a 3D package is that you do all the signaling off the die, so you have wire bonds going off the substrate and connecting into the other substrate. With a 3D IC, you have on-chip signaling, so you’ve got the communication between the various substrates happening through TSVs, for example.”

So when it comes to all the heat issues, everyone dealing with stacking is basically having the same problems. To combat the heat, engineers try to put the active devices—those devices or transistors doing all of the switching—at the top of the substrate hierarchy where a lot of heat is generated. Looking down further into the stack, IC developers are trying to increase the effective heat transfer coefficient, meaning they are looking for technologies that can shift the heat out very quickly from that substrate.

In terms of the packaging aspect of stacking 3D ICs, one approach is to develop a substrate with better thermal conductivity, but that’s cost-prohibitive for most developers. Intel has done research in this area and released a paper in 2007 with its suggestions for managing thermal issues. http://download.intel.com/design/iio/applnots/31505102.pdf Other approaches leverage familiar techniques that use copper, such as including a copper spreader or copper underfill between the substrates to dissipate the heat.

Then, when it comes to 3D ICs, TSV technology not only gives area, bandwidth and latency benefits, but it also can be used to manage all the thermal problems on a 3D IC by using the signal and power TSVs to dissipate the heat.

Synopsys’ experiments in this area involve taking the concept and introducing more vias or a via array—specifically, a TSV array—to reduce the temperature, Nandra said. “The idea is that if you can understand where the hot spots are going to occur in your design and somehow predict that in your EDA methodology, you can then insert a bunch of TSVs and those will help in the thermal dissipation. The question is how much do they help? We are seeing that they certainly help to reduce the peak temperature and the overall temperature gradients, but they don’t get the minimum temperature down any further.”

Not just for cell phones
Engineers tend to think of low power designs as being the wireless type solutions, but today everyone including the high-performance server developers are looking at lowering the power because of the associated heat and the high energy costs, Cadence’s McCrorie pointed out. “You think about the heating problem of the chip, but then when you try and dissipate the heat from the board and from the environment, that all gets very expensive if you’re generating too much heat.”

Meanwhile, Applied’s Ramaswami believes 3D stacking and TSVs may well pan out in the datacenter. With servers containing multicore CPUs that require lots and lots of data, if DRAMs are used in traditional DIMM approaches there will be several DIMMs on the board. “These DIMMs have a latency factor. They are a little slower because of the wire length, and so on. For the server market the blade would probably have these memory cubes on them [referring to Micron’s Hybrid Memory Cube as an example] with the following advantages: You get more memory per unit volume, which is much closer to the CPU, and because of that the latency goes down and your power dissipation goes down.”

Also, teams building chipsets for datacenters are asking for much lower power consumption for high-speed interfaces than what was typically thought of in the past. “They want something like a 10 or 12Gbps interface, but the power consumption numbers that they are asking for are very similar to what we would have thought in the past would be required by someone in the consumer industry,” Synopsys’ Nandra said. And they don’t always push for the higher performance, opting instead for lower power. “They say, ‘We want the 10Gbps interface, but what we really want is not for you to show us that you can take that 10 to 15Gbps or whatever. We want you to show us your roadmap to get the power consumption of that interface down.’ That’s a different requirement from customers.”

Modeling first
Of course, knowing where to put the TSVs is critical. From the design perspective, the first step is to model the problem with three pieces of information needed: current, resistance and voltage. “Once you’ve modeled the problem then you can think about some kind of automated EDA implementation. The way to think about this is going back to some very basic analogies. In order to do your thermal simulation, you can think of the heat source like a current source, because the current is directly related to heat, and you have an equation to do that,” Nandra said.

Thermal resistance is the other problem that causes heat, which is equivalent to a resistor, and that is equivalent to electrical resistance. Add to this the temperature gradient, which is analogous to the electric potential or voltage. With these pieces of information, a thermal model can be built based on those three parameters, which can then be used with any kind of numerical based simulator to do the thermal equivalent simulation, like SPICE, he explained.

“Then the question is, where do you implement it in the design flow. Fundamentally, the whole idea of 3D ICs is to solve the wiring crisis of interconnects. You’ve tried to solve the RC delay problem by having a vertical interconnect system. But now the next question is, in that EDA model, where do you implement the simulation of the vertical stack of heat?”

Grossly simplified, this is not too terribly complex of a problem in terms of modeling, he admitted. “The complexity is the fact that when you’ve got millions of TSVs in your network that you’re simulating, it all goes into this big matrix in SPICE or whatever numerical simulator you’re using and that becomes a challenge.” As such, there is work to be done with simulators for thermal analysis. More knowledge or heuristics need to be built in to help designers determine where to focus the model of the simulation.

Nandra believes that’s the most interesting aspect of this. “You can take this simple model and apply it blindly to the whole 3D IC, and that’s going to make the matrix that you’re running on the simulator huge. Or you can intelligently think, with some heuristics, ‘Okay, I’ve got 15 areas where this thing is going to get hot, and that’s where I want to apply the model.’ The reason you want to apply the model is because once you understand that the hotspot is occurring in this region, that’s where you want to put your TSV array to reduce the heat in that area. Then you need to know how many vias to put there because there is an area impact. You can do your insertion in that region. In the end, it becomes like a synthesis problem in a way. It’s almost like the way Design Compiler started because there was a way that Design Compiler initially worked out how to size gates based on logical effort and then, over years, the scientists that were working on it figured out some heuristics to make the optimization of that logical effort tuned to what you were trying to synthesize. I think that’s the way that this technology in terms of EDA automation is going to go,” he concluded.

Additional resources:
Examples of TSV technology in production designs today

1. TSV with interposer – Xilinx
2. TSV through memory – Elpida
3. TSVs through a logic chip – Qualcomm but no product out yet. discussed at many conferences.

Power Bits: July 15

Friday, July 15th, 2011

By Ed Sperling

Portability Play
Synopsys is working with GlobalFoundries to deliver interoperable process design kits later this year at advanced nodes. iPDKs are particularly important for companies looking to use designs for multiple markets. A general-purpose process, for example, is critical for markets looking for higher performance, while low-power processes are important in applications where battery life is a differentiating factor.

The problem is that many of these designs are not always portable between processes, despite the fact that power and performance are considered tradeoffs in most designs.

The companies said the 65nm G and enhanced low power (LPe) kits are available now. Versions for other process nodes will be available later this year.

Stacked die demo
Imec, the Belgian research organization, demonstrated a stacked die with DRAM on logic at Semicon this week. The chip is a prototype of what is expected to become a mainstream approach as companies seek to re-use existing analog IP and subsystems from previous nodes, as well as to add flexibility and speed to complex designs.

What’s particularly interesting about the prototype is Imec’s description of how heat can be removed from the die. Logic generates a fair amount of heat, but the DRAM die acts as a conductor for some of that heat. Qualcomm observed similar effects in its own stacking research last year.

Imec’s work was done in conjunction with GlobalFoundries, Intel, Micron, Samsung, TSMC, Fujitsu, Sony, Amkor and Qualcomm.

Widening The Channels

Thursday, March 17th, 2011

By Ed Sperling
Wide I/O—both as a specific memory standard and as a generic approach for on-chip networking—has been looked at for the past couple of chip generations as a way of improving SoC performance. Increasingly, it also is being used as a key strategy for reducing energy consumption.

Wide I/O refers to a number of different approaches in on-chip networking, ranging from through-silicon vias in 3D stacks to interposers in 2.5D stacking. It also refers to a standard for memory communication being developed by JEDEC, as well as more dedicated channels for signals. In all cases, the added benefit is a reduction in power needed to drive a signal.

The tradeoff typically is between serial I/O and wide I/O. Serial I/O is simpler to design and works over longer distances, but it is far less power efficient. Wide I/O, in contrast, is higher bandwidth with big power savings—Samsung, for example, estimates its new 1Gbit mobile DRAM based on a 50nm process consumes 87% less power—but the technology is also more complicated to use. And in most cases, it’s also more costly.

Eliminating complexity while adding more
The concept of bigger pipes has always been a last resort for chip architects. It’s well known that shortening the distance a signal travels and reducing the resistance can drive down the amount of power needed for a signal. Reducing the overhead of serialization and deserialization can cut the power even further. But ironically, it has taken an explosion in SoC complexity for chip architects to seriously consider simplifying signal paths.

“We always go through this pendulum swing of what’s the optimal physical implementation vs. what’s the simplest way to do it even if it costs more silicon,” said Steve Roddy, vice president of marketing and business development at Tensilica. “So you can do things with 128 wires using serialized I/O, or you can do it with a lot fewer using wide I/O. The serialized I/O requires deserialization, which costs power. With wide I/O, which could simply be a lot of wires connected to the next block, you can lower the frequency and widen the channel.”

In a 2.5D stack, that extra silicon is easier to justify because it doesn’t add significantly to the overall footprint. In a system-in-package or package-on-package it may involve an interposer, which is another piece of silicon. It also can involve a through-silicon via in a 3D stack, which is wide enough to avoid any congestion.

“With a TSV you don’t need a standard I/O, which includes the I/O circuitry, patch and bond wire,” said Tom Quan, deputy director of design methodology and service marketing at TSMC. “So you get rid of all the I/O circuitry, and you have the same area, power and current. That results in a tremendous power savings. You also get a big boost in timing. And if you use an interposer, that’s silicon so it has the same resistance and capacitance of a standard IC. You can simulate them both together and get a predictable result.”

Eliminating bottlenecks
There are many good reasons for using wider pipes. One is that multicore and multiprocessor implementations generally are inefficient. The whole idea behind these implementations was that software would be able to run across multiple cores and multiple processors. That didn’t work out as planned, due to the inability to parallelize many applications, but cores were still designed to share the same memory.

That’s inefficient from a performance and a power perspective. Cores that are not in use should be turned off or powered way down. Moreover, when they need to connect to memory it should be along a clear path with as little congestion as possible and over the shortest distance possible.

“For some years to come we’re going to be seeing systems in package with interposers as the ideal solution,” said Joe Sawicki, vice president and general manager of Mentor Graphics’ Design-To-Silicon Division. “That will involve a lot faster interconnects, mostly to memory, and potentially to homogeneous logic. One of our customers was developing a digital chip and needed Bluetooth. They did it in a digital IC and they also did it in a SiP. The SiP destroyed the SoC in performance and power.”

But the question also is at what cost. While 2.5D approaches are relatively straightforward, the interposer does add some cost and the TSV can add even more.

“We are pursuing full 3D and so are most of the people in the phone business, primarily because of the form factor and cost,” said Riko Radojcic, director of engineering at Qualcomm. If you think about an interposer, you’re adding another die to the cost. Conceptually an interposer is an elegant solution and it works fine for someone who sells a product for $100. If you throw in a $1 interposer it’s no big deal. But if you’re making a $5 die and you throw in an interposer, it is a big deal.”

The same is true of through-silicon vias, although the ultimate advantages of this approach are expected to become more significant over time.

“TSV is expensive but is a good way of meeting the form factor,” said Navraj Nandra, senior director of marketing for Synopsys’ DesignWare Analog and MSIP Solutions Group. “You need to optimize for both low power and low cost packages. It’s like buying a $50k hybrid car that gives you 32mpg compared to a $22k 1.2L, 3-cylinder petrol engine car that gives you 50mpg. Everyone is excited about the hybrid car.”

Optimizing the signals
Behind the hubbub about the I/O technology is another often overlooked piece of the equation. The move to multiple processors and multiple cores was done largely as a knee-jerk response to the end of classical scaling at 90nm. What has happened since then is a much more measured response to how to use these cores more effectively, which requires much more granularity in the design process. Not all cores need to be on an ARM or MIPS processor, for example, and not all of them need to be in one place on an SoC—or even on the same die of a SiP or 3D stack.

In addition, not all of those cores or processors need to be the same size or run the same software.

“In addition to wide I/O there are dedicated point-to-point connections to relieve the system congestion,” said Tensilica’s Roddy. “Those can include general purpose memory and processor. When the system architect knows beforehand what’s going to be in the system they can add those connections up front. So you may have a video decoder and buffer and an audio decoder using separate memories, and those may change depending on whether they end up in a cell phone or a set-top box. But there are some things you don’t know at design time and you need the ability to generate system-specific interconnects, which is what’s being sold by companies like Arteris and Sonics.”

And finally, there is a simple mathematic principle behind the push to reduce power.

“The longer a signal has to travel, the more power it takes,” said Qi Wang, technical marketing group marketing director for Cadence Solutions Marketing. “A lot of issues in design come down to power. If you put the memory outside the chip, that takes power. If you want to speed up performance, that takes power.”

Bigger pipes over shorter distances can help solve that problem, and it’s a solution that is beginning to garner much more attention these days.

Power Management Trumps Battery Technology

Thursday, February 10th, 2011

By Ann Steffora Mutschler
The lithium-ion battery has the power to ruin someone’s day, especially when it dies and cannot be charged, not to mention occasional thermal runaways that literally cause explosions. For a technology that is about 30 years old, and approaching its limits, it is mind-boggling that the best brains on the planet haven’t come up with a technological superior alternative.

But alas, they have not—at least from a realistic cost perspective. While the world waits, electrical engineers and system architects are leveraging power management techniques in the design of chips to do everything they can to make their system as efficient as possible to gain a bit more battery life.

As such, the power management IC industry is healthy. Marijana Vukicevic, principal analyst for power management at IHS iSuppli, predicts the global market for power management semiconductors will reach $36.2 billion in revenue this year, 13.9% higher than $31.8 billion last year. However, she expects growth to slow this year to bring revenues back in line after tremendous growth last year.

This move toward more efficient battery-powered devices is driving continuing demand for power management ICs as consumers everywhere look for longer battery life in their mobile devices—with new design trends likely to emerge in power management ICs, Vukicevic said.

Growth in alternate energy markets, including solar, wind, the electrification of vehicles and the smart grid also will drive growth, along with a move toward greater integration in power ICs. Those suppliers with the technology to further integrate their chips will reap the greatest benefits in terms of revenue.

“There are trends that are pulling several power management ICs into one, which is understandable for some devices,” she said. “Then there are times when some of these functionalities are coming from power management ICs that had already been integrated because the OEMs are looking into having more flexibility or they really want to add a feature that no one else does.”

Understandably, for tablets and iPads, there is a lot of integration because space is restricted and form factor is an issue.

When it comes to techniques, there is always a different issue, she noted. “Whether it is the battery charging, whether people are trying to figure out the best way to charge the battery without damaging the battery because you have to keep the current flowing—there are different techniques that people are applying. Some of these techniques are IP-protected, some of them are not. You do have companies looking into that, of course, because it is a big issue.”

Discrete chip vs. embedded block
In designs today, power management is implemented as discrete devices in a system or as part of the SoC, with the exact breakdown difficult to nail down.

“We have seen both types that are on-chip power management functionality available. There’s a lot of off-chip. It depends if you have a single SoC system. Then the power management has to reside typically on the SoC itself. That would be one reason to put it on the chip,” said Krishna Balachandran, director of product marketing for low-power verification products at Synopsys.

The job of the system architect is challenging. First, before even deciding how to implement the power management, the architect has to determine how to proceed. “There are a plethora of techniques that are available and the architect has to figure out which ones he/she wants and how to partition the design into a number of power domains. So that’s an architectural problem. Even before that, the architects decide how much they want to control power at the system level vs. using software vs. the hardware chip level. That’s a tradeoff they make early on,” he said. “Usually, whatever they are not able to achieve from a system perspective and from a software control perspective, that’s when they start putting the onus on the chip design itself. The system architect goes through a process, figures this out, and then says to the chip design team, ‘You’ve got to deliver me this power for this particular chip.’”

Looking at the smart phone market, there is also a trend toward integration of power management. “There are still functionalities that are outside that one particular IC, but there is a trend of integration because otherwise they would end up with a bunch of different ICs that take up space. Major power functions are integrated with the supporting ones that are not,” Vukicevic said.

The design approach depends on the OEM. “Between OEMs, there is a differentiation on how they do things. For example, sometimes you’ll find an OEM who buys a digital baseband from Qualcomm, for example, and then they buy an analog baseband from Qualcomm, and either power is integrated in that analog baseband or Qualcomm supplies an IC with power management,” she noted.

On the other hand, some OEMs pick and choose how the power is going to be managed. And finally, there is a top layer where software manages power consumption within the device—a layer of firmware and software that is above the hardware, Once you plug in to all of the hardware inside, there is a layer of firmware and a layer of software that is closest to the user, where the user actually can influence power usage, Vukicevic said.

Boosting 4G, Low Power

Monday, February 7th, 2011

By Ed Sperling
LTE Advanced, a significantly faster 4G standard, is gaining steam quickly as the race among the top smart phone providers shifts from features to performance. Unlike existing versions of LTE, LTE Advanced can handle peak data rates of up to 1 gigabit/second, which will be extremely useful in streaming video and online games, as well as better search response time.

With the same phones now available from multiple vendors—Verizon began selling the iPhone in the United States in 2011, with only minor differences, for example—the race is on now to provide faster loading of Web pages and better streaming.

The LTE Advanced standard was submitted to the Telecommunication Standardization Sector in Switzerland in late 2009. It is still awaiting finalization, but that hasn’t slowed the race for an early advantage in this space. Tensilica’s introduction today of five new LTE Advanced DSP cores is a case in point. The cores will be used in baseband systems developed for the 28nm LP process technology.

Building these cores is a first major step toward rolling out products for LTE Advanced this year. Like many of the latest processor cores, DSPs are becoming incredibly complex.

“There are two ways that DSPs fail,” said Chris Rowen, Tensilica’s CTO. “The first is in numerical computation. You add two 16-bit numbers and get a 17-bit number, but if you don’t keep track of that extra bit it can cause a problem. The industry has developed guard bits for this. The second way is the design of the pipeline. DSPs have exposed pipelines, and it’s left to programmers to determine the spacing. The problem is if you execute the instruction and interrupt you only get the result of the first instruction. You need everything to be fully interlocked.”

The new chips also offer a 4X improvement in flops per watt, with the high-end chip capable of running 128 multiple accumulates, the standard for DSP cores. That gain in performance for the same amount of power is particularly important because device power budgets are either fixed or shrinking.

The Tensilica introductions are merely one facet of a relentless global ecosystem push toward faster performance. Huawei showed off download speeds of up to 600 megabits/second 12 months ago at the 2010 GSMA Mobile World Congress. At that point, the company boasted speeds that were about 20 times faster than 3G networks.

And Qualcomm said this year that it has begun to evaluate LTE Advanced features. Qualcomm believes the next significant performance leap will come from leveraging topology, bringing the network closer to the user, and adding many low-power nodes. It said LTE Advanced will “improve capacity, coverage and ensure user fairness.”

Power Bits: Jan. 7

Friday, January 7th, 2011

By Ed Sperling
Microsoft will develop its next version of Windows for AMD, Microsoft and ARM SoCs. The emphasis is on SoCs, and the focus of SoCs has been on two things: power and the reusability of existing and commercially developed IP.

This is an interesting challenge for Microsoft, as well as for Intel, AMD, and ARM’s slew of partners. A general-purpose OS takes a lot more code to create—and it takes a lot more power to use—than a real-time operating system or an embedded version. The result is greatly reduced battery life and more time with a plug in the wall. Even open-source Linux has the same problem, which is why companies such as Mentor Graphics offer a slimmed down embedded version.

The big question for architects of these SoCs will be one of priorities. What takes precedence? Is it processing power? Is it performance? Or is it segregation of more efficient code for individual cores.

Microsoft’s announcement doesn’t address these kinds of issues. Intel has said next to nothing other than a canned statement from Douglas Davis, VP and GM of the tablet group: “…what is so exciting is how our two companies will be able to match a tailored, low-powered operating system with future generations of our popular Intel Atom processors…”

And comments from ARM, and ARM customers Nvidia, Qualcomm and TI have been no more enlightening. This isn’t a simple problem to solve while maintaining backward compatibility with bloated applications developed when power efficiency were far less critical than ease of use and connectivity. And it’s not one that anyone is likely to be talking about for at least a year or more. But when they finally do start talking, it will be very interesting to hear how these companies will position Windows and its very large code base.

Next Page »