Posts Tagged ‘Qualcomm’

Next Page »

Processor Subject To Change

Thursday, February 9th, 2012

By Ann Steffora Mutschler
With power complexity driving sophisticated management techniques, SoC design engineering teams are turning to a new class of customizable processor architectures from ARM, CEVA, NVIDIA, Qualcomm and Tensilica and others to take advantage of the best in power saving techniques.

While these new architectures are novel approaches, the concepts are not especially new, particularly in mobile applications.

“If you look at what mobile processors have been doing, I would argue they’ve been doing some sort of big.LITTLE for a long time,” explained Nandan Nayampally, director of applications processor marketing in the processor division of ARM. “By that I mean you have microcontrollers taking charge when the big application processor is not working, or you’ve got video engines being separate from the main application processing. The compartmentalization of the activities around the chip have been always a focus for mobile because you will save power any which way you can. That’s a given.”

ARM has observed that what’s changed in the recent past is that the main OS needs to be running more and more of the time because with apps like Twitter feeds and Facebook updates, those are little apps that are constantly running on top of the OS.

As fun and/or useful as they are, these apps are killing battery life.

Nayampally explained the big.LITTLE architecture with an example. “Let’s say I’m doing an MP3 playback in the old days. You’d say, ‘I’m running on the big core, I kick off the task to a little core and then turn off the big processor because the MP3 can run just fine on a microcontroller type device. It’s all on the same die. Then suddenly you get a call and it wakes up the big processor and it takes over again. But when you offloaded that MP3 in the olden days—six months or so ago—you actually could have a separate task that wasn’t really run by the OS. Now there are so many more things and services that people are coming to expect that you can’t have them done specifically for targets that are different from the application processor itself and they run on top of the OS. Now you are telling the chip, ‘No, I won’t do these specialized things as separate things for very power-efficient sub-components, they have to be done by the main processor.’ But the main processor also has to become very schizophrenic in the level of performance it requires for the main tasks as well as what it needs for the little tasks.”

Source: ARM

What makes big.LITTLE interesting is that the processors are fully coherent so the software engineer doesn’t have to worry as much about maintaining every piece of data. The coherency in hardware takes care of that. That makes the software development quicker and can actually improve performance and battery life.

Designed to be an extension of DVFS, there are multiple use models in which big.LITTLE can work, with the simplest use meant to be effectively transparent to the OS, Nayampally continued. “The power management software always speaks to a driver that is the right power and performance needed based on what is required. If, for example, you had today’s processor and it was using the lowest performance level it could while doing Twitter update, it just can’t be as efficient as something that was designed to be a fifth smaller or something like that. What if your DVFS had a next step that is more efficient and you can work there for a while? From an OS standpoint, or an application standpoint, it doesn’t matter. It’s just another step in your DVFS. Underneath it what happens is the driver now can do the kick-off to switch the operations from the big core to the little core or from the little core to the big core or cluster in fact.”

NVIDIA’s Tegra 3 employs variable symmetric multiprocessing (vSMP) while Qualcomm uses asynchronous symmetrical multiprocessing (aSMP) – which are the same principles that govern ARM’s big.LITTLE architecture.

NVIDIA’s Tegra 3, launched last November is a quad-core mobile processor for smartphones and tablets, currently shipping in the ASUS Transformer Android tablet. A company spokesman explained that behind Tegra 3’s power efficiency is a fifth lower-power “companion” CPU core that goes with the four CPU cores and is specifically targeted at battery savings. Tegra 3’s architecture allows it to provide the best combination of performance and battery life by switching between the four main CPU cores and the fifth core for less demanding tasks and active standby mode.

For CEVA, which licenses DSPs, programmability has always been the name of the game, according to Eran Briman, the company’s vice president of marketing. About seven years ago it became apparent that general-purpose DSPs are not going to make the cut for next-generation designs—particularly in 40nm communications designs. In one of its newest offerings, the CEVA-XC DSP software-defined radio architecture, users can run the complete receive and transmit channels entirely in software, except for very few hardware engines that simply don’t make sense in software, he said. To accompany this and to allow for advanced power management, CEVA recently released a software development kit that includes advanced power management. Looking ahead, Briman believes there will be fully programmable communications units on SoCs.

CEVA isn’t the only company in the DSP space to see this trend.

“Many baseband designs particularly, when they are operating on complex protocols and care a lot about energy have moved to neither completely hard-wired—because that would be too fragile or intolerant of inevitable corrections and improvements—nor completely general-purpose, because a general-purpose processor is generally much less energy-efficient than something that is more specific to the task at hand,” observed Chris Rowen, CTO at Tensilica. “Especially in low-power baseband processing, we’re seeing more and more optimization of programmable engines to do this, where the baseband subsystem might include 6 or 8 or 10 different cores that are programmable. Some of these still may be fairly general-purpose, because you may say in this function though there’s a wide variety of different tasks that I need to do on the data and it is more energy efficient for me to have one that is shared among these different, diverse functions than to have one piece of hardware for every single function. That would make it too big. Having a programmable solution can in some cases also make it a smaller solution. In general, small is good for energy.”

Tensilica offers a range of DSP cores. It also allows users to build their own customized dataplane processors.

Power Bits: Closer But Better

Thursday, August 11th, 2011

By Ed Sperling
Near-field communications has attracted an enormous amount of attention of late. Banks are allowing customers to scan checks from their cell phones and, increasingly, smart phones are being used for everything from airport access to paying bills at the grocery store. They’re even being used for parking lot fees.

But there’s a hidden pricetag behind all of this—the amount of energy needed to drive these transactions. That has led to a slew of development efforts behind the scenes, with the latest entry showing up this week from Texas Instruments. TI’s is bragging that its new NFC transceiver uses half as much power as the competitors’ products, running only as high as 120 milliampere in full-power mode and less than 1 microampere when it’s powered down.

TI clearly is not alone in this game of leapfrog. Companies such as Broadcom, Qualcomm and ST are racing in the same direction. But what’s interesting is that energy consumption has become the marketing focus rather than performance. The transceiver comes with eight possible power modes, which is particularly important in matching user preferences with the end devices.

EVM Board. Source: Texas Instruments

Dueling Power Formats

Thursday, August 11th, 2011

By Ed Sperling
Multiple power formats and increasingly complex SoCs don’t sound like a winning formula. So just how bad have things become? Low-Power Engineering asked Sorin Dobre, senior staff engineer at Qualcomm, for a real-world assessment of the situation.

LPE: There are three power formats—CPF, UPF and IEEE 1801. How big a problem is this for Qualcomm?
Dobre: Actually we have CPF 1.1 and 2.0 and UPF 1.0 and UPF 2.0, which is also called IEEE 1801. In the CPF area there is a unified approach. There is backward compatibility and consistent methodology for power intent and verification. In the UPF area there are many inconsistencies between UPF 1.0 and IEEE 1801. Instead of parallel formats there is a level of confusion about what can be used in a consistent fashion.

LPE: Which standard should everyone be using?
Dobre: On the UPF side, IEEE 1801 is the standard that can be used by the EDA community to develop all their tools, and it can be used consistently by designers. Keeping both UPF 1.0 and IEEE 1801 is creating problems. There should be a consistent power intent and a road map for tool development. That’s well defined in the CPF camp, and there is a path toward convergence to have one format in 2012.

LPE: What holds you back from just choosing one or the other?
Dobre: Today we can use IEEE 1801, but not all the EDA tools vendors support that standard. Even if you have a very good standard, you are limited by the set of tools.

LPE: And you have every vendor’s tools?
Dobre: Yes.

LPE: So as a result you have to support both?
Dobre: That’s correct.

LPE: How do you get around this problem?
Dobre: There is no easy way. It adds a lot of complexity. We have to define the power intent files and maintain the power intent files. Having a consistent, fully automated power intent flow for verification is difficult. You need translation from CPF to UPF and back to CPF. And it is not a straightforward translation process.

LPE: What do you want to keep from CPF?
Dobre: We believe we can take the hierarchical methodology, which is very well defined in CPF, and have that ported to the next version of UPF. We also need the automated macro-modeling generation, and it all has to be put into an agnostic format. If we have all of that we can have a very strong and powerful single standard.

LPE: Is there any movement in that direction?
Dobre: There is a big effort by Si2.

LPE: If you’re designing a chip now, what do you have to do now?
Dobre: Today, we define power intent at the system level for the product. We define the power domains and power modes. It’s a high-level description. From this high-level description you create UPF and CPF files and provide a reference to the block-level owners. They have to generate CPF and UPF files. Then there’s an integration process where the lower-level files are integrated with the higher-level power format files. You create a complete power intent file for the whole SoC, which can be used for functional verification and design implementation.

LPE: Do you need separate models for UPF and CPF?
Dobre: With power intent files you are dealing with multiple voltages. There are two ways to describe the models. One is multi-models. The other is libraries. If I have 10 voltages, I can use 10 libraries. It’s a much bigger port to have 10 libraries. It’s much more power efficient to have one model rather than a family of libraries.

LPE: If you have an engineering change order, do you have to adjust both formats as you go forward?
Dobre: That’s a very important aspect of power intent work. You need to define an integration methodology. You need custom integration capability for every change. What is required is to have an integration methodology. All changes at the system level or block level must be integrated with the top-level power intent file. As long as you don’t impact the macro model it won’t impact the top-level power intent file.

LPE: Does having more than one power format increase the risk of something going wrong?
Dobre: Yes. I get more power intent information in CPF than in UPF. Without it you are missing some power intent information. But there also is a lack of consistency. You need to pay attention if you do a check in UPF, you need to make sure you do all the checks with CPF so you have complete information.

LPE: What are the real-world ramifications of this?
Dobre: The biggest impact is in the overall design flow. You don’t have all the verification capabilities.

TSVs Ease Heat In 3D ICs

Thursday, August 11th, 2011

By Ann Steffora Mutschler
In the evolving discussion of 3D ICs and through silicon via (TSV) technology, a key issue engineering teams are facing today is how to reduce the thermal coefficients between substrates in a stacked die. Simply put, what is the best way to get the heat out of the 2.5 or 3D IC?

The answer, of course, is anything but simple.

“In a 3D system, the heat hierarchy is through the package, through the heat sink, through the bumps, through the adhesives and through the stacked tier layers. If the wafers were thicker, the heat would have a chance to flow out horizontally or vertically and dissipate a bit. As wafers are thinned more and more, the heat dissipation becomes an issue, and if you stack them it gets worse. Within them, thermal flux increases, your peak temperature within the stack increases and since your wafers are thinner, you also have a higher temperature gradient across the thinned wafer,” explained Sesh Ramaswami, senior director at Applied Materials.

The first step in managing thermal issues today is accurately calculating the power and leakage in the design, with leakage now one of the most dominant issues to be addressed.

“The low-dielectric constant materials are actually causing more of a problem because they’ve got lower thermal conductivity. That in itself is not helping with thermal gradients on a die,” said Pete McCrorie, product marketing director at Cadence. Thermal analysis technology employs IR drop power rail calculation to generate instance power in the design, so for each of the instances in a design that power is based on activity information. That is added to the leakage power, which is calculated, and it all gets thrown into a solution where the thermal conductivity of the substrate, interconnects, ball bonds and package is extracted and is solved for thermal at that point.

“When you’re stacking—and it doesn’t matter if it’s a 3D package where the package is a system-in-package or MCM, or a number of die on top of each other—with 3D IC it’s the same thing,” observed Navraj Nandra, senior director of marketing for analog/mixed-signal IP at Synopsys. “The difference between a 3D package is that you do all the signaling off the die, so you have wire bonds going off the substrate and connecting into the other substrate. With a 3D IC, you have on-chip signaling, so you’ve got the communication between the various substrates happening through TSVs, for example.”

So when it comes to all the heat issues, everyone dealing with stacking is basically having the same problems. To combat the heat, engineers try to put the active devices—those devices or transistors doing all of the switching—at the top of the substrate hierarchy where a lot of heat is generated. Looking down further into the stack, IC developers are trying to increase the effective heat transfer coefficient, meaning they are looking for technologies that can shift the heat out very quickly from that substrate.

In terms of the packaging aspect of stacking 3D ICs, one approach is to develop a substrate with better thermal conductivity, but that’s cost-prohibitive for most developers. Intel has done research in this area and released a paper in 2007 with its suggestions for managing thermal issues. http://download.intel.com/design/iio/applnots/31505102.pdf Other approaches leverage familiar techniques that use copper, such as including a copper spreader or copper underfill between the substrates to dissipate the heat.

Then, when it comes to 3D ICs, TSV technology not only gives area, bandwidth and latency benefits, but it also can be used to manage all the thermal problems on a 3D IC by using the signal and power TSVs to dissipate the heat.

Synopsys’ experiments in this area involve taking the concept and introducing more vias or a via array—specifically, a TSV array—to reduce the temperature, Nandra said. “The idea is that if you can understand where the hot spots are going to occur in your design and somehow predict that in your EDA methodology, you can then insert a bunch of TSVs and those will help in the thermal dissipation. The question is how much do they help? We are seeing that they certainly help to reduce the peak temperature and the overall temperature gradients, but they don’t get the minimum temperature down any further.”

Not just for cell phones
Engineers tend to think of low power designs as being the wireless type solutions, but today everyone including the high-performance server developers are looking at lowering the power because of the associated heat and the high energy costs, Cadence’s McCrorie pointed out. “You think about the heating problem of the chip, but then when you try and dissipate the heat from the board and from the environment, that all gets very expensive if you’re generating too much heat.”

Meanwhile, Applied’s Ramaswami believes 3D stacking and TSVs may well pan out in the datacenter. With servers containing multicore CPUs that require lots and lots of data, if DRAMs are used in traditional DIMM approaches there will be several DIMMs on the board. “These DIMMs have a latency factor. They are a little slower because of the wire length, and so on. For the server market the blade would probably have these memory cubes on them [referring to Micron’s Hybrid Memory Cube as an example] with the following advantages: You get more memory per unit volume, which is much closer to the CPU, and because of that the latency goes down and your power dissipation goes down.”

Also, teams building chipsets for datacenters are asking for much lower power consumption for high-speed interfaces than what was typically thought of in the past. “They want something like a 10 or 12Gbps interface, but the power consumption numbers that they are asking for are very similar to what we would have thought in the past would be required by someone in the consumer industry,” Synopsys’ Nandra said. And they don’t always push for the higher performance, opting instead for lower power. “They say, ‘We want the 10Gbps interface, but what we really want is not for you to show us that you can take that 10 to 15Gbps or whatever. We want you to show us your roadmap to get the power consumption of that interface down.’ That’s a different requirement from customers.”

Modeling first
Of course, knowing where to put the TSVs is critical. From the design perspective, the first step is to model the problem with three pieces of information needed: current, resistance and voltage. “Once you’ve modeled the problem then you can think about some kind of automated EDA implementation. The way to think about this is going back to some very basic analogies. In order to do your thermal simulation, you can think of the heat source like a current source, because the current is directly related to heat, and you have an equation to do that,” Nandra said.

Thermal resistance is the other problem that causes heat, which is equivalent to a resistor, and that is equivalent to electrical resistance. Add to this the temperature gradient, which is analogous to the electric potential or voltage. With these pieces of information, a thermal model can be built based on those three parameters, which can then be used with any kind of numerical based simulator to do the thermal equivalent simulation, like SPICE, he explained.

“Then the question is, where do you implement it in the design flow. Fundamentally, the whole idea of 3D ICs is to solve the wiring crisis of interconnects. You’ve tried to solve the RC delay problem by having a vertical interconnect system. But now the next question is, in that EDA model, where do you implement the simulation of the vertical stack of heat?”

Grossly simplified, this is not too terribly complex of a problem in terms of modeling, he admitted. “The complexity is the fact that when you’ve got millions of TSVs in your network that you’re simulating, it all goes into this big matrix in SPICE or whatever numerical simulator you’re using and that becomes a challenge.” As such, there is work to be done with simulators for thermal analysis. More knowledge or heuristics need to be built in to help designers determine where to focus the model of the simulation.

Nandra believes that’s the most interesting aspect of this. “You can take this simple model and apply it blindly to the whole 3D IC, and that’s going to make the matrix that you’re running on the simulator huge. Or you can intelligently think, with some heuristics, ‘Okay, I’ve got 15 areas where this thing is going to get hot, and that’s where I want to apply the model.’ The reason you want to apply the model is because once you understand that the hotspot is occurring in this region, that’s where you want to put your TSV array to reduce the heat in that area. Then you need to know how many vias to put there because there is an area impact. You can do your insertion in that region. In the end, it becomes like a synthesis problem in a way. It’s almost like the way Design Compiler started because there was a way that Design Compiler initially worked out how to size gates based on logical effort and then, over years, the scientists that were working on it figured out some heuristics to make the optimization of that logical effort tuned to what you were trying to synthesize. I think that’s the way that this technology in terms of EDA automation is going to go,” he concluded.

Additional resources:
Examples of TSV technology in production designs today

1. TSV with interposer – Xilinx
2. TSV through memory – Elpida
3. TSVs through a logic chip – Qualcomm but no product out yet. discussed at many conferences.

Power Bits: July 15

Friday, July 15th, 2011

By Ed Sperling

Portability Play
Synopsys is working with GlobalFoundries to deliver interoperable process design kits later this year at advanced nodes. iPDKs are particularly important for companies looking to use designs for multiple markets. A general-purpose process, for example, is critical for markets looking for higher performance, while low-power processes are important in applications where battery life is a differentiating factor.

The problem is that many of these designs are not always portable between processes, despite the fact that power and performance are considered tradeoffs in most designs.

The companies said the 65nm G and enhanced low power (LPe) kits are available now. Versions for other process nodes will be available later this year.

Stacked die demo
Imec, the Belgian research organization, demonstrated a stacked die with DRAM on logic at Semicon this week. The chip is a prototype of what is expected to become a mainstream approach as companies seek to re-use existing analog IP and subsystems from previous nodes, as well as to add flexibility and speed to complex designs.

What’s particularly interesting about the prototype is Imec’s description of how heat can be removed from the die. Logic generates a fair amount of heat, but the DRAM die acts as a conductor for some of that heat. Qualcomm observed similar effects in its own stacking research last year.

Imec’s work was done in conjunction with GlobalFoundries, Intel, Micron, Samsung, TSMC, Fujitsu, Sony, Amkor and Qualcomm.

Widening The Channels

Thursday, March 17th, 2011

By Ed Sperling
Wide I/O—both as a specific memory standard and as a generic approach for on-chip networking—has been looked at for the past couple of chip generations as a way of improving SoC performance. Increasingly, it also is being used as a key strategy for reducing energy consumption.

Wide I/O refers to a number of different approaches in on-chip networking, ranging from through-silicon vias in 3D stacks to interposers in 2.5D stacking. It also refers to a standard for memory communication being developed by JEDEC, as well as more dedicated channels for signals. In all cases, the added benefit is a reduction in power needed to drive a signal.

The tradeoff typically is between serial I/O and wide I/O. Serial I/O is simpler to design and works over longer distances, but it is far less power efficient. Wide I/O, in contrast, is higher bandwidth with big power savings—Samsung, for example, estimates its new 1Gbit mobile DRAM based on a 50nm process consumes 87% less power—but the technology is also more complicated to use. And in most cases, it’s also more costly.

Eliminating complexity while adding more
The concept of bigger pipes has always been a last resort for chip architects. It’s well known that shortening the distance a signal travels and reducing the resistance can drive down the amount of power needed for a signal. Reducing the overhead of serialization and deserialization can cut the power even further. But ironically, it has taken an explosion in SoC complexity for chip architects to seriously consider simplifying signal paths.

“We always go through this pendulum swing of what’s the optimal physical implementation vs. what’s the simplest way to do it even if it costs more silicon,” said Steve Roddy, vice president of marketing and business development at Tensilica. “So you can do things with 128 wires using serialized I/O, or you can do it with a lot fewer using wide I/O. The serialized I/O requires deserialization, which costs power. With wide I/O, which could simply be a lot of wires connected to the next block, you can lower the frequency and widen the channel.”

In a 2.5D stack, that extra silicon is easier to justify because it doesn’t add significantly to the overall footprint. In a system-in-package or package-on-package it may involve an interposer, which is another piece of silicon. It also can involve a through-silicon via in a 3D stack, which is wide enough to avoid any congestion.

“With a TSV you don’t need a standard I/O, which includes the I/O circuitry, patch and bond wire,” said Tom Quan, deputy director of design methodology and service marketing at TSMC. “So you get rid of all the I/O circuitry, and you have the same area, power and current. That results in a tremendous power savings. You also get a big boost in timing. And if you use an interposer, that’s silicon so it has the same resistance and capacitance of a standard IC. You can simulate them both together and get a predictable result.”

Eliminating bottlenecks
There are many good reasons for using wider pipes. One is that multicore and multiprocessor implementations generally are inefficient. The whole idea behind these implementations was that software would be able to run across multiple cores and multiple processors. That didn’t work out as planned, due to the inability to parallelize many applications, but cores were still designed to share the same memory.

That’s inefficient from a performance and a power perspective. Cores that are not in use should be turned off or powered way down. Moreover, when they need to connect to memory it should be along a clear path with as little congestion as possible and over the shortest distance possible.

“For some years to come we’re going to be seeing systems in package with interposers as the ideal solution,” said Joe Sawicki, vice president and general manager of Mentor Graphics’ Design-To-Silicon Division. “That will involve a lot faster interconnects, mostly to memory, and potentially to homogeneous logic. One of our customers was developing a digital chip and needed Bluetooth. They did it in a digital IC and they also did it in a SiP. The SiP destroyed the SoC in performance and power.”

But the question also is at what cost. While 2.5D approaches are relatively straightforward, the interposer does add some cost and the TSV can add even more.

“We are pursuing full 3D and so are most of the people in the phone business, primarily because of the form factor and cost,” said Riko Radojcic, director of engineering at Qualcomm. If you think about an interposer, you’re adding another die to the cost. Conceptually an interposer is an elegant solution and it works fine for someone who sells a product for $100. If you throw in a $1 interposer it’s no big deal. But if you’re making a $5 die and you throw in an interposer, it is a big deal.”

The same is true of through-silicon vias, although the ultimate advantages of this approach are expected to become more significant over time.

“TSV is expensive but is a good way of meeting the form factor,” said Navraj Nandra, senior director of marketing for Synopsys’ DesignWare Analog and MSIP Solutions Group. “You need to optimize for both low power and low cost packages. It’s like buying a $50k hybrid car that gives you 32mpg compared to a $22k 1.2L, 3-cylinder petrol engine car that gives you 50mpg. Everyone is excited about the hybrid car.”

Optimizing the signals
Behind the hubbub about the I/O technology is another often overlooked piece of the equation. The move to multiple processors and multiple cores was done largely as a knee-jerk response to the end of classical scaling at 90nm. What has happened since then is a much more measured response to how to use these cores more effectively, which requires much more granularity in the design process. Not all cores need to be on an ARM or MIPS processor, for example, and not all of them need to be in one place on an SoC—or even on the same die of a SiP or 3D stack.

In addition, not all of those cores or processors need to be the same size or run the same software.

“In addition to wide I/O there are dedicated point-to-point connections to relieve the system congestion,” said Tensilica’s Roddy. “Those can include general purpose memory and processor. When the system architect knows beforehand what’s going to be in the system they can add those connections up front. So you may have a video decoder and buffer and an audio decoder using separate memories, and those may change depending on whether they end up in a cell phone or a set-top box. But there are some things you don’t know at design time and you need the ability to generate system-specific interconnects, which is what’s being sold by companies like Arteris and Sonics.”

And finally, there is a simple mathematic principle behind the push to reduce power.

“The longer a signal has to travel, the more power it takes,” said Qi Wang, technical marketing group marketing director for Cadence Solutions Marketing. “A lot of issues in design come down to power. If you put the memory outside the chip, that takes power. If you want to speed up performance, that takes power.”

Bigger pipes over shorter distances can help solve that problem, and it’s a solution that is beginning to garner much more attention these days.

Power Management Trumps Battery Technology

Thursday, February 10th, 2011

By Ann Steffora Mutschler
The lithium-ion battery has the power to ruin someone’s day, especially when it dies and cannot be charged, not to mention occasional thermal runaways that literally cause explosions. For a technology that is about 30 years old, and approaching its limits, it is mind-boggling that the best brains on the planet haven’t come up with a technological superior alternative.

But alas, they have not—at least from a realistic cost perspective. While the world waits, electrical engineers and system architects are leveraging power management techniques in the design of chips to do everything they can to make their system as efficient as possible to gain a bit more battery life.

As such, the power management IC industry is healthy. Marijana Vukicevic, principal analyst for power management at IHS iSuppli, predicts the global market for power management semiconductors will reach $36.2 billion in revenue this year, 13.9% higher than $31.8 billion last year. However, she expects growth to slow this year to bring revenues back in line after tremendous growth last year.

This move toward more efficient battery-powered devices is driving continuing demand for power management ICs as consumers everywhere look for longer battery life in their mobile devices—with new design trends likely to emerge in power management ICs, Vukicevic said.

Growth in alternate energy markets, including solar, wind, the electrification of vehicles and the smart grid also will drive growth, along with a move toward greater integration in power ICs. Those suppliers with the technology to further integrate their chips will reap the greatest benefits in terms of revenue.

“There are trends that are pulling several power management ICs into one, which is understandable for some devices,” she said. “Then there are times when some of these functionalities are coming from power management ICs that had already been integrated because the OEMs are looking into having more flexibility or they really want to add a feature that no one else does.”

Understandably, for tablets and iPads, there is a lot of integration because space is restricted and form factor is an issue.

When it comes to techniques, there is always a different issue, she noted. “Whether it is the battery charging, whether people are trying to figure out the best way to charge the battery without damaging the battery because you have to keep the current flowing—there are different techniques that people are applying. Some of these techniques are IP-protected, some of them are not. You do have companies looking into that, of course, because it is a big issue.”

Discrete chip vs. embedded block
In designs today, power management is implemented as discrete devices in a system or as part of the SoC, with the exact breakdown difficult to nail down.

“We have seen both types that are on-chip power management functionality available. There’s a lot of off-chip. It depends if you have a single SoC system. Then the power management has to reside typically on the SoC itself. That would be one reason to put it on the chip,” said Krishna Balachandran, director of product marketing for low-power verification products at Synopsys.

The job of the system architect is challenging. First, before even deciding how to implement the power management, the architect has to determine how to proceed. “There are a plethora of techniques that are available and the architect has to figure out which ones he/she wants and how to partition the design into a number of power domains. So that’s an architectural problem. Even before that, the architects decide how much they want to control power at the system level vs. using software vs. the hardware chip level. That’s a tradeoff they make early on,” he said. “Usually, whatever they are not able to achieve from a system perspective and from a software control perspective, that’s when they start putting the onus on the chip design itself. The system architect goes through a process, figures this out, and then says to the chip design team, ‘You’ve got to deliver me this power for this particular chip.’”

Looking at the smart phone market, there is also a trend toward integration of power management. “There are still functionalities that are outside that one particular IC, but there is a trend of integration because otherwise they would end up with a bunch of different ICs that take up space. Major power functions are integrated with the supporting ones that are not,” Vukicevic said.

The design approach depends on the OEM. “Between OEMs, there is a differentiation on how they do things. For example, sometimes you’ll find an OEM who buys a digital baseband from Qualcomm, for example, and then they buy an analog baseband from Qualcomm, and either power is integrated in that analog baseband or Qualcomm supplies an IC with power management,” she noted.

On the other hand, some OEMs pick and choose how the power is going to be managed. And finally, there is a top layer where software manages power consumption within the device—a layer of firmware and software that is above the hardware, Once you plug in to all of the hardware inside, there is a layer of firmware and a layer of software that is closest to the user, where the user actually can influence power usage, Vukicevic said.

Boosting 4G, Low Power

Monday, February 7th, 2011

By Ed Sperling
LTE Advanced, a significantly faster 4G standard, is gaining steam quickly as the race among the top smart phone providers shifts from features to performance. Unlike existing versions of LTE, LTE Advanced can handle peak data rates of up to 1 gigabit/second, which will be extremely useful in streaming video and online games, as well as better search response time.

With the same phones now available from multiple vendors—Verizon began selling the iPhone in the United States in 2011, with only minor differences, for example—the race is on now to provide faster loading of Web pages and better streaming.

The LTE Advanced standard was submitted to the Telecommunication Standardization Sector in Switzerland in late 2009. It is still awaiting finalization, but that hasn’t slowed the race for an early advantage in this space. Tensilica’s introduction today of five new LTE Advanced DSP cores is a case in point. The cores will be used in baseband systems developed for the 28nm LP process technology.

Building these cores is a first major step toward rolling out products for LTE Advanced this year. Like many of the latest processor cores, DSPs are becoming incredibly complex.

“There are two ways that DSPs fail,” said Chris Rowen, Tensilica’s CTO. “The first is in numerical computation. You add two 16-bit numbers and get a 17-bit number, but if you don’t keep track of that extra bit it can cause a problem. The industry has developed guard bits for this. The second way is the design of the pipeline. DSPs have exposed pipelines, and it’s left to programmers to determine the spacing. The problem is if you execute the instruction and interrupt you only get the result of the first instruction. You need everything to be fully interlocked.”

The new chips also offer a 4X improvement in flops per watt, with the high-end chip capable of running 128 multiple accumulates, the standard for DSP cores. That gain in performance for the same amount of power is particularly important because device power budgets are either fixed or shrinking.

The Tensilica introductions are merely one facet of a relentless global ecosystem push toward faster performance. Huawei showed off download speeds of up to 600 megabits/second 12 months ago at the 2010 GSMA Mobile World Congress. At that point, the company boasted speeds that were about 20 times faster than 3G networks.

And Qualcomm said this year that it has begun to evaluate LTE Advanced features. Qualcomm believes the next significant performance leap will come from leveraging topology, bringing the network closer to the user, and adding many low-power nodes. It said LTE Advanced will “improve capacity, coverage and ensure user fairness.”

Power Bits: Jan. 7

Friday, January 7th, 2011

By Ed Sperling
Microsoft will develop its next version of Windows for AMD, Microsoft and ARM SoCs. The emphasis is on SoCs, and the focus of SoCs has been on two things: power and the reusability of existing and commercially developed IP.

This is an interesting challenge for Microsoft, as well as for Intel, AMD, and ARM’s slew of partners. A general-purpose OS takes a lot more code to create—and it takes a lot more power to use—than a real-time operating system or an embedded version. The result is greatly reduced battery life and more time with a plug in the wall. Even open-source Linux has the same problem, which is why companies such as Mentor Graphics offer a slimmed down embedded version.

The big question for architects of these SoCs will be one of priorities. What takes precedence? Is it processing power? Is it performance? Or is it segregation of more efficient code for individual cores.

Microsoft’s announcement doesn’t address these kinds of issues. Intel has said next to nothing other than a canned statement from Douglas Davis, VP and GM of the tablet group: “…what is so exciting is how our two companies will be able to match a tailored, low-powered operating system with future generations of our popular Intel Atom processors…”

And comments from ARM, and ARM customers Nvidia, Qualcomm and TI have been no more enlightening. This isn’t a simple problem to solve while maintaining backward compatibility with bloated applications developed when power efficiency were far less critical than ease of use and connectivity. And it’s not one that anyone is likely to be talking about for at least a year or more. But when they finally do start talking, it will be very interesting to hear how these companies will position Windows and its very large code base.

Making Too Much Noise

Thursday, October 7th, 2010

By Ed Sperling
For the better part of a decade talk about signal integrity in mixed-signal designs has been noticeably absent. That’s about to change.

Prior to the adoption of a 130nm process, many semiconductor companies actually went on record saying they were considering abandoning plans to ever put analog and digital on the same chip because the noise on digital would interrupt signals. The issue seemed to die down after that. But at 32nm it has shown up again, driven this time by a multitude of problems—some new, some old, and all of them made worse because there are fewer alternatives.

“The problem has always been there,” said Navraj Nandra, director of analog/mixed signal marketing at Synopsys. “But it has suddenly gotten worse because of the design interfaces at higher speeds. At 40nm and 28nm transistors switch faster. We also have 8 Gbps PCI Express [generation] 3 and DDR3. You have multiple lane configurations with PCI Express. Graphics cards use eight lanes. We’re connecting by 16s. But PCI can use 96 lanes.”

That’s a lot of noise on an advanced chip, where the wires are thinner and thinner at each node and components are packed together more tightly. If a single atom of deposition can change the functionality of a transistor, imagine what can happen when you start adding in parasitics and electromigration.

Power corrupts
If the only thing that changed in an SoC was the manufacturing process—doubling the number of transistors on a piece of silicon for every rev of Moore’s Law—then lowering the voltage would actually improve signal integrity. It isn’t that simple, however.

Adding in multiple voltage supplies increases the noise level on the chip. “At 28nm and beyond we’re seeing 800 millivolt supply voltages and threshold voltages of 300 millivolts,” said Aveek Sarkar, vice president of support at Apache Design Solutions. “Not only is the noise on the supply voltage increasing at each node, but the sensitivity is also magnified.”

The current is faster, the drive strength is higher, and voltage noise is higher. And the problem gets worse as you add in power gating and multiple power islands, all turning on and off unpredictably and intermittently in close proximity to each other.

It also gets worse when you bring the voltage regulators onto the chip from the PCB.
“That becomes a problem if you want multiple power domains on a chip,” said Qi Wang, technology marketing group director at Cadence. “The regulator is analog and noise becomes a problem. You’ve got big digital areas that generate noise. That can be a big issue, especially for the voltage regulator. People are now overdesigning chips and that’s creating more of a problem as more and more analog is put on the die.”

What’s in the package?
At least part of what will have to change in many designs is the package, which frequently is an afterthought for the total design and most often based on price rather than its effect on the operation of an SoC.

“People want to put in a cost-effective package to cut their costs, but that kind of package was not designed to handle high-speed I/O,” said Synopsys’ Nandra.

That can create a huge problem for signal integrity. But packaging is typically the victim of a silo effect. It’s not part of the up-front architectural decision and it’s not part of the SoC model being created.

“The focus is on the semi design, but the package design is just as important,” said Apache’s Sarkar. “There are no so many different packages that it’s confusing. You have to worry about whether it’s four layers or two layers, and if you have 80 different power domains the package can get very complex. We’re seeing wireless chip packages now that are not uniform.”

Living in a material world
The substrate material is equally important in signal integrity. CMOS has been getting mixed reviews, in part because it’s a proven low-cost material with excellent conductivity. But it’s not especially good for mixed-signal applications at advanced nodes. So while Intel may get away with using it for a predominantly digital processor, an SoC has completely different needs—and different economics.

Materials such as silicon on insulator and gallium nitride do improve signal integrity, but that improvement comes at a price. SoI is the less expensive of the alternatives, and has been proven to work in designs since 65nm by IBM, AMD, and some of the partners in the Common Platform ecosystem.

The problem is that architects and designers don’t necessarily know what kind of substrate or package they will need up front because the IP they buy from third parties doesn’t include information about noise.

“IP vendors need to provide enough data constraints in their libraries to say how the IP can be used properly,” said Cadence’s Wang. “You need to know, for example, ‘For this ping it can be this close to a digital component,’ or ‘Do you put this within this distance of I/O.’

He noted this is an important new wrinkle in IP integration. “’We need a holistic solution for the ecosystem of IP providers. We need a better model, and we need EDA tools that are better and faster at noise analysis.”

3D stacking
What has many experts particularly worried is the effect of 3D stacking on signal integrity. While most of the focus has been on thermal effects—hot spots caused by putting two or more chips together—there is a magnified effect for noise.

Vendors such as Qualcomm, Freescale and IBM expect 3D stacking to begin rolling out in late 2011 or early 2012—roughly one year from now. From there the approach is expected to grow rapidly, in large part because it shortens the distances that signals need to travel, which in turn boosts performance while lowering the power needed to drive those signals.

But moving the mass SoC market in this direction compounds many of the issues for signal integrity that exist with packaging, substrates and proximity—while adding new ones.

“With a through-silicon via, the power noise is much worse,” said Apache’s Sarkar. “TSVs brings signals closer together, but the silicon substrate is not stacked in terms of coupling with the TSV. So how do you model this?”

Synopsys’ Nandra noted that 3D shifts the problem from the packaging inside the SoC. “With a stack die you’re communicating inside the die, so the I/O problem is less,” he said. “But within the die now you have interactions between platforms. Basically you’ve just shifted the problem.”

Conclusions
None of this has been lost on the tools vendors. Many are scrambling to bring new tools to market that can analyze noise, heat, IP integration problems, as well as the ability to model all of it.

But these are complex issues. There is no single tool that can do everything, and so far these are well outside of existing design flows. Moreover, there are no standards that effectively address the dynamics of using IP in a high-density, highly noisy environment that includes voltage changes, rapid power-up and power-down modes, SerDes and high-speed I/O, and the effects of packaging and substrates.

These are challenging problems that have to be deal with up front and together, both by design teams and by ecosystems that include IP vendors and foundries. And so far, semiconductor makers have merely scratched the surface.

Next Page »