<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Low-Power Engineering Community &#187; ARM</title>
	<atom:link href="http://chipdesignmag.com/lpd/blog/tag/arm/feed/" rel="self" type="application/rss+xml" />
	<link>http://chipdesignmag.com/lpd</link>
	<description>Making Semiconductor Architectures More Efficient</description>
	<lastBuildDate>Thu, 09 Feb 2012 17:50:48 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<item>
		<title>Processor Subject To Change</title>
		<link>http://chipdesignmag.com/lpd/blog/2012/02/09/processor-subject-to-change/</link>
		<comments>http://chipdesignmag.com/lpd/blog/2012/02/09/processor-subject-to-change/#comments</comments>
		<pubDate>Thu, 09 Feb 2012 08:01:56 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[Top Stories]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[CEVA]]></category>
		<category><![CDATA[DSPs]]></category>
		<category><![CDATA[Nvidia]]></category>
		<category><![CDATA[processors]]></category>
		<category><![CDATA[Qualcomm]]></category>
		<category><![CDATA[Tensilica]]></category>

		<guid isPermaLink="false">http://chipdesignmag.com/lpd/?p=3716</guid>
		<description><![CDATA[Customizable processors leverage the best of power management; a variety of approaches combine energy efficiency without sacrificing performance. ]]></description>
			<content:encoded><![CDATA[<p>By Ann Steffora Mutschler<br />
With power complexity driving sophisticated management techniques, SoC design engineering teams are turning to a new class of customizable processor architectures from ARM, CEVA, NVIDIA, Qualcomm and Tensilica and others to take advantage of the best in power saving techniques.</p>
<p>While these new architectures are novel approaches, the concepts are not especially new, particularly in mobile applications.</p>
<p>“If you look at what mobile processors have been doing, I would argue they’ve been doing some sort of big.LITTLE for a long time,” explained Nandan Nayampally, director of applications processor marketing in the processor division of ARM. “By that I mean you have microcontrollers taking charge when the big application processor is not working, or you’ve got video engines being separate from the main application processing. The compartmentalization of the activities around the chip have been always a focus for mobile because you will save power any which way you can. That’s a given.”</p>
<p>ARM has observed that what’s changed in the recent past is that the main OS needs to be running more and more of the time because with apps like Twitter feeds and Facebook updates, those are little apps that are constantly running on top of the OS.</p>
<p>As fun and/or useful as they are, these apps are killing battery life.</p>
<p>Nayampally explained the big.LITTLE architecture with an example. “Let’s say I’m doing an MP3 playback in the old days. You’d say, ‘I’m running on the big core, I kick off the task to a little core and then turn off the big processor because the MP3 can run just fine on a microcontroller type device. It’s all on the same die. Then suddenly you get a call and it wakes up the big processor and it takes over again. But when you offloaded that MP3 in the olden days—six months or so ago—you actually could have a separate task that wasn’t really run by the OS. Now there are so many more things and services that people are coming to expect that you can’t have them done specifically for targets that are different from the application processor itself and they run on top of the OS. Now you are telling the chip, ‘No, I won’t do these specialized things as separate things for very power-efficient sub-components, they have to be done by the main processor.’ But the main processor also has to become very schizophrenic in the level of performance it requires for the main tasks as well as what it needs for the little tasks.”</p>
<div id="attachment_3717" class="wp-caption alignnone" style="width: 508px"><a href="http://chipdesignmag.com/lpd/files/2012/02/big.LITTLE.jpg"><img class="size-full wp-image-3717   " src="http://chipdesignmag.com/lpd/files/2012/02/big.LITTLE.jpg" alt="" width="498" height="189" /></a><p class="wp-caption-text">Source: ARM</p></div>
<p>What makes big.LITTLE interesting is that the processors are fully coherent so the software engineer doesn’t have to worry as much about maintaining every piece of data. The coherency in hardware takes care of that. That makes the software development quicker and can actually improve performance and battery life.</p>
<p>Designed to be an extension of DVFS, there are multiple use models in which big.LITTLE can work, with the simplest use meant to be effectively transparent to the OS, Nayampally continued. “The power management software always speaks to a driver that is the right power and performance needed based on what is required. If, for example, you had today’s processor and it was using the lowest performance level it could while doing Twitter update, it just can’t be as efficient as something that was designed to be a fifth smaller or something like that. What if your DVFS had a next step that is more efficient and you can work there for a while? From an OS standpoint, or an application standpoint, it doesn’t matter. It’s just another step in your DVFS. Underneath it what happens is the driver now can do the kick-off to switch the operations from the big core to the little core or from the little core to the big core or cluster in fact.”</p>
<p>NVIDIA’s Tegra 3 employs variable symmetric multiprocessing (vSMP) while Qualcomm uses asynchronous symmetrical multiprocessing (aSMP) – which are the same principles that govern ARM’s big.LITTLE architecture.</p>
<p>NVIDIA’s Tegra 3, launched last November is a quad-core mobile processor for smartphones and tablets, currently shipping in the ASUS Transformer Android tablet. A company spokesman explained that behind Tegra 3’s power efficiency is a fifth lower-power “companion” CPU core that goes with the four CPU cores and is specifically targeted at battery savings. Tegra 3’s architecture allows it to provide the best combination of performance and battery life by switching between the four main CPU cores and the fifth core for less demanding tasks and active standby mode.</p>
<p>For CEVA, which licenses DSPs, programmability has always been the name of the game, according to Eran Briman, the company’s vice president of marketing. About seven years ago it became apparent that general-purpose DSPs are not going to make the cut for next-generation designs—particularly in 40nm communications designs. In one of its newest offerings, the CEVA-XC DSP software-defined radio architecture, users can run the complete receive and transmit channels entirely in software, except for very few hardware engines that simply don’t make sense in software, he said. To accompany this and to allow for advanced power management, CEVA recently released a software development kit that includes advanced power management. Looking ahead, Briman believes there will be fully programmable communications units on SoCs.</p>
<p>CEVA isn’t the only company in the DSP space to see this trend.</p>
<p>“Many baseband designs particularly, when they are operating on complex protocols and care a lot about energy have moved to neither completely hard-wired—because that would be too fragile or intolerant of inevitable corrections and improvements—nor completely general-purpose, because a general-purpose processor is generally much less energy-efficient than something that is more specific to the task at hand,” observed Chris Rowen, CTO at Tensilica. “Especially in low-power baseband processing, we’re seeing more and more optimization of programmable engines to do this, where the baseband subsystem might include 6 or 8 or 10 different cores that are programmable. Some of these still may be fairly general-purpose, because you may say in this function though there’s a wide variety of different tasks that I need to do on the data and it is more energy efficient for me to have one that is shared among these different, diverse functions than to have one piece of hardware for every single function. That would make it too big. Having a programmable solution can in some cases also make it a smaller solution. In general, small is good for energy.”</p>
<p>Tensilica offers a range of DSP cores. It also allows users to build their own customized dataplane processors.</p>
]]></content:encoded>
			<wfw:commentRss>http://chipdesignmag.com/lpd/blog/2012/02/09/processor-subject-to-change/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Step Away From the Spreadsheet</title>
		<link>http://chipdesignmag.com/lpd/blog/2012/02/09/step-away-from-the-spreadsheet/</link>
		<comments>http://chipdesignmag.com/lpd/blog/2012/02/09/step-away-from-the-spreadsheet/#comments</comments>
		<pubDate>Thu, 09 Feb 2012 08:01:48 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[Top Stories]]></category>
		<category><![CDATA[Apache Design]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[Cadence]]></category>
		<category><![CDATA[Calypto]]></category>
		<category><![CDATA[Mentor Graphics]]></category>
		<category><![CDATA[Synopsys]]></category>
		<category><![CDATA[Tensilica]]></category>

		<guid isPermaLink="false">http://chipdesignmag.com/lpd/?p=3727</guid>
		<description><![CDATA[Complex system-level power analysis requires more than just a worst-case scenario. Planning for power needs to happen earlier in the design process.]]></description>
			<content:encoded><![CDATA[<p>By Ann Steffora Mutschler<br />
Engineers today spend more than a quarter of their time trying to meet power specifications.</p>
<p>A <a href="http://www.deepchip.com/items/0498-04.html">survey</a> of more than 700 engineers by Calypto illustrates just how important and time-consuming power management is today for engineering teams. As consumer devices grow ever more complex, the need to deal with, analyze and optimize power at not just the RTL but at the system level is the next challenge, even if the path to reach that goal is not yet clear.</p>
<p>The opportunities for optimizing a design for power efficiency are greatest at the architectural level of abstraction. The further a design moves downstream the less effective optimization techniques become, noted Yossi Veller, chief scientist for ESL at Mentor Graphics, in a white paper he co-authored for ARM’s <a href="http://iqmagazineonline.com">IQ Magazine</a>. “Power optimization must begin with architectural analysis, exploration, and optimization of power and timing at the electronic system level (ESL). According to a study by LSI Logic, techniques available at the RTL synthesis phase have the ability to reduce power by 20%; those at the gate level offer a 10% reduction; while those at the layout level can reduce power by only 5%. Waiting until the RTL to begin optimizing for power is a wasted opportunity because power usage can be reduced by 80% at the ESL.”</p>
<div id="attachment_3728" class="wp-caption alignnone" style="width: 607px"><a href="http://chipdesignmag.com/lpd/files/2012/02/Mentor_Figure1.jpg"><img class="size-full wp-image-3728" src="http://chipdesignmag.com/lpd/files/2012/02/Mentor_Figure1.jpg" alt="" width="597" height="281" /></a><p class="wp-caption-text">Fig. 1: The ability to optimize power at the architectural far exceeds that at lower levels of abstraction.</p></div>
<p>“Traditional power optimization tools are really working at the lower levels of abstraction,” explained William Ruby, senior director of RTL power product engineering at Apache Design. “If you look at synthesis, if you look at physical design, there are some automated techniques that are available in those tools. But those are in a category of additional refinement-type steps. Once you have the design architecture nailed down, then you can add in some optimizations based on those tools and you can get some additional incremental power savings, but the part that is missing is enabling the true design-for-power efficiency. If you look at modern chip architectures, they are extremely complex and the RTL descriptions of these architectures are even more complex such that RTL in some cases is no longer seen as a viable architectural description language. You want to be able to describe the architecture of the design in a high level of abstraction.”</p>
<p>With this description comes the requirement to be able to analyze power. Today, this is done by synthesizing the design from a high-level description such as C++ down to RTL, and then an RTL power analysis tool can function and give feedback into the architectural domain. But what needs to accompany this synthesis-loop-back type of flow and give some indication of what the power numbers is more intelligence in those high level tools. They need to point out inefficiencies in a design at both the RTL and architectural levels.</p>
<p>Chris Rowen, CTO and co-founder of Tensilica sees two big challenges for power analysis tools. “One, it is very, very difficult to isolate where the real problem is. It only makes sense to really measure power at the level when you have really synthesized the logic and laid it out and you actually know what the physical design looks like, because the physical design has a huge impact on what the power dissipation of the circuit it.”</p>
<p>By the time it has gone through synthesis and place and route, you have really very little visibility into what was the original logic being questioned. “It all goes into the Cuisinart and all you get is this amorphous mush of gates at the end. So if someone asks you, ‘How much power is being dissipated in my multiplier versus in my divider versus in my register file,’ I don’t know anymore because I have to process them all together in order to get good physical results. But then it all has been aggressively remapped into other logic forms and I can’t isolate the power easily. So you have to work in rather indirect ways to figure out whether the power was being dissipated in one function versus another.”</p>
<p>A second problem, he said, involves system-level tracking of different scenarios. “It is extremely difficult to reach your power goal if you say, ‘Let me use the worst case assumption about each subsystem. I’m going to assume that every piece of my baseband is on, and every piece of my Layer 2 and Layer 3 protocol stack is on, and my image processor is on, and my apps processor is running full out, and all of my RF subsystems are running,’ because of course you’d exceed your power budget by a factor of two or three. Instead people recognize they’re not all on at the same time, the system doesn’t work that way. When you are doing one thing, then you’re typically not doing something else. Therefore, you only have to look at the particular combination of subsystems that is on at that time. However, the software guys have really poor tools to correlate what’s going on in the higher-level operating modes to what’s going on in terms of actual power dissipation in different subsystems. They are completely shooting in the dark where they do not have anything like the kind of accuracy for the modeling of these things.”</p>
<p>As a step towards true system-level power analysis, engineering teams are gradually figuring out that they need to build approximate models of power in addition to simulation environments that are fast enough to run realistic scenarios and to capture real activity. “Ironically getting power information is more than anything else probably a function of getting fast enough simulation, because only if you can run realistic size scenarios will you really gain interesting information,” he said.</p>
<p>This has become one of the big drivers of ESL, which until recently has been relatively slow to catch on. But complexity at advanced nodes, including power considerations, have significantly boosted it’s appeal.</p>
<p>“What the user would like is to have at the very early stages, when he has a TLM model of the design, is at least a relative assessment what architecture decisions will impact the energy in which direction,” said Frank Schirrmeister, group director for product marketing of the system development suite at Cadence. “He will also want to know how the software impacts all of that. From a technology perspective, TLM models allow you to do that so it’s fairly straightforward to annotate power-related data into TLM models,” he asserted.</p>
<p>Annotating models with data just like annotating performance is a challenge and can be approached in three ways:</p>
<p>First, he said, “You can start with your assumptions, with your power budget. TLM models and virtual prototypes allow you to then execute your assumptions so you have in your power envelope/power budget. You say, ‘These tasks should take that much power, I know that from past experience,’ and then you execute your virtual platform with those annotated, estimated data or budgeted data. And you get dynamic results depending on what tasks the software ends up calling, how long a cell phone is used for which task in a day, and so forth.”</p>
<p>Second, annotate back from when you have RTL. “At the RTL level you have these switching formats that you can derive from the RTL to get a good idea about the activity,” Schirrmeister continued.</p>
<p>And third, it can be dealt with at the silicon level by taking previous designs, measuring power information and annotating back into TLM models.</p>
<p>Design engineers are undoubtedly looking for analysis and optimization at the system level so they can do power analysis and power estimation before RTL is available and before they can do gate-level simulations. But are they truly ready to adopt it?</p>
<p>Achim Nohl, technical marketing manager for Synopsys’ solutions group pointed out that today, power analysis starts with gate-level simulation. “If you talk to a hardware engineer and tell him, ‘We are going to employ virtual prototyping and high-level models to do power analysis,’ he will certainly look at you a little strange because he thinks, ‘I’m doing all those back-end optimizations and all those specific things to optimize power. How will you ever be able to reflect that in a virtual prototype simulation?’ But that’s not the point. For virtual prototyping, the granularity of a system is very much different. You’re not looking at just the memory controller. You’re looking at the CPU with the memory controller, the buses, the interconnect, the peripherals and how all those things are orchestrated to find out where the different hot spots are and what is best way to program all those pieces. What is the best scheduling technique? That is the concern at that level.”</p>
<p>When a new chip is architected today, estimates are done to determine whether the chip is feasible at all from a power perspective, he said. “Today, people are using spreadsheets in order to do this analysis, and this can only be a worst case analysis because they don’t know the dynamics and can’t reflect the dynamics of the system in those spreadsheets.”</p>
<p>While the pure architectural level tools don’t exist yet, many users are likely content with high-level synthesis tools for the time being. Apache’s Ruby believes they are good in their own respects but they are not actually meant to give architectural guidance; they are just meant to synthesize the design above the RTL.</p>
<p>One final thought for nervous system architects: The architectural tools of the near future will not replace the actual architect unless they become truly artificial intelligence, which is not likely to happen any time soon, Ruby concluded.</p>
]]></content:encoded>
			<wfw:commentRss>http://chipdesignmag.com/lpd/blog/2012/02/09/step-away-from-the-spreadsheet/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Next Big Challenge</title>
		<link>http://chipdesignmag.com/lpd/blog/2012/01/12/the-next-big-challenge/</link>
		<comments>http://chipdesignmag.com/lpd/blog/2012/01/12/the-next-big-challenge/#comments</comments>
		<pubDate>Thu, 12 Jan 2012 08:01:25 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[Top Stories]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[Cadence]]></category>
		<category><![CDATA[Mentor Graphics]]></category>
		<category><![CDATA[Tensilica]]></category>
		<category><![CDATA[Texas Instruments]]></category>

		<guid isPermaLink="false">http://chipdesignmag.com/lpd/?p=3647</guid>
		<description><![CDATA[Integration of different operating systems across SoCs with a focus on power is forcing some unusual combinations—and lots of headaches.]]></description>
			<content:encoded><![CDATA[<p>By Ed Sperling<br />
Software is the next big target in the quest to make electronics more energy efficient, but it’s proving a far bigger challenge than most systems architects originally believed it would be.</p>
<p>There are several very large big problems to deal with in software. Writing efficient code for small processors isn’t one of them. In fact, the proliferation of small processors across an SoC makes it easier to deal with at least a portion of the software software. Code can run directly on the bare metal, some of it can be nothing more than an executable file, and still other code can run on a real-time operating system written for a specific purpose or even on slimmed down versions of operating system code.</p>
<p>But bringing all of this code under the control of an SoC is another matter, despite the fact that this is the best way to manage power and minimize physical effects in a chip. Solving this problem requires integration and coherency across a chip, which in turn requires software architects and system architects to work together up front. This may be a goal among companies, but it certainly isn’t a reality.</p>
<p>“You need coherence to develop a high-end software design,” said Dan Driscoll, Nucleus software architect for Mentor Graphic’s Embedded Software Division. “At this point integration is a large portion of the effort, and the problem has yet to be solved. One thing that helps is a single development environment. If you use multiple profiling tools it’s more difficult to pull that together into a system.”</p>
<p><strong>Devils in the details</strong><br />
Just understanding the interactions between various hardware portions of an SoC has far exceeded human limits in complex SoCs, even at mainstream process nodes. Most companies use a block or subsystem approach to deal with this complexity, working on smaller pieces and then assembling them into the whole and hoping it works as a single system.</p>
<p>Software increases the complexity by orders of magnitude, because an increasing amount of software now controls functionality across the chip. It determines what remains on, what gets turned off, in what sequence, at what speed, and what gets priority. It also determines how much power and memory can be allocated to a given function or logic subsystem—at least in 2D designs. (In stacked die, it may be possible to dedicate portions of memory to logic blocks to minimize this issue).</p>
<p>“This is the job of the controller software for the overall system,” said Frank Schirrmeister, group director for product marketing of the system development suite at Cadence. “You tell it to execute this API or put data over here. This is a high-level sequence, and it can do connectivity between different cores of a processor. You also can add up the energy transactions and memory transactions that will trigger.”</p>
<p><strong>Multi-core, many-core, and multiple processors</strong><br />
A second big problem stems from the types of processors being used. The ability to write software applications that can take advantage of multiple cores is an old and well-understood issue—about four decades old, in fact. And while it’s easy for processor makers to add more cores onto a piece of silicon and hand it off to applications developers to deal with, the reality is that most applications cannot be parsed to take advantage of more than eight cores, and in many cases the number is likely to be fewer than four.</p>
<p>Databases, scientific calculations and graphics rendering, where there is extreme redundancy, are the exceptions. Even some games can have functionality parsed across cores. For most other applications, though, the limit it probably two to four cores. And if these cores are running popular general-purpose operating systems such as Windows, Mac OSX or Linux, chances are pretty good that it’s not the most efficient implementation of a function even though it may be the most convenient.</p>
<p>RTOSes have been used by the military for decades as a much more energy-efficient alternative, although most of that work was far less concerned about the energy than about security and performance. Their shift into commercial applications such as mobile phones makes them especially suitable for managing specific functions on separate processor cores in an SoC. It doesn’t make sense, for example, to utilize a multicore general-purpose processor for audio enhancements, and if it isn’t running on a general-purpose processor then it probably doesn’t need a general-purpose OS, either. But those functions still have to work with other parts of the chip without affecting signal integrity or creating hardware proximity effects such as heat, ESD and electromigration.</p>
<p>“The idea of SMP (symmetric multiprocessing) beyond 8 to 16 cores is not realistic for most applications,” said Mentor’s Driscoll. “We’re almost stuck with AMP (asynchronous multiprocessing) as part of large multicore implementations. But we’re seeing cases where you may have a TI OMAP 5, running a dual-core ARM Cortex A-9, an A4 and a DSP. You may have six or seven cores, and a general-purpose operating system going through this part of the system. That operating system may control other DSP interfaces, including RTOSes.”</p>
<p><strong>Verification and testing brain freezes</strong><br />
This approach leads to another problem, though. How do engineering teams verify and test this complex SoC, which now may include multiple types of processors and processor cores, various types of software, and a central software management scheme that probably involves a standard operating system? There may even be middleware making some of the connections, and in homogeneous environments possibly even a virtualization layer that may include hypervisors that can run on bare metal.</p>
<p>“The first thing you have to deal with is a traffic debug issue,” said Cadence’s Schirrmeister. “In many cases, the partitioning may happen by hand. But how you pull this all together may affect your debug strategy. Tensilica presented an extreme example involving a printer design, where they had a block diagram of the functionality and the cores. The printer company used Tensilica cores, which allowed them to replace the functions done in RTL with programmable functions. The connections worked, the memories worked, and the functionality was done in software as bare-metal, low-level software.”</p>
<p>There’s a tradeoff in doing that, however. Driscoll said that pushing functionality down to lower-end processors makes integration more complex. In addition, measuring power consumption becomes more difficult because it means adding up energy transactions that the memory transactions will trigger.</p>
<p>“That means you need data to verify what works at the block level, the subsystem and in the overall system,” Schirrmeister said. “And some chips have processors you can’t access from outside for security reasons. You need flexibility in the software because of security, but you are not allowed to see it from the outside.”</p>
<p><strong>Conclusion</strong><br />
While there has been much attention devoted to finding a common language between hardware and software engineers, the real path forward may be more focused on matching goals at the architectural stage, and then being able to swap information as a design progresses. </p>
<p>Virtual platforms that allow software to be developed earlier in the process help. So do some of the features that are being built into RTOSes these days. In addition, stacked die will help eliminate some issues, while creating new ones. But the real challenges will continue to be integration of hardware and software, and of various types of software with other software—with an eye toward remaining within a power budget and understanding how code affects energy consumed over time.</p>
]]></content:encoded>
			<wfw:commentRss>http://chipdesignmag.com/lpd/blog/2012/01/12/the-next-big-challenge/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Power Bits: Last Laptop Standing, Bacteria Power</title>
		<link>http://chipdesignmag.com/lpd/blog/2012/01/06/power-bits-last-laptop-standing-bacteria-power/</link>
		<comments>http://chipdesignmag.com/lpd/blog/2012/01/06/power-bits-last-laptop-standing-bacteria-power/#comments</comments>
		<pubDate>Fri, 06 Jan 2012 17:36:40 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[News Stories]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[Dell]]></category>
		<category><![CDATA[Lenovo]]></category>
		<category><![CDATA[U.S. Naval Research Laboratory]]></category>

		<guid isPermaLink="false">http://chipdesignmag.com/lpd/?p=3625</guid>
		<description><![CDATA[Lenovo rolls out hybrid laptop that promises up to 10 hours of battery life based on Qualcomm chip; U.S. Navy looks at bacteria-powered fuel cells.]]></description>
			<content:encoded><![CDATA[<p><strong>Road Warrior Tools</strong><br />
Being able to fly cross-country using a laptop all the way without plugging it in is one thing. Being able to fly across the Pacific Ocean is quite another.</p>
<p>The race is on not just to extend battery life, but to extend it while actually doing something useful on all mobile devices, whether that’s a PC or a smart phone. That requires a significant amount of specialization in both the processor and the software. </p>
<p>Lenovo’s <a href="http://news.lenovo.com/article_display.cfm?article_id=1543">announcement</a> this week of its ThinkPad X1 Hybrid is a case in point. The laptop includes something called Instant Media Mode, which it calls a second PC. Based on a dual-core Qualcomm processor running Linux, this chip can be used to watch videos, listen to music and surf the Internet. That still leaves the regular Intel chip to do the bulk of the heavy lifting, but it’s an interesting approach. </p>
<p>Lenovo isn’t the first company to come up with this idea, of course. Dell introduced a similar device back in 2009. The current iteration, called Latitude ON and available in its lineup, uses an ARM Cortex M3 core running in a Broadcom chip to achieve what ARM claims is <a href="http://www.arm.com/markets/mobile/dell-latitude-e4200-laptop.php">multi-day battery life</a>.    </p>
<p>This also helps explain why the netbook market segment has largely disappeared overnight, wedged out by tablets on one side and long-life laptops on the other. Interestingly, ARM seems to be the common thread in all of these.</p>
<p><strong>Bacteria In Space</strong><br />
It’s amazing what you can do when you don’t need air to sustain life. The U.S. Naval Research Laboratory is looking at “<a href="http://www.nrl.navy.mil/media/news-releases/2012/navy-researchers-investigate-small-scale-autonomous-planetary-explorers">microrovers</a>,” vehicles that weigh in at about 2.2 pounds (1 kilogram) and powered by microbes.</p>
<p>The combination of low-power electronics, low energy consumption and microbial fuel cells that continue to generate energy while regenerating themselves is an unusual approach. The anaerobic bacterium (geobacter sulfurreducens) is expected to have an extremely long lifespan, which will be essential in deep-space exploration.</p>
<div id="attachment_3629" class="wp-caption alignnone" style="width: 382px"><a href="http://chipdesignmag.com/lpd/files/2012/01/image1_8-12r_372x245.jpg"><img src="http://chipdesignmag.com/lpd/files/2012/01/image1_8-12r_372x245.jpg" alt="" width="372" height="245" class="size-full wp-image-3629" /></a><p class="wp-caption-text">Electron microscope image of bacteria used for power. Source: U.S. Navy.</p></div>
<p>The Navy says a portion of the energy generated by the bacteria will be used to maintain electronics, with the rest used to charge a battery or capacitor. From there, the robot will be propelled using a tumbling or hopping motion. </p>
<p>One question, though: Do you need environmental impact statements for this kind of stuff?</p>
<p><em>&#8211;Ed Sperling</em></p>
]]></content:encoded>
			<wfw:commentRss>http://chipdesignmag.com/lpd/blog/2012/01/06/power-bits-last-laptop-standing-bacteria-power/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>One On One: ARM CTO Mike Muller</title>
		<link>http://chipdesignmag.com/lpd/blog/2011/12/01/one-on-one-arm-cto-mike-muller/</link>
		<comments>http://chipdesignmag.com/lpd/blog/2011/12/01/one-on-one-arm-cto-mike-muller/#comments</comments>
		<pubDate>Thu, 01 Dec 2011 07:02:54 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[Top Stories]]></category>
		<category><![CDATA[3D stacked die]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[FinFETs]]></category>

		<guid isPermaLink="false">http://chipdesignmag.com/lpd/?p=3555</guid>
		<description><![CDATA[A candid interview about the future of Moore’s Law and the impact of power and leakage on all future designs.]]></description>
			<content:encoded><![CDATA[<p><strong>LPE</strong>: How far does Moore’s Law extend forward and what are we likely to encounter along the way?<br />
<strong> Muller</strong>: The good news is there is no known solution for 7nm. That implies that between now and then it’s okay. When I talk to people they seem fairly confident they’re going to get there. Exactly how they don’t know. Will there be any miracles needed? Yes, probably one or two. But 14nm and down close to 7nm will happen. The bad news is that frequency will be flat with constant leakage.</p>
<p><strong>LPE</strong>: That’s an interesting perspective.<br />
<strong> Muller</strong>: Life is full of tradeoffs. People have traditionally taken different tradeoffs on process development. But in the past a lot of that was about getting frequency uplift. You can trade that in lots of different ways, and there is still frequency uplift to be had. But that costs you in terms of leakage, and people worry about that much more than they used to and where those tradeoffs are made.</p>
<p><strong>LPE</strong>: Does that leakage continue even with the introduction of FinFETs and other techniques?<br />
<strong> Muller</strong>: New process techniques like FinFETs help, but they’re one-off advances. You draw your curve, and there are times when you get ahead of the curve. Then you’re on the gradual slope back down again. So there are one-off things that really help with leakage. But once you’ve done that, you’ve still got three impossible things to do before breakfast to get you back down to 7nm. Those steps are part of the solution, but they don’t solve it to the point where leakage is going away and you don’t need to worry about it anymore.</p>
<p><strong>LPE</strong>: How about dropping the voltage?<br />
<strong> Muller</strong>: We’ve always done voltage scaling, and DVS (dynamic voltage scaling) continues. There will be different learnings about how much voltage scaling can you get. If you can do it, voltage is one of the best things you can do for saving dynamic and static power. That will continue, but the margins are getting harder to find.</p>
<p><strong>LPE</strong>: ARM just introduced its big.LITTLE approach. What’s the thinking behind that?<br />
<strong> Muller</strong>: The idea is that you can crank down the voltage and save power and scale it. There are times when you need performance, which is the ‘big’ part, and there are times when you don’t need that. You cannot build as efficient a microarchitecture for the big cores as you can for the little cores because getting that single-thread performance involves a lot of microarchitecture complexity and speculation, which ultimately costs you power. If you don’t need all of that performance, and your voltage scaling has run out of anywhere to go, the right thing to do is to task migrate onto an identical but smaller core with a simpler microarchitecture. That works wherever you are and on whichever process. It will always be true. You will be able to build much more efficient little cores than big cores.</p>
<p><strong>LPE</strong>: How does this affect the overall device architecture?<br />
<strong> Muller</strong>: This is an OS-level task migration, which happens anyway. You determine how many SMP cores you need to light up. Then you do task migrations. It’s another step to migrate onto a smaller core. That’s something you just build into the OS. You don’t need to add any extra magic. It’s already happening.</p>
<p><strong>LPE</strong>: Is this going to apply in stacked die with rightsizing of functions?<br />
<strong> Muller</strong>: The stacked die is almost an orthogonal issue. It’s happening today with flash and SoCs put into the same package because of packaging constraints. It opens the door to completely different die-to-die memory interfaces, which allow you to build more efficient systems than going off-chip, down-chip to a separately packaged die. It changes some of the memory bandwidth. But it’s just a computer at the end of the day, so main memory bandwidth is one of the fundamental determinants of performance. Stacking allows you to change that. Whether you’re stacking big cores, little cores, or big.LITTLE cores in combination, for different applications you’ll need different combinations. And you exploit that with main memory bandwidth.</p>
<p><strong>LPE</strong>: It doesn’t sound like we’ve made much real progress in terms of true multiprocessing software for most jobs.<br />
<strong> Muller</strong>: When I went to university, which admittedly was a few years ago, I was taught never to trust an MP solution from a hardware guy. That was one of the lectures from a guy who invented the sub-routine. I think he was right, but for low numbers of cores—eight and less—SMP is a fixed problem or a solved problem because you have enough system complexity that you do have a browser and a background task. You don’t have to worry too much about how well you’ve taken an application and threaded it.</p>
<p><strong>LPE</strong>: But you’ve split the functions rather than threaded the application, right?<br />
<strong> Muller</strong>: That’s the first step. And for two, three or four cores you can do that without really having to re-do anything. When you get into re-programming applications like your browser and executing that on multicore, there are a limited number of applications that drive that performance envelope. Your small applet you’re running doesn’t touch it. You’re going to do browsers and virtual reality apps where programmers are willing to go back and figure out how to re-program and rewrite it. It’s true that the general software community is not set up for generating multicore applications. For most applications, you don’t need it. Beyond that, there is database lookup that&#8217;s independent of any one application scaling.</p>
<p><strong>LPE</strong>: So populating an SoC with small processors is a way of splitting off functions?<br />
<strong> Muller</strong>: Yes, and heterogeneous isn’t just about big.LITTLE. It’s about having entire subsystems for tasks, which may be a Cortex-A5 running a complicated audio subsystem that might actually be for custom hardware. If you open up an SoC for a mobile phone you’ll find all of those things in there. The challenge is the programming model for that heterogeneous system, let alone programming the multicore apps processor with lots of cores in it.</p>
<p><strong>LPE</strong>: And you need coherence across all of that, right?<br />
<strong> Muller</strong>: Some of it is about system-level coherency, and some of that is in the programming model. There are three or four emerging standards for that. What they address is which computing where. You still come back to manual placement of the different processing elements for different tasks. That’s not a solved problem.</p>
<p><strong>LPE</strong>: So as you look forward, is power and/or leakage the big issue?<br />
<strong> Muller</strong>: If you go back to ARM 1990, we always talked about power/performance/area and the tradeoff between them. I don’t think that’s changed. If it’s all about power, run at a kilohertz, sub-threshold, and you come up with completely different solutions. If it’s only the Internet of things and tiny embedded microcontrollers, you still have to figure out what’s your budget, what’s your power and what’s your performance, and balance between them. In the future we won’t just worry about power.</p>
<p><strong>LPE</strong>: But in the future will power become more important in the PPA equation?<br />
<strong> Muller</strong>: It depends on who’s talking. We’ve always had power up there as a fundamental part of what we do. There is no sudden change of course. Power really matters in system-level integration, whether it’s megawatts in server farms or milliwatts of active power in a small SoC device. We’ve always worried about that. It’s just maturing for more systems, but it’s something we’ve always done.</p>
]]></content:encoded>
			<wfw:commentRss>http://chipdesignmag.com/lpd/blog/2011/12/01/one-on-one-arm-cto-mike-muller/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Power Gating And Power-Centric Programing</title>
		<link>http://chipdesignmag.com/lpd/blog/2011/11/03/power-gating-and-power-centric-programing/</link>
		<comments>http://chipdesignmag.com/lpd/blog/2011/11/03/power-gating-and-power-centric-programing/#comments</comments>
		<pubDate>Thu, 03 Nov 2011 07:01:40 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[Top Stories]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[CPF]]></category>
		<category><![CDATA[UPF]]></category>

		<guid isPermaLink="false">http://chipdesignmag.com/lpd/?p=3464</guid>
		<description><![CDATA[Nothing is straightforward, and workarounds have power and performance penalties.]]></description>
			<content:encoded><![CDATA[<p>By Pallab Chatterjee<br />
SoC design has a number of techniques for power management. One of the more prevalent methods is to use power gating to turn on and off blocks based on applications being run, and mode controls. Power gating while being supported by the two major EDA power design flows, UPF and CPF, still has some implementation challenges.</p>
<p>The flows have to make sure that the states of the logic at the interface to the blocks being turned off do not get corrupted due to changes on the shared ground/supplies. Basic power gating is well known. However, its use in both multiple power supply systems and multi-logic threshold systems still has some challenges. Power gating requires the outputs of the switched gates to be isolated from the control signals on the inputs, and also that the output get clamped at some state—low, high or &#8220;last value.&#8221; </p>
<p>The power gating function results in a reduction in the logic level swing due to the IR drop of the &#8220;on&#8221; device between the logic cell and the power supply/ground. Gate bias and level-shifting to a second set of power rails to drive the gate buffer control logic allows for the power gating devices to have a reduced IR drop to the virtual supply (VVDD). Timing construction for this type of function, however, is transparent to the UPF/CPF design flows.</p>
<p>A workaround for the logic that has to interface with power-controlled blocks is to use state retention registers. This solution has quite a bit of area/performance penalty as it requires a formal and powered-on register bank for each I/O-facing logic block in the sub-block. The gate count is expensive for full state coverage, and partial state coverage has validation issues. There is an additional cost of power and latency. The latency is due to the loading and unloading of the software state for save/restore.</p>
<p>To address these issues, designers can use an enhanced DFF with connection to the always-on retention register power supply. This cell would have to support save, hold, restore and normal operation functions. UPF and CPF do not always work directly with these non-RTL states and impact the validation flow. A further challenge is the functional planning and implementation of set and reset signals through the retention registers and the impact of those signals on the data being held for the &#8220;off&#8221; blocks.</p>
<p>ARM, in the Cortex M class products, has implemented low-cost state retention using sub-period clocks and secondary power supplies for the retention devices. These sub-period clocks allows Set and Reset functions to occur on an asynchronous basis with the system clock. The logic blocks are generally built using clocks from a DVFS control system.</p>
<p>The challenge for using these blocks is to not only integrate them into the timing flow of the circuits, but to make sure that the retention registers can safely provide data, at the correct logic level, with the blocks that are on. As application programs gain control of the power gating function, simple state machine-based control for these registers is not sufficient. Programming optimization of the high-level language function now have more interaction with the data flow per block. This results in environments such as OpenCl, which sends tasks to both distributed CPUs and GPUs through common and segmented memory controls, having a great deal of impact on when blocks are on or off. Normally, a compute task that has no output view is contained just in the CPU signal path, and the GPU can be powered down. Under OpenCL, it is possible to have this task sent to both the CPU and the many threads of the GPU and then combine the results in central memory. This has an impact on the power control, because to achieve the performance enhancement of the extra computation capability you cannot tolerate the latency of a turn-on, reset or restore, and then store and turn-off cycle of the GPU. This latency is typically longer than the compute cycle.</p>
<p>The design verification is still hampered by the fact that none of the logic verification environments can model these turn-on and turn-off state transitions as the power supplies change under application software control. The simulations are based on timing for the power supply control switch transitions, and estimates based on RC load for the blocks to be either available or not.</p>
]]></content:encoded>
			<wfw:commentRss>http://chipdesignmag.com/lpd/blog/2011/11/03/power-gating-and-power-centric-programing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Five Important Changes That Will Affect Power</title>
		<link>http://chipdesignmag.com/lpd/blog/2011/11/03/five-important-changes-that-will-affect-power/</link>
		<comments>http://chipdesignmag.com/lpd/blog/2011/11/03/five-important-changes-that-will-affect-power/#comments</comments>
		<pubDate>Thu, 03 Nov 2011 07:01:22 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[Top Stories]]></category>
		<category><![CDATA[2.5D stacking]]></category>
		<category><![CDATA[3D Stacking]]></category>
		<category><![CDATA[Apache Design]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[Cadence]]></category>
		<category><![CDATA[FinFETs]]></category>
		<category><![CDATA[Mentor Graphics]]></category>
		<category><![CDATA[Synopsys]]></category>
		<category><![CDATA[Tensilica]]></category>
		<category><![CDATA[TSMC]]></category>

		<guid isPermaLink="false">http://chipdesignmag.com/lpd/?p=3450</guid>
		<description><![CDATA[A handful of new approaches to energy reduction and power management will provide huge benefits for mobile devices.]]></description>
			<content:encoded><![CDATA[<p>By Ed Sperling<br />
So far most of the energy savings in SoCs have been achieved using two main approaches—turning off most of the chip most of the time, and changing the materials used to insulate against current leakage.</p>
<p>Over the next few years, changes to designs will be more radical, encompass more pieces of a bigger system, and they will be orders of magnitude more effective. From a market standpoint, there is little choice. Computing increasingly is going mobile, and time between charges is a competitive edge. The caveat is that increased battery life has to come with a subsequent increase in functionality. Everything that could be done with a plug now will have to be done without one.</p>
<p>That means rethinking everything from the hardware design to the usage model to the software that runs on those platforms. And it means getting chips out the door at least as quickly, if not more quickly. Here are five trends and approaches that collectively, and sometimes individually, will have a big impact on energy efficiency, power consumption and leakage:</p>
<p><strong>1. Rethinking the basics.</strong> Some of the biggest advances in efficiency will come from optimizing existing technology. There is more to turn off, more pieces to improve, and there are more ways of doing it better.</p>
<p>Consider something as basic as the clock, for example. The big focus has been maximizing frequency for nearly five decades. There are even concurrent clocks to make that happen. But having them always on and always running at the same frequency means they use a lot more energy than necessary. </p>
<p>“Design has always centered around the clock being the heartbeat of the system,” said Chi-Ping Su, senior vice president of R&amp;D for Cadence’s Silicon Realization Group. “So people always assume the clock will be on. What we have found, working with ARM and the processor type of design, is that the clock consumes an extremely large percentage of the power. Timing and frequency are based on the clock. So you build a tree to be the ideal clock and you do everything based on that. When we started looking at it, we started asking why clocks need to be balanced at all.” </p>
<p>So how much energy can be saved? Su contends the amount is up to 30% of clock-tree power and up to 50% of dynamic power for the entire system.</p>
<p>He’s not alone in touting these kinds of numbers. Most SoC tools developers believe that dealing with energy/power/leakage at or before RTL can mean significant savings for the overall design.<br />
“All the low-hanging fruit is still available to chip designers,” said Vic Kulkarni, senior vice president and general manager at Apache Design. “We find that even advanced designers are more concerned with meeting functionality and identifying power bugs. What they forget is the relationship between data, clock, reset and enable—the four signals in an SoC.”</p>
<p><strong>2. Reducing distance and resistance.</strong> Over the next two years the SoC industry will undergo a radical shift that will continue for years to come. Rather than plotting Moore’s Law linearly, transistors will be placed in three dimensions.</p>
<p>Driven partly by re-use, partly by time-to-market pressures and partly by physical limitations, 2.5D and 3D stacking will have an enormous effect on energy consumption and power. By stacking memory and other components on top of logic, the distance a signal must travel can be shortened significantly, along with the energy necessary to drive that signal.</p>
<p>“Moore’s Law is not a law,” said Wally Rhines, chairman and CEO of Mentor Graphics. “But the easiest way to reduce the cost of a transistor for the last 40 years has been shrinking feature sizes and growing wafer sizes. We are coming into an era where it will be more cost effective to stack die than to shrink feature sizes. We will hit it with memory before logic, but as with all new technologies we will adopt it before it is cost effective because of unique capabilities.”</p>
<p>Whether it’s done with an interposer, package-on-package, or flip-chip bumped die, Rhines said there is a 70% decrease in power dissipation if the memory can be put on top of a processor. </p>
<p>And that&#8217;s just for starters. By adding more processors that are sized for a particular function and tying that to just the right amount of memory, rather than a whole memory chip or block, far less power is needed. Companies such as Tensilica and ARM have been making this case for some time. With stacked die, their arguments are likely to receive far more attention.</p>
<p><strong>3. New materials and structures.</strong> Calling a material “new” is something of a misnomer in SoC design. Most of the techniques that we consider revolutionary have been around for decades, but they haven’t been developed enough to the point where they are cost effective, both from a yield and materials standpoint.</p>
<p>Through-silicon VIAs, for example, have been talked about since the late 1950s, and interposers in 2.5D packages are simply a collection of TSVs on a single die. But there are still issues to be worked out. Shang-Yi Chiang, senior vice president of R&amp;D at TSMC, said there questions remain about how to integrate a substrate with an interposer, and how to debug it at different phases of development so it can be tested.</p>
<p>“There are a lot of parasitics to deal with in 2.5D,” Chiang said. “And with 3D we need time to make sure we can calibrate it.”</p>
<p>The other kind of 3D—structures such as FinFETs, tunnel FETs and nanowires—have been on the drawing board since the 1990s. All of these structures can lower leakage by controlling the gate at multiple points. FinFETs are planned in volume for 14nm by both GlobalFoundries and TSMC, while Intel may begin using them as early as 22nm. </p>
<p>These structures hold the promise of radically reducing leakage of both static and dynamic power using all modes of operation—at least initially.</p>
<p>“The problem is these are a one-off thing,” said Mike Muller, chief technology officer at ARM. “FinFETs do reduce leakage, but once you’ve done that you’ve still got three impossible things to do before breakfast. Those kinds of steps are part of the solution.”</p>
<p>Muller said combining those with stacking techniques will go even further. “It opens the door to completely different die-to-die memory interfaces which allow you to build more efficient systems than when you go off the chip, down the serial interface to a separately packaged die. It changes the memory bandwidth, and this is just a computer at the end of the day so memory is one of the fundamentals for performance. Stacking allows you to change that.  </p>
<p><strong>4. Lowering the voltage.</strong> One of the benefits of 3D structures such as FinFETs and stacking of die is that they make it easier to lower the voltage in certain parts of the chip. The reason is that the minimum voltage for DRAM may be higher just to maintain functionality than it is for logic or I/O. By separating those functions into different die, issues such as state retention and leakage can be confined and dealt with independently—the so-called divide-and-conquer approach.</p>
<p>So how low can the voltage go? Several years ago, researchers at IBM said the minimum voltage for an SoC would be at least 0.7 volts. It now appears it can be as low as 0.1 or 0.2 volts, and research is under way to lower it even further. </p>
<p>“You can get down to 0.3 or 0.2 volts without any problems,” Qi Wang, technical marketing group director at Cadence, said during a <a href="http://chipdesignmag.com/lpd/blog/2011/10/21/experts-at-the-table-mobile-design-challenges-2/">recent roundtable</a>. “If you keep the aspect ratio of the depth and the height of a FinFET then you can guarantee the performance, but you do have other physical effects. Nothing is free. But the voltage can go much lower than what the textbooks say.”</p>
<p><strong>5. Fixing software.</strong> Software is the last piece of the puzzle to fix, and it’s been one of the hardest for a number of reasons.</p>
<p>First of all, software takes longer to create and perfect than hardware. This is evident in all the bug fixes and updates. All three of the top EDA players are involved in this effort. Synopsys is working on software prototyping to get allow software to be written even before the hardware is ready. Mentor has been involved in simplifying the creation of RTOSes and embedded software. And Cadence has shifted its design approach so that software and hardware can be done far more concurrently.</p>
<p>But getting software out on time is only a first step. The next step is to make software function more efficiently, an approach that dates back to the RISC vs. CISC wars of the 1990s. Reduced instruction set computing was more efficient than complex instruction set computing, which boosted performance. By taking that approach one step further, it also can reduce the amount of energy consumed by a particular task, and be used to manage the overall power in an system much more efficiently. </p>
<p>Work on symmetric multiprocessing continues, as well. How far that will go is anyone’s guess, but for most applications we now seem to be facing a limit on the number of cores that can be effectively used by most applications. Talk about unlimited number of cores has given way to limited numbers of cores and unlimited numbers of processors spread throughout a system—most of which are off most of the time.</p>
<p>Taken together, all five of these trends will have a huge effect on efficiency, power and leakage. And now that battery life is a competitive issue, it also is likely to be used by vendors and seen as a value add instead of an unnecessary engineering cost—or worse, a nuisance.</p>
]]></content:encoded>
			<wfw:commentRss>http://chipdesignmag.com/lpd/blog/2011/11/03/five-important-changes-that-will-affect-power/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Power Bits: Driving A Harvester, Redesigning The Data Center</title>
		<link>http://chipdesignmag.com/lpd/blog/2011/11/03/power-bits-driving-a-harvester-redesigning-the-data-center/</link>
		<comments>http://chipdesignmag.com/lpd/blog/2011/11/03/power-bits-driving-a-harvester-redesigning-the-data-center/#comments</comments>
		<pubDate>Thu, 03 Nov 2011 07:01:10 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[News Stories]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[energy scavenging]]></category>
		<category><![CDATA[HP]]></category>
		<category><![CDATA[IDTechEX]]></category>
		<category><![CDATA[Intel]]></category>

		<guid isPermaLink="false">http://chipdesignmag.com/lpd/?p=3469</guid>
		<description><![CDATA[Energy scavenging will transform automotive electronics; HP takes a new tack in the server wars.]]></description>
			<content:encoded><![CDATA[<p>By Ed Sperling</p>
<p><strong>Highway harvesters</strong><br />
Hybrid cars may be taking on a new dimension. Rather than just running on batteries and gas, they could also generate their own energy.</p>
<p>This has happened already to a small degree with regenerative braking, which puts energy back into the car’s battery. But energy scavenging has come a long way since that concept was first developed.</p>
<p>The next challenge is to go well beyond just powering the infotainment in a car and actually use it to power the motors. <a href="http://www.energyharvestingjournal.com/articles/energy-harvesting-for-vehicles-00003843.asp?sessionid=1">IDTechEx</a> contends that future vehicles may have energy harvesting spread throughout them, from flexible photovoltaic cells that wrap around the car to energy-harvesting shock absorbers.</p>
<p>This creates a new category of car—one that generates as well as uses energy. But the key is that it can use much less energy, go significantly farther on a single charge or tankful of gas, and probably be accomplished for a very low additional cost. </p>
<p><strong>Cooler data centers</strong><br />
There have been two dueling problems inside of data centers for the past decade. One is heat, the other is energy. Both are related. The more servers, the more heat, and the more money it costs to power those servers and to cool them.</p>
<p>HP’s <a href="http://www.hp.com/hpinfo/newsroom/press/2011/111101xa.html?mtxs=rss-corp-news">announcement</a> that it is building extreme low-energy servers based on ARM’s, which it calls Project Moonshot, takes an interesting approach to this problem. Rather than thinking of servers as individual machines, it also allows resources to be shared throughout a data center. HP estimates energy can be cut by up to 89% using 94% less space. </p>
<p>Those are interesting numbers, and they speak volumes about the interest in energy costs rather than just performance. Energy costs are the main reason why Facebook just signed a deal to build a data center in northern Sweden, where outside air can be used to cool racks of servers 10 months of the year without turning on the chillers. They’re also the reason why data centers are being built along the Columbia River Gorge in Oregon, and in Arizona, which has a surfeit of energy produced by nuclear reactors.</p>
<p>This doesn’t end the war between Intel and ARM in this space, however. HP is planning to include Atom-based processors from Intel, as well, in the future. But the real key is that after nearly 60 years of focusing on performance in each subsequent server release, the central theme is now all about power. </p>
]]></content:encoded>
			<wfw:commentRss>http://chipdesignmag.com/lpd/blog/2011/11/03/power-bits-driving-a-harvester-redesigning-the-data-center/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Power Bits: The Battle For Mobile Mindshare</title>
		<link>http://chipdesignmag.com/lpd/blog/2011/10/21/power-bits-the-battle-for-mobile-mindshare/</link>
		<comments>http://chipdesignmag.com/lpd/blog/2011/10/21/power-bits-the-battle-for-mobile-mindshare/#comments</comments>
		<pubDate>Fri, 21 Oct 2011 14:41:44 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[News Stories]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[Intel]]></category>
		<category><![CDATA[Mentor Graphics]]></category>
		<category><![CDATA[Power]]></category>
		<category><![CDATA[RTOS]]></category>

		<guid isPermaLink="false">http://chipdesignmag.com/lpd/?p=3429</guid>
		<description><![CDATA[In the race to win more converts, processor vendors are slugging it out over efficiency first, processing second; software takes on new significance. ]]></description>
			<content:encoded><![CDATA[<p>By Ed Sperling<br />
The race is on to provide enough performance gains to justify an upgrade while relentlessly pushing to extend battery life. The goal is nothing short of turning every mobile device into the processing equivalent of a notebook computer, regardless of the form factor.</p>
<p>This trend has been evident for some time, but momentum is building. ARM this week introduced its “big.LITTLE” processor, with its Cortex-A7 processor that it claims is five times more energy efficient than the Cortex-A8 with “significantly” better performance in just 20% of the area. http://www.arm.com/about/newsroom/arm-unveils-its-most-energy-efficient-application-processor-ever-with-biglittle-processing.php</p>
<p>Intel’s 22nm Ivy Bridge chip, meanwhile, has entered production. Intel has said its TriGate FinFET technology would be available at that node, as well, dramatically reducing the power. It’s questionable whether the rest of the industry will follow Intel on the FinFET path at 22nm/20nm, or whether it will substitute other technologies such as SOI and bridge the gap to 14nm. But either way, the direction is clear. More processing power, but a big push toward energy efficiency with maybe some area gains thrown in for good measure.</p>
<p>This is only a piece of the puzzle, of course. Software is becoming much more power aware. Mentor Graphics announced this week that it has added dynamic frequency and voltage scaling capabilities into the kernel of its Nucleus RTOS. While many consumers are used to this kind of capability in general-purpose OSes such as Windows and Mac OSX, RTOSes are also widely used in microcontrollers and in parts of a device that has not been particularly power-aware, if at all. This is the embedded space, after all, and for many companies this has been black-box technology. http://www.mentor.com/company/news/mentor-nuclues-rtos</p>
<p>In the future all pieces will be power-aware, including the IP, firmware and software applications. That should give an even further boost to energy efficiency, along with the need for co-development of hardware and software at an unprecedented scale.  Power is a universal concern. There is only one battery, and everyone has to share it—and worry about it.</p>
]]></content:encoded>
			<wfw:commentRss>http://chipdesignmag.com/lpd/blog/2011/10/21/power-bits-the-battle-for-mobile-mindshare/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Power Bits: Hidden Cores</title>
		<link>http://chipdesignmag.com/lpd/blog/2011/09/23/power-bits-hidden-cores/</link>
		<comments>http://chipdesignmag.com/lpd/blog/2011/09/23/power-bits-hidden-cores/#comments</comments>
		<pubDate>Fri, 23 Sep 2011 15:58:27 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[News Stories]]></category>
		<category><![CDATA[Android]]></category>
		<category><![CDATA[ARM]]></category>
		<category><![CDATA[Nvidia]]></category>

		<guid isPermaLink="false">http://chipdesignmag.com/lpd/?p=3307</guid>
		<description><![CDATA[Nvidia’s new five-core chip turns a general-purpose graphics chip into a power-miser SoC.]]></description>
			<content:encoded><![CDATA[<p>Nvidia has an interesting surprise. Its upcoming four-core Tegra GPU actually has five cores. The extra “companion” core will be used for less-compute-intensive tasks to save on battery and includes an ARM Cortex-A9.</p>
<p>This is a new idea for a processor company, whether it’s a GPU or a CPU. It’s not a new idea for a systems company. Depending on how you define the system, SoC makers have been doing this for the better part of a decade and Dell has been offering similar approaches in its laptops for years.</p>
<p>But what’s intriguing here is that Nvidia is basically turning the GPU into an SoC, and if you had mentioned that to Nvidia five years ago its executives probably would have stared at you like you were from another planet. But given Intel’s push into the SoC world, this is no longer such a foreign concept. Nvidia has just released a <a href="http://www.nvidia.com/content/PDF/tegra_white_papers/tegra-whitepaper-0911b.pdf">white paper</a> on the subject. </p>
<p>One of the interesting side notes in that paper is a hint at a basic flaw in Android 3.x, which appears to suffer from the same limitations as more mature OSes such as Windows and OS X. Android supports muiltiprocessing, but it assumes all cores are created equal. They are, but they shouldn’t be if power is an issue, which raises questions about what exactly general-purpose processors and operating systems will be used for—or limited to—in the future.</p>
<p>The approach that Nvidia has come up with is variable SMP, meaning cores get used as needed and tasks are split depending on where they can run most efficiently. It doesn’t make sense, for example, to do background maintenance on a GPU, while it also doesn’t make sense to do high-performance tasks on an A9. Efficiency is now the driver, and we are simply at the starting point for re-engineering just about everything.</p>
<p><em>&#8211;Ed Sperling </em></p>
]]></content:encoded>
			<wfw:commentRss>http://chipdesignmag.com/lpd/blog/2011/09/23/power-bits-hidden-cores/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

