Posts Tagged ‘ARM’

Next Page »

Experts At The Table: Low-Power Management And Verification

Thursday, March 11th, 2010

By Ed Sperling

Low-Power Engineering moderated a panel featuring Bhanu Kapoor, president of Mimasic; John Goodenough, director of design technology at ARM; and Prapanna Tiwari, CAE manager at Synopsys. What follows are excerpts of their presentations, as well as the question-and-answer exchange that followed.

Bhanu Kapoor: There are two types of power you need to consider: Dynamic power, which is consumed because you are doing some useful activity, and leakage power, which gets consumed whether you’re doing something or not.

The dynamic power has dependence on switching activity, the frequency, the capacitance and the supply voltage. There are two components of leakage—sub-threshold and gate tunneling. Gate-tunneling is addressed by advances in process technology such as metal gates. Sub-threshold leakage grows exponentially with the decrease in threshold voltage. At 90nm it was significant, at 65nm it was equal to the dynamic power, and it grows from there.

If you look at the typical smart phone, it’s the same system-on-chip that is running different applications. These different modes of operation have different performance requirements. You can use different voltages to achieve those different levels of performance.

A typical power-managed SoC includes a power-management IC that provides different cores. One core can be a processor. And if it’s an ARM Cortex A9, there is power management in that core, as well. A second core might be for mixed signal, which potentially could require higher performance. And then this power controller, which is on all the time.

All of these power techniques have an implication on verification.

Slide5

If you look at standby leakage, one of techniques is power gating, which is cutting off power to certain regions. If you don’t need portions of the chip to be on, you can completely shut it down. That is power gating. But that has an effect on performance, because turning on and off a function is a long event compared to a clock cycle. You need to sometimes retain the state so you can come up fairly quickly.

All of this has an effect on verification, as you can see from the following chart.

Slide6

If you can do gate-level simulation, that is very helpful. You need input/ouput and power connected and you need to have appropriately modified your library definitions so power is one of the variables. With domain isolation, once you shut down you have to make sure you are not sending floating values to other regions. You have to isolate it to proper ones and zeros, which you can check with isolation gates using a rule-based checker.

If you have power in your simulation, a lot of rule-based issues can be addressed right up front. Over the years, simulation was not power aware. In the future, simulation will take a more and more important role. Simulation, by default, will incorporate power.

John Goodenough: We are verifying systems on chip. They’re large. They have lots of power domains to match all the application workloads that are going to be demanded on those devices. They have processors and software. Some of the domains are being switched on and off to meet the energy profile. They have virtually every technique available. The state space you’re trying to validate is therefore exploding by an order of magnitude.

One of the things we think about a lot at ARM is that it’s not so much the techniques that you can apply. It’s how you’re going to scale them to tackle these problems. There are lots of clever ways to validate, but not all of them scale effectively into workflows and onto your infrastructure. Power verification is not just about logical verification.

If you get a chip like the one below, you can mess it up in a lot of different ways.

Slide3

Usually, you can fix it in software. But you also can mess up the connectivity between the power domains. If you get your level shifter or always-on buffer or retention register wired up wrong, it’s not going to work. It’s going to be D.O.A. on the bench. A lot of chip failures are being caused by the failure to verify the integrity of the power network.

That’s a non-standard piece of verification, particularly where that interacts with the logical function of the chip and you’re trying to measure the maximum in-rush current and the average in-rush current. If you’re switching domains on and off, what’s the power domain going to look like from an electrical perspective? Is turning one domain on and turning another domain off going to put the voltages on either side of a level shifter into a pathological state that will damage or degrade the transistors and the level-shifting buffer?

There are some very interesting cross-coverage issues between what is traditionally more of the analog verification space on the power network and the logical verification space. We need, when considering power simulation, to run abstracted analog simulations, SPICE-level simulations, and cross between the two.

Unfortunately, the explosion in power states is also increasing because of the number of software states or the number of field configuration states. From a verification standpoint, not only are you adding a multiplier due to power states, you also have things like a secure or non-secure state. Will they work when a chip is configured for a single package and pinout if it uses another package and pinout? There’s an explosion in these operating modes.

The other pressure we have is making sure you’re going to hit a given schedule. In looking at the power metrics it’s important to see how they can be applied into practical workflows and how you can feed performance metrics from wherever you are in the process back up into reporting and closure reporting. If you combine the need for those two, one of the things it leads to is enterprise scaling, both in terms of infrastructure to support the simulation and how you scale this across workgroups that are not co-located.

The other problem you face is that if you do all of the verification, you’re never going to get the chip out the door. You’ve got to have a verification plan and really narrow down which of the power modes are going to be pathological and which ones can be worked around in software. A major part of thes power verification is the integration of a VP of engineering risk-reduction play into a more mainstream verification practice.

We’ve come a long way in a lot of the techniques, but at the end of the day you have a block diagram that needs to be simulated. Today that block diagram consists of RTL and some way of describing the power network or the power intent and power state space of the design. You also have to support the verification IP and transactors. You need coverage across the RTL and the power descriptions. It’s not rocket science. It’s just a more complicated block diagram.

Slide4

Smart-Grid Designs Solve Low-Power Riddles

Thursday, February 11th, 2010

By Ellen Konieczny

Imagine that you go to your kitchen to get a drink and pass your home’s energy-usage monitor. Due to a recent heat wave, you see that your energy usage is already at what it usually is for the entire month. Yet you’ve still got one week left in your billing cycle. To keep the bill low, you turn your A/C thermostat up a degree and make a mental note to not keep lights on unnecessarily.

The next day, the weather is more comfortable. You log in from work and turn the A/C off completely. Such capabilities are not farfetched, thanks to plans to roll out smart-grid networks across the globe (see Figure 1). In fact, some utility companies have already tested these technologies. For such two-way communications to be realized on a grand scale, however, the infrastructure, smart meters, and millions of wireless devices involved will need to consume minimal power.

0210LPsmartgridEKfig1

Fig. 1: The various aspects of the smart grid and how they will be connected. (Courtesy of Ember Corp.)

Emmanuel Sambuis, general manager for the metering business at Texas Instruments, says water and gas meters now require 20-year operation from the same battery. In some devices, the requirements are now as much as 25 to 30 years—particularly in areas where batteries are particularly difficult to access or where there are so many devices that changing out batteries can become expensive. In some extreme cases, companies have been developing energy-scavenging solutions that require no batteries at all.

What’s changing, however, is the addition of low-power communications technology inside of even home-area-network products, such as in-home displays and intelligent thermostats. Generally, such low-power communications are RF-based. In the case of power-line communications, regulations also apply and force the energy consumption to be minimal. To raise energy efficiency in e-metering applications, TI has developed an SoC microcontroller that integrates all metering functionality onto a single chip, with ultra-low-power operation so that only simple voltage regulation is required for a complete solution. The MCU provides direct device operation from a 3V supply with the CPU and ESP active at only 2.5 mA. During a power outage, the device can operate in standby mode at 1.1 µA with the real-time clock function active.

It is essential to keep in mind that smart-grid devices will most likely be asleep for the majority of the time. “The biggest challenge is in enabling the battery-operated devices to not be awake for long periods as well as for them to be able to join the network, acquire and process any data as quickly and efficiently as possible, and go back to sleep, said Skip Ashton, senior vice president of engineering at Ember Corp. “Designing technology–radio, processor, and networking software–which enables devices to do that reliably and securely is the crux. A user of a battery-operated device expects instant operation and control when they are using it but long battery life when they are not. This type of ‘instant-on’ capability requires coordination of the radio as well as the software controlling the devices.”

Thankfully, standards bodies like the ZigBee Alliance (www.zigbee.org) include low-power operation as a critical goal as they develop their protocols. For suppliers implementing the protocols and hardware, however, Ashton emphasizes it is important to view the technology offering as a “system” that includes hardware and software. “A tightly integrated platform, which has been developed from the ground up to work together to deliver excellent performance, efficiency in code size, and processing of security and application data, lends itself to better resolve the challenge of minimizing power consumption and extending battery life. Although the standards can prescribe a certain level of behavior, different suppliers can innovate within the standard to improve performance,” Ashton says.

In addition to running IEEE 802.15.4/ZigBee wireless, for example, the MeshConnect modules and integrated circuits (ICs) from California Eastern Laboratories promise to get good range out of a very low-powered device (Fig. 2). According to David Cohen, director of marketing, and Rich Howell, director of business development, the MeshConnect modules and ICs put out +7 dBm power out native (i.e., without using an external power amplifier). The MeshConnect technology delivers standby mode at less than 0.3 µA.

Figure 2: With sleep-mode power consumption below 1 µA, the MeshConnect Extended Range Module offers extended battery life

Figure 2: With sleep-mode power consumption below 1 µA, the MeshConnect Extended Range Module offers extended battery life

Although such products are impressive innovations on their own, they are only part of a bigger picture. A successful smart grid will require close collaboration between providers of communication devices (RF transceivers and processors, for example), providers of communication software, and designers of communication systems. In addition, success will largely depend on advances in signal processing.

“Smart-grid technology developers look to advanced digital and analog signal-processing technology to power next-generation energy infrastructure,” said Ronn Kliger, energy group director at Analog Devices. “By leveraging ICs optimized for a range of smart-grid applications—from energy-metering solutions to dynamic, grid-integrated management and communication systems—developers are able to design intelligent systems that promote energy efficiency and management flexibility.”

Along with energy-metering ICs, the firm offers RF, power-line carrier communication, power management, and digital signal processing in support of smart-grid applications.

Measurement capabilities also will need to be fine-tuned, as standby or “vampire” power poses a clear threat to the smart grid’s low-power-consumption efforts. Standby power results from electronic devices that are plugged into wall sockets, such as TVs, DVD players, cell phones, and answering machines. Whether they are on or off, they consume power 24 hours a day. It is difficult to accurately measure the power that they actually use, however, which is why they must be accounted for in the smart grid. A number of companies have developed ways to measure even the smallest amounts of energy usage. For example, Teridian claims to provide accuracy of +/-0.5% over a 2000:1 dynamic range.

Of course, the most frightening specter haunting power consumption in smart-grid devices may be standby current. This issue becomes especially critical for the finest silicon technology nodes, where transistor leakage current starts to dominate. Leakage power already poses a significant problem at advanced process nodes, and the problem increases with density at advanced nodes. The dynamic power dissipation arising from high-frequency switching of the tens of millions of transistors directly impacts aspects like battery life, packaging and cooling costs, form factor, and reliability.

All of the major EDA companies are now advising SoC developers to consider low power as part of the architecture rather than something implemented later in the flow, and entire flows are becoming power-aware and power optimized. That goes for the process as well as the components. Third-party IP vendors such as Virage Logic, ARM and Synopsys are now standardized on low power versions rather than splitting their product lines between IP geared for low power and performance, and even in the FPGA space, where concern for power was either an afterthought or non-issue, all of the major vendors are now offering lower-power solutions. Actel has even developed chips that rival the power consumption of some of the most advanced ASICs.

Low-Power Architectures Go Mainstream

Thursday, January 14th, 2010

By Pallab Chatterjee
Until recently, low power engineering has been defined by the automated use of EDA tools in the design flow to help cut back on peak dynamic power. The new generation of mobile and video products has forced a change in that methodology.

There are two other fast rising architectural approaches. The first is multicore, which is prevalent in new product introductions from Nvidia, Samsung SLSI, Imagination Technology, NetlogicMicro, Broadcom, and Qualcomm. To address the usability specs required by e-readers, mobile Internet devices and other mobile information products, a new compute architecture was needed that did not just rely on “function disabling” as a power reduction technique. All of these companies introduced designs that are focused on multicore architectures, where there is complete functionality available at all times even though the process has been optimized for low power.

This low power optimization has to do with custom library design creation, modification of internal clocking schemes, datapath and buffer optimization, memory segmentation and placement, and most importantly dynamic control of the design’s power use and speed based on the data content of the information being processed on a per-packet basis. This re-architecture of products was the key enhancement with the new dual Cortex Nvidia Tegra, which is targeted to e-readers and tablet PCs, as well as the high-performance Alchemy multicore and multithreaded processors for automotive and navigation applications, and the many new video and communications appliances from Broadcom and Qualcomm.

The basis for most of these systems are ARM processors cores (A8 or A9 primarily) or MIPS cores. This shift has allowed both a performance increase in the end systems as well as a nearly doubling of the operating battery life.

The second prevalent low-power methodology is the segmentation of design to a CPU and a GPU rather than a single compute engine. While the initial impression is, this takes more power, the GPU is actually more power-efficient on graphics and some video data than the CPU, and on general use functions, the CPU is more power-efficient than the GPU. For most of the smart phones and media processing chips, this approach has replaced bigger single-processor cores with clock-gating and multi-voltage device process solutions.

These architectural changes were implemented to address both the data dependence of the power use and the yield-process variability of sub-wavelength manufacturing. As most of the applications have a very thin and small form factor, they are bound by a fixed or diminishing power envelope. To address the longer term of operation the components can lower the operating voltage, but this does not take into account the associated reduction in performance in the power envelope that is associated with it. In order to address this aspect of design, the mobile handset and mobile computing requirements have driven to the smallest geometry process flows available.

The utilization of these processes (45nm and 40nm, currently) requires restricted design rules, restricted topologies and limited device size diversity to yield well. These designs are optimized with new RTL and physical libraries, new floor plans, and power routing to highlight the data path symmetry that is required by the data sets being processed. Examples of this are new 3dmedia processor in 40nm by Samsung for mobile phones that utilize the IMG Tech 3D video and graphics engine and a high-performance ultra low power ARM CPU.

The distributed multicore approach also has been utilized in high performance for lower power products. AMD/ATI introduced the 5970 Radeon graphics card at the Consumer Electronics Show. The card has two GPUs and is a Direct X11 product with more than 4.6TFlops of peak performance. The restructuring of the device/cell library, its reliance on proven 40nm bulk CMOS processing and the use of GDDR5 memory allows the product to operate with a peak power of about 300 watts but only requires 51 watts for nominal operation. The design was optimized for power and a data control flow to support the 3200 parallel stream processors and the 160 texture units. Dynamic power is managed based on how many streams and texture units are needed at any time based on the contents of the data that being processed on any given cycle.

Most of these new systems are targeting use of Samsung’s low-power DDR3 memory, which operates at 1.3v vs. 1.5 volts and offers higher densities than DDR2. These higher-density, low power solutions can provide in excess of 35% overall power footprint reduction for the design, if used with 32nm low-power flash memories in SSD applications rather than rotating media.

The takeaway from CES this year is that architectural engineering and new firmware control methods are now seen as essential to address the functional requirements of the new mobile communication and processing platforms. This is an intelligent shift from recent years, when only feature size reduction and blind tool-based selection of power gating and power routing were in vogue.

Low-Power Tools Lag Behind Cores

Thursday, January 14th, 2010

By John Blyler
Despite the preponderance of news on low power, especially at the recent Consumer Electronic Show, automated design tools have not been well received by low-power chip designers, according to a recent Chip Design Trends (CDT) “EDA Tools and Technology” survey.

These same respondents listed their primary design areas as digital logic, mixed signal and embedded processors, all areas where architectural power decisions lock-in the power usage of the end-product. But while power tools may come up short among chip designers, the processor IP manufacturers and OEMs have been steadily decreasing the energy consumption of their cores.

Type_of_IC_EDAsurvey

For example, Intel just announced its 2010 core lineup of processor platforms. Among them were several embedded processors with core power consumption far below the previous standard levels. In addition to lower power, Intel is pushing the software side of power design – at the chip and board levels. (see “New Intel Processors Benefit Embedded and Challenge Software.”)

On the mobile handset side of the market, ARM continues to makes strides in reducing its core power consumption. As James Bruce, Mobile Segment Manager for ARM puts it: “OEMs typically specify the maximum power of the processor at 300mWatts for handsets. This power limit is constrained by battery technology and the size of the consumer’s pocket.”

With battery technology advancing slowly, power tools and IP will have to take up the slack. Moreover, as low-power becomes an important consideration in all designs, this picture is likely to change significantly.

Differentiating Embedded Processors

Thursday, December 10th, 2009

By Ann Steffora Mutschler

The embedded processor world addresses a vast range of applications – from the datacenter to the biomedical device – all of which have critical power needs that vary with the use. Power concerns continue to dominate the embedded system whether it is avoid a noisy fan in a TV set-top box, allow video on a mobile phone or minimize pricey cooling costs in the datacenter.

The leading vendors in this space – ARM, MIPS and ARC (recently acquired by Virage Logic) – employ a variety of tactics and technologies to differentiate from trimming power by removing some features and making tradeoffs in the embedded software that run on them.

Putting power into the embedded perspective, Yankin Tanurhan, vice president and general manager for Processor and non-volatile memory Solutions at Virage, explained that in a system where there is a host doing most of the main processing, by adding small, deeply embedded processors it enables power to be saved on the big number cruncher. That allows smaller processors to listen to the analog signals, detect certain thresholds, and do whatever small level of signal processing needs to be done in a much smaller power budget that the ‘big gun’ would be using.

ARM looks at the embedded processing world in three segments, which its processor platform addresses in three ways, according to Travis Lanier, a product manager at ARM. ARM’s Cortex-A series is targeted at higher-end applications processing and is geared to running high-level operating systems such as Linux, Android and some flavors of Windows. The Cortex-M series is aimed at the microcontroller space for more deeply embedded applications that run RTOSs, with a focus on minimal gate count and extremely low energy. ARM’s Cortex-R series doesn’t have the ability to run the high-level OSes, but is instead focused on high performance and has a similar feature set.

In terms of differentiating, Lanier noted that it is important to look at it compared to what ARM offered historically and has built from that to refresh its product line.

On the technology front, ARM uses multiprocessing technology in its embedded processors. “We have some fairly small processors and generally when you think multiprocessing you think of the larger processors – you want to go faster, you want to do more operations per second. With the Cortex-A5, we’ve taken a different twist on that, and we have a fairly small, extremely power efficient processing,” he said.

Multiprocessing in the ARM Cortex-A5 allows up to four cores to be in a single configuration and run the processor doing four times the work, with each one of the processors having a fraction of what a higher performance processor would. So rather than running it at 2GHz, for example, there can be four processors running at 500MHz with each one taking a fraction of the energy of what a 2GHz processor would take.

“When you try to go faster it takes more and more energy, and when you try to get more done per clock cycle it takes more and more energy. We recently released the Cortex-A5 and this is pretty much our most power-efficient processor ever. Rather than just looking at absolute low power or absolute frequency, when we set out to design this one, we looked at a series of benchmarks and said, ‘What’s the maximum amount of work we can get done per unit of energy?’ Rather than saying, ‘We’re targeting 300mWs or we’re targeting 500MHz,’ we said, ‘Let’s just go for the maximum work we can get done per unit of energy,’” Lanier added.

MIPS’ technology, specifically its M14K and M14Kc cores, make use of a new code compression technology that maintains the native performance of the MIPS architecture. “Most of the applications that MIPS is used in and that comprise many of the embedded markets are very cost sensitive. You can look at cost from many different angles but your code has to be stored somewhere and the smaller you can make it, the less expensive the overall system is, and some systems are more sensitive to that than others,” said Mark Throndson, director of product marketing at MIPS.

For instance, MIPS decided to target the microcontroller space initially because microcontrollers in many cases are basically a self-contained piece of silicon containing on-chip memory and what you see is what you get. “So clearly packing in all of the code into on-chip flash is a very space-critical implementation. But even with portable products, the effect on power and limited built in flash tends to push code space as a higher priority,” he explained.

MIPS also employs coherent multicore technology that includes hardware multithreading in each of the cores in the system. 2 to 4 cores connected working as one unit to a symmetric multiprocessing operating system.

Software becomes big consideration
In terms of software, there are definite considerations – depending on the application – for the embedded processor. Lanier said this occurs many times with software such as device drivers. On the embedded level, if the software is running on the applications processor, for example, the device driver runs in the background and detects how much activity is on the processor and automatically shuts it down when necessary to conserve power. So when an application is running, it doesn’t interfere.

Then, in the deeply embedded space, such as a Bluetooth headset or biomedical application, the software must be written from scratch being very power aware as to what modes the processor is put into and how it is activated. You have to focus on getting every little scrap of power you can dig up from the processor

“The biggest challenge now is that everyone basically wants to have a mainframe computer inside their cell phone. So the question is how to keep adding more and more performance to these small devices where the battery life hasn’t increased. So you still have the same power budget that you had 10 years ago in the battery yet people are expecting more and more performance out of these devices. Part of that you get from Moore’s Law. But lately, as the process technologies have shrunk the voltages have slowed down as far as how much it has dropped. So a lot of the power gains you have from the process technology where the voltage would drop with each new process generation have kind of slowed down starting at 65nm. It’s a lot more incremental how much you reduce the voltage. There’s not as much power savings to be had from Moore’s Law, although the number of transistors increases,” he said.

The key thing about software in an embedded system that wants to operate at low power is that when you have a high-end processor like that, the trick is knowing when you are running out of things to do or ahead of schedule, said Darren Jones, MIPS’ engineering director.

“You want to think about if it is a camera, once you take a picture and you know you are done processing it, the whole thing can shut off,” Jones said. “So the trick is to get done processing as quickly as possible so the whole thing can shut down. Even though it is difficult for hardware to help this, the software can tell them how they are doing against their deadlines. And if they can know that, then they can use the hardware hooks we put in to bring the power down. Whenever you bring the power down, it is costing you something – lowering the MHz or frequency – so you get less performance out of it. You don’t want to do that in a real-time system like your antilock braking system. If it is extremely low power, something like a hearing aid or even an automotive application, then the trick is being super efficient with your code. MIPS processors and the MIPS architecture as a whole are very efficient. When you want to get work done, you get it done as efficiently as possible, get the work done as quickly as possible and then go to a lower power mode.”

Jones noted that low power used to be confined to portable electronic devices such as cell phones to conserve battery time. It now has moved to televisions, data centers and almost everything else imaginable.

“Companies are really starting to pay attention to power as a goal, not just performance,” he said. “So even in high performance applications they can’t just burn power.”

The Week In Review: Nov. 20

Friday, November 20th, 2009

By Ed Sperling

Business seems to be picking up everywhere in the design world, with an emphasis on speed—quicker deals, faster product rollouts and overall time to market—and all of it with an underlying emphasis on low power and tighter power budgets. Could it be that after the recession, everyone is trying to get back on track quickly?

Virage Logic completed the acquisition of NXP’s IP technology and its development team. That comes on the heels of its recent acquisition of ARC. The fact that Virage completed both of these acquisitions in a 12-day period is nothing short of an accounting miracle. And just in case the company didn’t have enough to do, it added a Silicon Browser for post-silicon bring-up and system debug.

Android seems to be getting its share of attention these days. Mentor Graphics introduced an Android Development System for Texas Instrument’s OMAP35x processors. TI’s processors also include ARM Cortex-A8 technology, which puts ARM squarely in the center of this effort, as well, with a heavy push toward better battery life. But will any of this take a bite out of the Apple iPhone?

On the get-things-done-quicker side, Digital Imaging Systems used Synopsys’ Galaxy Custom Designer to achieve first-pass silicon in 22 days. Not all of it was from scratch, of course, but that’s still a very tight timetable.

And Atrenta’s deal with Fujitsu’s Kyushu Network Technologies is aimed at reducing design risks in integration of third-party IP from multiple vendors with different clock domains. Translation: Faster time to market.

Also on the business side, Cadence expanded its design alliance with Toshiba for the consumer and mobile markets.

Intel invested millions of Euros in an Exascale Computing Research in France, as part of Intel Labs Europe. This is the second time in two weeks that Intel has paid out big bucks to appease antitrust regulators. This deal will add 900 new research jobs in Europe. That follows Intel’s settlement with AMD, clearing the way for Intel to go after ARM with its Atom chip.

ARM’s comeback was largely a reiteration of the strength of its ecosystem. It struck up a strategic architectural license agreement with Infineon for advanced security applications and created a solutions center for Android.

Hypervisors For Managing Power

Thursday, November 12th, 2009

By Ed Sperling
Hypervisors are headed for a new role inside of multicore chips—managing the various power islands in addition to the cores.

A patent application filed by IBM, entitled “Method and system for hypervisor based power management,” shows the company’s intention to use hypervisors for everything from monitoring power consumption rates to scaling power for individual cores. http://www.faqs.org/patents/app/20080301473

In the well-documented history of hypervisors, this marks a major shift in direction. Hypervisors have been used primarily for running virtual machines on a single or multiple cores and for directing applications to take advantage of one or more cores. In effect, they have worked like rudimentary traffic cops, scheduling software functions for processors, memory, logic and buses.

Adding power into the mix changes the basic concept in two fundamental ways. First, it means the operating system becomes less important in a multicore system because critical decisions about what gets turned on and off, how much power is assigned to different processors or other parts of the chip, and what gets prioritized are made by the hypervisor layer rather than the operating system. And second, it means getting chips out the door will become immensely more complicated because just thinking about all the possible permutations for verifying these kinds of systems makes your brain hurt.

“A hypervisor for low power management certainly can work,” said Marc Bryan, product marketing manager for Mentor Graphics’ Codelink products. “This is an extension into the SoC world and configurable IP. Software developers want middleware capability to control the power demands with the SoC.”

He noted this works in both multicore and single core chips and becomes particularly useful in chips with multiple configurable power domains, such as an advanced ARM processor that can contain 14 of those domains. But it’s also like building complexity on complexity.

“This definitely opens up a new set of challenges in design and verification,” he said. “You’re adding complexity in the power domain. The challenge is verifying it. You have to make sure the hardware switches on and off and that the software is included. And with power management software, you have to make sure you can turn on and off the power domain and that the software works correctly with the hardware.”

This is no simple feat. In fact, to the best of anyone’s knowledge, it has never even been attempted.

From the beginning
The concept of a hypervisor has been around for decades. IBM introduced the first implementation back in the 1970s with its System/370 mainframes as a way of virtualizing applications running on the mainframes to make them more efficient.

Fast forward to 2005 and that same technology showed up in the eight-core Cell processor, which IBM created with Sony and Toshiba. Sony used seven of those cores for its Playstation3, plus a hypervisor to manage all the cores. It was the classic example of smaller, faster and cheaper compared to the complex multi-million dollar mainframes that were the size of multiple refrigerators.

Almost simultaneously, the same general concept began showing up to manage virtual machines in virtualization software created by companies like VMware and Citrix, which allow multiple operating systems to run on a single core or multiple cores. They also allowed multicore servers to be utilized at greater rates than the average 15% to 20% that many were being run at, costing both power to run the machines and power to cool the server racks.

Using hypervisors to manage the power itself, however, is new and shows the resilience of this concept of adding programmable controls for functions that typically have been handled by hardware.

“The hypervisor is a way to really start giving us control over power in SoCs,” said EDA consultant Gary Smith. “Put that together with an NoC and you really start moving toward an ESL view of power.”

Market realities
It still could take years before this concept shows up in power management of SoCs, however. While there is a compelling need to simplify power management on chips, this may not be the only approach or even the best approach.

Right now, many of these functions are assigned to the operating system. It’s possible that the operating system can start offering these kinds of capabilities rather than a hypervisor, or that a more robust hypervisor will be built into operating systems.

But hypervisors, at least in IBM’s view of the world, have a distinct advantage. In IBM’s model, the hypervisor runs between the metal and the operating system, almost like an enabling set of middleware. The result is that it can take advantage of whatever changes are made to the hardware and whatever hooks are added much more quickly than those changes can be added into the operating system, where backward compatibility of applications is vital. (See Figure 1)

Figure 1: IBM's hypervisor design

Figure 1: IBM's hypervisor design

“One problem with doing power in the hypervisor is in the area of security,” said Barry Pangrie, solutions architect for low power design and verification at Mentor. “If you’re creating a medical device and you put more into the hypervisor, that means the hypervisor layer now has to be certified.”

The flip side is that the hooks in the hardware are going to be much more readily available to a hypervisor layer built for a specific purpose than an operating system. “When you’re talking about dynamic voltage frequency scaling, for example, those capabilities tend to run well ahead of what the software guys are using when they write their code. One way to deal with that is to make the OS smarter and use some of the statistics dynamically to help bring down the power levels.”

Another way is to develop new code that wedges between the hardware and the operating system, which is one of the models now being considered in the virtualization world. But when this gets to market and in what form is unknown. What’s interesting is there is a need and a method, and from here anything can happen.

Power Delivery Issues

Wednesday, November 11th, 2009

By Ed Sperling
Reducing the voltage in a system on chip is like turning down the water pressure on a home plumbing system. Pretty soon you find out that not all the faucets work properly because there isn’t enough pressure behind them.

While it’s vital to drop the voltage to boost battery life in mobile devices, not to mention reduce the overall power consumption in plug-in devices, the effects aren’t always well understood ahead of time. Power delivery changes with the voltage, and not always in anticipated ways. The problem is that chips are getting so complicated with power islands and multiple cores that it’s difficult to anticipate all the possible permutations up front.

“There are indeed challenges,” said Jan Rabaey, who heads the Wireless Research Center at the University of California at Berkeley. “Fluctuations in currents are an obvious result of turning domains on and off.”

In fact, the more abrupt the on/off states, the greater the likelihood of power delivery problems. “It’s like hitching a car to a trailer and taking off,” said Srikanth Jadcherla, group director for R&D in Synopsys’ verification group. “It doesn’t move the same way.”

And the more power islands, the worse those problems get. “This is something that’s well known in the cell phone industry,” said Bhanu Kapoor, head of Mimasic, a low-power consultancy. “They’ve got ARM cores, DSPs and memory blocks on a cell phone processor and they have a power supply for all of these different modules. But when you need to switch on a new block, the power supply has to deliver power to both. The power supply inductor tries to guard against any change, though, so it actually gives parts a lower voltage. That causes a temporary malfunction.”

Thinking about delivery in the architecture
While the effect of power islands have gotten the lion’s share of attention in low-power designs, they’re certainly not the only things that can go wrong. Failing to account for all possibilities up front can cause problems that grow as the chip moves from architecture to design and verification.

“Blocked frequencies and domains shutting off are a result of badly designed power distribution networks, which can happen even if you don’t have power islands,” said Rabaey. “By changing the resonant frequencies of the power network, you may see potential interplay with the clock frequency of the modules. But again, this is a generic problem with power distribution networks and has nothing to do with having power islands or not.”

Problems also grow as the semiconductor process shrinks. One of the problems in delivery of power at smaller geometries is the width of the wires themselves. While most engineers went through school with the assumption that electrons move through wires at a fairly constant rate–depending upon the type of wire rather than the thickness of that wire—that’s clearly not the case. IBM first began noticing earlier in the decade that resistance of smaller wires was increasing due to electron crashes with the atoms in the wires. Increased density meant more crashes.

The typical route for chipmakers is to engineer a solution to these kinds of problems. But that also increases the complexity and the price, because it usually means more parts. A 10-cent decoupling capacitor for a chip that is sold in quantities of 50 million adds $5 million to the overall price. And that doesn’t include the additional cost for assembly, which typically adds another nickel, or $2.5 million.

More parts also mean more complexity in the design. And more complexity means more things can go wrong.

“There was one chip we were developing where the clock gating domain produced a spike in current,” said one engineer, who asked not to be named. “We came up with logic to control the wake up, but when you shut down the clock it staggers it. As you’d expect, it got stuck. So we took off the clock-gating circuitry and there was a huge droop in voltage.”

In another real-world example, chip development was stopped the day before tapeout because there was insufficient decoupling capacitance. That affects timing. The chip arrived at tapeout two days later because a crew of engineers worked solidly for 36 hours to fix the problem. Needless to say, they wished the chip architects had figured this out ahead of time.

The Week In Review: Nov. 6

Friday, November 6th, 2009

It was a very good week for low power engineering, although you have to do more than just scratch the surface to figure out why.

Mentor Graphics connected the dots on test and yield analysis, building on its own internal development in the yield space and coupling that with its LogicVision acquisition. The result is a new solution called Tessent, whose purpose is to sort through the rising mass of data in complex chips and simplify it. This could be a great step forward if it works as well as Mentor’s pitch, particularly in complex designs involving low-power techniques such as power islands and multiple power states at different voltages.

Virage Logic completed its acquisition of ARC International. This should prove to be an interesting marriage, in large part because the sum of the two is greater than the parts. Virage has the capability to target much broader markets than ARC did. It now has IP for the processor, memory and logic areas—and some very close ties to the major foundries. And all of this is a low-power play, which makes it particularly interesting should it ever decide to play alongside ARM and Intel’s Atom.

This proved to be a good week for indirect distribution channels, too. Mentor signed up Authorize Pty as a distributor for its FPGA and PCB products. Those kinds of products have to go through distribution because they’re low margin, high volume. That means direct sales are too expensive. This is how you come out on top down under.

And Synopsys announced that Arrow Electronics, another big distributor, cut its test development schedule by using automatic test pattern generation with multicore processing with TetraMAX. A lot of the stuff to design at the nano-design level can be applied to the macro world, as well. Consider this a case in point, and a nice potential growth market for EDA vendors.

Infineon and TSMC will jointly develop a 65nm embedded flash process for microcontrollers used in automobiles and chip cards. Embedded flash is also less susceptible to radiation from packaging than DRAM and draws lower power. That’s been Actel’s whole pitch for its Fusion and Igloo lines.

Also on the foundry side, Chartered Semiconductor’s shareholders approved ATIC’s bid to acquire the company. Our prediction: Globalfoundries becomes the leading edge foundry with the most advanced process technology and SOI substrates, Chartered comes in one or two nodes behind running on either SOI or bulk CMOS, and potentially another foundry is added for older process technology. That way equipment is bought once, processes are developed once, and everything is passed down the line.

Power Briefs

Thursday, October 15th, 2009

Benchmarks
For years the federal government’s ENERGYSTAR program has been benchmarking white goods, and more recently it has applied the same power consumption to enterprise servers. But until now, there has been no similar type of standardized benchmarking at the component level.

That’s changing, however. The Embedded Microprocessor Benchmark Consortium has integrated its EnergyBenchTM software into all of its benchmark suites (with the exception of GrinderBenchTM) at no cost to its licensees. The suite has been implemented using National Instruments’ LabVIEW software and X Series data acquisition hardware. While the performance benchmarks are running on the targed board, EnergyBench calculates the average energy per iteration using multiple unaliased sampling frequencies and an adaptive statistical process.

More SOI
ARM introduced its first 45nm SOI test chip that it claims can boost power savings by 40%. What makes this particularly interesting is the battle between ARM and Intel. ARM is trying to boost its performance enough to compete with Intel, while also trying to stay in very tight power envelopes, while Intel is trying to drop its power enough to compete with ARM in many portable devices such as smart phones.

The battleground du jour is netbooks, and what’s interesting about SOI technology is it also allows chipmakers to cut power or increase clock frequencies. In fact, ARM claims it can also cut power by 30% and increase the clock frequency by 20%. What happens if it leans completely in the direction of performance?

So far, Intel has said it will not use partially depleted SOI (silicon on insulator) technology, although it is looking at fully depleted SOI. ARM, as part of its alliance with the Common Platform companies, is looking at SOI as a competitive advantage.

Next Page »