Posts Tagged ‘virtualization’

Virtualization In Your Hand

Thursday, March 11th, 2010

By Ed Sperling

The addition of multiple cores inside of computers has created an enormous opportunity for virtualization. Instead of running one operating system or one application, a single server or multicore PC can run multiple virtualized OSes on a single machine at the same time.

From the standpoint of energy efficiency, this has been a huge gain in data centers and the corporate enterprise. With most servers averaging 10% to 15% utilization, rather than the recommended 80%, one multicore serer running a virtualization layer could replace as many as eight less efficient single-core servers. That means less power to run applications, less power consumption by the new machines, and less power needed to cool server racks.

From an economic standpoint, this all makes sense. But that’s not the end of the road for virtualization. By the end of this year, that same technology will show up in smart phone prototypes, with products using this technology expected to hit the shelves in 2011.

“Our strategy has been that, over a period of time, mobile products would have a mobile interface,” said Srinivas Krishnamurti, director of product management and market development at VMware. “The goal is not just to shrink our existing technology to a mobile PC.”

Much of this has been under tight wraps since VMware bought Trango Virtual Processors in 2008. There has been much speculation in the mobile world about what all this means and how it will unfold, but little information. Details are now starting to emerge.

Krishnamurti said one use is allowing non-standard devices like the iPhone or Android device to be supported by corporate IT departments by using one of the cores for connecting to the enterprise. When that core is in use, access to other cores is restricted. But the next phases of development become far more interesting from a power-management standpoint.

Following the data center
Within the enterprise data center, one of the newer applications of virtualization technology is the ability to move processing onto machines, or even cores, that are underutilized and shut down any processors that are not in use. Entire regions of the data center can be shut down on weekends, for example, and loads concentrated where power is already being used. For a large company, that can result in savings of tens of millions of dollars annually.

In a mobile Internet device, that same strategy can be used to save battery. Lisa Su, senior vice president and general manager of Freescale’s networking and multimedia division, said the ability to partition for Linux and proprietary real-time operating systems opens up all sorts of possibilities for improving power management—particularly as more cores are added into the processors in these devices.

“Whatever we do at the infrastructure level will get down to the device level,” Su said. “We will see it on consumer devices soon.”

Taking full advantage of virtual machines in mobile Internet devices, however, requires that much of the power management be built into the software. A graphics-intensive application such as a game, for example, needs far more power than instant messaging. While those types of applications can be hard-wired into different sizes of cores with different voltages, allowing thyem to take advantage of whatever core becomes available with a virtual machine requires flexibility in the voltage supplied to the virtual machine running that application, regardless of what core it’s running on. There may be a fixed number of possibilities, or there may be a range of possibilities. So far, none of that has been fully worked out.

Also not fully worked out is how to verify the systems using this kind of technology. While virtualization has thrived in the enterprise, where machines are plugged into the power grid, handheld devices have had to rely on much more creative and painful techniques such as power gating, power islands and various on-off states. How virtualization will work with those states, and how devices will be verified, remains to be seen. For example, will virtualization supplant power islands altogether or be part of the strategy for turning parts of the chip on and off? And will virtualization ultimately require more power than power islands and right-sized cores with tightly coupled software?

Krishnamurti said VMware has been spending a lot of time on slimming down the hypervisor level in the virtualization layer, as well. The current layer for servers takes up about 32 megabytes of storage. In mobile phones, the new layer is expected to take up only 20 to 30 kilobytes. He declined to discuss more details, saying that VMware has a number of patents pending in this field.

“But from all the testing we’ve done so far, the power overhead is not significant,” he said. “The biggest drain on these devices is still the display.”

Greener Data Centers

Thursday, December 10th, 2009

By Ed Sperling

For decades the race inside the data center was all about performance. If you upgraded from an IBM Series/370 mainframe to a Series/380 your applications ran faster. And if you upgraded your PC server from a Pentium II to a Pentium 4 you got significantly better performance.

The race now is to reduce the number of servers altogether, to lower the cooling costs per server rack, and to utilize the servers that are running more effectively. Performance is a “nice to have,” but power reduction is a “must have.”

What’s changed in the thinking of data centers and why are server-class electronics now being subject to the same kinds of power-saving concerns as portable battery devices? There are a number of factors to consider, and all of them are converging at the same point.

A messy legacy

To understand the problem inside data centers requires some history—as much as six decades worth in many large companies. Data centers in many ways look like geological striations. While new technology runs many of the most advanced applications, there are still old, assembly-code mainframes and even minicomputers still churning cycles each day. In many cases no one knows what’s even running on those computers. But at the risk that it could be important—or worse, that something else might be affected that is known to be important—the fear of turning off these machines is palpable.

2423PH2044

Figure 1: IBM’s S/360, circa 1964 (Source: IBM)

Large corporations have been systematically looking through the data on these machines and others over the past several years in an effort to get this old stuff out of the data center. It takes up expensive real estate, uses an enormous amount of power—no one even thought about power as an issue when these machines were installed—and requires expensive cooling because the average data center runs at about 70 to 72 degrees Farenheit. The only good news was that early mainframes used water for cooling instead of air, which was much more energy-efficient.

Minicomputers entered the mix in the 1980s as a less-expensive but air-cooled approach. Those computers are still in use in many companies alongside mainframes that pre-date them. Ken Olsen, the founder and CEO of the former Digital Equipment Corp. (bought by Compaq and later absorbed by HP) is famous for saying that in minicomputers there would be no plumbers. While that made it easy to move around the machines, it also paved the way for more expensive cooling since then.

800px-Pdp7-oslo-2005

Figure 2: DEC PDP-7 (Source: Wikipedia)

By the 1990s, commodity servers using primarily Intel processors began replacing mainframes. Even IBM and Hewlett-Packard began selling Intel-based machines, usually in the form of blades that could be placed more closely together in a rack. And they were so cheap that business units could afford to use dedicated servers for their individual applications, create their own customized processes and finally put decisionmaking closer to the customer.

That was the argument, at least, and it was considered the best practice at the time. After 20 years, however, some companies accumulated hundreds of thousands of these servers, often running only one application with utilization rates as low as 5%. And because they were air-cooled, often with raised floor construction that cooled from the bottom instead of the top—heat rises, of course—the cooled air had to be run almost constantly and often ineffectively.

Virtualization and clouds

Virtualization has been touted by Intel over the past half-decade as the ultimate solution to server sprawl. Rather than run one application per machine, many applications could be run using virtual machines. While the concept was new for PC servers, the technology was invented by IBM back in the 1960s and employed in mainframes for decades.

Virtualization also works particularly well with multicore chips. And because it’s impossible to keep cranking up the clock frequency on processors without melting the chip, it’s now a requirement that all new chips have multiple cores. But only database, graphics, some scientific applications and some EDA tools have effectively been able to parse functions across multiple cores. The vast majority can use a maximum of two cores effectively, which creates a business issue for chipmakers. If they can’t figure out a way to use all those cores, there’s no reason to buy new chips.

Virtualization was resurrected as the ultimate solution for that problem. By adding hypervisors to manage the applications running on a single core, and by dynamically scheduling those applications to run on available cores instead of dedicating cores to applications, a system can conserve huge amounts of energy. Old mainframes used this approach primarily to utilize compute resources, but power consumption is the new competitive weapon.

Cloud computing—which is basically used to clean up data centers, often with a virtualized approach to running applications—is another term that has been overhyped in the data center. It generally means outsourcing, although in many companies at least part of the cloud is inside their data center and dedicated for their operation. That turns the IT department into a business unit that can create its own profit-and-loss center and keep track of the overall costs.

Intel’s latest research, which is expected to start showing up in servers made by other companies over the next several years, is to build a cloud on a single chip. (See Figure 3) By adding enough cores—48 is the current number tested by Intel—there is no reason to ever go off the chip. Intel believes the total server power consumption at that point could be measured in less than 125 watts when fully utilized.

scc-h-rack

Figure 3: Intel’s prototype for a cloud on a chip. (Source: Intel)

What this does, in effect, is bring the resources used in a computer down to the chip level instead of between machines. At that point, the challenge of getting computers to talk to each other and to shift resources will be significantly confined and power consumption will become a much more localized problem.

To some extent, this is no different than what has been happening in smart phones. When cores are not in use they go into various sleep modes. It doesn’t matter, for example, if a game takes a couple seconds to boot, while it is essential that the phone function be always on and ready to work.

The same type of control can be applied to data centers. A search of old data, for example, can stand a wait of several seconds, while a transaction from a customer must be instantaneous. Running a payroll application likewise can stand behind a more critical function in a data center, such as blocking a possible security breach.

This type of scheduling on a single machine, let alone clusters of machines, is a new concept, however. In the mainframe and minicomputer days, all resources were managed locally. In the PC world, particularly those connected to the Internet, management can be centralized for a global corporation. But in the new model, it also can be centralized on a single machine once again with enough processing power and low enough power requirements to significantly cut costs while also maintaining at least the same performance—even if applications cannot utilize multiple cores.

At that point, it may be more a matter of scheduling priority—and in some cases, paying for that priority access even within a company—than how fast the machines are running. After decades of arguing for centralized control as the most efficient way of using resources, many data center managers are finding it’s also the most efficient way to use power.

Does that sound familiar?

Experts At The Table: Building A Better Mousetrap

Friday, September 4th, 2009

Low-Power Design sat down with Richard Zarr, chief technologist for the PowerWise Brand at National Semiconductor; Jon McDonald, technical marketing engineer in Mentor Graphics’ design creation business unit; Prasad Subramaniam, vice president of design technology at eSilicon; Steve Carlson, vice president of marketing at Cadence Design Systems, and David Allen, product director for power at Atrenta. What follows are excerpts of that conversation.

By Ed Sperling

LPD: How important is it to be green?
Zarr: In the past, when our customers plugged something into the wall they didn’t care. They pushed the problem off. But with some of the legislation, people are starting to care. No system is ever loaded 100% all the time. Even data centers are not always busy. Typically 50% to 80% of the power is wasted. They’re running at high speed and consuming power when they don’t need to be. But they’re not doing anything about it because it’s adding complexity or it’s adding cost. You’re designing the hardware, but someone is taking that and using in ways that you didn’t design it.
Carlson: I wrote a paper on the effects of virtualization. One of the things they would do in data centers is offload the servers, but the servers would have to go into standby mode when they’re not being used. They didn’t stand-by very well because they were never designed to stand-by. An improvement in the architecture at the macro level would be a big benefit, but people aren’t doing that unless they’re forced to do it or unless it becomes a competitive advantage.
McDonald: Where people have been investing the time—in the handhelds and at the micro level and device-level optimization—we’ve squeezed a lot of benefit out of that. Things can be made better, but a lot has already been done. At the macro level, almost nothing has been done.
Allen: The great thing about the handhelds is they’re proof points that it can be done. There’s a lot of work going on in the networking companies now, but you’ve got to start at the IC level. Once you’ve got the infrastructure there, then you can start layering on energy efficiency in the lighting, the HVAC in the data center and controlling of peak power.
McDonald: Cisco did a study in 2006 where they determined that if they saved 1% on the power for a network router it was the equivalent of taking tens of thousands of cars off the street. But when you’re designing it, no one cares. They just want to get it out the door and meet performance.
Zarr: Education is a big thing here. Designers are not educated in the vehicles to reduce the power consumption in their designs. It hasn’t been a priority for them.
McDonald: It’s also the delayed benefit. It’s not a benefit to the designer or even the company making the chip.

LPD: If it came down to hitting a deadline for getting a design out the door or cutting power, what’s the likely response?
Carlson: In the case of a very large printer company, it’s getting the chip out the door—even if it ultimately costs more money.
McDonald: Power is not really what most people care about up front. You care about the economics. You care about power only insofar as it affects the economics.
Zarr: It may have more of an impact as we go forward.

LPD: What happens if we trim the margin in designs? Do we gain power savings?
Carlson: There’s incredible waste. If you look at design methodologies for front-end design teams, there was a 5% margin. Now it’s typical to see 20% to 25% margin. One company we’re working with is going to use a 50% timing margin on the design for a battery-operated application. You start to explain what the impact will be on the overall logic architecture and the response you get is, ‘I hadn’t thought of that.’ You need to look at timing and power together. That’s where the real increases in margin occur.
Subramaniam: Margin is an issue, but it’s even more than that. Today we’re overdesigning chips because we are designing for the worst-case scenario that may never occur. So how do we take advantage of the process itself? You need to monitor the chip and lower your voltage accordingly. You’ve designed the chip for the slow corner, but you know that in normal conditions the chip is going to work much faster.

LPD: We’ve been adding cores and power domains on a regular basis. Now we’ve got a bunch of this stuff. How do you manage all these pieces?
Allen: You need to start at the architectural level. You can’t retrofit designs on the chip. There are a small number of power architects who can do this. They understand what the tradeoffs are, and from an EDA perspective you have to arm them with the right tools.

LPD: How small?
Allen: At ST there might be four. At TI there might be a half dozen. Maybe that’s enough. You don’t need a whole new power architecture for each derivative. You need a power architecture for the first one, and then you may get 30 or 40 derivatives out of that. But can every small company afford to have one of those guys? No. But big companies do have this expertise.
Carlson: There are sources of expertise to bridge the gap.
Allen: With external expertise, there’s a question of how much the design team learns.
Carlson: It depends on how you structure the engagement. If it’s a turnkey operation, they’re not going to learn much. But you can also teach them how to fish.

LPD: Do we ever get to the point where it’s no longer economical to do this stuff?
Subramaniam: You can probably go quite low on voltage for digital logic. We had a customer running digital logic at 600 millivolts. They could afford to do that because the chip runs at a very low frequency. If you’re willing to go with low performance, you can go to very low voltage on digital logic.
Allen: We’re not quite at the end of this road. Another thing to think about is how much charge is in a battery. That’s not really going to change that much. But there is still a lot of potential for architecture at the high end of the spectrum. Those guys can probably learn a lot.
Zarr: Even architectures that scale frequency will find benefit.

LPD: Is there a limit to how far down we want to go down the Moore’s Law roadmap, though?
Subramaniam: There is definitely a tradeoff. Only those with high-volume products will be willing to go to the next step.
Zarr: You never know until the next materials come out. They’re just continuing with strained silicon techniques and SOI.
Subramaniam: There are still a lot of designs done in 0.25 micron and 0.18 micron today. TSMC has not retired a single process since its inception. People will be willing to go back to older nodes if it helps them, but it doesn’t really help with power because they consume more power.

LPD: How do more restrictive design rules affect all of this?
Carlson: That will drive a renaissance in architecture. The process guys will quit solving the problem for you, and you have to be more clever about everything. You can’t just say you’re going to use the next-generation LP process and think you won’t have a problem with it.
Allen: There have been a number of times where the design guys said, ‘Leakage is going to kill us,’ and the process guys said, ‘Don’t worry about it.’ Then it scales to the next generation and it’s something else. The process guys may save us, but they won’t be able to save us forever.
Zarr: Somewhere along the line we’ll have to change materials, whether it’s carbon or something else. Everyone’s trying to avoid making that kind of investment.