Posts Tagged ‘Moore’s Law’
The Growing Legacy Of Moore’s Law
Thursday, December 2nd, 2010By Ed Sperling
Moore’s Law has defined semiconductor design since it was introduced in 1965, but increasingly it also has begun defining the manufacturing equipment, the cooling needed for end devices, and both the heat and performance of systems.
In the equipment sector the big problem has been the delay in rolling out extreme ultraviolet (EUV). Moore’s Law will require tighter spacing than a 193nm wavelength laser can etch at 22nm and beyond, and at present the only alternative is double patterning. As double patterning implies, it requires a double pass of the laser, as well as a much more complex mask set, and significantly more time and expense per wafer. Moving from 40nm NAND flash to 22nm will require six extra steps. With logic and DRAM, that same shift will require an extra 10 steps each.
Moving to 3D die stacking will alleviate some of this problem. At least the analog and some of the IP can be manufactured using older process technology. But for the memory, the logic, and the processors, being able to etch more efficiently and quickly is a requirement.
Applied Materials, for one, is well aware of this trend. The company’s rollout this week in Japan of a new etch machine, Centris Advantedge Mesa, raises the number of process chambers from four to eight. That basically doubles the etch speed, thereby neutralizing the effect of double patterning. As a selling point it also cuts energy consumption by 35% in the etch process, uses less water and reduces carbon dioxide emissions.
“This is a Moore’s Law machine,” said Thorsten Lill, vice president in Applied’s etch business group. “The goal is to decrease the cost by 30%.”
In addition to increasing the number of chambers, Lill said one of the advantages of this approach is a decrease in the number of defects. He said that when EUV finally does become commercially viable, it will complement the double patterning that will already be a proven technology.
This same focus on density is forcing other changes across the supply chain, as well. Large data centers have started to add water cooling—something they eliminated with the advent of client/server computing in the 1990s—because the density of the semiconductors and the density of blade servers inside of those cabinets makes it almost impossible to cool the uppermost servers in a closed server cabinet. Coupled with increased utilization from virtualization and cloud technology, the amount of heat being generated is moving beyond the capabilities of forced-air cooling.
In response, IBM has begun offering a water-cooling option for its new mainframes, which the company says will boost performance and reduce cooling costs. “We’ve reached the heat transfer limit of air,” said Jack Glass, director of data center planning at Citigroup. “We’ve also reached the acoustic limits. The machines are getting too noisy.”
For related reasons, ARM is winning a bigger toehold in the enterprise server and networking market where it has had almost no influence for decades. “One of the main reasons why people are looking at ARM in servers is power,” said Pete Hutton, vice president of technology and systems for ARM’s advanced product development. “We’ve already shown that in Linux implementations we can improve power by two to three times, and that’s even without us doing software optimization.”
In a data center with thousands of servers and server racks, the cooling costs from that kind of power reduction can equate to millions of dollars a year—enough to warrant serious consideration for Linux-based applications.
In the consumer space, this same density is forcing even more radical changes in design. Multicore is giving way to many-core processors, SoCs are utilizing multiple voltage rails, sleep states and an increasing number of power islands, and designs typically are running at lower voltage than in the past.
This has made it extremely difficult to create designs from scratch, ushering much of the design industry toward 3D stacking. And it has increased the focus on power management around all chips because thermal issues become much more important when silicon die are layered on top of each other.
“The big issue is how you get the heat out so that non-volatile memory doesn’t fail,” said Hutton. “We’ve been looking at how you can aggressively turn off parts of the SoC and have more active control of the chip.”
Perhaps even more troublesome is the other part of the Moore’s Law economic equation that often gets hidden from the rest of the world, namely how to reduce other components in conjunction with the reduction in silicon. Aveek Sarkar, vice president of product engineering and support at Apache Design Solutions, said pressure will continue to cut costs on every side of an SoC, including the package.
“We’re going to see demand for one less layer of package, which means you have to do more with less,” said Sarkar. “You will be forced to squeeze a design into an available number of layers that is far less regular in terms of its structure. That will require a much more detailed power signal integrity analysis than in the past.”
Getting Ready For 15nm
Thursday, October 7th, 2010By David Lammers
The trends towards vertical transistors, non-silicon channel materials, and resistive RAMs promise to hold center stage at the 2010 IEEE International Electron Devices Meeting (IEDM), set to begin Dec. 6 in San Francisco, Calif. (www.ieee-iedm.org)
Taiwan Semiconductor Manufacturing Co. (TSMC, Hsinchu, Taiwan) will present a 22/20nm technology platform based on a FinFET architecture. The TSMC paper describes a full CMOS technology, complete with silicon germanium stressors, high-k/metal gate, and dual-epitaxy technology. TSMC said it demonstrated a 0.1µm2 SRAM cell, which operated at a 0.45V operating voltage (Vmin) with a 90 mV noise margin.
While TSMC is expected to shift from today’s planar transistors to the vertical FinFET devices at the 14nm generation in the 2015 time frame, the IEDM 22/20nm paper demonstrates that the world’s leading foundry has the FinFET manufacturing challenges well in hand. TSMC used 193nm immersion lithography to achieve NMOS and PMOS drive currents of 1200/1100 µA/µm respectively, at off-currents of 100 nA/µm.

Fig. 1: TSMC will unveil a complete FinFET-based 22/20nm CMOS logic technology at IEDM 2010. Electron microscope images show a cross-section of the vertical fin’s sidewall.
While creating 20nm gate-length vertical transistors is “demanding,” due to parasitic capacitances and other challenges, an abstract of the TSMC paper said the FinFET architecture allows continued scaling with good electrostatic control of the channel. To accomplish its scaling goals, TSMC turned a series of process technology knobs, including embedded SiGe to strain the PMOS channel, stress memorization techniques in the NMOS devices, an optimized contact edge stop layer (CESL), dual work functions, and both epitaxial silicon and boron-doped e-SiGe in the source and drain regions. Compared with planar transistors, the TSMC paper will describe much (100x) improved leakage from the source and drain regions, critical for low-power mobile systems.
Intel and IQE Inc. researchers will describe their latest advances with a FinFET architecture based on an InGaAs quantum well technology. At the 2009 IEDM, Intel described a surface-channel InGaAs FinFET. The quantum well InGaAs FinFET features fins, which are 35nm-wide and smaller, 5nm gate-to-drain and gate-to-source separations, and a high-k gate dielectric.
Intel and its research partner have been developing quantum-well compound devices as successors to silicon CMOS. The paper to be presented at the 2010 IEDM takes the InGaAs technology from a planar to a FinFET architecture, which delivers much-improved control of the channel compared with the planar devices described at the previous meetings. Also, the paper describes a high-k dielectric with a Tox of 20.5 Angstroms and good interface properties.
An InGaAs MOSFET will be presented by a team led by the University of Tokyo. The device features a 3.5nm channel, the smallest such device to be described thus far. The dual-gate device was created on a silicon substrate using wafer bonding.
Memories taking resistive turn
On the memory front, researchers from Intel and Micron Technology have developed a 25nm multi-level cell (MLC) NAND memory technology, with a cell size of 0.0028 µm2 – the smallest transistor now in production. An air gap was introduced between word lines to control the word line-to-word line capacitance and cell-to-cell interference.
The MLC device uses only 30 to 40 electrons per level, which requires advancements in the insulating tunnel oxide and the inter-poly dielectric in order to confine the charges. The cell has an asymmetric design, with a word line half pitch of 24.5nm and a 28.5nm half pitch in the bit line direction, allowing for insertion of the control gate between the floating gates. The technology is used for 64-Gbit NAND memories.
The authors will describe how the Intel-Micron team dealt with dopant fluctuations, structural bending, and other challenges presented at such small dimensions.

Fig. 2: Researchers from Intel and Micron Technology will describe the 25nm 64Gbit multi-level cell (MLC) NAND technology. The image shows the select gate and contacts in the bit line direction.
Resistive RAMs (RRAMs), which use a voltage to alter the resistive state of metal-based compounds, have emerged as a path to higher-density non-volatile memories once NAND flash scaling reaches its limit. A functional transition-metal-oxide resistive memory (TMO-RRAM) developed at the National Nano Device Laboratories in Taiwan has a record 9nm half-pitch, with a programming current of less than 1 µA, which compares with about 20 mA for phase-change memories. The researchers controlled the device’s resistivity by changing the chemical composition of the tungsten-oxide layer. They postulate that the memory’s change in resistance is due to the controlled movement of oxygen ions, with a monotonically varying ratio of oxygen and tungsten atoms.
The Taiwan laboratory’s research team includes Chinming Hu, a professor at the University of California, Berkeley. In an abstract of the paper, they said the “unexpectedly low” 1 µA current required to set and reset the RRAM cell makes it a promising candidate for low-power non-volatile memories.
The reported progress with exploratory RRAMs comes amid concerns about power consumption with the phase-change RAMs (PC-RAMs), which use heat to change the resistive state of a chalcogenide material. At IEDM, a team from the IBM/Macronix PCRAM Joint Project will describe a previously unknown failure mechanism for phase-change memories, apparently related to electromigration stemming from the polarity of the operating current.
At the high current densities required to change the state of the chalcogenide material, the researchers found that hole-induced electromigration occurs when current polarity is reversed. The paper claims that the phenomenon causes voids at the interface between the phase-change material and the bottom electrodes, limiting their cycling endurance by four orders of magnitude. The team also will discuss countermeasures to deal with the effect.
IBM researchers also will describe their latest-generation SOI-based embedded DRAM (eDRAM), enhanced with a high-k/metal gate technology. Big Blue claims eDRAM delivers several advantages over SRAM for large on-chip caches, including higher density, better soft error rates, and lower power consumption. The performance rivals SRAM speeds, with the SOI eDRAM delivering a sub-1.5ns latency and 2ns cycle time.
The 32nm eDRAM uses a deep trench capacitor with 25 percent higher capacitance and much less resistance than conventional memory stacks based on SiON/poly gate stacks. IBM said it use of a high-k/metal gate technology to reduce leakage and control the threshold voltage of 40 mV. IBM created a 32 Mbit array from cells measuring 0.39 µm2. The eDRAM is 3-4x smaller than a comparable SRAM, enabling a much-higher density on-chip cache, the abstract of the paper said.
Moore’s Law vs. Low Power
Thursday, September 17th, 2009By Ed Sperling
Moore’s Law and low-power engineering are natural-born enemies, and this dissension is becoming more obvious at each new process node as the two forces are pushed closer together.
The basic problem is that shrinking transistors and line widths between wires opens up far more real estate on a chip, which encourages chip architects and marketing chiefs at chipmakers to take advantage of all that extra real estate. But more functionality layered onto a die also increases the demand for power—or makes the development of the chip much more complicated.
One way to deal with all of this is to drop the operating voltage across the chip. But decreasing the supply voltage has its problems.
“If you decrease the supply voltage too much, then circuits don’t work anymore,” said Mark Bohr, an Intel senior fellow and director of process architecture and integration. “There isn’t enough signal-to-noise ratio to make it work. But there’s also no silver bullet for this. One of our ongoing challenges is to scale transistors and operating voltage.”
How to do that is a rather difficult task, however, and engineers and scientists working on the most advanced chips on the planet say it will remain extremely challenging at all future nodes.
“There is a minimum voltage any ‘charge-based device’ can work on,” said Jan Rabaey, professor of electrical engineering and computer sciences at the University of California at Berkeley. “It equals 2 (kT/q) ln(n+1), where n is the subthreshold slope factor of the device. At room temperature and a normal device (n=1.4 – 1.6), this translates to approximately 50 mV. Given the fact that there is margin needed for reliable operation, a practical minimum voltage would be around 100 mV. There are some ways to lower this. High k is not one of them, as the main purpose of that is to reduce gate leakage. More effective is to reduce n (which is 1 for an ideal bipolar device).”
Power modeling, power islands
One solution is power modeling, which is almost required as more power islands are added to a system on chip. The advantage is clear—if the majority of functions on a device can be powered down or even off when they’re not in use—then the amount of power consumed by the chip can be dramatically reduced.
But complexity increases with the addition of power modeling. It’s harder to design, to route traffic and prioritize that traffic.
“Even at the architectural level people are reluctant to use multiple power domains in their design because they don’t want to complicate their system,” said Prasad Subramaniam, vice president of design technology at eSilicon. ”They don’t want to have multiple voltage regulators. A chip already requires two voltages, one for the I/O and one for the core. They don’t want to go beyond that.”
Verification adds another level complication. It’s much, much harder to verify the chip because that verification has to be done utilizing every state of every different function and in every different possible sequence.
“This is major problem,” said Srikanth Jadcherla, Synopsys’ group director for R&D for low-power verification. “But it isn’t a tools problem. The tools for verification are there. It’s a methodology and mindset shift. Engineers are not used to doing regression and debugging in this way. You have to change the whole thing under you.”
This is easier said than done. Tools can be swapped out, and even when there is more training involved that can be a relatively painless step. But changing a methodology is radically different.
“If there are six power domains and on/off nodes, then you have 64 possible combinations (more if there are more states than just on and off). You have to make sure the chip still functions in each state and that you can get out of one state and into the next,” said Jadcherla. “RTL engineers never bothered about system states before. Now they have to know the major states. A smart phone has a phone mode and an e-mail mode and a camera mode, so you now need to do mode-based testing. This is not something we see in the design community yet. Low-power verification must be done in the context of the system.”
New materials, methodologies and technologies—and challenges
At least some of the problems will be dealt with using new materials. While Intel added restrictive design methodologies at 45nm, IBM and AMD changed substrate material from bulk CMOS to partially depleted silicon on insulator (SOI). At 28nm and 22nm, IBM and its ecosystem—which includes AMD—are looking at restrictive design rules and Intel is exploring the possibility of adding fully depleted SOI.
Intel looked into partially depleted SOI technology about the time that IBM did and ruled it out because the cost was too high and the performance benefit based upon that cost was limited. But Bohr said the company is now looking into fully depleted SOI technology. There is no determination whether Intel will use that technology at future nodes, but it remains a possibility.
The difference between partially depleted and fully depleted is that in a fully depleted model the source and drain in a transistor are depleted down to the oxide. The channel is subsequently deeper, which in turn provides better insulation. With SOI, chipmakers typically can get a boost in either performance or power. But with performance now far less of an issue in most applications than power, the bulk, the focus is on SOI to save power.
“SOI technologies have a slope factor of approximately 1.2-1.3,” said Rabaey. “There is currently a lot of research on the development of devices with an ‘n’ smaller than 1 (such as Tunnel-FETs or TFETs, and other hetero-devices). This would allow for lower voltages. Right now this is purely experimental though.”
Conclusion
There is no simple answer to how power issues need to be addressed. The clear implication, however, is that design will become more complicated in some areas even as it becomes simpler in others. Restrictive design rules will limit what design engineers can do, but they will open up all sorts of possibilities for power modeling and engineering that never existed—or needed to exist before.
As IBM’s top engineers have said repeatedly, each new node requires some group to feel the pain. In the past, much of that pain was absorbed in the manufacturing and foundry process. The next phase will hit the design engineer and the verification methodology. After that, it’s anyone’s guess.
Experts At The Table: Greener Design
Wednesday, April 15th, 2009By Ed Sperling
Low-Power Design sat down to discuss green technology and the future of low-power design with Rich Kapusta, Actel vice president of marketing and business development; Tom Quan, TSMC senior director of EDA and design service marketing, and Brani Buric, Virage Logic executive vice president of marketing and sales. What follows are excerpts of that conversation.
LPD: Is communication more open these days between the various parties involved in the design-to-manufacturing flow?
Quan: It’s improved greatly in the past couple years. Traditionally, for people doing digital designs, you needed the SPICE model. That has been available and there is no issue there, except that there is a lot more information than before when you go down to 45, 32 and 28 nanometers. And most of the digital designs will need to have access to the timing models for standard cells and memories. Those are pretty available. The challenge is when you go to a new process node, before the design can take advantage of that the infrastructure has to be available. We have to re-work with the IP vendors so companies can start building RAM—even though the process may change. We still need to go to pre-production so customers can start using it. That’s the part that’s more challenging. The actual mechanism of transferring data is not an issue anymore.
Kapusta: We’re on a process technology that’s somewhat unique, so we’re forced to co-develop the process with the foundry, which in our case is UMC. We’re working on 65nm embedded flash with UMC to give us the best technology for our FPGA families. We have our own process development engineers working with UMC, we’re doing test chips together and we’re tweaking the process together. By the time we finally tape out we’ve seen a couple runs of silicon, we understand what we’re doing, and we’re working together to get it right from the very beginning as opposed to waiting for a foundry to create a process node and jump onto that. We’re co-developing the process node.
Buric: That’s been a trend for several process nodes. Foundries are developing application-specific processes. With TSMC, when you go up to 65nm and 90nm, there is optimization for mixed-signal and ultra-low power processes, and even CIS (CMOS image sensor) processes. The idea of a general-purpose process serves a smaller and smaller market segment. More and more you will see applications that drive huge volumes will also be able to drive modifications to the process.
LPD: Are the foundries seeing that, as well?
Quan: Yes, that definitely is the trend. When you go down to 40, 32 and 22 nanometers, there will be mostly SoC designs. You have fewer of those, but the volume is larger and those customers have very specific requirements for their products to be competitive. Those will be more specialized processes. Last year we introduced the open innovation platform, which allows collaboration to go much deeper and much earlier in the process. One of the main features is design co-optimization so that each side can take full advantage of what’s available on the other side. We can trim 20% to 30% of leakage power even before tapeout. That was not possible before when everything was separate.
LPD: Is the concern low power or performance—or both?
Kapusta: For us it’s all about low power. Our customer base is not performance-driven. We’ve already surpassed all of their performance needs at the current node. When we go to the next node it’s all about getting more and more power out of the system, not making it go 10 times faster.
Quan: For the computer guys, it’s still all about performance.
LPD: That’s the plug-in computers, though, right? Not the notebooks?
Quan: Yes, the servers. Even for laptops, it’s hard to say you want less performance. Intel now has a 2GHz version of the Atom that only takes 1 watt. Communication and consumer are all about low power.
Buric: Atom is a good example. Even where there is a need for performance, those designs are built with low power in mind. If you just said run it at the highest performance possible with no concern for power, it would be in a ceramic package and require liquid nitrogen to cool it down. With everything we have seen at 40nm, they design for performance, but they also design for low power because it is cost-effective. With packaging costs and a huge number of transistors, you cannot afford to make those designs if they are not low power.
LPD: In some designs, the clock speeds on individual cores aren’t getting faster. Are we getting to the point where adoption of new nodes will slow down?
Kapusta: We’re not even talking about 32nm. We’re strategically behind the leading edge because we don’t need that performance and we can get the power we need one or two nodes back. We’re at 65nm now.
Quan: For mixed-signal RF and analog, most designs don’t go that fast. With the 65nm general-purpose process, you can push the 60GHz transceiver. That’s the probably the highest frequency we push in any market. But for computing and graphics, the trend is still there and going down. We had a lot of activity at 40nm and 28nm. Traditionally there were bell curves with adoption and maturation. It probably will get flatter, though, as time goes on. The highest revenue producer for us is the 90nm and 65nm nodes. More than 60% of our revenue is there. That’s the sweet spot, and it’s where most of the activity will occur for some time.
Kapusta: Even at 130nm when we came out with our ProASIC 3, it was low power but it wasn’t that low power. It was still higher performance. When we came out with the Igloo line on the same node, we pushed the equation further into the power side. That family is more popular than the ProASIC 3. It’s basically the same architecture but lower power vs. higher performance.
LPD: Will Igloo ever fit into embedded applications?
Kapusta: Right now you can embed an ARM soft core into the chip.
LPD: How about the other way around—embedding Igloo into other chips?
Kapusta: We have had a few conversations about that. Some customers are trying to figure out how to embed it into other processors, but so far, no.
LPD: On a different subject, is the trend toward stacked die?
Quan: Stacked die is a different term for 3D chips. It’s already there for the memory companies. Most of the connections are still through bonding, but one technology that is still in the works is through-silicon vias. You actually drill holes through the wafer and fill it with copper. There are still a lot of issues to solve, but the technology is there and the prototypes are done. Certainly timing has to be worked out, but the real issue is how to distribute the thermal crests of these die and how these conducting columns are supposed to be behave. The good news is that silicon is not a bad thermal dissipator. The challenge is when you stack things against each other—a processor next to RAM next to analog. You need to analyze what gets affected most.
LPD: Is it lower power when you stack a die, though?
Quan: If there’s any change, it’s the power dissipation in the interconnect. If you have four cores and it’s flat, the signal needs to travel across these connections on a narrower line versus a fat copper interconnect between die.
Buric: A big problem to solve is how to test for a good die before you start stacking things. Your yield changes when you add these interconnects. The problems multiply.
Kapusta: I think you can get lower power by stacking. If you look at two functions you can choose the lowest power process implementation for each of these two functions separately and stack them together versus making a compromise of integrating them on a less optimal process. If you’re looking at 65nm flash and want to stack it with some memory, you can choose a 40nm SRAM process and get the best of both worlds. If you want to implement it on a 65nm flash, you make compromises. You can make lower-power SIPs (systems in package) by taking the best of each element you’re trying to stack.
LPD: Is there a way of manufacturing these so the cores are not the same?
Kapusta: When you think of multicore, you’re thinking high performance. We have customers implementing multiple soft cores on a fabric. As we start embedding hard cores into our FPGA, people will have the opportunity to use hard cores and soft cores and build a system based on the processing chunks they need rather than being forced into some choice.
Quan: There are two trends. One is a more general-purpose platform where the cores will be different, but each one has the same purpose such as processing or graphics. The other thing we see is where each core is custom. They’re all small, very low power, but there may be 500 or 1,000 of them. Those are for very specific purposes like simulation or processing of security applications. Instead of using a general-purpose application where you waste a lot of power, you make the cores very specific and very low power. Each of the cores is maybe 100th of the size of a standard core in terms of size and power.

