Posts Tagged ‘stacked die’

Will Wide I/O Reduce Cache?

Thursday, August 25th, 2011

By Ann Steffora Mutschler
In an ideal world, all new SoC technologies would make the lives of design engineers easier. While this may be true of some techniques, it is not the case with one advanced memory interface technology on the horizon, Wide I/O.

There are claims that Wide I/O could reduce cache, but so far this is not widely understood. In fact, exactly how Wide I/O will be used, what the benefits will be and when it will become a mainstream technology are hazy at best

Marc Greenberg, director of marketing for Cadence’s SoC realization group believes Wide I/O will reduce cache but cautions that there is no one answer for that because every system is going to do it a little bit differently. “In some cases you might say some of the L2 or L3 cache could move into a Wide I/O device. That’s certainly a possibility. Or maybe not all of it, but perhaps some of it—maybe none of the L2 but all of the L3. It’s also possible that the Wide I/O becomes sort of an L4 cache to some other, even more distant memory; it becomes a new layer in the memory hierarchy,” he said.

Greenberg believes all of these options will likely be seen in different chips.

Cadence's Greenberg: No simple answers.

“The real thing about cache is that you want to keep small, fast memory close by, and then slower, larger memories farther away. Unless you have a super-fast memory off-chip, Wide I/O will not remove cache from on-chip. The fastest off-chip memory today is still much, much slower than on-chip SRAM, so you’ll always have cache on-chip as far as possible,” said Prasad Saggurti, product marketing manager and senior staff for embedded memory in Synopsys’ test and repair group. “As you need to go to larger sizes—if you go to L2 cache, even that tends to be on-chip. You could have a situation wherein instead of doing DRAM and using that as a L3 or so on being the main memory, you might have an intermediate that replaces DRAM or complements DRAM by having a Wide I/O to regular memory and then have this.”

Synopsys' Saggurti: Cache will always be on-chip.

In general, Wide I/O is seen as a way to take care of I/O speeds so instead of going to a DRAM through a high-speed serial interface, Wide I/O could be used to reduce latency.
Early adopters of Wide I/O have been in the mobile space for cell phones and tablets. In that case, the Wide I/O is replacing the main memory, observed Cadence’s Greenberg.

“There have been people hinting at not being able to stack enough DRAM on top of, perhaps, a tablet processor, so you might want to have another tier of RAM further out in memory. In that case, the Wide I/O becomes either an L3 or L4 to some even more distant memory.”

Steve Hamilton, applications architect at Sonics, stressed that stacking could theoretically maximize L2 and L3 caches, even though that is not likely any time soon. He said there are some people looking at using through-silicon vias to place a denser memory close to processors. A bigger cache can be placed in the same space, but there are a number of reasons—both physical and economic—why that does not yet make sense.

“This would require a custom memory chip to perfectly match the floorplan of the SoC,” said Hamilton. “Economics don’t support that. Managerially, you would then need to coordinate two custom chip developments to intercept at some point. That adds risk. Then there are restrictions on where the TSV columns could be placed on a die that we don’t fully understand yet. The dies expand and contract in operation due to heating. That may stress the connections or crack the die if it’s not engineered correctly. We don’t have enough experience yet to know those rules.”

Sonics' Hamilton: Unlikely to reduce cache.

It makes a lot more sense to start with a single common interface point, as this allows for mechanical expansion. By defining it as a physical standard, just as other interfaces have done, it allows independent manufacturers such as the DRAM and SoC vendors to do their own thing. The common standard also amortizes the development costs over a larger set of applications. So something like Wide I/O is a perfect starting point for TSV technology, he said.

But when it comes specifically to Wide I/O, Hamilton doesn’t believe the technology will reduce cache at all. “Wide I/O provides a wider (4-channels) interface to DRAM, but operates at a lower frequency than DDR3. Wide I/O also has some painful restrictions on page access rates. So the total bandwidth is only slightly improved. Worse, this is just I/O bandwidth. The actual access time to DRAMs (latency) is not changing at all. Caches are used to minimize read latency and increase memory bandwidth (for an access stream having locality). So as long as there is external DRAM of any type (with high latency) there will be caches.”

This is the most frustrating thing for SoC developers who need low latency, high bandwidth, and low power from DRAM, and not high density. Meanwhile, DRAM vendors keep marching down the path they understand—more density with each generation. They only reluctantly have moved to newer specs that increase I/O bandwidth. But these specs are increasingly hard to use. They rely, for example, on access in larger chunks than processors may need, while doing nothing to address latency.

“As long as the server guys—who need density—are the majority of demand there is not sufficient motivation for the DRAM vendors to optimize for what the mobile folks need,” Hamilton believes.

Interestingly, eDRAM does have the potential to reduce caching. “Processors generally use private L1 caches, and share L2 caches across small (2 to 4) clusters. eDRAM radically improves latency (and bandwidth, and power). So it becomes possible to consider eliminating or reducing L2 caching when eDRAM is used,” he noted.

3D Integration: Extending Moore’s Law Into The Next Decade

Thursday, August 27th, 2009

By Cheryl Ajluni

At the 46th Design Automation Conference in San Francisco last month, attention turned to a discussion of how to extend the momentum of Moore’s Law into the next decade. One plausible solution, according to Philippe Magarshack, the general manager of Central CAD & Design Solutions at STMicroelectronics, is 3D stacking for complex System-on-Chips (SoCs).

The concept of 3D stacking or integration technology is not new. In fact, 3D stacking of dies has been successfully demonstrated and is currently being commercially employed in some embedded domains (e.g., stacking DRAM memory on CPU cores). A recent 3D IC report from Yole Développement suggests that by 2012, the number of 3D IC-processed wafers could surpass 10 million units, driven in part by handset, wireless and computing applications. Given the intense interest and work going into developing 3D integration technology, this prediction seems just about right—assuming, of course, that a few challenges can first be met.

Exploring the third dimension

Very simply put, 3D integration consists of stacking integrated circuits and connecting them vertically so that they behave as a single device. A 3D chip is therefore just a stack of multiple device layers with direct vertical interconnects tunneling through them. So what’s the big deal about 3D integration?

Today’s semiconductor chips face extreme pressure to achieve increased performance, while reducing their size and accommodating lots of new functionality. When these factors coalesce in traditional 2D chips, longer interconnects result. In SoCs, longer interconnects translate into reduced speed and increased power consumption.

A key benefit of 3D integration is that it can reduce the length of interconnects. Additionally, it provides higher transistor density, faster interconnects and heterogeneous technology integration, with potentially lower power, cost and faster time-to-market. As Matt Nowak, director of engineering in the VLSI technology group of Qualcomm’s CDMA technology division, pointed out in a DAC 2008 presentation, the 3D approach “achieves extremely high densities, allowing us to use heterogeneous technologies and reduce form factor. The key is that it enables the use of new differentiating technologies to build new architectures that cannot be built in existing technologies.”

Eyeing recent developments

Up to this point, most efforts in 3D integration have focused on developing different fabrication techniques for stacking multiple device layers and forming the vertical interconnects. Much of the work has been done through collaborations with academia, industry organizations and government-sponsored laboratories around the world. One of the key technologies to come out of this research is a next-generation interconnect technology known as Through-Silicon Via (TSV). The TSV is a vertical electrical connection that passes completely through a silicon wafer or die to produce multilevel chips with an optimum combination of cost, functionality, performance, and power consumption. By using TSV technology, 3D ICs can pack greater functionality into a smaller footprint and realize shorter critical electrical paths, resulting in faster operation.

Some of the other developments to come out of ongoing 3D integration research were recently recognized at the Electronic Components and Technology Conference. Sandia National Laboratories presented details of its W TSV process, which is said to provide a suitably low-resistance metal with a coefficient of thermal expansion close to Si, a via fill that is conformal, and can be readily integrated into IC fabrication. IMEC introduced a novel process for die-to-wafer bonding (using Cu-Cu bonds) of its 3D SIC technology and a scalable TSV technology for 3D wafer-level packaging. Its TSV technology is designed for 3D structures where interconnects are fabricated after standard CMOS processing.

SEMATECH also is focusing its 3D research on TSV technology, particularly for implementation. The industry organization is actively working to bring together partners from across industry—chipmakers, equipment and materials suppliers, assembly and packaging service companies—to make 3D TSV suitable for high-volume manufacturing (Figure 1).

Figure 1. In contrast to the 2D-SoC or 3D System-in-Package, 3D TSVs offer a cost-effective way to achieve high density and performance, while also being able to integrate non-CMOS products with CMOS. The SEMATECH 3D project is based on cost modeling to assure products will be both manufacturable and affordable.

Help: tool support needed!

While ongoing research and development is absolutely critical to the success of 3D integration, perhaps one of the greatest challenges it faces is tool support in terms of design techniques and methodologies. Without it, engineers have virtually no efficient way to exploit the technology’s benefits. Tool support is especially critical when it comes to 3D integration because vertical stacking tends to increase thermal resistances, further exacerbating temperature-induced problems that can negatively affect system reliability, performance and leakage power. The use of 3D also will significantly complicate the typical design flow.

The key, of course, lies in creating a standardized design environment and methodology for physical design of 3D chips that could support a range of different tools. Having the tools integrated in one place would make it easier for designers to explore and make architectural decisions and then, to hand those decisions off to next stages in the design process.

3D IC integration is still in its infancy and, as a result, tools developed today for one specific application (e.g., stacked memory) may not be suitable for heterogeneous integration tomorrow. Nevertheless, there are some tools available now, with more in development. Some of these tools include:

3D PathFinding

Javelin Design Automation. 3D PathFinding provides a detailed 3D flow for accurate performance/power/cost estimates that can be used for rapid design exploration and optimization of 3D stacked ICs. Developed in collaboration with IMEC and Qualcomm, the solution extends Javelin’s existing PathFinding methodology and j360 Silicon PathFinder physical design prototype platform to support virtual chip design (Figure 2).

Figure 2. Javelin’s 3D PathFinding solution allows the designer to assess the impact of various 3D interconnect strategies throughout the IC design and fabrication process, in a matter of just a few hours or days. Silicon process engineers can use it to fine-tune their technology to the system architecture specifications.

MAX-3D, R3Integrator, R3CAD, and R3Artist; R3Logic

These tools, developed through work conducted as part of research programs sponsored by the Defense Advanced Research Projects Agency (DARPA), enable 3D IC design and analysis (Figure 3). MAX-3D is a 3D mask layout tool whose technology file includes all properties of stacking process, wafer orientations, bond materials, via electrical/material properties, and also incorporates 2D foundry design kits. R3Integrator is used for die/interposer/package co-design with TSVs. R3CAD is a java-based, multi-platform tool for 3D design research and prototype study and R3Artist is an embedded 3D layout editor (Figure 3).

R3Logic is currently collaborating with STMicroelectronics and CEA-LETI to develop a full 3D design flow for 3D heterogeneous system and system-in-package design.

Figure 3. R3Artist features single and multiple wafer technologies, integrated material properties database and solid model extraction, including dielectric layers.

3DCACTI

3DCACTI estimates the optimum access times and power dissipation of a cache using 3D IC technology for a given number of active device layers and by partitioning device layers for various technology nodes. Based on the estimation, it searches for the optimized configuration that provides the best delay, power and area efficiency trade-off according to the cost function for a given number of different 3D partitions.

3D Magic and PR3D, Massachusetts Institute of Technology

3D Magic is a comprehensive layout methodology for 3D circuit-layout editing and extraction with MAGIC, an open source layout editor developed by UC Berkeley. PR3D is a placement and routing tool for standard cell design in 3D. Both tools were developed through MIT’s Interconnect Focus Center Research Program. MIT also developed SysRel (System-Level IC Reliability) for assessing the interconnect reliability of 3D ICs from a thermal-aware perspective at the circuit-layout level.

Conclusion

With the pressure on traditional 2D chips mounting, 3D integration has begun to establish itself as a viable means of breathing more life into Moore’s Law. It certainly touches on all the hot buttons in the industry today, namely low power, cost and time-to-market. The challenge will be in ensuring that these benefits are realized in a timely and efficient manner. 3D-specific design tools and methodologies are coming to meet this challenge head on. In the meantime, the tools available now and the groundwork for future tools and methodologies being laid by industry organizations, academia and commercial companies alike, will go along way in ensuring 3D integration plays a critical role in the future of the semiconductor industry.