Posts Tagged ‘Xilinx’

Next Page »

Using Power Aware IBIS v5.0 Behavioral IO Models To Simulate Simultaneous Switching Noise

Thursday, April 25th, 2013

Typically simultaneous switching noise (SSN) transient simulations require significant CPU and RAM resources. A prominent factor affecting both CPU and RAM resource requirements is the number of MOSFET models included in the post layout extracted IO netlists. By replacing the IO netlists with power aware IBIS v5.0 behavioral models, both the CPU and RAM resource requirements are dramatically reduced. A comparison of several SSN transient simulations whereby the aggressor frequency is sweep across a wide frequency range is shown. The resultant victim waveforms will clearly demonstrate that each SSN transient simulation using post layout extracted IO netlists requires days to run compared to just mere minutes using power aware IBIS v5.0 behavioral models. Most notably, there is no significant loss in accuracy. In fact, in many cases, there is an increase in accuracy due to convergence issues associated with post layout extracted IO netlists. The power aware IBIS v5.0 behavioral models offer both dramatically faster transient simulation times and lower memory requirements. Improvements to these two key metrics without sacrificing accuracy, allows for more aggressive and accurate signal and power integrity analysis than has previously been possible.

To view this paper, click here.

Prototyping Now A ‘Must Have’

Thursday, November 29th, 2012

By Ann Steffora Mutschler
No longer a ‘nice to have,’ FPGA-based prototyping is now indispensible for SoC and ASIC development. Semiconductor companies are investing in the infrastructure, the EDA tool chain, the human resources and everything needed to set up an entire department to focus on prototyping, emulation and validation.

“We are seeing these customers invest in significant amounts of equipment because if you look at a lot of these different companies are not just building one chip. They are doing one chip this quarter and they are doing another one next quarter,” observed Kirk Saban, a senior product line manager at Xilinx.

As the name implies, FPGA-based prototyping tools allow a design team to map their RTL code into one or more FPGAs to allow for analysis, debug, and software development long before the actual chip is manufactured. The beauty of this technology for the user is that because the FPGAs are programmable, investment can be made in a hardware infrastructure and scaled up as the need arises.

Driving the use of FPGA-based prototyping is partly due to shrinking process geometries. The mask cost to spin an ASIC at those geometries goes up every node and the barriers of entry to play in that game are very high, he noted. Another market driver is the explosion of smart devices, mobile handsets and tablets—all of which contain multiple processor cores. “If you crack open an iPhone or an iPad or Samsung tablet they all have a system-on-a-chip in there that has multiple processor cores with some custom logic around it. Everybody wants a new gadget every 18 months so all the guys that build these chips have to crank them out at that pace, and the only way they can do that is by turning to this approach to validate their methodology,” Saban asserted.

From the IP provider perspective, Javier Orensanz, director of product management at ARM, said the company is working with partners to improve the software debug experience. “In general, the areas where we focus are modeling software development on processors that are either inside or implemented alongside FPGAs. The big problems for the software developer are actually quite similar to the ones that the silicon vendors face. So when we are talking about hardware, which is buggy or nonfunctional, software-hardware integration problems, issues that happen at the boundary with the FPGA itself and the FPGA subsystem—these are normally the ones that are causing most problems today. The difference between a software developer working around an FPGA and a silicon vendor is that normally OEMs don’t have access to the EDA simulation tools and the development tools that are available for the silicon partners.”

What this means is that the software developer doesn’t always have the visibility into the hardware that they may need, which is what the prototyping provides. There also are specific technical challenges between the FPGA and the CPU that are design-dependent.
“It really depends on the hardware design,” said Orensanz. “Some do a better job than others in providing enough hooks into the subsystem. If the customer designs extra holes on the FPGA, somehow they need to make them available for connection of a debugger to the main data interface of the FPGA. Otherwise they may not be able to have a software debugger connection of all the signals in the system. And then, if we want to concentrate a system trace on some corners of the FPGA, again you need to provide trace interfaces that connects both the FPGA subsystem and the CPU subsystem.”

As such, the ability to thoroughly debug is crucial. Fast turnaround time is needed to locate problems, as well as having as much debug data as possible for greater visibility into the correct module in the design where the bug is occurring.

“We support what is known as real-time debug,” said Mick Posner, director of product marketing for FPGA-based prototyping at Synopsys. “Customers like to use logic analyzers if there are signals or I/O interfaces that they want to see. You can select that to be real time debug to route it out to a standard Mictor card, they plug their logic analyzer in that has fundamentally virtually unlimited storage. Part of that real-time debug flow is also to set up that analyzer to capture those signals. During those early stages of bringing up the system you want that rapid turnaround time, you want to over instrument and look at everything; later on in the design cycle you will probably want a deeper visibility.”

Both Synopsys and Xilinx offer debug tools. Xilinx’s approach is more geared toward debugging at the multi-chip partitioning level, though. “What we see is that most of the customers that are using these types of systems are building very large ASICs and SOCs and typically—even with our very largest chip—they need more than one of them to be able to prototype their ASIC so they need to figure out how they are going to partition their ASIC into multiple FPGAs to be able to prototype it and emulate it,” Saban said. “The ASIC prototyping guys are taking it to another level where they are having to take a huge ASIC and split it into two or four or six or eight [Xilinx] 2000Ts and figure out how to make that all speak together through multiple connectors on HAPS board. Or they’re building their own board with 8 of our largest FPGAs on it and figuring out how they’re going to partition that and debug it, and make it all work together.”

The good news is that with the FPGA-based prototyping market growing at a fast clip, leveraging the latest-generation FPGAs, and tool vendors trying to capture their share of the market, users will benefit—especially because it is non-negotiable to utilize this technology going forward in SoC and ASICs.

Experts At The Table: FPGA Prototyping Issues

Thursday, September 27th, 2012

By Ed Sperling
System-Level Design sat down to discuss challenges in the FPGA prototyping world with Troy Scott, product marketing manager at Synopsys; Tom Feist, senior director of marketing at Xilinx; Shakeel Jeeawoody, vice president of marketing at Blue Pearl Software; and Ralph Zak, business development director at EVE. What follows are excerpts of that discussion.

SLD: As we go forward, there are more tradeoffs. What do design teams need to do differently in the future?
Feist: There are a lot of what-ifs. What if I set up a memory architecture like this? Another challenge in developing these systems is getting the architecture right. Once I’ve committed to silicon, that’s a big cost. Did I get the interconnects right? Is my cache depth right? Do I need a level 1, level 2, level 3 cache?
Zak: You have to look at the complexity level and what’s driving prototyping. It’s a requirement. You have to wring out all these problems out front. There is no other way to do it. We’re used to thinking of RTL simulation as 50 cycles per second. I’ve seen chips that were in the single-digit cycles per minute. You can’t simulate the whole chip. The major blocks are so complex they’re hard to simulate.

SLD: Chips are buggy today, no matter what. Even IDMs are fixing chips after rollout.
Scott: These prototypes may reach 70% to 80% of their test coverage in simulation context and then transition to prototype. In those accounts, you have the systems engineers—not even C++ but SystemC—deploying these very early models of the system. The prototyping person needs to be able to communicate those technologies to connect them, but they’re both clearly in the signoff path. What can programmable fabric bring to the party? There are variants, but what if you can extend an architecture by adding a processor subsystem or a special video interface? You may use standard platforms and then just extend that.

SLD: Do FPGAs add another debug capability because the hardware is programmable?
Zak: Yes. The problem is that you’re limited to how much you can squeeze into an FPGA, but it’s at least an order of magnitude greater for logic gates. That will affect your die size and how much programmable logic versus custom logic is in a device. Over time you probably need to move to some sort of hybrid.
Feist: The challenge is that even to fix it, you have to figure out if the programmable fabric is connect to the part that’s broken. Can you actually get to the parts? The chances that you’re going to have the interconnect to that part that you’re trying to fix is almost zero.
Scott: For applications like mil/aero you need triple modular redundancy for SEU (single-event upset) state machines. Rather than trying to figure out which three of these register spaces is correct, the two that say they’re correct could improve the overall bug detection.
Zak: It may be tough to say something occurred in a programmable fabric and it can actually be fixed there. It may be because gates are becoming almost free—at least in one respect. With power management schemes you put in redundant circuits so that if you have a failure in one you have another. We may start seeing more redundancy in the critical parts of circuitry.
Feist: We’re already seeing that in applications where they can’t afford any down time. Now, even in the data centers, they’re requiring it.
Scott: It’s not just mil/aero. FPGA technologies are being applied for ASIC debug. If you go back into emulation and prototyping hardware you’ll see technologies that attach to either a custom or commercial prototyping system to extract system states to achieve near-emulation types of capabilities. Because you need to re-instrument a design in multiple places you need reprogrammability. It’s another interesting application to attach arbitrary probes.

SLD: If you get everything right, you should have less debugging, right?
Jeeawoody: Yes, but it’s so complex that it’s hard to debug anything up front. That’s why we have these complex debug systems.
Scott: A couple years ago I went to one large vendor’s Web site and printed out the errata for one of its platforms. I thought it was going to be a couple of pages. It was a full ream of paper. With all the methodologies that we provide, there are still huge numbers of errors. And those are well-used, wrung-out parts used in a lot of systems around the world.

SLD: Looking out five years, what happens with FPGAs and the tools that are out there?
Scott: You’re going to involve more people and more IP, so there will be more bottom-up methodologies for divide and conquer. There will be more ASIC IP. That will migrate to the FPGA domain. And you will still need this robust timing closure, which will have to be solved at the RTL. You’re not going to be deep into static timing analysis trying to solve something at the gate level. It will have to be handled at RTL or higher. What we consider ASIC-caliber tools will be pulled in.
Zak: The implementation software is going to get extremely complex. If you start migrating it down the technology curve you’re going to run into timing and electromagnetic issues in terms of crosstalk in an FPGA that you normally deal with in a custom ASIC. The tools will have to get that much more sophisticated. It may get to the point where there are so many gates that dies will shrink. You may only use them in prototyping applications.
Jeeawoody: We’re also going to see more and more IP blocks—hundreds of them. Each block probably has been analyzed individually. When you assemble them all together, the inter-block analysis will become really important. That’s going to be particularly important at the interface level. You need to ensure you get the most utilization and performance.
Feist: As you go down Moore’s Law, the cost per gate is not tracking anymore. As a result, you’re going to see platforms, either in 3D technology or hardened blocks. There will still be a place for glue logic, but you’ll see more platforms in the future for market segments. You’ll see fabric to differentiate those, as well. To grow the market, it won’t be just programmable logic. It will be programmable platforms. We’re looking at how to suck more of the bill of materials into our platform. Process technology and 3D designs will always be in the programmable space. There will be more platforms in the future rather than just process. Even today you could build and program 80% of a device and give it to someone to differentiate. Economically, if you’re going against an ASSP or an ASIC it’s still too expensive for high-volume applications. If the market is ripe, why not harden things? But most people wouldn’t build it all themselves. That’s what’s so attractive about 3D technology. It’s challenging, but it opens the door to a lot of possibilities for FPGA vendors. It’s no longer just a gate array.

Experts At The Table: FPGA Prototyping Issues

Friday, September 7th, 2012

By Ed Sperling
System-Level Design sat down to discuss challenges in the FPGA prototyping world with Troy Scott, product marketing manager at Synopsys; Tom Feist, senior director of marketing at Xilinx; Shakeel Jeeawoody, vice president of marketing at Blue Pearl Software; and Ralph Zak, business development director at EVE. What follows are excerpts of that discussion.

SLD: Are FPGAs starting to be used as platforms for a 3D stack, where you put a memory chip on an FPGA?
Feist: We just introduced 2.5D. Those are slices of an FPGA fabric on an interposer. We released a heterogeneous one with external SerDes and an FPGA. We’ve got the technology now. Where we take it will be driven by markets. Because we are general-purpose, there has to be enough of market to say it’s worthwhile to go out and build. Memory is an area that would make sense. If you can suck those into the device, you reduce the board problems of routing.
Scott: As far as hybrid ICs, it makes perfect sense. You have a yield benefit, because you don’t have to produce such a large die. You can piece smaller portions together. That’s a big time-to-market, reliability and economic benefit. The thing that’s emerging for us is the expense of software development. You’ll see a lot more high-level development tools that address multicore and the challenges of bringing up the software. That requires platform analysis tools, virtual prototype distribution and new business models—it’s not just about selling a synthesis tools.
Zak: The big thing we see from a verification standpoint is application-specific debug. You’ve got networking, PCIe, video, graphics, and you have to move the verification environment up to a level where you’re looking at video streams, packet analysis, and not just the signal. When you’re looking at chips with hundreds of millions of gates you can’t spend your time watching waveforms. You have to be able to identify, capture and reconstruct the instant something gets triggered. You have to build a deterministic debug environment. That’s part of what we’re seeing.
Scott: FPGA vendor tools aren’t enough. These system and software tools have to comprehend the resources of the multi-board systems. They have to understand the interconnect for system-level static timing analysis. They have to know about dedicated memory resources and even verification IP. That’s something the FPGA makers would prefer not to get involved with.

SLD: Still, adoption rate of more sophisticated tools for FPGAs has been much lower than what EDA companies thought it would be. Is that changing?
Scott: Yes. The chips are larger, there’s more collaboration and more IP. They have to migrate the ASIC IP into the system. False path extraction, data clock conversion—things that can really slow a system down—those are all automation-dependent.
Jeeawoody: You need to make sure you can optimize the right portions of the design. One thing we’re also seeing is that the designs are getting so big that you may have to split it into multiple FPGAs. Technology is enabling that split.

SLD: As FPGAs go up in complexity, is there a crossover where you may do part of the design at 20nm and another part at 130nm.
Feist: It’s the system architect making those decisions, typically. If you’re doing a base station, how will you architect that? There are lots of different possibilities. You can use a TI OMAP or an FPGA. There are tradeoffs made up front in terms of how long it will take to develop it. We had one customer that was using five TI chips instead of one FPGA because they would have to migrate over the code base and figure out how to put it in. It was drawing more power, but it was faster time to market. We look to the EDA side to make the process more streamlined, whether it’s high-level synthesis capabilities. What we provide is a simulator, not a verification tool. You can make RTL work with it. The design methodology needs to focus on reducing the pain, but it has to be done at the architectural level or it’s meaningless.

SLD: Is the main concern time to market, or is it performance?
Feist: Designs are evolutionary rather than revolutionary, so you need to be able to show a 3x advantage to get people to change. The reason is that they’re risking their businesses on changing their methodologies, and there is an infrastructure they have to put in place. They don’t have any hardware designers. That’s why you have to have methodologies. They have a lot of C programmers, but they don’t have anyone who knows SystemVerilog. If they’re coming from the ASIC world, that’s not a problem. But if you are coming off of standard, off-the-shelf, you have to put in methodologies they feel comfortable with.
Zak: In the high-performance computing market, people were starting to take FPGA-based boards that were PCIe-oriented, and plugging them in as a co-processor. The algorithms they’re trying to automate—financial trading, oil and gas exploration, image processing—these are all C++ programmers. That’s where they need to go directly to FPGA prototypes. They don’t understand Verilog.
Jeeawoody: It’s a more general problem. When we went from Verilog to SystemC, there were not many people who could think at that level. SystemVerilog took off because it was just an extension of Verilog. They could extend their knowledge of Verilog and apply that to the system. We’re seeing the same thing now with embedded systems, asking the architects to think at a system level.
Scott: The EDA industry is addressing more and more of the software development community. There are more embedded CPUs, and our tools have to account for the distribution of these high-speed prototypes. Often you receive a box, and there had better be a very nice interface. It sometimes is a mix of hardware and SystemC, and it has to complement what they already know. If they’ve got a multicore system and they have to figure out context switching and what threads are running on what processor, there have to be complementary tools there.

SLD: Multicore seems to be an enormous problem. Is it more difficult with an FPGA, or is it still the same?
Feist: The people building multicore systems want 800MHz or 1GHz. If you’re just doing it in fabric you’re going to get anywhere between 200MHz and 500MHz. Then you have to worry about what your interconnect will look like. If you add an accelerator in the software stack, does that slow the system down? That’s why several years ago we decided to use hardware. Otherwise you can’t do multicore on an FPGA. In the past we had a PowerPC in there with an FPGA wrapped around it. It was usable, but it wasn’t ideal. You won’t get a high-performance processing system out of that. We look to the EDA side to profiling a system and bringing up the software. There are a lot of areas where we can provide the fabric and base methodologies, but we won’t provide the entire design flow.
Zak: The challenge is to keep going up. The ITRS roadmap calls for doubling the number of processors. And those are application-specific processors, combined with software, going up 2x to 4x each chip generation. If we scale to 14nm, the typical chip will have 12 processing cores of various types—application specific and general purpose—all running software on them. You’re going to need entirely new chip architectures for the communications between them and managing all the caches. These are what our customers are dealing with. And then getting the timing neutralized so we’re not introducing new issues—that challenge is increasing multiple times.

Experts At The Table: FPGA Prototyping Issues

Thursday, August 23rd, 2012

By Ed Sperling
System-Level Design sat down to discuss challenges in the FPGA prototyping world with Troy Scott, product marketing manager at Synopsys; Tom Feist, senior director of marketing at Xilinx; Shakeel Jeeawoody, vice president of marketing at Blue Pearl Software; and Ralph Zak, business development director at EVE. What follows are excerpts of that discussion.

SLD: Where are the problem areas in creating FPGA prototyping and FPGA platforms?
Jeeawoody: One of the missing pieces is a complete tool set that’s easy to use, and which gets you from A to Z quickly. Constraining the design has been an issue for a lot of people. Automating that process is key.
Feist: Designers are using the biggest and baddest devices out there and they’re usually early in the design cycle. If they’re doing an ASIC, they’re doing a big device. They want everything to work perfectly. And they’re usually the early adopters of the latest process nodes. The vendors find some hiccups, and they’re maturing it as the FPGA prototypers are in the middle of the design. The challenges they face help us bring up our devices and tools along the way. In terms of design flows, it doesn’t matter how big of a device they build. They typically have to spread it across multiple devices, so there are a lot of partitioning challenges behind that to make it easier. But some blocks just don’t break up easily. You run into things like time-domain multiplexing between different I/Os. The prototyping market usually isn’t running at the highest frequencies that will be used when chips go into production. That makes it a little easier. But the biggest challenge involves the people building boards.
Zak: There are a couple challenge areas. One is getting designs into FPGA-based systems, with hundreds of devices. That’s one of the biggest challenges in EDA because you have to take RTL, or in some cases people will be starting with higher-level models, and then you essentially have to push a button that reaches all the way down to timing-correct physical implementation. The better the speed, the more advantageous it is to the software developer.
Scott: The bring-up time is really a profound problem—code migration, the substitutions you have to make in migrating ASIC-style code into an FPGA. The emulation community has made compilers that are more compatible and which can accept ASIC-style coding conventions. But as people really try to achieve the 50MHz to 100MHz kinds of speeds on these prototyping platforms, you really need to use a more high-performance implementation. That means more automation, data clock conversions, and memory substitutions to get the design partitioned so there is no signal contention. That’s one side of it. The other side is that people rely on custom boards. They look at the bill of materials and say they know how to build those. There’s strong inertia to just keep building their own systems, but the practical matter is that the ROI—particularly applying them to multiple projects and scaling them—that’s why we’re seeing a trend toward more off-the-shelf solutions.

SLD: There’s been a lot of talk that to do a complex FPGA you’ll need ASIC-style tools, but the free FPGA tools are still extremely popular. Is that changing?
Jeeawoody: We’ll see the change as people address timing. Timing and timing closure are becoming more difficult as chips get bigger. In the past you could just do synthesis and route. We see that as another barrier—getting FPGA designers to think about timing. That’s where standards and how to use them are becoming important.
Feist: Anyone who uses an FPGA has to use the vendor’s place and route. Beyond that, they can choose. This is an area where the standards will help thing. Xilinx has been very proprietary in the past. The tool chain was about 15 years old, and if it wasn’t invented at Xilinx at that point then it didn’t get used. That’s why we’ve redone the tools and tried to use every standard that is out there. Traditionally the guys doing the ASIC emulation don’t know what a UCF constraint is. They were having to convert IP they purchased into something that could be constrained to be timing critical into our format. It had to match up with UCF.
Scott: You’re trying to do two things. You’re trying to maximize the actual runtime speed of the system in a timing-neutral environment. But you’re also trying to minimize the critical path across the prototyping. That may include multiple boards and multiple chips. What you’re doing is neutralizing the timing impact and running a purely functional verification environment. You also want to maximize its runtime speed, so you identify the critical paths, shorten the timing delay and actually create a new critical path, and then iterate to get to the fasting implementation. That’s one of the key challenges—getting that right.

SLD: For a long time, the idea was to do a prototype and roll it back to an ASIC. Is that still the case?
Feist: We do see that, where people start and FPGA and plan to move it to an ASIC. The crossover points have changed dramatically, particularly at 28nm. We’re now including a full processing system, which makes it quicker to customize, too. It’s not quite an Atom processor but at least you can bring up your software. There was a programmable IP company where the vision was that customers would have to do many versions of a chip. As the cost started going up, they decided to produce one die and put different part numbers on it. Others have started doing that. It has to do with the lifecycle of the end product. Consumer product cycles used to be nine months. They’re now six months.

SLD: Doesn’t that also fragment the market so you can’t get the same kind of volume from one chip?
Feist: Yes, and the way to get more volume is with these hybrid chips. If you have a really long lifecycle, you may start with an FPGA and migrate it to ASIC. Or you may have really high volumes but you’re time-constrained. Before you can start with an ASIC implementation you’re onto a new product. I don’t think we’ve seen much of that, but there is potential for that to become part of the trend.
Jeeawoody: 28nm seems to be the inflection point. There’s always a balance between cost and an FPGA. But at 28nm they’re more comfortable doing production FPGA designs.
Feist: We track on a quarterly basis design wins and we look at who we competed against. It’s hundreds of ASIC replacements per quarter. That’s the growth of the market.
Zak: The other side of this is that you’re seeing a change in the industry and the dynamics of who the players are. Fifteen years ago we were looking at 12,000 ASICs a year and about 4,000 standard parts. We’re still looking at the same number of standard parts, but the ASICs are down to about 1,500 a year.

SLD: Have the FPGAs picked up a lot of the slack?
Zak: It’s that plus the standard parts have all become platform chips. They’re all becoming SoCs. Broadcom and Qualcomm are building what essentially are ASICs, but they’re selling them to multiple vendors. If you don’t have enough volume to build your own parts, you buy from Broadcom, Qualcomm, TI and others. The dynamics of the semiconductor industry have changed.
Scott: PLD and FPGA vendors have been invited to the party. They’re fixing ASIC problems. Power has come down enough. And there’s a lot of hardened embedded functionality.
Feist: The FPGA of 10 years ago is not the same as an FPGA today. Today FPGAs have analog/mixed signal capabilities, A-to-D converters, serial I/O and block RAM and distributed RAM. If you think about an FPGA five years ago—a field-programmable gate array—that doesn’t apply anymore. But every piece of this is still programmable.

Blog Review: Aug. 22

Wednesday, August 22nd, 2012

By Ed Sperling
Mentor’s Nazita Saye looks at the wind resistance of dreadlocks, the best way to design a bicycle and why male athletes shave their legs. This is the kind of stuff aerodynamics engineers think about.

In case you haven’t noticed, there aren’t a whole lot of young engineers running around the hardware industry. How do you change that? Synopsys’ Karen Bartleson and Rich Goldman talked with Xilinx’ Patrick Lysaght about engineering education.

Cadence’s Joe Hupcey interviews his colleague, Bin Ju, about formal technology and what’s the best way to adopt it. Check out the blowing vegetation in the background. It looks as if a storm is moving in. Maybe it is.

Independent blogger Gaurav Jalan applies a handful of laws—Newton’s, Murphy’s, Moore’s, the law of natural selection—to verification. The reverse might also be interesting.

Mentor’s Robin Bornoff pins part five of his epic on the best location for a radiator in a room to a job interview. Apparently reading comprehension skills are required.

Synopsys’ Helen Thibieroz talks with Rambus’ Bing Chuang about speeding up DFT logic and timing verification.

Cadence’s Jason Andrews rolls out the latest installment of his Ubuntu OS epic, this one on improving fonts for SimVision.

DeepChip’s John Cooley looks into rumors about Synopsys’ next acquisition target and what’s behind it. You never can tell who’s just talking and who’s really bargaining—or whether anything will happen even if both parties are serious—but it’s an interesting snapshot of the competitive stakes and who’s got what.

Verilabs’ JL Gray ponders the possibility of a $10,000 ASIC. Break out the spreadsheets and sharpen the pencils. This is going to be a long night.

Mentor’s Colin Walls peels back the covers on evaluation boards for anyone doing embedded software development, offering several possible uses and advantages for choosing this option.

Synopsys’ Navraj Nandra looks at what it takes to bring non-volatile memory IP up to automotive quality standards.

Cadence’s Richard Goering interviews Martin Lund, who just jumped ship from Broadcom’s network switching group to Cadence. You’ll find out why.

New Kinds Of Hybrid Chips

Thursday, June 28th, 2012

By Ed Sperling and John Blyler
Crack open any SoC today and it will contain a variety of third-party memory, processor cores, internally and externally developed software and analog. In fact, the main challenge of most chip designs today is integration and software development rather than developing the chip from scratch.

By that definition, almost any chip is a hybrid. But the definition is about to expand significantly over the next few years, as Moore’s Law becomes increasingly difficult to follow and more of the chip is developed in discrete pieces that may go together horizontally, vertically, and sometimes even virtually.

Stacking of die, notably 2.5D configurations, is merely the first step in this process. Going vertical with 2.5D and full 3D versions will likely create a market for subsystems that are silicon-hardened. This makes good sense from a business standpoint, because not every part of the chip needs to be manufactured using the latest process node. In fact, analog developed and verified at older process nodes will likely work fine with a processor core developed at 20nm.

“In general, the trends are toward it being harder and harder from a process technology standpoint for foundries to create a process that is good,” said Hans Bouwmeester, director of IP at Open-Silicon. “That’s true for digital CMOS, for analog, for embedded DRAM and for embedded flash. We’re going to see a lot more heterogeneous die in a package, each in its own process technology. So you’ll have the CPU die in digital using a low-power/high-performance process, a high-speed I/O die with high-speed SerDes, and then you’ll have specialized RF, DRAM, and flash.”

FPGAs could well become part of the stack, as well. In fact, both Xilinx and Altera have created 2.5D planar chips and have commented publicly that they can be used in stacked configurations with other die.

“What you’ll see is that one side will become more specialized,” said Bouwmeester. “The other side will be everything in a package, which opens up enormous possibilities.”

Software
At least part of this is being made possible by software. Getting to tape-out is still a big problem, but it’s certainly not the only one—and maybe not even the biggest. Software development has become a huge challenge. Recent IBS data (see Figure 3) agrees with other evidence that software has become the big driver of cost and schedule. What is unique to the IBS data is confirmation that this trend accelerates at each lower geometric silicon node. Like chip hardware, software—including firmware, operating systems, middleware and even applications—becomes more complex with each generation of Moore’s Law.


Fig. 3: Software costs escalate with each advancement in process node.

Software tends to be the main product differentiator, in large part because hardware has become a commodity—a trend that will likely continue as subsystems and processor platforms become too expensive for most companies to develop. In a “fast market” such as mobile handsets, manufacturers that miss the market by as little as 9 to 12 months may lose $50M to $100M in potential revenue. This revenue loss combined with the extra development time required by software is one reason why software and hardware co-design approaches are so important. In addition, it explains the rise in popularity of virtual and FPGA-based prototype systems and emulation platforms.

Chips also can be built with the assumption that they’re part of a broader communication and storage scheme. Apple’s iCloud is one example of this, where at least some of the processing is done externally, allowing devices to behave almost like thin clients at times, and as fully functional processors at others. This virtualization allows a whole new set of tradeoffs in design, putting as much or more emphasis on the I/O as on the processor and memory.

Mixing and matching
All of these considerations are the result of a big speed bump in IC design, which has forced the semiconductor industry to look elsewhere for gains in performance and efficiency. Double patterning at 20nm has greatly increased the cost of manufacturing, and it will increase further still at 14nm if EUV isn’t commercially viable. They key sticking point there is how many wafers can be processed per hour using EUV. It currently is way too slow to be a viable replacement for 193nm immersion lithography.

But there also is a possibility of double patterning only part of a chip, and developing the rest on the same planar die in an older node. Luigi Capodieci, R&D fellow at GlobalFoundries, noted this is a very real possibility for reducing development costs in the future. But so are new techniques such as directed self-assembly, which can supplement multi-patterning and potentially help keep the cost down.

Still, cost isn’t the only issue that has to be considered. Heat is difficult to remove from chips that are packaged together. While some of that can be programmed away, running a processor at maximum speed for a short period of time and then shutting down, some of it also has to be engineered out with new structures such as FinFETs and new materials such as silicon on insulator (SOI), which can reduce current leakage that causes heat in the first place. What’s new here is that chips may be a combination of all of these things, with companies investing more money in certain portions of a chip—or a die within a package—and reducing costs in other areas. So areas that don’t generate much heat, or functions that aren’t used as often, won’t require as much engineering or the latest process technology and presumably can be done using single patterning.

Conclusion
IC design and manufacturing have been largely evolutionary. After decades of slicing costs at every new process node, it’s difficult to give up on a model that has worked well. The move to 450mm wafers will help boost efficiency even further, providing that yields are reasonable.

However, there is also a growing recognition that not all parts of a chip will continue down the Moore’s Law path at 20nm and beyond. Some portions of an SoC will remain on that path, others will not. But they may all be part of the same aggregate solution, packaged together in unique ways that can actually improve performance, lower power consumption, and get to market on time and with minimal risk of failure.

Blog Review: May 9

Wednesday, May 9th, 2012

By Ed Sperling
Mentor’s Mike Jensen looks at the “Rooster Tail,” the giant fan of water released out of three dams in the Pacific Northwest. Check out the photos and you’ll know why they call it the Rooster Tail. This would certainly be a rude awakening in the morning.

Synopsys’ Navraj Nandra examines whether DDR3L will ever make its way into mobile DRAM. The answer, apparently, is no.

Cadence’s Richard Goering peers into the details of 20nm RTL to GDSII methodology. This is like looking over Niagara Falls where manufacturability, timing variability and design size and complexity are churning at the bottom. Better reinforce the barrel.

DeepChip’s John Cooley focuses in on the changes on Mentor’s board of directors, notably that two of Carl Icahn’s three choices for the board have been replaced. This gives new meaning to hot swapping.

NXP engineer Chris Hill, standing in for Mentor’s Robin Bornoff, looks at diminishing returns in thermal design of PCBs and how extra copper layers don’t always help. Given the price of copper these days, no one will argue.

Independent blogger Gaurav Jalan examines the list of challenges for verification teams, providing some insight into why it takes so long to get a chip out the door. Even more disturbing, though, is that confidence in the final product appears to be on the wane.

Synopsys’ Eric Huang had one of those “ah-hah” moments about the limits of family involvement in technical subjects. You do what for a living? Is that legal?

Cadence’s Jason Andrews looks at simulation performance on a Zynq virtual platform using VirtualBox compared with native Linux.

Si2’s Steve Schulz previews what his group is doing at DAC this year, which will include a rundown of all the standards efforts under way at the moment—or at least the ones they’re talking about in public.

Mentor’s Brooks Moses looks at the embedded software in a control cluster of an unmanned aircraft and how difficult it is to program to get maximum performance. In this case, “maximum performance” may mean different things to different people.

Managing Complexity With Advanced Packaging

Thursday, March 22nd, 2012

By Ann Steffora Mutschler
Engineering teams across the globe continue to pound the process geometry treadmill to stay on the curve of Dr. Moore to achieve better speed or lower power or smaller die—and it all adds up to increased complexity in the design and packaging. However, with advanced forms of die stacking such as package-on-package, silicon-in-package, 2.5D silicon interposer technology and other techniques, engineering teams now have more degrees of freedom around how chips are constructed.

A significant consideration in moving from one process generation to the next is that there are many IP functions that must migrate. “Sometimes it’s too expensive to port it from one generation to the other and you may not need it as far as the speed or as far as the power,” noted Shafy Eltoukhy, vice president of manufacturing operations for Open-Silicon.

This is where advanced die stacking comes into play. The engineering team may consider going to 28nm for one particular aspect of the function—for example, to get a better speed in the ARM processor—while there are a lot of other interfaces for a particular die that may not have to be in that advanced process node. A USB 2.0 or 3.0 does not have to be in 28nm to achieve the requirements—it could be in 90nm or 40nm, he said.

“The whole notion of re-using IP is common, though something not as commonly discussed is the reusability of die. What we’ve been seeing a fair amount of is companies saying, ‘I’m going to use advanced packaging techniques that are available today and I’m going to take this older generation die that I’ve got sitting on the shelf. And I’m going to make a much smaller new chip to complete it or extend it or interface to it. And I’m going to put that all into a multi-chip module, or advanced packaging structure, and circle back and use a lot of the IP that is in actual hardware form and make that available.’ It’s not mainstream, but reusing IP 15 years ago wasn’t mainstream either,” said Jack Harding, president and CEO of eSilicon.

Engineering teams tend to have a certain function they really want to squeeze and go to the next generation, but there are a lot of other functions in the design that don’t have to be in the latest generation, Eltoukhy observed. In advanced SoCs, customers are paying first and foremost for the IP development. “You are paying more dollar-wise per silicon area for a function that does not have to be in 28nm.”

What process node makes sense
Naturally this leads to a discussion about not bringing every single function into the next generation, especially because some analog and RF functions do not scale very well. So why not stay in the previous generation and partition the design in order to leverage older technology where available and not re-invent it?

“What I have to do instead is some kind of interface between this technology and the new technology. I put only the function that I want in the technology that can handle it and leave the other somewhere else,” he noted.

The question then becomes how to connect these together. “You certainly can connect them on the package level, which people used to call MCM (multi-chip module). You can actually get multiple die and bolt them in the substrate of the package and connect them. But the package technology has been way, way behind compared to the silicon technology, and you may end up with much higher power and slow interfaces and so on,” Eltoukhy explained. This has led to the development of silicon interposer technology in order to replace the substrate interconnect or the package interconnect, which is commonly known as 2.5D stacking.

Essentially, silicon interposer technology connects one die to another instead of connecting to a package, thereby reducing power and improving speed. Xilinx already has made its version of 2.5D-stacked technology available with certain product families.

Another use of 2.5D would be in a processor design that needs to talk to a DRAM, he continued. “Most people have a DDR interface and you go through the board to interface with the memory. But this approach is slow and large. Instead of buying a DRAM package from a DRAM vendor, we ask the vendor to sell us a known good die, which can be attached with processors on an interposer so you don’t have to go outside the chip. The DRAM can talk to the processor right away and the form factor will be much, much smaller. So there are multiple applications for that interposer—mixing the process nodes so that you can reduce the cost and so on, and improving the yield or bringing up some known good die from the DRAM to your die.”

“The application processors, which are really only delivered with package-on-package memory, end up with a very easy knob in that system—they can pile on different amounts of DRAM. To them it’s almost the same design and it is the same software. A couple of bits different in the software and suddenly they’ve got a new derivative part,” said Drew Wingard, CTO of Sonics.

“In many cases the die itself has more package attachment or wire bonding sites than the package may have pins, so you may take the same die and put it into a different package with different amounts of I/O resources available, and then sell those chips—even though they are the same fundamental chip design—at different price points. That’s been going on for a long, long time but with some of the more advanced packaging technologies, there are new degrees of freedom there,” he added.

While sounding tantalizing, all of these options are still under development. Complicating widespread deployment are two factions in the industry at odds as to the right path forward. On one side are the semiconductor foundries, which would like to enable customers to use a transposer because, at the end of the day, they want to sell more dies to put on the interposer, Eltoukhy explained. “They say, ‘We can give you the interposer but you buy the dies from us and we can glue it together for you.’”

In another camp are packaging providers such as Amkor and ASE that fear losing business to the foundries and would also like to offer the interposer to their customers so they won’t go and do the interposer with their foundry. “These two camps are fighting now because it requires some investment from a capex point of view,” he added.

Managing complexity, saving dollars
In addition to dealing with complexity, advanced die stacking techniques can save big dollars, eSilicon’s Harding asserted. “You could measure it just in terms of NRE dollars, you could measure it in engineer years of work, you could measure it in terms of time to revenue. By any metric, going down the advanced-package, multi-die solution is better by two orders of magnitude than just actually making a new chip, and I would argue it’s probably better by one order of magnitude by just doing RTL modification, which still has high NRE and a lot of technical risk, albeit you have a product that is closer to being the final product. These decisions are classic risk-reward.”

The Week In Review: March 2

Friday, March 2nd, 2012

By Ed Sperling
Synopsys issued a barrage of announcements, including new products, new relationships, and a new win. The company unveiled its next-generation verification IP based on its new VIPER architecture, with native support for OVM, UVM and VMM. Synopsys claims up to 4x performance over other commercial VIP. This is an interesting number, and likely will spark a volley of announcements from the other Big Three EDA vendors, all of which have been gearing up for what they see as a big opportunity in the VIP space. Synopsys also rolled out 28nm M-PHY IP that supports six different standards for mobile applications.

On the relationship side, Synopsys struck a deal with Arteris to jointly develop an IP solution based on the Low Latency Interface, which cuts the cost of the bill of materials by eliminating a memory chip and reducing the area of a PCB. In a related move, Arteris introduced its low-latency interface digital controller IP, which it says is already silicon-proven in TI’s OMAP platform.

Synopsys also is working to link Springsoft’s debug technology with its own Protocol Analyzer. It also won a deal with BiTMICRO for a slew of EDA tools.

Samsung teamed up with Mentor Graphics to create a DFM sign-off reference solution for Samsung’s foundry. This opens the door to a couple of other big deals for Mentor, as well, considering Samsung is one of the three main companies in the Common Platform. The others are GlobalFoundries and IBM.

Mentor also announced its Q4 financial results, which set a new record. Revenues for the quarter were $320.4 million, up from $307.3 million in the same period in 2011. For the 12 months ended Jan. 31, revenue was $1.015 billion—also a record—up from $914.8 million in fiscal 2010. Net income for Q4 was $57.8 million, up from $50.6 million in Q4 2011, and for the year it was $83.9 million, compared with $28.6 million the previous year. Mentor expects revenue to increase to about $1.1 billion this year.

Cadence unveiled the production release of a virtual platform for Xilinx’s Zynq-7000, which is based on the ARM Cortex-A9 MPCore. After years of EDA companies trying to gain a strong entry into the FPGA world, this is an interesting doorway.

Docea Power rolled out a new tool for architectural-level power and thermal analysis. Given the fact that the biggest savings in power and heat can be obtained at the earliest stages of a design, this is an important step forward. The next challenge is to implement this kind of capability into existing flows so that power and heat models can be integrated easily with other models. Functionality and performance are no longer enough.

Tensilica introduced its second-generation multimode baseband chip, which includes multiple dataplane processors. The chip was co-developed with NTT DOCOMO, Fujitsu, Panasonic and NEC.  Tensilica also rolled out Dolby Digital Plus for surround sound on its HiFi Audio DSPs, and it struck a deal with ClariPhy, which will license Tensilica’s dataplane processors for optical networking mixed signal processing.

Next Page »