Posts Tagged ‘Arteris’

Next Page »

Experts At The Table: Designing At 28nm And Beyond

Thursday, April 5th, 2012

By Ed Sperling
System-Level Design sat down to talk about design at future process nodes with Naveed Sherwani, president and CEO of Open-Silicon; Charles Janac, chairman and CEO of Arteris; Frank Schirrmeister, group director of product marketing for Cadence’s System Development Suite; Behrooz Zahiri, vice president of marketing at Magma (and currently director of marketing at Synopsys), and Charlie Cheng, CEO of Kilopass.

SLD: SoCs have always been on the high end of the cost curve. Will that change as they become more mainstream?
Schirrmeister: There are FPGA SoCs, which may have a dual A9 subsystem on which you can run Linux. And you also have 4 million gates in the large version where you can put your own RTL into it. That’s what’s approaching the ASIC side. You can get it pre-defined and add your components in RTL into the programmable fabric. And that’s already integrated into the system level with virtual platforms and high-level synthesis. Those are making these designs accessible to different markets.
Cheng: There will not be generic SoCs replacing ASSPs. Cost is very high at the system level, and every integration point costs something. It’s hard to say a generic SoC has a place. Every SoC I know is slated for a specific market.

SLD: We start doing different tradeoffs as we move down the curve, right? Time to market is arguably as valuable as paying an extra dollar for a chip.
Janac: Mobility SoCs are selling in huge volumes. That will continue to grow. But they’re designed for a specific purpose. What we’re starting to see is people building FPGA co-processors that can manage the functionality of the SoC at a very low cost because they are in huge volumes and there are very big contracts that can drive the price down. They are using those co-processors to take the SoCs into markets that they were never intended for. The FPGA business is getting very interesting.

SLD: Does software change, as well? Do we move away from a general-purpose operating system, or do we still have a big operating system and many little ones?
Janac: Virtualization allows you to run multiple operating systems at once, invisible to the user and the application software.
Schirrmeister: I was at an automotive conference recently and one executive was talking about a hypervisor to switch between different operating systems even in the car. They’re looking at things like that. But if you look at some classic semiconductor companies making application processors, they’re starting to differentiate in software. Android democratizes everything, and then you add other software. The differentiation can be in hardware and software.
Zahiri: One area that hasn’t been tapped is software controlling the power. Our high-end customers are doing 50 to 100 power domains, mostly for mobile processors. We’re enabling people to design this way. And yet there’s not a significant way from a software point of view where, if you’re dialing your phone, another part of the chip should be shut down. There’s nothing software has done in a big way to control the power.
Janac: It’s a cultural problem. The software people don’t understand the hardware and they don’t want to use the APIs that are available. There is a gap. Computer science graduates want to write Java.
Schirrmeister: When engineers write their iPhone app for the hardware, there is one example where they ignored an API and it sucked the battery dry within an hour. Those are things that need to be validated. They’re an integration problem of hardware and software. I have seen software controlling power domains in the wireless world, though. There are power management ICs that can pull down areas of the processors and certain power areas to a lower voltage. That is software controlled. From my perspective, integration is a huge issue there at several levels. There is the software-hardware integration. There is the issue of verification that the application is running correctly on the chip, which has become so big at an advanced technology node. Then there is the subsystem integration and validation. It’s quite a challenge.

SLD: And from a verification standpoint, you’re never really done, right?
Schirrmeister: The designer has to be confident at the end of the day that it’s not the last thing he does in his career to tape out that chip.

SLD: As we move to 14nm and beyond, will we be involved with the chip at the same level or will it be more an integration of platforms?
Sherwani: We are already doing 14nm chips in development. I don’t see that happening. Our customers are willing to pay the money required, even though there will be fewer customers who do that. But even at 14nm there are still a whole bunch of companies doing chips. There are applications that need that.
Janac: But you do need more and more volume. At 90nm you needed about 100,000 units to break even. At 65nm you needed probably 6 million units. At 40nm, you needed between 10 million and 15 million units. At 28nm you need 50 million to 60 million. At 20nm you will need 100 million units. The markets to support those volumes are fewer and fewer, which is why I see the SoCs and 3D silicon will be prevalent in markets that don’t justify those very complicated deep-submicron dies.

SLD: But you may have a 14nm known good die that is part of that chip, right?
Janac: Yes, you become an assembler.
Sherwani: The other problem we see is that IP companies are not willing to warranty their IP. That’s one of the big problems of known good die. If I buy $5 million to $6 million in IP, one piece of IP out of 100 or 200 pieces that doesn’t work can force me to re-spin that chip. Yet, most IP vendors do not warranty their work. They are not willing to pay us a re-spin cost even if we can prove their IP is the problem. That means I have to go to a model where I can save myself from that problem, but at the same time Open-Silicon has to warranty the chip. That’s why we want to go to known good die. Once IP is proven in a piece of silicon I don’t want to re-integrate again and again. Every time I re-integrate it, I have a chance that I have missed a problem. The world is moving toward 3D kinds of chips that will allow us to address smaller markets but still have high volume for chips. The same chip will go into multiple 3D chips.

SLD: You basically define what a derivative chip is, right?
Sherwani: Yes.
Janac: About 25% of our revenue comes from being able to link die together inside a system in package. It becomes one of the key enablers. You don’t have to be part of the 28nm or 20nm problem because some things like analogs and modems don’t require it. You maintain Moore’s Law by stacking the die.

Experts At The Table: Designing At 28nm And Beyond

Friday, March 30th, 2012

By Ed Sperling
System-Level Design sat down to talk about design at future process nodes with Naveed Sherwani, president and CEO of Open-Silicon; Charles Janac, chairman and CEO of Arteris; Frank Schirrmeister, group director of product marketing for Cadence’s System Development Suite; Behrooz, Zahiri, vice president of marketing at Magma (and currently director of marketing at Synopsys), and Charlie Cheng, CEO of Kilopass.

SLD: Where will biggest challenges be at future nodes?
Schirrmeister: For us it’s the combination of hardware and software that gets interesting. You may have a network operator determines he wants coverage for the NFL on Sundays. That trickles through the design chain of what the network needs in terms of bandwidth and what the devices need to be able to process. As an EDA vendor, there are huge challenges for us because what used to be a small IP model has grown into a subsystem. People are building chips as an assembly of subsystems. The integration and the verification become a big issue at both the subsystem and the system level. There are lots of ways to grow.
Janac: I see things getting fragmented, concentrated and disintermediated. Nobody can afford to do everything themselves, so you wind up focusing on your core competencies. The people in those core competencies become more concentrated. The chip world also gets more concentrated because there won’t be many people who can afford to build a platform at 20nm. But the components for that platform are going to be disaggregated. Companies will have to outsource a big chunk of those designs and a lot of the tools they used to do themselves. So the little chip companies die. They will have a really tough time, particularly at the leading edge. The EDA industry has a lot of problems because it will be sharing the volume, which is going to explode, and it will be hurt by the fact that the number of projects will decline. You wind up with someone owning 80% of the processor market, someone owning 80% of the interconnects, and someone owning 80% of the memory. The DSPs get concentrated. Tools get concentrated, where someone owns place and route and someone owns simulation and ESL.

SLD: But you have to redefine what’s a chip company, don’t you? Are Open-Silicon and eSilicon chip companies?
Janac: Yes. And if I’m a small company I have to go to Open-Silicon or eSilicon because I can’t afford a staff of engineers to get a chip out.

SLD: But traditionally they were not considered chip companies.
Zahiri: They’re an aggregator of chip demand. Maybe eSilicon and Open-Silicon become the equivalent of midsize to large chip companies, aggregating the demand of the little companies that have to go to that model to be competitive and survive in the marketplace.
Sherwani: Along these lines, one of the challenges I see is that we’ve set up the market to expect 50% gross margins and 30% net margins. If your IP is coming from ARM and Kilopass and other companies, then how do you achieve those kinds of margins? You can’t. And if you can’t achieve those kinds of margins then you also have a business problem, and your business structure has to change. If you do everything in-house you’re not paying all the up-front fees to IP vendors. So first there is a problem of size. And second, even if you have the size there is a profitability problem with respect to the expectation that has been fed to Wall Street.
Janac: If your gross margin goes down, your operating margin has to improve, which means you can’t do enough R&D. So instead of using 25% of revenue for R&D you can only afford to do 10%. The PC guys are reasonably profitable at 25% margins because they don’t do any R&D. Intel does it. That’s why people are starting to outsource the IP. They can’t afford to do the R&D as the gross margin drops.
Sherwani: That’s one piece. But if you look at what’s going on in chip companies, the R&D budget goes down for IP, but it doesn’t go away. It goes into software. The number of software engineers is increasing. The market expectation still remains for hardware gross margins, but your expenses are going up.
Schirrmeister: You can’t just look at the chip in isolation. You have to look at it holistically. One large OEM says it’s losing money on every TV it sells. They have to get it back other ways. You can’t look at these things in isolation.
Janac: It gets back to the business model. If you don’t have a good business model and you just keep squeezing the margins then you’ll go out of business. But there are people who have innovative business models, like Amazon and Apple, that can afford to sell the hardware at cost.
Cheng: Worrying about margins and R&D is like worrying about the 120 companies that went out of business selling cars. As businesses mature, the technology content gets very high and it costs a lot. It’s not that companies don’t have good gross margins. There are a lot of companies with margins of 60% or more. But the ones that assemble IP and add 10% original content are not going to be very successful if they don’t differentiate, and they won’t be good customers long-term for the EDA vendors because 70% of those chips are memory and another 20% are IP that’s licensed from the outside. So they may only be doing 10% of the chip. This is why EDA revenue has been flat. If you look at the surviving car companies, they’ve been very profitable over time because there’s a high barrier to entry and it’s a fixed market.

SLD: But more pieces have to go together into something that’s coherent, and that’s more difficult than ever before, right?
Janac: I just met with a customer that spent $500 million on their platform and they have 180 IPs. They still make most of those IPs themselves, but integration is the issue.
Cheng: Integration isn’t any worse today than in the past.
Janac: It’s absolutely worse. And the reason is that you have an incredible amount of computing in smartphones, and that’s even trivial compared to what it’s going to be. You can’t afford to keep that device turned on except at times when you need it. One of the complexities of 20nm and 14nm is that you need a portion of the chip to do its job and then you shut it down. From a power perspective, you can’t afford to keep it on. And you don’t want it to be big, so you can’t afford a huge battery. It is very complex. You have frequency domains, power domains, power regions. You have as many as seven modems—WiFi, Bluetooth, CDMA, GSM and LTE.
Schirrmeister: What customers are telling us is that getting to an acceptable confidence level in verification is a very difficult thing, driven by the integration of all the components they have. Given that you’re taping out a chip and you can’t make a change tomorrow—that’s the pivotal point where you have to have enough confidence. The integration challenges are huge.

SLD: The promise of stacked die is that if you have a base platform you can start shifting into vertical markets quickly because a lot of the integration is already done, right?
Janac: Yes. The application processors that are being made for phones can be shifted into dashboard control, automotive infotainment and home gateways. What’s also going to happen is that the low end of the SoC market is going to disappear because the costs are too high. You’ll get 3D silicon, where people are selling dies with specific functionality on trailing-edge processes. You’ll wind up with FPGA SoCs.
Sherwani: But that’s a good thing. You could build viable chip companies that are on trailing processes with known good die that we can put into 3D stacks. You don’t have to push them all the way to 22nm. There’s no need for that. A lot of people will stay on 65nm, and that will justify keeping those fabs alive for a long time. It actually helps with the overall investment we need to put into 14nm.

SLD: Are the specialty fabs that are coming online capable of doing all this integration work?
Sherwani: They don’t need to. The interposer technology we have today doesn’t have to be much better. At 22nm you’ll see many people bringing 3D chips buying known good die from a bunch of people and putting these MCM-style 3D chips together. That will lead to many companies, which we consider sub-optimal today, becoming viable. And I don’t think these small SoC companies will disappear. They will start doing specialty silicon.
Janac: They will be the known-good-die companies.
Sherwani: Yes. They will be working with GlobalFoundries and TSMC at 65nm. They don’t have to run at 1.2GHz. They can run at 300MHz and be just fine. And you don’t have this area constraint. So area constraints and power constraints can be reduced. Today you have one chip and something that is 1.2GHz can run fine at 100MHz. Not everything is being pushed to that level.
Janac: And then you’re moving from 2D integration to 3D integration. That opens up a whole bunch of opportunities that are untapped today.
Sherwani: Just because of 3D, there are huge opportunities. I also think that IC design and computing will completely change if we can change the memory. The idea in the past was to dumb down the memory because you could pull the gross margin into the microprocessor. After 25 years of dumbing down the memory we do have standard interfaces, but memory isn’t doing much. When you look at 3D memory, it has 20X the performance of DDR3. It is one-sixth the power and one-tenth the space of DDR memory. A new era of intelligent memory will do a lot more than just keeping the bits. It will become very close to the processor, which changes the processor design. And many new applications are possible. If the architecture changes and memory and processors are very close together, many new things can happen. That is what you will see in the next five to seven years. You will be able to put terabit memories on top of processors in the same 3D package.

Blog Review: March 28

Wednesday, March 28th, 2012

By Ed Sperling
Mentor’s Dennis Brophy looks back on the life of the man who first pulled him into the standards world, Don Loughry. It’s a good story and a great eulogy to one of the stars of the standards effort.

Cadence’s Richard Goering examines an all-too-common phenomenon in testing a chip—exploding it. Testing a chip with everything on is a lot different than testing it with the normal functional power. Make sure you check out the photo.

Synopsys’ Navraj Nandra looks at non-volatile memory and why it’s important for smartphones with near-field communications. When you swipe your phone, speed and battery life are critical.

How many TVs can U.S. households hold? Apparently not as many as TV makers would like. IHS iSuppli’s Lisa Hatamiya predicts flat panel shipments will fall for the first time ever this year.

Mentor’s Michael Ford compares the taming of young music students to orchestrating of the chip manufacturing process. Just imagine if Toscanini had been in charge of a 28nm fab.

Cadence’s Adam Sherer digs into verification of power-aware designs and why they should be running low power in every regression test.

In case you’ve wondered where you can augment your verification skills for AMS, Synopsys’ Helene Thibieroz details who’s teaching this summer at UC Santa Cruz. Bring your surfboard.

Mentor’s Mike Jensen rolls out Part 5 of his analog modeling epic, this one focusing on implementation of equations using VHDL-AMS.

And in case you missed the most recent issue of the System-Level Design newsletter, here are some standout blogs:

–Mentor’s Jon McDonald sheds light on cycle-accurate models and why they’re not always necessary or even good.

–Synopsys’ Achim Nohl shares some insights about virtualization and ARM’s big.LITTLE processor.

–Cadence’s Frank Schirrmeister steps back and assesses how many of ESL’s core pieces have moved beyond the early adopter phase.

–Sonics’ Frank Ferro asserts that speed is still the crucial requirement for all SoCs. Damn the torpedoes, full speed ahead.

–Arteris’ Kurt Shuler looks ahead to the coming shakeout in the design industry and who’s going to be affected.

–Atrenta’s Mike Gianfagna compares SoC development to an old video game with much higher stakes.

–eSilicon’s Javier DeLaCruz looks at which companies will be the drivers of TSV packaging.

–Methodics’ Simon Butler expounds on the continuous build approach and why it’s necessary to take SoC design out of the Dark Ages.

Experts At The Table: Designing At 28nm And Beyond

Thursday, March 22nd, 2012

By Ed Sperling
System-Level Design sat down to talk about design at future process nodes with Naveed Sherwani, president and CEO of Open-Silicon; Charles Janac, chairman and CEO of Arteris; Frank Schirrmeister, group director of product marketing for Cadence’s System Development Suite; Behrooz, Zahiri, vice president of marketing at Magma (and currently director of marketing at Synopsys), and Charlie Cheng, CEO of Kilopass.

SLD: As we move to 28nm and below, what will we have to do differently than in the past?
Sherwani: We see issues at the system level. One involves 3D chips (stacked die). How do you actually put these together? Memory already has gone 3D. Electrical, physical and mechanical tools, both on the simulation and analysis sides, are not that sophisticated. Most of that work is being done manually today. Second, we have several customers that have come to us and asked us to put two or three chips together at 22nm. The science to combine the chips is not well known. This exercise is costing more than if it had been done from scratch. You should be able to do this very quickly. We don’t have tools like that, so we will have to develop them. A third area is that we see our design teams growing, but our verification is growing super-linearly. Right now verification teams are almost 3x the time of other teams. That is not sustainable.
Schirrmeister: Somewhere around 40nm software became more dominant. It passed the effort needed for hardware. How do you create for this vast mass of available space new things to differentiate your chip? And how do you integrate all of these pieces together? The whole integration and verification in the context of the software and the hardware together is the big challenge.

SLD: Does that software include just just the drivers, or is it more?
Schirrmeister: It goes all the way up to the application level. The more enabling hardware you have, the higher up to the software you reach because you need to partition which component runs, which processor it uses. There are different layers from the very low-level bare metal to applications that may be split across different processors.

SLD: Let’s go back to the original question. What challenges are ahead?
Janac: One of the things we’re trying to do is to bring computing closer to the person. It’s going from the mainframe to PC to smartphone, which will be the new personal computer. What people are trying to do is build things of frightening complexity into something as small as a smartphone and with the functionality of a PC. They’re not quite there yet, but they’ll get there. At 28nm, we’ve got power domains, frequency domains, disparate pieces of IP on disparate chips, and we’re trying to re-use software. You have to make all of that work together. That’s one of the biggest challenges. You have IP with different protocols and sources. There are requirements that looked like science fiction a few years ago. And you have to make it all work. It’s a very big challenge and it’s very costly. Some of these things cost hundreds of millions of dollars to build.
Cheng: Integration of software and miniaturization are a challenge we’ve been dealing with over the past three decades. It’s not just 28nm that’s the problem. But what 28nm does bring is very different packaging, which drives the silicon decision, as well as confusion in transistors. That will cost the industry $10 billion to $30 billion. The reason is that 28nm is supposed to be the generation of high k/metal gate. I’m not sure that’s going to happen. TSMC has re-introduced bulk silicon with no metal gates. The question is do you want to take a chance on bulk silicon and save 30% in cost, or go with high k/metal gate that will be more expensive but may not work. That’s a lot of money for an experiment.
Zahiri: Design costs are rising. With 28nm, you’re spending 20% or so more than what we spent in the past. Semiconductors are not growing at that pace. Something has to give. So chipmakers will go in two directions. One is to figure out how to use the same resources and the same schedule and the same number of people to get these 28nm chips out, which are more complex and much bigger. Those that can’t do that will go away. So 28nm, and especially 20nm, will be the test to see which companies will re-tool, re-position and re-architect to take advantage of this real estate. We’re trying to understand which companies will survive and how we can help them.

SLD: We’re talking about complexity and convergence, as well as business issues. As they become more entangled, do we have to rethink everything?
Schirrmeister: According to IBS, at 28nm the price of fab construction is $3 billion; process R&D is $1.2 billion R&D; the design cost is $50 million to $90 million and each mask set is $2 million to $3 million. For 22nm/20nm, fab construction is $4 billion to $7 billion; process R&D is $2.1 billion to $3 billion; design cost is $120 million to $500 million, and a mask set costs $5 million to $8 million. As an EDA vendor, our investment is huge, as well. What you will see, more and more, is the whole notion of collaborative design. It’s hard to do it all by yourself. The business dynamic behind this is becoming very interesting for who can afford to do a design.

SLD: So who can afford to do a design?
Janac: The business models and technology are getting intertwined. There are a bunch of people getting squeezed in a traditional business model, where they have to do more and more to save a lot of money, and their margins are suffering. Their R&D footprints are expanding and they’re trying to fight that with outside IP and more efficient EDA tools. But you also have companies like Apple and Amazon, where they don’t care what the cost of silicon is. They make money in different ways. The silicon is an enabler, and they can afford to make chips that are 50% bigger or to give away their tablets at cost because they have a different business model that supports that silicon effort.

SLD: Can the industry survive on four or five of these large companies?
Janac: One of the key issues will be access to silicon. Where it starts to get problematic is, at some point, even though you have this business model you have to make silicon. So what fabs are able to give you that cost advantage? There are starting to be fewer and fewer of them. As you go to 28nm, you only have a choice of a few players. You have TSMC and GlobalFoundries, plus Samsung and Intel, and then some smaller players on the periphery like ST and Panasonic.
Sherwani: That’s a good point. At 14nm, who are the players going to be? You can basically say it will be four players.

SLD: And what will they be building? Will it be platforms for a stacked die, or just their own chips?
Sherwani: These guys cannot afford to have fabs and make chips just for themselves. That model won’t work. Even Intel is being forced to revisit that model, which is why they’ve got a custom foundry. They’re projecting a need for five to seven years out. Intel is one of the biggest producers of silicon on the planet, and even they are forced to think like that. What will emerge is that people will move more and more toward platforms, with software as a differentiator rather than hardware. So if you look out four years, you may have a home gateway that is a standardized platform with a processor core and I/O chips and some custom silicon, and then a huge software investment that differentiates one platform from another. The hardware differentiation will be less. There will be some custom hardware, but that may only account for 10% or 15% of the hardware. That will be a platform you can get from a few suppliers working with these foundries. That’s how you cut down the costs. You don’t do those kinds of chips. We see a lot of chips, but there isn’t much difference between them. There may be six or seven chips that are essentially the same.
Schirrmeister: If you look at ITRS (International Technology Roadmap for Semiconductors) data, they already have characterized the design challenges in those platforms. There is a networking platform, which is essentially a bunch of packet processing engines with a smart interconnect. There is a stationary platform, which is a compute platform where you have the embarrassingly parallel portion that plugs into a wall outlet. And then there is a mobile platform, where you try to distribute software across different cores. There is a lot of similarity in how the chips are structured at the block level. But there also are a lot of challenges for how IP providers get their design ready for 28nm and beyond, how to integrate all of that. Some of the integration challenges are very complex. And the software on top of that is very difficult to create.

SLD: Where do the tools vendors see their future?
Sherwani: Once the platform is stable, you can always have disruptions. There are very few of those disruptions, though. When Android becomes stable, phones will look very much the same. Then someone will invent a new generation of phones, and until that stabilizes a lot of innovation will happen.
Zahiri: As EDA vendors, we are trying to intercept that innovation. We don’t see the world as just software or hardware. It’s a combination of both. But more important, we see that we are moving this forward. Whether it’s graphics or something that will allow a battery to last two days instead of a day, or whether it’s just making our industry more productive, the consumer market will remain demanding. We are trying to figure out the ways to meet that need and facilitate innovation.

Coherency Becomes A Stack Of Issues

Thursday, March 22nd, 2012

By Ed Sperling
As complexity increases and the industry increasingly shifts away from ASICs to SoCs, the concept of coherency is beginning to look more like a stack of issues than a discrete piece of the design.

There are at least five levels of coherency that need to be considered already, with more likely to surface as stacked die become mainstream over the next few years. Perhaps even more mind-numbing, this stack itself will have to take on a level of coherency over the couple generations of chips.

Let’s take a closer look.

Cache coherency
The concept of keeping data coherent historically was relegated to processor makers such as IBM, Intel and AMD, which have focused on improving performance through faster access to data. One solution to that improved performance has been multithreading and multiprocessing. Along with that, these vendors have added in various levels of cache memory for faster recall of important data.

More cores also makes it harder to effectively use these caches. Data has to be kept consistent, which requires more system overhead in terms of processing and power just to maintain that coherency. And it gets even harder as more cores are added into an SoC, which increasingly are not same size, do not run at the same frequency, and sometimes do not even connect directly to the main CPU.

“With cache coherency, some of the traffic may be serviced by the cache on another GPU,” said Drew Wingard, CTO at Sonics. “If you’re just using an ARM core, the CPU coherence is sufficient. But the GPU uses its own local memory. You really want it to be fully cache coherent across all of those.”

But even finding the data to maintain consistency may be a problem in a complex SoC.

“You can view what’s in memory, or view it and be able to change what’s in memory, but first you have to find it,” said Kurt Shuler, vice president of marketing at Arteris. “If you have four cores, the most efficient way to hook them up is for each core to have its own cache and graphics to have its own cache. If you change something, you have to snoop in all the caches to make sure it’s consistent.”

But there is also a move in the completely opposite direction—sharing memories among multiple cores—because it reduces the number of components on the bill of materials. The Low-Latency Interface specification from the MIPI Alliance is a case in point, where a memory can be shared between a modem and an applications processor. Intel, meanwhile, has added on-chip graphics that share memory with the CPU.

“The whole design gets more complex,” said Shuler. “You have more traffic beyond the cores, and from a power standpoint the overhead goes up.”

Still, cache coherency is one of the better-understood pieces of this stack. It has been an issue ever since multiprocessing was first employed in the 1960s. “Snooping” has been widely used since that time.

Software coherency
A newer facet of coherency involves embedded software. Because SoCs now include an increasing amount of software in the design, engineering teams now have to wrestle with coherency issues that previously were dealt with by the operating system.

“Fundamentally you’ve got two combined issues here,” said Andy Meyer, verification architect for Mentor Graphics’ Design Verification Technology Division. “You’ve got cache coherency, where the same data is being viewed in a couple places. And then you’ve got an issue with consistency in the simple code in a uniprocessor that now has to run on a second processor. The ordering of events can change in multiprocessing.”

Those problems crop up regularly in verification, but not always with the expected results. It’s difficult to effectively write the stimulus in a testbench for coherency. What happens, for example, when a core is shut down to save power?

“The scariest part is when there is no OS support,” said Meyer. “There’s also a big problem with heterogeneous cache, such as when you have a CPU working with a GPU.”

Another issue has to do with effective coverage in verification, already a problem for complex SoCs. States frequently are distributed across multiple chips and multiple boards. Timing varies from one state to another, and can be particularly problematic if snooping functions are tied to a state. And parallelism continues to baffle even the most advanced teams.

“Standard coverage methods don’t work well here,” said Meyer. “You have to query in ways you traditionally didn’t have the power to query and ask questions across months of regressions. For instance, ‘Have we been here ever—or in the last two months.’ Until coverage steps up, people with deep knowledge of verification running hundreds of full-time emulator systems are finding out at the last minute that it’s not okay to ship.”

I/O coherency
Tied in with both cache coherency and software coherency is I/O coherency. Increased communication on a chip, between chips, and between a chip and the outside world, have turned what used to be a relatively straightforward networking issue into a complex jumble of prioritization and synchronization.

“You have to deal with this even in single processors,” said Sonics’ Wingard. “You may have a PCI core streaming data into memory. Today, without I/O coherence, it’s difficult to determine what is coming in. The CPU has no way of knowing what was transferred when it dos a copy from non-cache to cache.”

He noted that personal computers had I/O coherency for a long time, particularly with direct memory access. DMA was developed initially to help solve the bottleneck that occurred when a CPU was involved in an I/O transfer. Rather than tie up the CPU with that transfer, the CPU continued running, then accepted an interrupt when the transfer was completed.

But with more of this being moved onto a chip, keeping coherency while moving data back and forth from more places is becoming much more difficult.

Ecosystem coherency
One of the least addressed facets of the coherency stack involves business and communication issues across a supply chain for a particular SoC rather than the actually technology itself. Even where competitive suspicions can be overcome, the very different approaches taken for designing components, IP and software, as well as language barriers, create one of the more difficult and less tangible challenges in the coherency stack.

“The challenge going forward is that you have a bunch of people who may not be that skilled in system development driving the chip and spec for one design, and other supplier trying to orchestrate things,” said Mike Gianfagna, vice president of marketing at Atrenta. “So you bring them together to solve a problem for one customer in 12 weeks and then they move on. You’ve got corporations coming together and bringing all these pieces together almost like the way a movie is done. But is there a coherent way to communicate data and information risks and still provide good visibility from a power/performance/area point of view?”

For decades this task has been handled by IDMs, but in the SoC world there are far fewer IDMs these days. Many of these chips are built using third-party IP such as cores from ARM or MIPS, DSPs from companies such as Tensilica, and standard IP from the Big Three EDA vendors.

Coherency in stacked die
It’s uncertain whether stacking of die, either in 2.5D or 3D configurations will make coherency easier or harder. The answer is likely to be a little of both.

“With 2.5D and 3D, you’re looking at low-power memory access,” said Arteris’ Shuler. “You put the DRAM closer to the CPU, the addressing is wider and you get rid of some of the latency. But you also need coherency across all of this.”

No one is sure yet how multiple high-speed communication channels between die will affect coherency. If the channel between the core is wider and shorter that will improve data speed, but if processors and DRAM are scattered on multiple die, with some of them shut down, some partially shut down, and others fully active, it may make it harder to keep track of data and make sure it is all synchronized.

The Week In Review: March 9

Friday, March 9th, 2012

By Ed Sperling
Mentor Graphics introduced a new Calibre DFM flow for GlobalFoundries 45/40nm and 32/28nm processes, which it claims can significantly boost yield and improve turnaround time for full-chip designs. Also on the DFM side, Mentor rolled out the next version of its PADS suite for PCB design through manufacturing, adding the ability to link high-speed associated nets and assign constraints.

Cadence introduced its Encounter RTL-to-GDSII flow for high-performance and giga-scale designs down to 20nm. What’s especially interesting here is support for double-patterning, one of the big issues with progressing down Moore’s Law because foundries have unique ways for doing this. Cadence also launched a business incubation program in Australia to boost entrepreneurship in this market. Nice design, mate.

Arteris inked a deal with Carbon Design Systems to enable NoC interconnect IP to be generated, managed and distributed using Carbon’s IP Exchange portal.

Atrenta announced that 10 IP providers have qualified soft IP for TSMC’s 9000 IP library using Atrenta’s IP Handoff Kit. The tool checks for syntactical and semantic correctness, power consumption and clock domain issues, among other things.

The Week In Review: March 2

Friday, March 2nd, 2012

By Ed Sperling
Synopsys issued a barrage of announcements, including new products, new relationships, and a new win. The company unveiled its next-generation verification IP based on its new VIPER architecture, with native support for OVM, UVM and VMM. Synopsys claims up to 4x performance over other commercial VIP. This is an interesting number, and likely will spark a volley of announcements from the other Big Three EDA vendors, all of which have been gearing up for what they see as a big opportunity in the VIP space. Synopsys also rolled out 28nm M-PHY IP that supports six different standards for mobile applications.

On the relationship side, Synopsys struck a deal with Arteris to jointly develop an IP solution based on the Low Latency Interface, which cuts the cost of the bill of materials by eliminating a memory chip and reducing the area of a PCB. In a related move, Arteris introduced its low-latency interface digital controller IP, which it says is already silicon-proven in TI’s OMAP platform.

Synopsys also is working to link Springsoft’s debug technology with its own Protocol Analyzer. It also won a deal with BiTMICRO for a slew of EDA tools.

Samsung teamed up with Mentor Graphics to create a DFM sign-off reference solution for Samsung’s foundry. This opens the door to a couple of other big deals for Mentor, as well, considering Samsung is one of the three main companies in the Common Platform. The others are GlobalFoundries and IBM.

Mentor also announced its Q4 financial results, which set a new record. Revenues for the quarter were $320.4 million, up from $307.3 million in the same period in 2011. For the 12 months ended Jan. 31, revenue was $1.015 billion—also a record—up from $914.8 million in fiscal 2010. Net income for Q4 was $57.8 million, up from $50.6 million in Q4 2011, and for the year it was $83.9 million, compared with $28.6 million the previous year. Mentor expects revenue to increase to about $1.1 billion this year.

Cadence unveiled the production release of a virtual platform for Xilinx’s Zynq-7000, which is based on the ARM Cortex-A9 MPCore. After years of EDA companies trying to gain a strong entry into the FPGA world, this is an interesting doorway.

Docea Power rolled out a new tool for architectural-level power and thermal analysis. Given the fact that the biggest savings in power and heat can be obtained at the earliest stages of a design, this is an important step forward. The next challenge is to implement this kind of capability into existing flows so that power and heat models can be integrated easily with other models. Functionality and performance are no longer enough.

Tensilica introduced its second-generation multimode baseband chip, which includes multiple dataplane processors. The chip was co-developed with NTT DOCOMO, Fujitsu, Panasonic and NEC.  Tensilica also rolled out Dolby Digital Plus for surround sound on its HiFi Audio DSPs, and it struck a deal with ClariPhy, which will license Tensilica’s dataplane processors for optical networking mixed signal processing.

Blog Review: Feb. 29

Wednesday, February 29th, 2012

By Ed Sperling
Synopsys’ Hezi Saar digs into the future of mobile devices—dropping some hints about what’s still to come. Expect more granularity and customization. This should raise the stress level inside IT departments. Forget going postal. The new phrase may be “going terminal.”

Cadence’s Richard Goering reports on the Accellera town hall meeting for the future of EDA standards. Standards are important. So was this meeting. But can you imagine if these people really comprised an entire town?

Mentor’s Colin Walls is headed to Embedded World in Nuremberg, Germany. We expect a full report. Embedded software, as design teams well know, is becoming a very big deal in IC design.

DeepChip’s John Cooley has compiled a list of Magma tool users voting to save those tools post-merger. Given the fact that the merger was completed last week, the ball is in Synopsys’ court.

TLM Central’s Tom De Schutter interviews Alex Braun of the European SystemC User’s Group. Given the fact that the early adopters of ESL were in Japan and Europe, this is good sounding board for what’s going on in system-level design.

Cadence’s Frank Schirrmeister looks at virtual divides and fixed subsystems. As customers demand pre-integrated and pre-verified IP, subsystems will grow in importance.

Mentor’s Harry Foster talks about Mentor’s Verification Academy for UVM. With all of the Big Three EDA vendors now on board with UVM, it will be interesting to see where they carve out their differentiation.

Synopsys’ Helene Thibieroz details what happened at the HSPICE special interest group event during DesignCon. If you missed it, here’s the short version.

And in case you didn’t see the most recent System-Level Design newsletter, here are some standout blogs:

–Synopsys’ Achim Nohl digs deep into pre-silicon SoC bring-up and why models make it easier.

–Cadence’s Frank Schirrmeister examines the differences between software and hardware developers and how to bridge the two worlds.

–Sonics’ Frank Ferro predicts we will never stop talking about SoC architectures for mobile devices.

–And Arteris’ Kurt Shuler compares C2C and MIPI LLI on the path to stacking of die.

Different Tradeoffs

Thursday, February 23rd, 2012

By Ed Sperling
The push to “smaller, faster and cheaper” hasn’t changed since ICs were first introduced, but the context for those requirements is beginning to shift—with enormous consequences.

What was once done on multiple chips continue to migrate to a single chip or package because of cost, but in some cases the decisions about goes where go well beyond an individual device to include a network of systems. Power and heat have forced some of those decisions. Others are being driven by shorter market windows that affect business decisions about exactly when to move to smaller, faster and cheaper, and whether to keep a design in two dimensions or move to three. In some cases, it even has evolved into a tradeoff about sharing resources to make up for additional costs elsewhere in a design.

“Form factor is everything in a lot of these cases, and you’re being forced to make tradeoffs involving a lot of different pieces,” said Mike Gianfagna, vice president of marketing at Atrenta. “But that requires you to know exactly what you’re doing. A lot of times you don’t. What happens when you reduce the number of layers? Do you know the impact on the system? You may not. But competitive pressure is also forcing you to rethink everything.”

Rethinking designs
Some of these changes are as fundamental as where the processing gets done. While the concept of cloud computing has been around since the days of time sharing on mainframe computers in the 1960s, the ability to offload processing and storage on the fly—and to load balance across compute farms around the globe—adds a modern twist to it all.

The result is a handheld device with the performance capabilities of a compute farm—but with the design focused far less on local processing and storage and more on communication and battery life.

This is evident with a number of upcoming communications schemes and protocols in the handheld market. LTE Advanced, for example, which is expected to find its way into smart phones and base stations over the next four years, focuses on reducing power while increasing performance. One of the best ways to do that is by shifting what processing is done where.

“One of the key decisions is how much processing and intelligence is in the cell phone versus the cloud,” said Graham Wilson, a product marketing manager at Tensilica. “You also have to understand deeply what cores are being used for. There is no room for fat. We’re also going to see a big shift in infrastructure from homogeneous to heterogeneous.”

That means rather than a giant cell tower on the highest hill or building, smaller boxes will be mounted on houses and strung together in a mesh network. “Every house will have its own femto cell or pico cell box so they’re less reliant on the macro cell and they work off each other,” Wilson said.

That changes what resources can be committed within a design to processing, to communication, to storage, and where it can be done best—whether it’s a central processing unit or lots of smaller processors for individual uses. It also boosts the ability to cut some costs in different places than just by shrinking the process geometries in a design.

The Low-Latency Interface working group of the MIPI alliance, for example, is currently working on a new standard that allows DRAM memory to be shared between two chips. NoC technology vendors, in particular, have seen this push because it requires a highly efficient network-on-chip infrastructure.

“The big advantage is that it allows you to get rid of an entire memory chip,” said Kurt Shuler, vice president of marketing at Arteris. “The modem and the application processor are sharing the same memory. You also reduce the number of pins, which is important because it allows you to use those pins for other things.”

He notes there is a very slight performance hit. But the ability to eliminate an entire memory chip can save a couple dollars in a design. Multiply that times millions of units and the savings are huge—far greater than just shrinking the features on a die.

Rethinking packaging
Stacking die offers another alternative to improving performance and time to market, but the tradeoff will be in cost unless additional components can be eliminated. Adding an interposer layer or TSVs will be expensive—at least initially—even though 2.5D and full 3D stacking hold the promise of dramatically improving performance through shorter distances, bigger pipes for data, and lower power because signals will not have to be driven as far.

While this packaging approach is still under development, foundries report that chips are rolling out using this approach. “This is already happening,” said Luigi Capodieci, R&D Fellow at GlobalFoundries. “It’s mostly a decision of which design processes to use in the chip, and that decision will have to be made by the chip designers.”

Stacked die also allow IP developed at older nodes—particularly analog—to be attached through Wide I/O to other chips developed at more advanced processes. That, at least in theory, substantially reduces the time it takes to design a chip because much of it can be based on what has been previously developed.

“Re-use leads to a reduction in time to market,” said Shrikrishna Gokhale, COO and managing director of Open-Silicon’s India unit. “This opens up the lifecycle of different IP and puts the emphasis on packaging and re-use.”

It also puts greater emphasis on software-hardware co-design, he said, and requires more emphasis on defining partitioning earlier in the architecture phase. In addition, it requires a rethinking of what gets done where. Some portions of the design that used to be in separate locations now have to be co-located in the same place because of the constant need to update models and data for both hardware and software teams.

“The logic front-end design needs to be done at the same location as the software,” he said. “That’s less important at the back end, which is the physical implementation.”

Other tradeoffs are less obvious, though, particularly to design engineers. One involves weight.

“Half the weight of a tablet is the battery,” said Drew Wingard, CTO of Sonics. “You can’t afford to add a bigger battery so you have to do an increasing amount of computation with lower power. That means you look at more efficient ways of doing that computing. One is using the GPU as a general-purpose CPU, which allows you to get a lot of performance at low energy.”

He noted that utilizing the GPU requires it to be easily accessible to software developers. And it requires much better management of clock domains, voltages and on-off functionality within an acceptable power budget. And to be really energy-efficient, users need to be able to easily input their own usage models.

Rethinking manufacturing
Some of the changes that are under way are forcing a major shift in manufacturing, too. Staying on the Moore’s Law road map has always been a given for high-volume digital designs, but with double patterning required at 14nm and the delay in extreme ultraviolet lithography, alternatives are being considered that could have ramifications throughout IC design.

“Double patterning is the biggest issue we’re dealing with right now,” said Jean-Marie Brunet, director of product marketing for model-based DFM and place and route integration at Mentor Graphics. “We’re even looking at triple patterning, but there is no way to have density balance between the layers when you do that.”

Lars Liebman, an IBM distinguished engineer, said his company has been working on commercializing self-assembly for finFETs because even multi-patterning isn’t sufficient beyond 14nm. That has implications throughout the design chain. For one thing, it can increase the density on existing process nodes. For another, many of the tools for automating design, particularly on the DFM side, will need to be rewritten.

Conclusion
Area, power and performance have always been the standard metrics for tradeoff in any IC design. What’s changing significantly is why those tradeoffs are being made and where the benefits will show up. Changes targeted at an individual chip in the past, or even a block or subsystem, may now be aimed at a much broader level.

The good news is that infrastructure changes—everything from manufacturing approaches to communications networks—evolve much more slowly and deliberately than those made in the individual device or chip. The bad news is that sometimes that moves so slowly that it can affect what’s done elsewhere in this much broader system. But some change is underway at every level, and managing that change—and the tradeoffs it will demand—will be much more challenging in the future.

Derivative ICs: A Look At The Options

Thursday, February 23rd, 2012

By Ann Steffora Mutschler
With the cost of designing and producing even moderately advanced SoCs skyrocketing, semiconductor companies and systems houses must find ways to defray the cost across a larger number of end uses than ever. As such many companies have adopted a platform-based derivative design approach, with that the platform serving as the first SoC design of a new family.

That strategy is expected to become far more widespread as mainstream process nodes push beyond 40nm, outsourcing of entire subsystems becomes more popular, and stacking of die gains traction.

“There used to be this idea that you would define this platform and all the chips would be derivatives of that platform,” said Drew Wingard, CTO at Sonics. “Now it’s more of, ‘No, we can’t afford to think about trying to build the platform itself. Instead, we are going to build the first chip, but we are going to use a platform-based design approach, which basically says we’re going to set ourselves up so that we can make derivatives.’ The way to sell this to more people is to personalize it for different customers or for different markets or for different regions of the world by making these derivatives. I think that’s a very common need that’s expressed to us by our customers, and it’s all of course about defraying the cost.”

But what exactly is a platform? A commonly cited definition was provided by Alberto Sangiovanni-Vincentelli, a professor of electrical engineering at the University of California at Berkeley, who is best known for co-founding both Synopsys and Cadence. He defined a platform as a set of choices that you’ve made, and because you’ve made those choices you can then abstract upwards from them to a higher level of design abstraction. And because you’ve fixed those aspects, you can refine downward.

“So the question then in a platform design is, ‘What things do you fix, what decisions do make and what decisions do you make flexible,’” said Wingard. “What things can you make different in each of the different derivatives? That’s what gets interesting about this whole theory of platform-based design.”

Of course, companies like Intel have been derivative powerhouses from the very beginning, noted Naveed Sherwani, president and CEO of Open-Silicon. “That’s actually what their strength is. They would have one or two teams chase new designs, then they would have 5, 10, or 15 teams do derivatives. The challenge is not actually to do derivative engineering. The challenge is how to do derivative engineering in the most cost-effective fashion.”

What is considered cost-effective today, however, is a different formula from what applied five years ago.

“Derivatives are being used to chase smaller and smaller markets because markets are fragmenting,” Sherwani said. “We have the long tail effect. We don’t have that $3 billion market to chase. Those are few and far between. Now we have to go chase that $200 million or $400 million market. Previously you could be more inefficient in your derivative design and you could afford to be late, but now you have to be very efficient because you are chasing a smaller market.”

A second consideration is to determine if a derivative will be outsourced. That requires effort in every aspect of chip development. “You have to have a full vertical team, and that’s not quite the case when we’re doing ASICs. Somebody is giving us fully functional RTL and our job is to do physical design, tape it out and get a working part. That is a completely different phenomenon when you go to derivative design,” Sherwani continued.

Closely connected with this decision is the rapid rise in various forms of die stacking—system in package or 2.5D with a silicon interposer—which has given semiconductor and systems developers a greater degree of freedom to choose the right option for their situation.

In this realm, Jack Harding, president and CEO of eSilicon explained the company is seeing large semiconductor and systems companies who say, “I’ve got this large inventory of successful parts. I’ve got for example, 180nm die sitting on the shelf and I’ve got this other 65nm die I got from the guy down the street and I got this RF radio I got from Skyworks and I’m going to put that all together and make some sort of product as opposed to actually making a brand new chip. The thought process is much like making a derivative. You’re making an architecture, you’re considering the software implications, you want to use as much of each, but you’re approaching it from two radically different perspectives.”

One involves making a chip that is a superset of something a company has already created. A second involves piecing it together with components the company can acquire. “Or we can try to piece it together with stuff that you have or you think you can get,” Harding said. “Let’s make a chip that is packaged or a device is good enough and get that to your customers in six months instead of three years and the NRE is $100,000 instead of $6 million. If it really takes off, then we’ll go and make the ASIC and we’ll make that single, advanced node part. If you think about the need to have a derivative, there are really two vectors to go down. One is the die-level, advanced-packaging approach. The second is to simply modify the RTL and re-tape out a version of the chip which is kind of ‘close enough.’”

A third option, of course, is to never make a chip and simply go to an FPGA.

Outsourcing grows, but not because of cost
No matter which approach is used, outsourced teams play larger roles than in the past. “A lot of global companies have that mantra of you build one design in one location and then you ship it somewhere else for derivatives,” said Michal Siwinski, group director of product marketing in the system and software realization group at Cadence. “I’m seeing at least in the leading companies something a little bit different. The geo teams are not just being treated as a cost reduction. Rather, due to the sheer complexity of what it takes to build these massive SoC platforms, you basically have these globalized teams that might take a lead on a specific derivative.”

In general, the big savings that made headlines a decade agp by outsourcing to lower-cost labor areas are no longer an option.

“Due to the complexity of re-integration, I’m not seeing that so much now,” Siwinski said. “A whole product line with its derivatives might be led by one geography versus the other, but in general the complexity of doing that derivative and the time and cost are at this point so prohibitive on their own. If you consider that a new design is $150 million, and the derivative might be $100 million, that’s still not cheap. So it’s not about notions of having lower cost to the derivatives. Rather it’s utilizing a global workforce where a lot of that expertise needs to be built in all the geographies.”

Architectural decisions drive derivatives
With all of this complexity, there is a lot more up-front work required to define what the platform is and what new market segments it can serve, in addition to the main business unit’s market. “There’s more coordination within the corporation and more cooks in the kitchen up front,” said Kurt Shuler, vice president of marketing at Arteris. “That takes a long time and there’s a lot of coordination of the different groups, because what happens is when that platform goes from one business unit to another business unit, the IP that that second business unit uses may be totally different than what the guys who created the platform for.”

Jon McDonald, a technical marketer at Mentor Graphics noted, “When you think about outsourcing, it all comes back to understanding what needs to be done. The traditional approach when people just started RTL design was that it was easy when a group is all in one location. If you have a design group and they’re all together, they have meetings daily or weekly. Everybody is talking and you communicate a lot of things informally. When you go to outsource something you’ve got to have a contract. And it’s not just a business contract. There’s got to be a development contract. If that development contract is not very specific and very rigorous, there are opportunities for miscommunication, there are opportunities for something to be delivered that doesn’t satisfy the need.”

To combat this, in many situations the delivery contract is an executable transaction level model, which is completely unambiguous and allows each party to say, “This is what I need: this is the performance I need. You need to deliver a subsystem or part or whatever it is that does exactly what this does with this performance and this architecture,” he continued. “By putting a little bit more into the upfront specification/the upfront architecture, if you have an executable model for the upfront architecture and a prime [mil/aero contractor] hands that to a subcontractor, it makes the process so much easier because they know what it needs to do.”

Next Page »