Posts Tagged ‘ARM’

Next Page »

The Week In Review: Sept. 30

Friday, September 30th, 2011

By Ed Sperling
Synopsys created the first TLM Web portal, complete with an initial offering of 600 models, and inked a deal to distribute ARM’s Cortex processor models from its new TLMCentral site. Synopsys said it hopes the portal will spur investment in virtual prototyping.

Mentor Graphics won a deal with Fujitsu for its embedded software development environment, which will be used for Fujitsu’s general-purpose 32-bit microcontrollers. What’s interesting here is that Fujitsu chose Mentor’s Embedded Sourcery CodeBench for ARM’s microcontroller IP, which will be included in the Fujitsu product. It’s an unusual keyhole into the microcontroller space.

ARM cut another deal, too, which must have had the corporate lawyers hopping. Open-Silicon signed a multi-year agreement to license a broad portfolio of ARM technology, which allows Open-Silicon to offer ARM’s IP with its own design and manufacturing services. We may be witnessing a change in the wholesale distribution model.

Tensilica inked a deal with Fraunhofer IIS, which allows Erlagen, Germany-based Fraunhofer to become a design center partner for Tensilica’s HiFi Audio DSPs. Fraunhofer, incidentally, is part of the Fraunhofer-Gesellschaft research organization, which is partly funded by the German government.

ST-Ericsson reportedly gained a 10x improvement in time by using using Cadence’s mixed-signal flow for its 40nm baseband chip. Create automation tools for analog engineers, force them to hit tight schedules within budget, and apparently they’ll use these tools.

The Week In Review: Sept. 23

Friday, September 23rd, 2011

By Ed Sperling
Summer is over, literally and figuratively.

Mentor Graphics extended its collaboration with NuFlare Technology for advanced mask generation, combining the Calibre DFM suite of tools with NuFlare mask writers.  Mentor also moved into Brazil with a Portuguese language version of PADS for designing PCBs, which gives an interesting indication of where that market is heading. It also rolled out new ATPG tools for test and is working with ARM for testing ARM processor-based designs.

Synopsys beefed up its own ATPG tools with a volume diagnostics flow and improved yield ramping.

Altis Semiconductor, the French specialty foundry, is standardizing on Cadence’s MaskCompose reticle and wafer synthesis technology. Shanghai-based Giantec Semiconductor has adopted Cadence’s Encounter and Virtuoso flows.  And Fujitsu is standardizing on Cadence’s DFM technology for 28nm and mixed-signal designs. In addition, Cadence introduced DFI 2.0-compliant design and verification IP.

Sonics rolled out the gigahertz version of its network-on-chip NoC technology, which provides lots of headroom for future derivative chips and stacked die. For anyone worried about a communications bottleneck, this should silence all fears.

The Week In Review: June 17

Friday, June 17th, 2011

By Ed Sperling
MIPS has positioned itself head-to-head with ARM in the Android world, adding yet another competitor. The other one is Intel’s Atom, of course. MIPS stake on this one involves a smartphone that passed the Android Compatibility Test Suite.

Moortec Semiconductor taped out its embedded temperature sensor IP using TSMC’s 40LP and 28HP processes and Synopsys’ custom design solution. Who says analog isn’t migrating down the process curve? Moortec is based in Plymouth, U.K. 8

TSMC’s net sales, which are a good indication of how the semiconductor industry is faring, were down 0.7% from April to May—basically flat—but they are still up 6.3% from last May, which was well into the recovery period. Revenue was up 12.2% in the same period compared with 2010.

GlobalFoundries, meanwhile, swapped out its top leadership team. Ajit Manocha will replace Doug Grose as acting CEO. James Norling will become executive chairman and Ibrahaim Ajami the vice chairman, while COO Chia Song Hwee—former CEO of Chartered Semiconductor, which was acquired by GlobalFoundries—will leave the company in August.

DAC Report: June 6

Monday, June 6th, 2011

By Ed Sperling
Mentor Graphics unveiled a common embedded software development platform, using everything from virtual prototyping to emulation to speed development time. Given that software causes the largest delay in the design process these days, this is a very big deal.

Synopsys inked a multiyear agreement with ARM under which Synopsys will provide ARM engineering teams extended access to its tools and ARM will provide Synopsys access to its Cortex- A15 processor. Considering that ARM views the A15 as its chief weapon against Intel’s Atom processor, this is an important move.

Cadence signed up with IMEC to create an automated way of testing 3D stacked ICs. IMEC has created the standard methodology in this space, which is rather convoluted because it’s not always possible to directly connect to pieces deep inside a 3D stack. Mentor announced a similar deal several months ago.

Cadence also inked a deal with ARM to provide verification IP for ARM’s AMBA 4 Coherency Extensions protocol. And it announced that it is working with TSMC to provide seed IP to support USB 2.0 and 3.0.

Sonics rolled out early support for phase two of AMBA 4, which is interesting because the NoC now includes support for the popular on-chip bus. Arteris, the other NoC vendor, has taken a similar position. What’s particularly interesting is that there is now cooperation in two directions—from ARM to the NoC and back to ARM.

Sonics also rolled out support for TSMC’s Reference Flow 12. Most of the EDA world has voiced support for this flow over the past few weeks.

Atrenta announced a big win with Fujitsu Kyushu Network Technologies, which has adopted SpyGlass AutoVerify for advanced linting analysis.

EDA Inflections On Technology Innovation

Thursday, May 26th, 2011

By John Blyler
Everyone talks about innovation. Start-up companies are the most visible vehicle for innovation, but also the most risky with a 1-in-10 chance of modest success. Less visible is the innovation that constantly must occur in fully formed, large companies if they are to continue to succeed. System-Level Design (SLD) talked with the three major EDA companies about the challenges to innovation: Michael McNamara, vice president and general manager of system-level design at Cadence; Serge Lee, vice president and general manager of new ventures in the system-level engineering division of Mentor Graphics; and Michael Jackson, vice president of engineering for physical design at Synopsys.

SLD: How do you foster innovation inside an existing, successful company? How do you create new products inside a large and often bureaucratic commercial organization?

McNamara: A good example was the C-to-Silicon product line, a high-level synthesis tool. This product started inside of Cadence labs back I 1998, originating from a project called Metropolis with UC Berkeley. This project focused on the system-level development space and was the genesis of platform-based design. The idea was that you would have a single platform that you then customize for different usages. A modern example is today’s Droid phones that use TI’s OMAP to incorporate an ARM process, graphics, Bluetooth and other radios into a collection of IP that is aggregated together. Others then start with this platform to add their unique innovation and, wham, in 6 months you have a cell phone.
In 2002, an internal design team was exploring the idea of high-level synthesis. The idea was that a C-code program running a microprocessor could serve as a specification for what would need to be done by the hardware device. In those days, we had a silicon compiler but needed to create a register-transfer-level (RTL) language as an in-between language. It was too hard to go all the way from C to transistors.

Interestingly enough, there was a product in the late 90’s called Behavioral Compiler, but it turned out to be a failure. The promise of high-level synthesis was huge, but it just wasn’t happening. Finding out why was the first goal of our research task and is a key difference between research and implementation (or product) groups. Research labs can step back and examine why things aren’t working as expected. We started by interviewing some two dozen companies that were doing various levels of high-level synthesis.

Our internal R&D group asked them why wasn’t high-level synthesis working? Those interviews identified a couple of issues. One was a manufacturing challenge correcting design issues, i.e., if RTL is automatically generated, how do you accommodate changes such as specification or place and route changes?

Another issue identified by the R&D group was the challenge of reuse. Designers really want a way to specify how the C-code program would be implemented in any given design. One example is the use of decoding and compression algorithms for video movies for different end user applications, from smart phones to laptop computer to large home entertainment theater systems. You are using the same exact algorithms, the each implementation requires a different set of power, performance and quality requirements. Quality issues might be that the smaller screen sizes on a smart phone don’t need the same display resolution as a home theater.

That was the nature of the research part of this project. The researchers gathered a bunch of data, then formulated three or four ideas that they believe constituted the major roadblocks for the adoption of this technology. Around this same time, Cadence had serendipitously purchased a high-level synthesis tool from a company called Get-to-Chip. Suddenly, the researchers at Berkeley labs had a high-level synthesis tool that they could play with and try to implement some of their ideas to address the technology roadblocks, like ECO implementation and the separation of constraints from design.

The Get-to-Chip tool read and generated Verilog, but it didn’t support C or C++ or SystemC. But the researchers could use this tool to develop a prototype. The first step involved research to identify an opportunity. The second step involved building a prototype.

SLD: Are there other ways in which large companies try to innovate new ideas?

Leef: I manage a non-traditional venture portfolio where we attempt to identify opportunities in markets or application domains that are adjacent to Mentor’s products, technologies or know-how. In other words, I’m looking for adjacent places where our current assets or expertise can be leveraged. As the IC and PCB markets are reasonably well understood and served, our focus tends to be mainly in the systems space. There are basically four operational models that we follow, which I’ll list from least to the most expensive.

Adaptation is the least expensive because it augments existing horizontal product with domain-specific libraries, design examples and application notes to create a vertical product. One example would be SystemVision, a horizontal megatronic simulator that is augmented with models applicable to implantable medical devices.

The next least-expensive approach is to repurpose a relevant in-house technology and retarget it to a different domain. For example, we have a good understanding of optimization techniques in the EDA space. We could consider retargeting this technology to a different, but adjacent, domain in the automotive electronic simulation market.
Let me explain. A modern vehicle is a complex distribute compute/control system with a myriad of possible trade-offs. Mapping of software modules to hardware subsystems has tremendous impact on cabling, performance, power, cost and weight. While these trade-offs are managed manually today, it is easy to see how the number of alternatives becomes impossible for a human to comprehend in a state-of-the-art vehicle containing 80 on-board computers and thousands of software modules. While the cost function needed to assess relative “goodness” of alternatives is quite complex, algorithmically, this problem is not very different from IC Place and Route. Thus, optimization algorithms that were originally created for IC optimization can be applied to automotive E/E design.

A third approach to adjacent markets is called incorporation. Here we identify useful third-party companies whose unique technologies can be plugged into one of our existing products. Of course, such plug-ins would need to drastically improve or alter the opportunity size for whatever it is that we have. For example, consider the development of virtual prototypes, which typically include models of microprocessors, microcontrollers and DSP cores. It is quite expensive to develop these models organically. Let’s imagine, that there is a third-party supplier of inexpensive, fast instruction-set simulators. We might acquire a license to such a technology, then snap it into multiple simulation products. We would incorporate that technology as opposed to doing a stand-alone business acquisition.

The last and most expensive approach would be development. In this situation, we have some unique know-how, but that is it. In those cases, we’d invest in R&D to create something new based purely on our understanding of the problem and requisite technologies needed to address it.

SLD: Would Calibre be an example of an R&D project?

Leef: Calibre didn’t come through any kind of structured venture portfolio management. Rather, Calibre was something of a skunkworks exercise, where a bunch of people worked for long periods (sometimes on their own time) without management noticing what they were doing. So they succeeded against organizational forces rather then because some infrastructure was in place to support the development. What I am trying to do is create an environment where programs like Calibre can be identified and nurtured in a repeatable way, as opposed to spontaneously, which is what happened with Calibre.

Overall, we tend to view our innovation effects in a similar way to venture capitalists (VCs). Currently, we are running projects that are in the A, B, and C stages of funding and development—similar to series funding in the VC world (see Figure 1). Basically, we have a Pre-A stage, in which we explore concepts to decide if business plan is warranted. In Stage A, a specific market and product opportunity are identified and we develop a prototype. Stage B consists of creating commercial strength product and engaging with early customers. Stage C is where we deploy the product to broad set of customers and hopefully start to generate revenue.

We do have a significant advantage over independent start-ups and that is a powerful sales organization. So if we determine that what we created has potential, we have mechanisms in place to sell and deploy the product and monetize the opportunity.

Fig. 1

SLD: How is R&D funded within a large company?

Jackson: We fund new technology and product development in our business units as opposed to an independent R&D organization. We have had many successful innovations doing this. Some recent examples include the creation of a new router (Zroute), new test compression (DFTMax), new RTL exploration (DC Explorer) and a new constraint analyzer (Galaxy Constraint Analyzer).

SLD: Do you favor internal development or acquisition as a way to innovate technology?

Jackson: Generally we rely more on home-grown development. This is especially true in areas where we are creating a replacement product or extending a product to address an adjacent area. Homegrown development is also used for new product areas, but acquisition also can play a role here.

Mobile Applications Drive New Architectures

Thursday, April 28th, 2011

By Pallab Chatterjee
The push toward mobility in consumer devices is having an impact on the entire component flow.

Mobile devices are dominated by two key factors—an overriding power constraint and very high data bandwidth. The power constraints are on the mobile device side and on the cloud-based support server side. The high data bandwidth issues are due to the limited processing power available and the need to switch between functions, rather than keeping a common memory load and multiprocessing of the data.

The power side for the mobile devices has been discussed in depth. The impact on the rest of the system is less well known. Because mobile devices have to process data on a limited power budget, the support for these devices—the carrier and connection network, and the computing cloud that the device is connected to—has to pick up the slack on the processing front. New custom chipsets and processor architectures are being created to address some high-volume connection tasks such as display view transcoding, security processing and authentication, and sensor/imaging data processing. These chips are making their way into the network connectivity side with multicore being the dominant format for network processors. Also on the networking side, the addition of dedicated, power-optimized AES encryption/decryption blocks allow for secure data traffic on a per block basis with mobile devices.

Also on the power side is the change to high-bandwidth interfaces such as 10G, 40G (organized as 4 lanes of 10G), and 100G (organized as 4 lanes of 25G). While it would appear these interfaces consume more power, the reality is that when implemented in pairs, the lower duty cycle and larger packet size enable low power. For the 100G interfaces, the ability to implement the 25-28G lanes with 32nm and below CMOS offers huge power savings, as the PHY/MAC pairs actually consume less dynamic and active power than 10G lanes implemented in 40nm processes.

The data bandwidth is one of the keys behind the multicore architectures of both mobile devices and server designs. To optimally process data, database access, still images, video content, audio content, gaming graphics, and sensor data (touch screen, gyroscope, GPS, etc.), separate processing engines are usually employed. This is a key driver for multicore where the task base can be continually loaded, and only the data sets get changed. In order to handle the diversity and volume of data sets to be processed, wide- and high-bandwidth data paths are needed. Servers have moved to deep memories architectures to support the cloud computing from smartphones and tablets.

Similarly, the data bandwidth of broadband and wireless are increasing. For broadband, there is a need to put more data per channel on existing lines. This is being done with new wide data architectures that support multiple lanes of SerDes driving the network. To handle the large variety of data that is being presented, new cross-point switch architectures as well as multicore internal bus architectures are changing. These new buses are both externally expandable and support individualized power and data management for each core on the bus.

These different architectures are responsible for the division in use model of the various available cores. Tensilica cores tend to be used in audio processing applications, MIPS and Freescale cores are used in network transaction and security processing, ARM cores are used a generalized CPUs for mobile devices, x86 architectures dominate the main server side and specialty DSPs abound on sensor processing. As the data consumption systems moves to being more mobile-centric, the whole ecosystem from servers to delivery is now shifting to a true ultra-thin client computing model.

3D ICs: No Simple Answers

Thursday, March 31st, 2011

By Pallab Chatterjee
Just how ready is the semiconductor industry for stacked die? That was the subject of a recent panel discussion involving ARM, Atrenta, Xilinx, Samsung and Mentor Graphics.

The reasoning behind 3D stacking is becoming clearer at each node. I/O count and delay times are forcing different configurations, but the time frames for these changes and the gating constraints are still somewhat fuzzy.

In the area of uses, the discussion focused on three areas–memories, SoCs and computing systems (processor cores and memory). Memories have been using stacked die approaches for many years. These stacks use traditional wirebond technology, feature either standard or thinned die, and have a known cost model.

The advantage of memories is that there is common pin-out and stacked devices can utilize existing memory test methodologies by adjusting the address range for the design. The die in this application are stacked from the top of one die to the bottom of the next die. These products have shipped literally billions of parts in this technology at a very similar price point to standard wire bond. This methodology supports using known good die for the design, has compatibility with current design tools and has known thermal performance.

Computing systems have a different target for stacked die. These systems, however, require a different architecture. There is local 3D memory for each core that is connected, where the core is placed in the die by way of a vertical interconnect. These applications have very high I/O counts that cannot be run to peripheral I/O, so they cannot use memory-style connections. The die are stacked in a top-to-top format. These are the designs targeted for TSVs.

There are questions, however, about whether the TSVs should be part of the IP blocks and whether the models for the IP should include the timing for vertically stacked memory. The challenge with including them in the IP is related to the large variability in post-processing options for TSV creation by the fabs. The tools needed to model the TSVs and verify the IP is being used properly are still lacking, according to the panelists. Moreover, the thermal models, changes in strain after thinning, and multi-layer capacitive coupling for the die being stacked face-to-face are issues that need to be dealt with for generalized IP use.

These problems are not unsolvable. Xilinx has released products using multi-die technology, and for fixed topology applications there is an understanding of how to solve these problems. The generalized use of TSVs randomly distributed over a custom processor die leads to the creation of custom memory configurations and pin-outs, as well.

It is unlikely that a standards group will drive the memory compilers and designers to a standard pin-out for the blocks. Because the processor cores are soft IP and have different optimization tradeoffs, there is no standardized application target that would allow for the performance tradeoffs of the cores to hit a standard pin-out. In general. these will be custom designs and custom applications. The stacked die setup is targeted for very high volume or high ASP products that can justify the high cost of test.

With respect to SoCs, this platform will likely be one of the last to address TSVs because of the impact on the design and release cycle. Packaging, thermal, timing and power issues for multi-die SoCs is very complicated and is beyond the capacity of most EDA tools, especially in the context of billion-device ICs that already are pushing the limits of the tools. Advances are being made for this area, and tool vendors have discussed options for system verification that are being targeted at this use. These are still in development, and the current releases of the tools address some but not all of the use models for TSVs and stacked die, or silicon interposer and stacked die, but are not to the design tradeoff stage as yet.

In addition, this whole area is still bracketed by cost. Traditional system-in-package and wirebond-based stacked die are still the most cost-effective for consumer commodity chips. The key is to identify a device, market and performance metric that can justify the high production cost of this technology now.

The Quest For A Better IP Integration Methodology

Thursday, March 31st, 2011

By Ed Sperling
With the amount of IP in SoC designs now hitting an estimated 70% to 90%, companies are scrambling to figure out a way to more consistently integrate that IP and to test that it will work as expected.

This is easier said than done, however, for a number of reasons:

  1. There are numerous types of IP, ranging from I/O to logic and memory.
  2. Not all IP is of equal quality.
  3. Not all IP is used the way it was intended, or even consistently from one chip to the next.
  4. Re-use within companies of their own IP frequently doesn’t conform to any standards.

So far, standards efforts in this area have been relatively modest. The SPIRIT consortium introduced IP-XACT to document IP and provide tools to access meta data, but that’s a far cry from a consistent methodology for integrating IP.

“In the old days all you had to do was characterize the IP,” said Jean-Marie Brunet, director of product marketing for model-based DFM and place-and-route integration at Mentor Graphics. “Now you try to create context with lithography and stress. You need to instantiate the IP in corner cases and the surrounding context. It’s random at this point, which means there is not a lot of predictability.”

That becomes even more critical at future nodes. At 20nm, for example, double patterning makes IP even harder to characterize and re-use. And fill at 28nm and 20nm can have an effect on density, which in turn affects min/max values. That also has an effect on IP.

“These are problems for the IP creator and the SoC integrator,” said Brunet. “You almost need a ring around every IP, but that blows the area. And double patterning is not done the same way from the IDM to the foundry, so you need a situational solution for each version.”

There also has to be a better way of defining what is good IP. A piece of IP that functions perfectly in one design may not function the same way in all designs because of issues ranging from noise—a problem that has been particularly acute for RF and some analog IP—to electromagnetic interference, physical stress and exactly how the IP is used.

“The big issues we’ve found is that different IP is being delivered in different states of readiness and quality with a different understanding of what it means to actually be IP,” said Neil Hand, group director for product marketing in Cadence’s new business group. “Today when you deliver IP you do some amount of generalized skeleton code, floor planning and I/O placement. But there is a lack of consistency in this.”

He noted that at 70% to 90% IP content in SoCs, any amount of overhead in making IP come together and work properly is unacceptable. “What’s needed is to unify the delivery of IP. After that, everything falls into place.”

Verifying quality
Behind that delivery is a need to have more consistent quality, which means the IP can be used under a variety of circumstances and still work as planned.

“Integration is an issue, but the bigger problem from a customer standpoint is to figure out which IP is good and which IP is not good,” said Gideon Intrater, vice president of marketing and applications at MIPS. “The risk is huge. What you’re looking for is IP that is isolated enough from the rest of the system. With sensitive analog or RF you still want to be able to drop it into the chip and have enough rules in place for using that IP. But you also have to consider that the more aggressive the process technology, the more IP you put in a chip and the more power and power rails, which are noise—all of that is going to impact how the IP behaves.”

IP certainly needs to be tested once it’s in a design, but it also needs to be tested and properly characterized well before that. Large IP vendors typically build reference designs using worst-case scenarios to test the limits of their products. With Synopsys’ DDR3 and DDR4, for example, the company has built the memory into what Navraj Nandra, Synopsys’ senior director of marketing for DesignWare analog and MSIP IP, calls “cheap and nasty packages.”

“What we don’t know is how the customer will implement IP inside an SoC,” Nandra said. “But there is a lot you can do to mitigate potential issues if you know what they are.”

The largest merchant IP vendors—ARM, Synopsys and MIPS—all use this method of testing all possible configurations and developing data sheets for problems that can erupt along the way. Jack Browne, senior vice president of sales and marketing at Sonics and a former executive at MIPS, said that once an IP company has more than 20 customers and has developed more than 5 to 10 products, it has figured out the quality issues. “As customers do their second and third transaction with an IP company, they’ve got the quality issues worked out on their side, too.”

Internally developed IP and most custom-built analog IP rarely have that kind of information available, however. And as companies attempt to move their existing IP to the next process node, or when they attempt to use the old methods of putting in IP blocks as it becomes available, problems can erupt that no one ever considered.

“The interconnect ends up being the sticky point in chips,” said Kurt Shuler, director of marketing at Arteris. “If you use Wide I/O to memory on a mobile phone you get better bandwidth, but the question that has to be answered is where you put everything. You need to floor plan all the IP blocks earlier. And often the people doing the interconnect and the people doing the IP don’t understand the IP inputs as well as they need to.”

Future directions
The question now is just how much IP will be sold pre-integrated as subsystems or even as complete die for use in 3D stacking.

“The methodologies for putting subsystems together and SoCs together are not all that different,” said Ajoy Bose, chairman and CEO of Atrenta. “There is some methodology in place today, even if it involves homegrown solutions and scripts. What’s more of a challenge is trying to fit your own ideas into an existing situation.”

That’s been the problem with commercial IP from the start. It’s possible to write IP specifically for an SoC design that is smaller in area, uses lower power and has no proximity issues because it is developed for a specific design. But getting the design out the door on time using internally developed IP is impossible.

“Right now you create IP, sign off on that IP, you import the IP, validate it in an SoC and hand it off to implementation,” said Bose. “This is similar to what the enterprise software industry was doing with analysis, human resources and inventory. Then enterprise applications were created to connect all the software together into a single integrated package. We’re seeing the same trend in IP with the subsystem becoming more popular. It helps that the semiconductor companies are aligning themselves vertically, too. With each vertical they know the pieces that are used.”

In many cases this job will fall to value-chain producers such as eSilicon, Open-Silicon and Global Unichip, which are among the largest commercial IP integrators and testers. Kalar Rajendiran, eSilicon’s senior director of marketing, said his company has developed a four-step methodology for selecting, managing, integrating and testing IP. What’s important in this process is an understanding of how that IP performs in chips over time and for multiple customers.

“The really heavy lifting is in selecting the IP,” he said. “Choosing IP suppliers is very important. Once we qualify the IP we document it in a database with version control. We also audit the supplier’s methodology—what they use to develop and verify that IP—and we do a site visit to the IP supplier to meet with them. We’ve been doing this for 10 years. We have proof points about why not to go with certain supplier. In some cases it’s because they cause problem for other industry players.”

At this point those kinds of capabilities are a competitive advantage and problems on the integration and testing of IP loom large for many companies. That may change as the IP industry continues to consolidate and tools become available, but at that point the problem also may be less about integration than on customization of IP for specific needs.

The Enterprise Effect

Thursday, February 24th, 2011

By Pallab Chatterjee
In the enterprise it’s all about speed and power—as in more speed and less power—and those changes are forcing shifts in the chip architectures as well as the processes used to develop those chips.

At the Linley Data Center Conference the next generation of network control chips were discussed. The keys for the new networks are 10G data lanes to be used with 10G/40G and 100G applications. For 100G the alternate configuration from 10 lanes of 10G was 4 lanes of 25Gb/s also being designed with 40nm.

The 40nm processes give the advantage of the data speed that was needed, plus power savings that are required to keep the reliability of the die and package. The trend is that these high-speed switches need to be available not as single PHYs, but as duals and quads. The 40nm node allows for target power at about 3W for these parts, which will enable 24- and 48-channel switch products.

The PHY that is being provided by most of the vendors can, with the 40nm process, support security data processing. The architecture for many of the high-throughput data systems includes local data analysis, decryption, policy and authentication testing off the early data bus just after the transceivers. These application processors can be on the same die or separate die from the PHY.

In applications where there are separate server processor chips, the trend is toward 32nm processes with multicore configurations. Intel is offering 6- and 10-core products under the Westmere architecture. For the upcoming Sandy Bridge architectures, they are featuring 8 and 12 cores using the 32nm process. On the server processor side, there also are 32nm products from AMD using the new “Bulldozer” architecture. Rounding out the server side there are also new cores from ARM with the Cortex A-15.

For dedicated application processors, a number of multicore processors are now available using 40nm processes. These include the 16-core Octeon from Cavium Networks, the 8-core QorIQ from Freescale, the 4-core ACP3448 from LSI, and the 8-core XLP family of processors from Netlogic Micro. Also in this space is the Netronome NFP-3240, which is a 40-core 40Gbps flow processor that is a co-processor to the Xeon main processor for network traffic handling.

One of the power/performance drivers is the security aspects of the networks. The Federal Information Processing Standards (FIPS) 140 is focused on cryptography and security systems, not on items such as firewalls, Web filters, spam and virus protection, or content and flow control. The cryptographic modules are constantly increasing in complexity of their algorithms and degree of touch of the data.

The Growing Importance Of Subsystems

Thursday, February 24th, 2011

By Ed Sperling
A growing reliance on third-party IP is beginning to expand well beyond just IP blocks and into full subsystems, opening significant growth opportunities for companies competing in this market as well as enormous business and technical challenges.

The IP market is ripe for this kind of convergence. Complexity at advanced process nodes coupled with time-to-market demands has elevated third-party IP from an emergency fix inside most designs to a necessity. By some reports IP now accounts for up to 90% of an SoC design. What’s changing is that IP increasingly is being integrated with other IP, software and even hard IP, so it can be plugged into an SoC with far fewer integration problems associated with single IP blocks.

“What’s driving this are the tablet and smart phone markets,” said Prasad Subramaniam, vice president of design technology at eSilicon. “Those are the highest growth markets, and companies like ARM and MIPS are creating blocks around their cores to harden and pre-verify. We’re even seeing this happening with entire reference designs, which include those subsystems. In the past this was just a reference design. But there are a lot of OEMs—especially Asian companies—that don’t have the resources to do these designs from scratch, so they’re picking up the design and fixing whatever is necessary to get to market quickly.”

James Mac Hale, vice president of Asia operations at Sonics, had a similar view: “We’ve seen SoCs move from integrating cores to subsystems. The number of cores is going up, and so is the complexity and desire for re-use. The challenge is that once you combine all these subsystems, how do you design the overall system behavior. Each reacts differently on its own.”

Getting a handle on the changes in this market is no simple task, in part because this trend is just beginning to take shape and in part because of the breadth of what’s happening. Even defining IP can lead to arguments about what is and is not considered commercial intellectual property. Adding more pieces into the mix only confuses the definition.

Synopsys, for one, defines IP subsystems based upon function. That definition includes the integration of one or more pieces of IP with the software stack running on top of it, all of which is configured for a specific application. That could be an audio-based subsystem with an ARC processor, the necessary codecs, interconnects for such things as a headset or speakers, or it could be a USB subsystem with the controller, PHY, a software stack on the USB, integration services and verification IP, according to John Koeter, vice president of marketing for IP and systems at Synopsys.

“The majority of IP is still going through traditional sales channels,” said Koeter. By our best estimates, about 5% of the IP market today is made up of IP subsystems. Over the next three years we expect that to double or triple.”

Exactly what those subsystems evolve into, however, is anyone’s guess. “One way to look at this is that right now you have semiconductor IP like USBs or memory, and at the other end you have platform-based deisgn, which is architectural re-use but not IP,” said Mike Gianfagna, vice president of marketing at Atrenta. “The subsystem is what’s in between. What’s interesting about all of this is that a few years ago what we now consider a subsystem was a full chip.”

Business challenges and changes
As the market for subsystems grows it also will create significant fallout across the industry, which is why chipmakers, IP developers and tools companies all are scrambling to position themselves for this shift. Rather than just another form of outsourcing of pre-developed IP, the convergence of multiple IP blocks, development tools, software and even services threatens to shake up the power structure in this segment of the industry.

It’s not certain at this point who will lead the subsystems effort, and whether the leaders will emerge from existing IP vendors, software developers, tools companies, or some combination of all three that has yet to come together.

Neil Hand, group director for marketing inside of Cadence’s SoC realization group, believes that while foundries are well equipped to deliver IP the EDA companies are better equipped to deliver a subsystem. “This is the functional space, which is where EDA companies live,” Hand said. “They can combine IP with high-level synthesis and high-level modeling. It’s a natural direction for EDA companies to be working with IP.”

eSilicon’s Subramaniam believes it also could be the chip companies that ultimately sell the combined subsystems. “I see the chip companies becoming more like IP companies, particularly as 3D stacking evolves,” he said. “You might see memory companies initially, but it also could be analog companies or an RF company selling the subsystems.”

At least part of what will drive these changes is the push toward Wide I/O and the recognition that multicore and many-core strategies need to be re-evaluated. The initial idea behind multicore was that either software would be written in parallel or that virtualization would work when parallelization wasn’t possible. Despite the devotion of enormous resources to parallelization of software, the best that companies have been muster for many applications is to thread certain functions onto two or four cores. In a 16-core processor, that still leaves 12 cores idle.

Virtualization hasn’t worked out as planned, either, despite its success in the server world. The solution in enterprise IT departments has been to virtualize servers to improve utilization of the servers, which typically had been running at 5% to 15% of capacity, by most industry estimates. While that improved efficiency in large server farms with thousands of server racks, because it reduced cooling costs and the cost of powering the servers, the strategy is actually inefficient at the processor level because all the cores must be homogeneous and too many need to be in the “on” state to take advantage of this approach. Moreover, one of the limitations of multicore and many-core systems is the shared memory.

With wide I/O, more dedicated memory and heterogeneous cores sized for specific applications, performance can be ratcheted up significantly while simultaneously reducing power. That basically turns a core into a subsystem, and one that may or may not be independently designed by a third party and tweaked slightly for re-use.

“The tricky part is what happens near the interface,” Gianfagna said. “Timing, power and the performance of a subsystem, or even a block, are now affected by its neighbors. That means you have to re-check it in the context of full chip integration. We will need tools at the subsystem and the system level to do that. In my opinion, that’s a huge opportunity for EDA. It’s also a modeling and methodology challenge.”

Signal traffic also is affected. Sonics’ Mac Hale said connectivity is one of the top issues that needs to be addressed as IP is combined into subsystems. “We need much more flexible interfaces to deal with this,” he said. “System-level IP is becoming much more important these days, and that includes subsystems. We need to understand how different subsystems interact.”

Impact of 3D stacking
3D stacking and Wide I/O are expected to bolster sales of pre-integrated IP even more. While solving the issue of traffic bottlenecks, they also significantly raise the complexity of the interactions.

“It’s not really off-the-shelf subsystems,” said Cadence’s Hand. “They have to be tweaked by traffic patterns. Long-term there may be a whole memory subsystem, but right now it’s getting together pieces that work. That could include a Qualcomm baseband subsystem, which is incredibly complex. It also could be a compute subsystem that includes a processor and graphics chip. But while there is a demand for off-the-shelf IP that works together, customers are still wary
of taking everything off the shelf.”

In one respect this is a significant market shift. From an EDA tools perspective, however, it amounts to a tweak—at least for the moment.

Michael Buehler-Garcia, director of Calibre design solutions marketing at Mentor Graphics, said that whether it’s blocks or subsystems or even moving devices onto printed circuit boards, the basic idea hasn’t changed. As a result he believes many of tools at the back end of the design should work fine—at least until stacking of chips begins over the next couple years.

“At that point you’re going to be doing the kind of make vs. buy decisions that you’re doing now with third-party IP, but it could be a proven die instead of IP,” Buehler-Garcia said. “From an EDA perspective, until you are doing tradeoff analysis and tuning with the TSV (through-silicon via), the tools we have now will work with extra scripts.”

Conclusions
The push toward subsystems will continue unfolding over the next few years, driven by ever-increasingly complexity and an understanding of where companies truly add value and where they’re adding a function that is required by a particular market segment. That makes subsystems a design shortcut, and one that is particularly useful when the marketing department adds another requirement late in the design cycle.

“An IP subsystem becomes the bridge between the system-level design and the implementation,” said Synopsys’ Koeter. “You get a virtualized model of the IP subsystem, accelerated chip-level verification and you start seeing software integration of the stack.

It also becomes a business opportunity in its own right for companies that can build these flexible subsystems, and for those that can sell them in a coherent way.

“The market opportunity is for a catalog of proven silicon so you pick out what you want,” said Mentor’s Buehler-Garcia. “That is a quick way to get to market.”

Next Page »