Posts Tagged ‘ARM’

Next Page »

DAC Report: June 6

Monday, June 6th, 2011

By Ed Sperling
Mentor Graphics unveiled a common embedded software development platform, using everything from virtual prototyping to emulation to speed development time. Given that software causes the largest delay in the design process these days, this is a very big deal.

Synopsys inked a multiyear agreement with ARM under which Synopsys will provide ARM engineering teams extended access to its tools and ARM will provide Synopsys access to its Cortex- A15 processor. Considering that ARM views the A15 as its chief weapon against Intel’s Atom processor, this is an important move.

Cadence signed up with IMEC to create an automated way of testing 3D stacked ICs. IMEC has created the standard methodology in this space, which is rather convoluted because it’s not always possible to directly connect to pieces deep inside a 3D stack. Mentor announced a similar deal several months ago.

Cadence also inked a deal with ARM to provide verification IP for ARM’s AMBA 4 Coherency Extensions protocol. And it announced that it is working with TSMC to provide seed IP to support USB 2.0 and 3.0.

Sonics rolled out early support for phase two of AMBA 4, which is interesting because the NoC now includes support for the popular on-chip bus. Arteris, the other NoC vendor, has taken a similar position. What’s particularly interesting is that there is now cooperation in two directions—from ARM to the NoC and back to ARM.

Sonics also rolled out support for TSMC’s Reference Flow 12. Most of the EDA world has voiced support for this flow over the past few weeks.

Atrenta announced a big win with Fujitsu Kyushu Network Technologies, which has adopted SpyGlass AutoVerify for advanced linting analysis.

EDA Inflections On Technology Innovation

Thursday, May 26th, 2011

By John Blyler
Everyone talks about innovation. Start-up companies are the most visible vehicle for innovation, but also the most risky with a 1-in-10 chance of modest success. Less visible is the innovation that constantly must occur in fully formed, large companies if they are to continue to succeed. System-Level Design (SLD) talked with the three major EDA companies about the challenges to innovation: Michael McNamara, vice president and general manager of system-level design at Cadence; Serge Lee, vice president and general manager of new ventures in the system-level engineering division of Mentor Graphics; and Michael Jackson, vice president of engineering for physical design at Synopsys.

SLD: How do you foster innovation inside an existing, successful company? How do you create new products inside a large and often bureaucratic commercial organization?

McNamara: A good example was the C-to-Silicon product line, a high-level synthesis tool. This product started inside of Cadence labs back I 1998, originating from a project called Metropolis with UC Berkeley. This project focused on the system-level development space and was the genesis of platform-based design. The idea was that you would have a single platform that you then customize for different usages. A modern example is today’s Droid phones that use TI’s OMAP to incorporate an ARM process, graphics, Bluetooth and other radios into a collection of IP that is aggregated together. Others then start with this platform to add their unique innovation and, wham, in 6 months you have a cell phone.
In 2002, an internal design team was exploring the idea of high-level synthesis. The idea was that a C-code program running a microprocessor could serve as a specification for what would need to be done by the hardware device. In those days, we had a silicon compiler but needed to create a register-transfer-level (RTL) language as an in-between language. It was too hard to go all the way from C to transistors.

Interestingly enough, there was a product in the late 90’s called Behavioral Compiler, but it turned out to be a failure. The promise of high-level synthesis was huge, but it just wasn’t happening. Finding out why was the first goal of our research task and is a key difference between research and implementation (or product) groups. Research labs can step back and examine why things aren’t working as expected. We started by interviewing some two dozen companies that were doing various levels of high-level synthesis.

Our internal R&D group asked them why wasn’t high-level synthesis working? Those interviews identified a couple of issues. One was a manufacturing challenge correcting design issues, i.e., if RTL is automatically generated, how do you accommodate changes such as specification or place and route changes?

Another issue identified by the R&D group was the challenge of reuse. Designers really want a way to specify how the C-code program would be implemented in any given design. One example is the use of decoding and compression algorithms for video movies for different end user applications, from smart phones to laptop computer to large home entertainment theater systems. You are using the same exact algorithms, the each implementation requires a different set of power, performance and quality requirements. Quality issues might be that the smaller screen sizes on a smart phone don’t need the same display resolution as a home theater.

That was the nature of the research part of this project. The researchers gathered a bunch of data, then formulated three or four ideas that they believe constituted the major roadblocks for the adoption of this technology. Around this same time, Cadence had serendipitously purchased a high-level synthesis tool from a company called Get-to-Chip. Suddenly, the researchers at Berkeley labs had a high-level synthesis tool that they could play with and try to implement some of their ideas to address the technology roadblocks, like ECO implementation and the separation of constraints from design.

The Get-to-Chip tool read and generated Verilog, but it didn’t support C or C++ or SystemC. But the researchers could use this tool to develop a prototype. The first step involved research to identify an opportunity. The second step involved building a prototype.

SLD: Are there other ways in which large companies try to innovate new ideas?

Leef: I manage a non-traditional venture portfolio where we attempt to identify opportunities in markets or application domains that are adjacent to Mentor’s products, technologies or know-how. In other words, I’m looking for adjacent places where our current assets or expertise can be leveraged. As the IC and PCB markets are reasonably well understood and served, our focus tends to be mainly in the systems space. There are basically four operational models that we follow, which I’ll list from least to the most expensive.

Adaptation is the least expensive because it augments existing horizontal product with domain-specific libraries, design examples and application notes to create a vertical product. One example would be SystemVision, a horizontal megatronic simulator that is augmented with models applicable to implantable medical devices.

The next least-expensive approach is to repurpose a relevant in-house technology and retarget it to a different domain. For example, we have a good understanding of optimization techniques in the EDA space. We could consider retargeting this technology to a different, but adjacent, domain in the automotive electronic simulation market.
Let me explain. A modern vehicle is a complex distribute compute/control system with a myriad of possible trade-offs. Mapping of software modules to hardware subsystems has tremendous impact on cabling, performance, power, cost and weight. While these trade-offs are managed manually today, it is easy to see how the number of alternatives becomes impossible for a human to comprehend in a state-of-the-art vehicle containing 80 on-board computers and thousands of software modules. While the cost function needed to assess relative “goodness” of alternatives is quite complex, algorithmically, this problem is not very different from IC Place and Route. Thus, optimization algorithms that were originally created for IC optimization can be applied to automotive E/E design.

A third approach to adjacent markets is called incorporation. Here we identify useful third-party companies whose unique technologies can be plugged into one of our existing products. Of course, such plug-ins would need to drastically improve or alter the opportunity size for whatever it is that we have. For example, consider the development of virtual prototypes, which typically include models of microprocessors, microcontrollers and DSP cores. It is quite expensive to develop these models organically. Let’s imagine, that there is a third-party supplier of inexpensive, fast instruction-set simulators. We might acquire a license to such a technology, then snap it into multiple simulation products. We would incorporate that technology as opposed to doing a stand-alone business acquisition.

The last and most expensive approach would be development. In this situation, we have some unique know-how, but that is it. In those cases, we’d invest in R&D to create something new based purely on our understanding of the problem and requisite technologies needed to address it.

SLD: Would Calibre be an example of an R&D project?

Leef: Calibre didn’t come through any kind of structured venture portfolio management. Rather, Calibre was something of a skunkworks exercise, where a bunch of people worked for long periods (sometimes on their own time) without management noticing what they were doing. So they succeeded against organizational forces rather then because some infrastructure was in place to support the development. What I am trying to do is create an environment where programs like Calibre can be identified and nurtured in a repeatable way, as opposed to spontaneously, which is what happened with Calibre.

Overall, we tend to view our innovation effects in a similar way to venture capitalists (VCs). Currently, we are running projects that are in the A, B, and C stages of funding and development—similar to series funding in the VC world (see Figure 1). Basically, we have a Pre-A stage, in which we explore concepts to decide if business plan is warranted. In Stage A, a specific market and product opportunity are identified and we develop a prototype. Stage B consists of creating commercial strength product and engaging with early customers. Stage C is where we deploy the product to broad set of customers and hopefully start to generate revenue.

We do have a significant advantage over independent start-ups and that is a powerful sales organization. So if we determine that what we created has potential, we have mechanisms in place to sell and deploy the product and monetize the opportunity.

Fig. 1

SLD: How is R&D funded within a large company?

Jackson: We fund new technology and product development in our business units as opposed to an independent R&D organization. We have had many successful innovations doing this. Some recent examples include the creation of a new router (Zroute), new test compression (DFTMax), new RTL exploration (DC Explorer) and a new constraint analyzer (Galaxy Constraint Analyzer).

SLD: Do you favor internal development or acquisition as a way to innovate technology?

Jackson: Generally we rely more on home-grown development. This is especially true in areas where we are creating a replacement product or extending a product to address an adjacent area. Homegrown development is also used for new product areas, but acquisition also can play a role here.

Mobile Applications Drive New Architectures

Thursday, April 28th, 2011

By Pallab Chatterjee
The push toward mobility in consumer devices is having an impact on the entire component flow.

Mobile devices are dominated by two key factors—an overriding power constraint and very high data bandwidth. The power constraints are on the mobile device side and on the cloud-based support server side. The high data bandwidth issues are due to the limited processing power available and the need to switch between functions, rather than keeping a common memory load and multiprocessing of the data.

The power side for the mobile devices has been discussed in depth. The impact on the rest of the system is less well known. Because mobile devices have to process data on a limited power budget, the support for these devices—the carrier and connection network, and the computing cloud that the device is connected to—has to pick up the slack on the processing front. New custom chipsets and processor architectures are being created to address some high-volume connection tasks such as display view transcoding, security processing and authentication, and sensor/imaging data processing. These chips are making their way into the network connectivity side with multicore being the dominant format for network processors. Also on the networking side, the addition of dedicated, power-optimized AES encryption/decryption blocks allow for secure data traffic on a per block basis with mobile devices.

Also on the power side is the change to high-bandwidth interfaces such as 10G, 40G (organized as 4 lanes of 10G), and 100G (organized as 4 lanes of 25G). While it would appear these interfaces consume more power, the reality is that when implemented in pairs, the lower duty cycle and larger packet size enable low power. For the 100G interfaces, the ability to implement the 25-28G lanes with 32nm and below CMOS offers huge power savings, as the PHY/MAC pairs actually consume less dynamic and active power than 10G lanes implemented in 40nm processes.

The data bandwidth is one of the keys behind the multicore architectures of both mobile devices and server designs. To optimally process data, database access, still images, video content, audio content, gaming graphics, and sensor data (touch screen, gyroscope, GPS, etc.), separate processing engines are usually employed. This is a key driver for multicore where the task base can be continually loaded, and only the data sets get changed. In order to handle the diversity and volume of data sets to be processed, wide- and high-bandwidth data paths are needed. Servers have moved to deep memories architectures to support the cloud computing from smartphones and tablets.

Similarly, the data bandwidth of broadband and wireless are increasing. For broadband, there is a need to put more data per channel on existing lines. This is being done with new wide data architectures that support multiple lanes of SerDes driving the network. To handle the large variety of data that is being presented, new cross-point switch architectures as well as multicore internal bus architectures are changing. These new buses are both externally expandable and support individualized power and data management for each core on the bus.

These different architectures are responsible for the division in use model of the various available cores. Tensilica cores tend to be used in audio processing applications, MIPS and Freescale cores are used in network transaction and security processing, ARM cores are used a generalized CPUs for mobile devices, x86 architectures dominate the main server side and specialty DSPs abound on sensor processing. As the data consumption systems moves to being more mobile-centric, the whole ecosystem from servers to delivery is now shifting to a true ultra-thin client computing model.

3D ICs: No Simple Answers

Thursday, March 31st, 2011

By Pallab Chatterjee
Just how ready is the semiconductor industry for stacked die? That was the subject of a recent panel discussion involving ARM, Atrenta, Xilinx, Samsung and Mentor Graphics.

The reasoning behind 3D stacking is becoming clearer at each node. I/O count and delay times are forcing different configurations, but the time frames for these changes and the gating constraints are still somewhat fuzzy.

In the area of uses, the discussion focused on three areas–memories, SoCs and computing systems (processor cores and memory). Memories have been using stacked die approaches for many years. These stacks use traditional wirebond technology, feature either standard or thinned die, and have a known cost model.

The advantage of memories is that there is common pin-out and stacked devices can utilize existing memory test methodologies by adjusting the address range for the design. The die in this application are stacked from the top of one die to the bottom of the next die. These products have shipped literally billions of parts in this technology at a very similar price point to standard wire bond. This methodology supports using known good die for the design, has compatibility with current design tools and has known thermal performance.

Computing systems have a different target for stacked die. These systems, however, require a different architecture. There is local 3D memory for each core that is connected, where the core is placed in the die by way of a vertical interconnect. These applications have very high I/O counts that cannot be run to peripheral I/O, so they cannot use memory-style connections. The die are stacked in a top-to-top format. These are the designs targeted for TSVs.

There are questions, however, about whether the TSVs should be part of the IP blocks and whether the models for the IP should include the timing for vertically stacked memory. The challenge with including them in the IP is related to the large variability in post-processing options for TSV creation by the fabs. The tools needed to model the TSVs and verify the IP is being used properly are still lacking, according to the panelists. Moreover, the thermal models, changes in strain after thinning, and multi-layer capacitive coupling for the die being stacked face-to-face are issues that need to be dealt with for generalized IP use.

These problems are not unsolvable. Xilinx has released products using multi-die technology, and for fixed topology applications there is an understanding of how to solve these problems. The generalized use of TSVs randomly distributed over a custom processor die leads to the creation of custom memory configurations and pin-outs, as well.

It is unlikely that a standards group will drive the memory compilers and designers to a standard pin-out for the blocks. Because the processor cores are soft IP and have different optimization tradeoffs, there is no standardized application target that would allow for the performance tradeoffs of the cores to hit a standard pin-out. In general. these will be custom designs and custom applications. The stacked die setup is targeted for very high volume or high ASP products that can justify the high cost of test.

With respect to SoCs, this platform will likely be one of the last to address TSVs because of the impact on the design and release cycle. Packaging, thermal, timing and power issues for multi-die SoCs is very complicated and is beyond the capacity of most EDA tools, especially in the context of billion-device ICs that already are pushing the limits of the tools. Advances are being made for this area, and tool vendors have discussed options for system verification that are being targeted at this use. These are still in development, and the current releases of the tools address some but not all of the use models for TSVs and stacked die, or silicon interposer and stacked die, but are not to the design tradeoff stage as yet.

In addition, this whole area is still bracketed by cost. Traditional system-in-package and wirebond-based stacked die are still the most cost-effective for consumer commodity chips. The key is to identify a device, market and performance metric that can justify the high production cost of this technology now.

The Quest For A Better IP Integration Methodology

Thursday, March 31st, 2011

By Ed Sperling
With the amount of IP in SoC designs now hitting an estimated 70% to 90%, companies are scrambling to figure out a way to more consistently integrate that IP and to test that it will work as expected.

This is easier said than done, however, for a number of reasons:

  1. There are numerous types of IP, ranging from I/O to logic and memory.
  2. Not all IP is of equal quality.
  3. Not all IP is used the way it was intended, or even consistently from one chip to the next.
  4. Re-use within companies of their own IP frequently doesn’t conform to any standards.

So far, standards efforts in this area have been relatively modest. The SPIRIT consortium introduced IP-XACT to document IP and provide tools to access meta data, but that’s a far cry from a consistent methodology for integrating IP.

“In the old days all you had to do was characterize the IP,” said Jean-Marie Brunet, director of product marketing for model-based DFM and place-and-route integration at Mentor Graphics. “Now you try to create context with lithography and stress. You need to instantiate the IP in corner cases and the surrounding context. It’s random at this point, which means there is not a lot of predictability.”

That becomes even more critical at future nodes. At 20nm, for example, double patterning makes IP even harder to characterize and re-use. And fill at 28nm and 20nm can have an effect on density, which in turn affects min/max values. That also has an effect on IP.

“These are problems for the IP creator and the SoC integrator,” said Brunet. “You almost need a ring around every IP, but that blows the area. And double patterning is not done the same way from the IDM to the foundry, so you need a situational solution for each version.”

There also has to be a better way of defining what is good IP. A piece of IP that functions perfectly in one design may not function the same way in all designs because of issues ranging from noise—a problem that has been particularly acute for RF and some analog IP—to electromagnetic interference, physical stress and exactly how the IP is used.

“The big issues we’ve found is that different IP is being delivered in different states of readiness and quality with a different understanding of what it means to actually be IP,” said Neil Hand, group director for product marketing in Cadence’s new business group. “Today when you deliver IP you do some amount of generalized skeleton code, floor planning and I/O placement. But there is a lack of consistency in this.”

He noted that at 70% to 90% IP content in SoCs, any amount of overhead in making IP come together and work properly is unacceptable. “What’s needed is to unify the delivery of IP. After that, everything falls into place.”

Verifying quality
Behind that delivery is a need to have more consistent quality, which means the IP can be used under a variety of circumstances and still work as planned.

“Integration is an issue, but the bigger problem from a customer standpoint is to figure out which IP is good and which IP is not good,” said Gideon Intrater, vice president of marketing and applications at MIPS. “The risk is huge. What you’re looking for is IP that is isolated enough from the rest of the system. With sensitive analog or RF you still want to be able to drop it into the chip and have enough rules in place for using that IP. But you also have to consider that the more aggressive the process technology, the more IP you put in a chip and the more power and power rails, which are noise—all of that is going to impact how the IP behaves.”

IP certainly needs to be tested once it’s in a design, but it also needs to be tested and properly characterized well before that. Large IP vendors typically build reference designs using worst-case scenarios to test the limits of their products. With Synopsys’ DDR3 and DDR4, for example, the company has built the memory into what Navraj Nandra, Synopsys’ senior director of marketing for DesignWare analog and MSIP IP, calls “cheap and nasty packages.”

“What we don’t know is how the customer will implement IP inside an SoC,” Nandra said. “But there is a lot you can do to mitigate potential issues if you know what they are.”

The largest merchant IP vendors—ARM, Synopsys and MIPS—all use this method of testing all possible configurations and developing data sheets for problems that can erupt along the way. Jack Browne, senior vice president of sales and marketing at Sonics and a former executive at MIPS, said that once an IP company has more than 20 customers and has developed more than 5 to 10 products, it has figured out the quality issues. “As customers do their second and third transaction with an IP company, they’ve got the quality issues worked out on their side, too.”

Internally developed IP and most custom-built analog IP rarely have that kind of information available, however. And as companies attempt to move their existing IP to the next process node, or when they attempt to use the old methods of putting in IP blocks as it becomes available, problems can erupt that no one ever considered.

“The interconnect ends up being the sticky point in chips,” said Kurt Shuler, director of marketing at Arteris. “If you use Wide I/O to memory on a mobile phone you get better bandwidth, but the question that has to be answered is where you put everything. You need to floor plan all the IP blocks earlier. And often the people doing the interconnect and the people doing the IP don’t understand the IP inputs as well as they need to.”

Future directions
The question now is just how much IP will be sold pre-integrated as subsystems or even as complete die for use in 3D stacking.

“The methodologies for putting subsystems together and SoCs together are not all that different,” said Ajoy Bose, chairman and CEO of Atrenta. “There is some methodology in place today, even if it involves homegrown solutions and scripts. What’s more of a challenge is trying to fit your own ideas into an existing situation.”

That’s been the problem with commercial IP from the start. It’s possible to write IP specifically for an SoC design that is smaller in area, uses lower power and has no proximity issues because it is developed for a specific design. But getting the design out the door on time using internally developed IP is impossible.

“Right now you create IP, sign off on that IP, you import the IP, validate it in an SoC and hand it off to implementation,” said Bose. “This is similar to what the enterprise software industry was doing with analysis, human resources and inventory. Then enterprise applications were created to connect all the software together into a single integrated package. We’re seeing the same trend in IP with the subsystem becoming more popular. It helps that the semiconductor companies are aligning themselves vertically, too. With each vertical they know the pieces that are used.”

In many cases this job will fall to value-chain producers such as eSilicon, Open-Silicon and Global Unichip, which are among the largest commercial IP integrators and testers. Kalar Rajendiran, eSilicon’s senior director of marketing, said his company has developed a four-step methodology for selecting, managing, integrating and testing IP. What’s important in this process is an understanding of how that IP performs in chips over time and for multiple customers.

“The really heavy lifting is in selecting the IP,” he said. “Choosing IP suppliers is very important. Once we qualify the IP we document it in a database with version control. We also audit the supplier’s methodology—what they use to develop and verify that IP—and we do a site visit to the IP supplier to meet with them. We’ve been doing this for 10 years. We have proof points about why not to go with certain supplier. In some cases it’s because they cause problem for other industry players.”

At this point those kinds of capabilities are a competitive advantage and problems on the integration and testing of IP loom large for many companies. That may change as the IP industry continues to consolidate and tools become available, but at that point the problem also may be less about integration than on customization of IP for specific needs.

The Enterprise Effect

Thursday, February 24th, 2011

By Pallab Chatterjee
In the enterprise it’s all about speed and power—as in more speed and less power—and those changes are forcing shifts in the chip architectures as well as the processes used to develop those chips.

At the Linley Data Center Conference the next generation of network control chips were discussed. The keys for the new networks are 10G data lanes to be used with 10G/40G and 100G applications. For 100G the alternate configuration from 10 lanes of 10G was 4 lanes of 25Gb/s also being designed with 40nm.

The 40nm processes give the advantage of the data speed that was needed, plus power savings that are required to keep the reliability of the die and package. The trend is that these high-speed switches need to be available not as single PHYs, but as duals and quads. The 40nm node allows for target power at about 3W for these parts, which will enable 24- and 48-channel switch products.

The PHY that is being provided by most of the vendors can, with the 40nm process, support security data processing. The architecture for many of the high-throughput data systems includes local data analysis, decryption, policy and authentication testing off the early data bus just after the transceivers. These application processors can be on the same die or separate die from the PHY.

In applications where there are separate server processor chips, the trend is toward 32nm processes with multicore configurations. Intel is offering 6- and 10-core products under the Westmere architecture. For the upcoming Sandy Bridge architectures, they are featuring 8 and 12 cores using the 32nm process. On the server processor side, there also are 32nm products from AMD using the new “Bulldozer” architecture. Rounding out the server side there are also new cores from ARM with the Cortex A-15.

For dedicated application processors, a number of multicore processors are now available using 40nm processes. These include the 16-core Octeon from Cavium Networks, the 8-core QorIQ from Freescale, the 4-core ACP3448 from LSI, and the 8-core XLP family of processors from Netlogic Micro. Also in this space is the Netronome NFP-3240, which is a 40-core 40Gbps flow processor that is a co-processor to the Xeon main processor for network traffic handling.

One of the power/performance drivers is the security aspects of the networks. The Federal Information Processing Standards (FIPS) 140 is focused on cryptography and security systems, not on items such as firewalls, Web filters, spam and virus protection, or content and flow control. The cryptographic modules are constantly increasing in complexity of their algorithms and degree of touch of the data.

The Growing Importance Of Subsystems

Thursday, February 24th, 2011

By Ed Sperling
A growing reliance on third-party IP is beginning to expand well beyond just IP blocks and into full subsystems, opening significant growth opportunities for companies competing in this market as well as enormous business and technical challenges.

The IP market is ripe for this kind of convergence. Complexity at advanced process nodes coupled with time-to-market demands has elevated third-party IP from an emergency fix inside most designs to a necessity. By some reports IP now accounts for up to 90% of an SoC design. What’s changing is that IP increasingly is being integrated with other IP, software and even hard IP, so it can be plugged into an SoC with far fewer integration problems associated with single IP blocks.

“What’s driving this are the tablet and smart phone markets,” said Prasad Subramaniam, vice president of design technology at eSilicon. “Those are the highest growth markets, and companies like ARM and MIPS are creating blocks around their cores to harden and pre-verify. We’re even seeing this happening with entire reference designs, which include those subsystems. In the past this was just a reference design. But there are a lot of OEMs—especially Asian companies—that don’t have the resources to do these designs from scratch, so they’re picking up the design and fixing whatever is necessary to get to market quickly.”

James Mac Hale, vice president of Asia operations at Sonics, had a similar view: “We’ve seen SoCs move from integrating cores to subsystems. The number of cores is going up, and so is the complexity and desire for re-use. The challenge is that once you combine all these subsystems, how do you design the overall system behavior. Each reacts differently on its own.”

Getting a handle on the changes in this market is no simple task, in part because this trend is just beginning to take shape and in part because of the breadth of what’s happening. Even defining IP can lead to arguments about what is and is not considered commercial intellectual property. Adding more pieces into the mix only confuses the definition.

Synopsys, for one, defines IP subsystems based upon function. That definition includes the integration of one or more pieces of IP with the software stack running on top of it, all of which is configured for a specific application. That could be an audio-based subsystem with an ARC processor, the necessary codecs, interconnects for such things as a headset or speakers, or it could be a USB subsystem with the controller, PHY, a software stack on the USB, integration services and verification IP, according to John Koeter, vice president of marketing for IP and systems at Synopsys.

“The majority of IP is still going through traditional sales channels,” said Koeter. By our best estimates, about 5% of the IP market today is made up of IP subsystems. Over the next three years we expect that to double or triple.”

Exactly what those subsystems evolve into, however, is anyone’s guess. “One way to look at this is that right now you have semiconductor IP like USBs or memory, and at the other end you have platform-based deisgn, which is architectural re-use but not IP,” said Mike Gianfagna, vice president of marketing at Atrenta. “The subsystem is what’s in between. What’s interesting about all of this is that a few years ago what we now consider a subsystem was a full chip.”

Business challenges and changes
As the market for subsystems grows it also will create significant fallout across the industry, which is why chipmakers, IP developers and tools companies all are scrambling to position themselves for this shift. Rather than just another form of outsourcing of pre-developed IP, the convergence of multiple IP blocks, development tools, software and even services threatens to shake up the power structure in this segment of the industry.

It’s not certain at this point who will lead the subsystems effort, and whether the leaders will emerge from existing IP vendors, software developers, tools companies, or some combination of all three that has yet to come together.

Neil Hand, group director for marketing inside of Cadence’s SoC realization group, believes that while foundries are well equipped to deliver IP the EDA companies are better equipped to deliver a subsystem. “This is the functional space, which is where EDA companies live,” Hand said. “They can combine IP with high-level synthesis and high-level modeling. It’s a natural direction for EDA companies to be working with IP.”

eSilicon’s Subramaniam believes it also could be the chip companies that ultimately sell the combined subsystems. “I see the chip companies becoming more like IP companies, particularly as 3D stacking evolves,” he said. “You might see memory companies initially, but it also could be analog companies or an RF company selling the subsystems.”

At least part of what will drive these changes is the push toward Wide I/O and the recognition that multicore and many-core strategies need to be re-evaluated. The initial idea behind multicore was that either software would be written in parallel or that virtualization would work when parallelization wasn’t possible. Despite the devotion of enormous resources to parallelization of software, the best that companies have been muster for many applications is to thread certain functions onto two or four cores. In a 16-core processor, that still leaves 12 cores idle.

Virtualization hasn’t worked out as planned, either, despite its success in the server world. The solution in enterprise IT departments has been to virtualize servers to improve utilization of the servers, which typically had been running at 5% to 15% of capacity, by most industry estimates. While that improved efficiency in large server farms with thousands of server racks, because it reduced cooling costs and the cost of powering the servers, the strategy is actually inefficient at the processor level because all the cores must be homogeneous and too many need to be in the “on” state to take advantage of this approach. Moreover, one of the limitations of multicore and many-core systems is the shared memory.

With wide I/O, more dedicated memory and heterogeneous cores sized for specific applications, performance can be ratcheted up significantly while simultaneously reducing power. That basically turns a core into a subsystem, and one that may or may not be independently designed by a third party and tweaked slightly for re-use.

“The tricky part is what happens near the interface,” Gianfagna said. “Timing, power and the performance of a subsystem, or even a block, are now affected by its neighbors. That means you have to re-check it in the context of full chip integration. We will need tools at the subsystem and the system level to do that. In my opinion, that’s a huge opportunity for EDA. It’s also a modeling and methodology challenge.”

Signal traffic also is affected. Sonics’ Mac Hale said connectivity is one of the top issues that needs to be addressed as IP is combined into subsystems. “We need much more flexible interfaces to deal with this,” he said. “System-level IP is becoming much more important these days, and that includes subsystems. We need to understand how different subsystems interact.”

Impact of 3D stacking
3D stacking and Wide I/O are expected to bolster sales of pre-integrated IP even more. While solving the issue of traffic bottlenecks, they also significantly raise the complexity of the interactions.

“It’s not really off-the-shelf subsystems,” said Cadence’s Hand. “They have to be tweaked by traffic patterns. Long-term there may be a whole memory subsystem, but right now it’s getting together pieces that work. That could include a Qualcomm baseband subsystem, which is incredibly complex. It also could be a compute subsystem that includes a processor and graphics chip. But while there is a demand for off-the-shelf IP that works together, customers are still wary
of taking everything off the shelf.”

In one respect this is a significant market shift. From an EDA tools perspective, however, it amounts to a tweak—at least for the moment.

Michael Buehler-Garcia, director of Calibre design solutions marketing at Mentor Graphics, said that whether it’s blocks or subsystems or even moving devices onto printed circuit boards, the basic idea hasn’t changed. As a result he believes many of tools at the back end of the design should work fine—at least until stacking of chips begins over the next couple years.

“At that point you’re going to be doing the kind of make vs. buy decisions that you’re doing now with third-party IP, but it could be a proven die instead of IP,” Buehler-Garcia said. “From an EDA perspective, until you are doing tradeoff analysis and tuning with the TSV (through-silicon via), the tools we have now will work with extra scripts.”

Conclusions
The push toward subsystems will continue unfolding over the next few years, driven by ever-increasingly complexity and an understanding of where companies truly add value and where they’re adding a function that is required by a particular market segment. That makes subsystems a design shortcut, and one that is particularly useful when the marketing department adds another requirement late in the design cycle.

“An IP subsystem becomes the bridge between the system-level design and the implementation,” said Synopsys’ Koeter. “You get a virtualized model of the IP subsystem, accelerated chip-level verification and you start seeing software integration of the stack.

It also becomes a business opportunity in its own right for companies that can build these flexible subsystems, and for those that can sell them in a coherent way.

“The market opportunity is for a catalog of proven silicon so you pick out what you want,” said Mentor’s Buehler-Garcia. “That is a quick way to get to market.”

What’s A Cell Phone?

Thursday, January 27th, 2011

By Ed Sperling
Just because a smart phone is sold by Verizon or AT&T mobile no longer means that it will be used primarily as a phone.

That distinction may sound trivial, but it has deep implications for the components that are used inside of these devices, how they’re used, and who wins the designs. Shifts such as this can also lead to broad changes in who buys the tools to develop the components, which tools they buy, and what sorts of flows they create with those tools.

There are several fundamental reasons why this shift is occurring, and all of them intersect and support the others.

Generation, geography and culture
First, there is a huge generational and geographical gap between what’s important in phones. For older users, voice conversations are the most important feature. For younger users, texting and games are key. And for business people on the go, the most useful features are a combination of voice and e-mail.

“This trend began in China, where a phone is not considered a voice device,” said Charlie Cheng, CEO of Kilopass. “It’s very textual and graphical. You use it for text, Facebook and browsing. Young people all use it that way, too. My kids think it’s a novelty when I call them on the phone.”

The tablet has blurred the lines even further. While most of the comparisons have been between tablets and personal computers, the real volume market overlap will be between smart phones and tablets. Both are capable of texting, videoconferencing and e-mail, and each can go places and do things that the other cannot. A tablet has huge possibilities in the business world and in places such as hospitals, where a touchscreen is preferable to a keyboard because it can be wiped clean. It’s also better for making presentations. But while it fits in a briefcase, it doesn’t fit in a pocket—something that may change as flexible screens begin production.

“A phone is no longer a phone,” said Vishal Kapoor, vice president of product management for SoC realization at Cadence. “The three most important issues are security, management of data—including how much of that is local information—and the video or graphics. Even bandwidth is no longer a problem technologically, although not all of the phones can take advantage of 4G yet.”

Power and performance
The second major change is in performance and power. While the two typically are tradeoffs on the same SoC, they’re not necessarily tradeoffs in the same package.

For the past couple of generations, smart phones have been able to hold their own as full-fledged number-crunching and computing devices. They can be used to surf the Web, do e-mail, download documents and photos, and even update those documents. While the form factor is limiting, the tradeoff in portability may suffice for executives or salespeople on the road.

But the real opportunity is less in conventional desktop or notebook computing than a raft of new applications. Apple reportedly is working with Visa, for example, on “swipe and go” technology, where a smart phone is used as a checkout device that can replace credit cards using near-field communication technology. Phones already are being used as boarding passes on many airlines, particularly in Europe.

These features are a sign of just how far performance has increased on these devices. Apple, MIPS, Freescale, ARM and Synopsys (through its ARC acquisition) all have developed very powerful multicore processors that draw very low power when used in conjunction with such approaches as power islands and power gating.

“What’s changing is that you’re starting to put a lot of personal information into these devices,” said Cheng. “There’s a lot of money involved in this and there’s a lot at stake.”

The Android effect
A third factor that is contributing to this shift is Android. The operating system developed by Google is spawning a revolution in how devices such as smart phones are used—and who wins the designs.

MIPS, which was one of the first adherents of the Android platform, is experiencing huge growth—much of it because of Android. The company’s revenues grew 44% in Q2 of 2010 vs.Q2 of 2009.

“The playing field is wide open,” said Art Swift, vice president of marketing and business development at MIPS. “What’s changed is those companies that were the leaders in the past in mobile will not necessarily be the leaders in the future. That’s especially true with tablets. Part of the market driver here was Android, and it’s wide open. There’s a whole cast of new players.”

Not all of it is happening in the usual places, either. John Koeter, vice president of marketing for IP and systems at Synopsys, says Android is opening up other markets that didn’t exist in the past.

“We’re not going to play in the shootout between MIPS and ARM,” said Koeter. “But we are seeing new markets for things like picture frames that can be interconnected and run applications. There are all sorts of new and interesting applications.”

Samsung's new Android-based Galaxy tablet.

Design challenges and opportunities
The challenge for chip companies working in these transitional technology markets is figuring out where the volume adoption will be, how to best utilize the technology to serve multiple markets, and how to add in enough flexibility so that certain features can be given priority where necessary.

In some cases this may utilize a system-in-package approach with an interposer technology or a network-on-chip architecture for improving signal traffic flow. In some cases it may be multiple cores within a complex SoC that serve the same purpose. And in the future, it will likely be multiple chips on a 3D stack, where different functions can be developed and then manufactured as needed for different markets.

What will have to change, however, will be the pace of tool adoption and development. One of the big complaints among system-level tools vendors is that not everything can be integrated into a flow because high-level tools don’t necessarily work perfectly with older tools that utilize lower levels of abstraction. That has stymied the growth of high-level synthesis and software prototyping, for example.

But these kind of changes may bring new players into the market, raising the competitive stakes to develop chips more quickly, more efficiently and with more flexibility. That means newer tools will be required, and it can quickly force a competitive upgrade among existing companies and spur growth in areas that have been slow to develop, particularly in the ESL space.

Conclusions
So what will be the most important features on a phone in the future? That will depend to a large extent on the applications and what’s important to users and companies that buy these devices. A phone will still have to be able to make phone calls, but that may be just a lesser feature on these increasingly complex devices.

“A phone will have to be reliable and clear,” said Koeter. “But once it meets that standard, then it’s all about a whole new experience.”

Embedded Computing Down To Two Major Camps

Thursday, January 27th, 2011

By Pallab Chatterjee
The 2011 CES show was highlighted by the large number of tablet computers and mobile devices that support Internet access. The form factor for these devices is based on use models, but the computing capabilities are based on the power and operational life between charges. The platforms are drawing diving lines between x86 cores vs ARM cores, and CPUs vs GPUs.

While on the high level the tradeoffs were on screen size, battery life, UI and apps available, the hardware battle for the BOM was with a much smaller set of players. These new tablets were derived from either smart phones or laptops. The ARM CPU selection is the dominant platform for the oversized smart-phone derived products. The x86 CPU selection is the dominant platform for the reduced function laptop computer products.

The ARM devices are available in multiple forms. These include straight single and multicore CPUs, embedded CPUs in graphics chips, and full mobile chipset with memory controllers and display control. These can be found in parts from Nvidia, the Tegra products, and parts from Marvell and Qualcomm such as the Snapdragon products. In a lot of the cases, these ARM cores are combined in systems with MIPS cores and graphics acceleration from Imagination Technologies.

The ARM platform is highly power optimized for battery operation and for the RF interface. The key driver for the platform in the tablet space is the development environment. The platform software, available from both ARM and third parties, covers bus architecture for multicore design, operating system functions and high-level applications. To complete the tablet design, there are many suppliers making compatible IP for USB, HDMI, microphones, headsets, eSATA, and other components that have firmware interoperability with ARM. The majority of the ARM-based designs shown used the Android operating system rather than cell phone OSes.

The x86 platforms are primarily coming from the business applications and higher data throughput side. The main chipset uses Intel Atom cores, and some of the “lighter” duty reduced function tablets featured VIA x86 cores. These platforms were running Windows derivatives and Linux. The highly promoted Moblin environment was not really present in the new product mix. These x86 products have separate CPUs and GPUs. They feature either Intel graphics or Nvidia graphics co-processors. These tablet chipsets are lower-power versions of the one in the netbook/ultra-light laptop market and several also included the Imagination Technology POWERVR graphics and shader engine.

As the tablet form factor does not allow for a fan and has thickness that is driven by the height of the peripherals connectors, balancing power for the applications is the major challenge. Some of the new tablets released played with the options of single core CPU, multicore CPU, single GPU, multi-GPU, and a hardware codec. As one of the high-power and high-use applications is for playback of video and displaying high-resolution graphics, efficient utilization of the H.264 codec is a key.

Based on the compression involved with the data and the length of the video streams being viewed, there are power tradeoffs doing the video decoding in a CPU or a GPU with software vs. a hardware codec. There are similar issues for the camera interface if it supports stills, video, and single/dual/triple (S3D on one side, single on other) data streams going to a single codec. The format for the tablets affects not only the configuration of the cores but also their duty cycle. If a single part gets an extended operation, this changes the thermal gradient on the die and the board—hence the software has to balance the activities to help balance the heat.

Connecting System-Level Flows To Implementation Tools

Thursday, January 27th, 2011

By Ann Steffora Mutschler
With the complexity explosion occurring in SoC design today, there is a relentless force to push design decisions further up in terms of abstraction. Resolving issues at the gate level is not possible any more because there just isn’t enough time or resources. Further, the resulting design may not even be competitive because optimization at the gate level can leave a lot of power and/or performance on the table.

Performing optimizations earlier in the flow will almost always result in a better design, noted Mike Gianfagna, vice president of marketing at Atrenta. “The movement toward 3D IC stacks is making the whole problem even more urgent, since iterating at the detailed implementation level in a multi-chip design will make design schedules far too unpredictable.”

All of this requires system-level flows to be much more productive and efficient than ever. “A fundamental issue with these types of flows in the past was that they existed in isolation,” he said. “The user performed some exploratory work and then did it all over again during chip implementation. That ‘gap’ between system level and implementation level flows is now closing. The tool chain needs to be integrated and consistent if true efficiency is to be realized. Better modeling and standards, such as IP-XACT, are helping. The development of architectural level chip assembly tools is also making a big difference.”

Not everyone is convinced, however. eSilicon chief architect Sam Stewart pointed out that engineers that have been doing ASIC design for 15 to 20 years are sticking with what they know. “There is a lot of time spent just figuring out how to hook up all these blocks as well as understanding how each of these blocks works.”

“On all of the projects I’ve worked on, sometimes you don’t have a good handle of what is a good solution. What people do is they generally work in C or SystemC and try to capture something about how the system would work, so you could say that’s a system flow. It’s a system flow in the sense that what they are doing is capturing what an ASIC would do. You have to have something to translate what you are doing in SystemC into an implementation, which is instantiating things, and connecting these instances,” Stewart explained.

He said one customer has spent the past two years on its ASIC. If asked what they did during that time, Stewart said that they would say they spent their two years implementing the system—conceiving it and then hooking up all the various pieces on the ASIC. As for the high-level system aspect of it, they would say that was the easy part.

“The difficult part is first specifying the ASIC. There is an ARM CPU, for example. What does it talk to, what is the width of the bus, etc. Then, this block over here, what does it talk to? Our customers would claim that system flow stuff is probably not as much of a concern as whether there is an easier way to design an ASIC,” he continued. Stewart also noted that it is not clear exactly what information needs to be, and should be, carried forward.

This information gap should ease as tool providers solidify flows and get the word out to customers directly, or at conferences and other events.

What designers need today in terms of strengthening the connection between system-level flows and implementation flows falls into three areas, according to Frank Schirrmeister, director of marketing for system-level solutions at Synopsys. The first area includes interfaces to the front end, such as UML. Second, the huge model ecosystem and the tools needed to enable the system-level ecosystem. The third area comes in once decisions have been made and the system-level partitioning has been determined, along with the system-level flows, so that everything doesn’t have to be re-entered at implementation.

Source: Synopsys

From left to right, the first thing needed is correspondence of IP blocks implemented from RTL down to models of those blocks, Schirrmeister said. “We all hear these numbers like, ‘We are at 60% IP re-use on the hardware side going towards 70% over the next couple of years.’ That’s all good and proper if you integrate this at the RT level. But you want to have models in the front end that represent, so that may be everything from a model for a processor core like the ARM processor cores, as they all come with the system-level models, or instruction-set simulators. You want the same for any type of peripheral.” For example, Synopsys has a USB 3.0 model that can be used to write drivers against, so the driver doesn’t have to be changed when in RTL.

Then, looking toward the middle of the diagram but still in the left-hand column of block implementation for the new blocks that aren’t re-used as IP, he explained, “you need implementation from the high-level model to the implementation, and there’s a verification linkage there and implementation linkage.” In the two areas where Synopsys plays are high-level synthesis tools, whereby the high-level C model is the basis for generating the implementation from there. As part of that, all testbenches are also generated here to validate that what was implemented is correct. “It’s the same issue as RTL downward. You have RTL to implementation, RTL to gate, you have the synthesis flow and then you have functional verification by simulation and eventually equivalence checking. And the same will happen here throughout, where equivalence checking goes together with high-level synthesis,” Schirrmeister explained.

Also still in in the block implementation column is processor design. If a processor designer wants an application-specific processor or a custom processor for doing some offloading of the most intense tasks, he/she takes a high-level model and generates from there. Synopsys has a C dialect language for instruction set architecture, while other vendors use the nML description for processors.

In the right column, “Software Development,” Synopsys hasn’t abstracted that much in the areas that are close to hardware, he said. “Essentially the trick is what we do today with virtual prototyping. We execute the software on a model of the hardware. We are not executing the model of the software on the model of the hardware. That would be the next step upwards.” On the software side today it is more in the software implementation, with similar linkages as for high level synthesis but for automated software implementation. “What we are following fairly closely are things like code generation from UML or Mathworks Simulink, which are equivalent to high level synthesis on the software side,” Schirrmeister said.

Moving to the middle column, “Chip Integration/HW-SW Integration” includes for Synopsys IP-XACT-compliant “core” tools (coreAssembler, coreBuilder, coreConsultant) and for Mentor Graphics, Platform Express.

The correspondence on top of that is the system-level and prototyping tools, which at the transaction level hook things together.

Synopsys is providing high-level models to system-level fabrics, such as the Sonics fabric, ARM’s AMBA fabric and the Arteris fabric. These models allow the interconnect to be configured, as well as configuration of all blocks in the design, and all of the items from the left column in the diagram fit together on the hardware side. The software is mapped onto the fabric, the design team assesses whether it will fit together, and once the system is configured appropriately, it can be linked down into a tool like ARM’s AMBA Designer, Mentor’s Platform Express, or Synopsys’ coreAssembler.

Next Page »