Posts Tagged ‘NoC’

Proprietary On-Chip Connections Yield To NoC Designs

Thursday, September 22nd, 2011

By John Blyler
Interconnect technologies are nothing new at Intel. During the recent Intel Developers Forum (IDF) 2011, several processor-centric interconnect technologies were on display in the company’s Labs Pavilion. Most noticeable of these were Many Core Application Research Community (MARC) and its derivative called the Many Integrated Cores (MIC) projects.

In terms of interconnect fabric, the MARC platform relies on an open standard “Message Parsing Interface” (MPI) to communicate between as many as 48 Pentium cores within a single die. The goal of this research is to develop the interconnect hardware and parallel software applications that would support the “millions of processor” program. In this activity, Intel has been working with the U.S. government on a project called Ubiquitous High-Performance Computing (UHPC).

Interconnect strategies change as vendors move from processor-centric to SoC third-party IP-based designs. While Intel laid out its SoC development strategy years ago, few details concerning the interconnect fabric have been made public. Bill Leszinske, the company’s general manager of technical planning and business development at the Atom processor SoC development group, recently revealed that the Intel interconnect fabric will serve as a “chassis” within which a variety Intel and third-party IP can be swapped in and out for different applications. The company calls this proprietary chassis the Intel On-Chip System Fabric (IOSF). It is analogous to the ARM community’s Advanced Microcontroller Bus Architecture (AMBA) interconnect platform. Other proprietary on-chip bus structures include MIPS SoC-it and IBM’s CoreConnect, to mention a few. These buses have bridging capabilities to ARM’s AMBA bus or the Open Core Protocol (OCP) standard for IP cores (OCP-IP) socket technology.

Leszinske is quoted as saying that the IOSF is a scalable fabric that supports multicore operation and maintains the PCI-bus order. This last item is critical because Intel’s Atom processor uses the PCI bus to connect to the outside world, for example, to provide embedded programmability via Altera’s FPGA core (see, “Intel Teams Up with Altera.”) The popular PCI bus is also an important interface between ARM processors of Xilinx FPGA fabric (see, “FPGAs Move to IP through Processor Interface”).

NoC vs. internal buses
The growing demand for low power and high performance chips is putting new demands on the on-chip IP interconnect architecture. Perhaps that is why many chip companies have migrated from internal interconnect technology to on-chip networks. This approach allows them to protect their legacy IP cores and any proprietary communication features while providing access to third party IP vendors. But how do overall SoC networks, such as a network-on-chip, relate to proprietary buses like Intel’s IOSF or ARM’s AMBA?

Drew Wingard, Sonics’ CTO, puts it this way: “Our principal competitor is internal technology, which is typically derived from either legacy computer buses or the various flavors of ARM’s AMBA specifications. Intel’s IOSF represents such an internal technology, and their press interviews about IOSF make it clear that supporting the ordering requirements of PCI is crucial to them for supporting their large, existing software base.”

Figure: Intel’s hierarchical approach to SoC integration, with separate interconnection fabrics (networks) for Intel IP and most third- party IP.

Processor-centric companies like ARM and Intel need interconnect architectures to grow an ecosystem of third party IP providers. But these providers have widely varying communication requirements that are difficult to manage.

Here is where NoCs can be of great value. As chief architect and co-founder of Arteris, Phillippe Boucard explains that before NoC technology was available IDMs would use hybrid-bus technology to connect IPs to a centralized crossbar, which would then route the traffic throughout the chip. In the past five years, NoC on-chip interconnect architectures began to replace proprietary hybrid bus technology.

“Our NoC IP uses Network Interface Units to convert the ARM protocol into a packetized protocol format. Instead of having a centralized crossbar, the NoC interconnects are distributed throughout the SoC. On top of that, the NoC provides several services, such as security, quality of service, software bring-up, power management, domain management, and so forth.”

There are multiple challenge facing today’s SoC designers. Chips must meet the often-conflicting requirements of low power, high performance, small die size, low cost, low heat generation and development in a very tight time-to-market period. The problem with traditional, proprietary hybrid-bus interconnects is that any change in the IP requires a physical change in the overall system topology, including the buses. With a NoC architecture, only the interconnect needs to be reconfigured.

Complex designs have spurred the growth of design re-use via semiconductor IP. To handle all of this IP, on-chip interconnects had to become more complex. Proprietary internal buses have been giving way to more open on-chip interconnect specifications. NoCs further reduce chip complexity by providing a easily reconfigured communication subsystem between the majority of IP cores on an SoC.

3D IC Stacking Challenges

Wednesday, September 21st, 2011

System-Level Design talks with Sonics CEO Grant Pierce about the challenges of stacking die, what has to change and why.

YouTube Preview Image

The Week In Review: July 22

Friday, July 22nd, 2011

By Ed Sperling
Synopsys rolled out its next-generation prototyping solution called Virtualizer, integrating technology it developed with the technology it acquired from Virtio, CoWare and VaST. The rollout is a strong indication of how complexity is forcing some interesting changes.  Synopsys also won a deal with Ricoh for Processor Designer  to speed DSP design, and another with Renesas for HAPS-64 Systems.

Cadence won a deal with Shenzhen-based HiSilicon for its Virtuoso parallel simulator. HiSilicon says the tool offers up to 24 times the performance on various circuits with different CPU configurations.

Arteris inked a deal with Toshiba for its FlexNoC interconnect IP. Toshiba will be using NoC technology in its future SoCs.

An Analysis Of Blocking Vs. Non-Blocking Flow Control In On-Chip Networks

Thursday, April 28th, 2011

High end System-on-Chip (SoC) architectures consist of tens of processing engines. These processing engines have varied traffic profiles consisting of priority traffic that require that the latency of the traffic is minimized, controlled bandwidth traffic that require low service jitter on the throughput, and best effort traffic that can tolerate highly variable service. In this paper, we investigate the trade-off between multi-threaded non-blocking (MTNB) flow-control and single threaded tag (STT) based flow-control in the realm of Open Core Protocol (OCP) [1] specifications. Specifically, we argue that the non- blocking multi-threaded flow-control protocol is more suitable for latency minimization of the priority traffic and jitter minimization of controlled bandwidth traffic, when compared with a single threaded tag (STT) based protocol. We present experimental results comparing MTNB against STT based protocols on representative DTV data flows. On average, in the STT based system, the latency of priority traffic is increased by 2.73 times and the latency of controlled bandwidth traffic is increased by 1.14 times when compared to the MTNB system, under identical configurations.

To download this white paper, click here.

From Bus To Crossbar To Network on Chip

Thursday, November 19th, 2009

Network-On-Chip (“NoC”) technology is rapidly displacing traditional bus and crossbar approaches for SoC on-chip interconnect. What defines a NoC ? Many terms are being used in the industry: On-Chip Networks, Interconnect Fabrics, Networks-On-Chips and so on. Let’s use the term “on-chip interconnect” as an umbrella name for all approaches. The Network-on-Chip is one specific architecture and is defined as “an on-chip interconnect with decoupled transaction layer, transport layer and physical layer”. This article explains the underlying concepts of Network-on-Chip technology, and details why NoC technology is superior to bus and crossbar approaches.

To download this article, click here.

To Bus Or Not To Bus, That Is The Question

Friday, July 31st, 2009

By Ann Steffora Mutschler

When you hear the words, “block interface,” your ears may not perk up, but as system architects well understand, making the right choice between a bus or non-bus interface on an SoC is absolutely critical to design’s success in terms of power efficiency, reusability and performance.

How many of the problems in new chip designs have to do with the interconnect and the bus as opposed to any functional block issue is a matter of debate because it is extremely difficult to re-architect a chip once it has been done one way. The design team could start over, but most teams don’t have that choice and are left to try to fix something that isn’t quite working right. The chip probably works, but it’s a whole lot slower or uses a lot more power than the spec.

From a system design tool perspective most designs today are bus-centric, but there are problems with that approach. Steve Roddy, vice president of marketing at Tensilica, says the traditional reliance on buses—where there is one master at a time doing one transaction at a time—doesn’t scale.

“We see more and more of our customers taking advantage of the ability to add what we call designer-defined ports and queues that are interfaces on a processor specific to a given chip architecture or design,” Roddy says. “That has been one of the drivers of our business more and more as people really utilize that to either get higher performance or to better balance bus traffic or to reduce power in their system. As a general proxy, if half or more of our customers are taking advantage of that, it would suggest that it’s a fairly widespread problem and half or more of the complex design need something other than just straight, traditional buses.”

One of the big challenges, in general, with block interfaces is reuse and the ability to be able to hook up to something that can accept a variety of protocols and standards quickly, according to Charles Janac, chairman and CEO of Arteris. As such, new approaches have come into the design arena. He notes that Arteris’ technology performs protocol conversion at the edge of the network and, in effect, acts as a protocol converter so AXI and OCP IPs, for example, can run side by side without making any modification to them whatsoever.

“We’ve gotten to the point where the network interface units are very low latency and they don’t cost very many gates,” Janac says. “It’s a pretty proven technology at this point, so it’s one of the key approaches to effective IP reuse. Once you start writing different interfaces, and different wrappers, it just gets too costly and too complex in a hurry. Then you’re kind of stuck with one kind of IP. As SoCs get more and more complex, no one provides all the IP — the IP has to come from internal sources, external IP vendors, legacy, some of it is designed from scratch – and all of it has to play together as easily as possible.”

That’s not to say that bus-based architectures are always bad. But getting it right is becoming much more difficult.

“If you have a bus-based system, it may make reuse of a component easier because you plug it to a bus, you define a base address and then you program it, with all the drivers ported fairly easily,” says Frank Schirrmeister director of product marketing for system-level solutions at Synopsys. “But if you don’t get the right bandwidth to this component at the right time in your design because you have a scenario which you didn’t foresee, then your design won’t work. That’s why you have issues with cell phones not receiving a call while taking a picture and playing a video game at the same time. There’s simply too much going on. Those are the scenarios that are easily overlooked.”

Still, complexity is driving the use of direct block interfaces.

“Two years ago, we were seeing 50 to 80 IP blocks,” says Jack Browne, senior vice president of sales and marketing at Sonics, an interconnect IP provider. “At 45nm and 32nm we are seeing up to 150 IP blocks, and there can be two dozen masters that want some share of the memory bandwidth.”

Browne notes that he is starting to see design activity pick up as people get to the point where they have to do new platforms not just derivatives of existing designs.

Who’s adopting non-bus approaches?

Early adopters of the network-on-chip approach are people designing mobility SoCs – complex application processors — that share the problems of complexity, low power, and limited space and which are made in very high volumes. As a result, they are hitting all the constraints at once. Digital televisions, set-top box applications are not too far behind in complexity, and are good candidates for non-bus (or direct) block interfaces approaches like network-on-chip.

“Once you get beyond 65-nm and below, the network-on-chip has a broad application in those kinds of SoCs,” says Janac.

Tensilica’s Roddy observes that the folks who do system modeling and iterative analysis tend to be more proactive at looking at the newer forms of on-chip interconnect. “Where people are going to make substantive changes and, let’s say, have had three successive and successful projects from older architectures and now they are entering a new market, or adopting some new standard, they recognize they are going to have to have four times as much data flow and four times as much bus traffic and know they are going to have to do something different. It is those technology dislocations and new platform designs that will cause people to look at their bag of ingredients and determine if they need to add something new.”

But even with the benefits seemingly clear, there are still roadblocks to adoption of direct block interfaces.

“The No. 1 issue is unfamiliarity,” Roddy says. “If the hardware designer and software programmer have been accustomed to a monolithic view of the world, one that has stayed largely static for 20 or 30 years, even before the notion of integrating things together in a more complex SoC, it’s kind of like everything is memory mapped, the programmer’s view of the world is relatively simplified, and he didn’t have to think about how things actually happen on chip. To the degree that the changes in underlying hardware architecture preserve that idealistic abstraction of the world for the programmer, all the better. If you can have a dedicated link in the hardware that recognizes a particular type of transfer is being requested and gets mapped to a specific hardware channel as opposed to the common bus channel, that would certainly make life easy for the great mass of programmers. And there are always 10 or 100 programmers trying to write code for something versus one hardware designer trying to build something.”

NoC Your SoCs Off

Thursday, February 19th, 2009

By Ed Sperling

The network on a chip (NoC) approach is gaining ground as an essential part of a system on a chip (SoC), providing the same kind of time-to-market advantage that well-tested intellectual property blocks provide.

This follows almost eight years of hype about NoCs potential with little to show for it. Times have changed and there appear to be two main drivers, one technological and the other business-related. From a technology standpoint, the real key is that chip designs are becoming far too complex to create all the interconnects necessary to get an SoC out the door on time and on budget. From a business perspective, the downturn has cut into staffing of design teams so severely that most companies don’t have the manpower left to develop complex interconnects on a chip that also has multiple cores, multiple power islands, as well as shared busses and memory.

“The key trend that makes such technologies more important is simply the increasing levels of integration, which significantly increase the amount and complexity of the on-chip communication—particularly in the sharing of key resources such as external DRAM,” said Jim Hogan, a venture capitalist familiar with this market. “This complexity permeates every part of the SoC design, from the increasing fraction of circuit delay due to wiring at deeper process nodes up through the massively deeper pipelining required to keep modern DRAMs operating at high efficiency, to the QoS scheduling required to ensure that general purpose software on CPUs can co-exist with real-time communications and multimedia traffic. NoCs provide a structured framework for managing these growing complexities and will therefore become the dominant approach for complex SoCs.”

But structured does not mean standardized. Far from it, in fact. While NoCs fit into standardized EDA flows and work with standards, they are one of the key components that must radically change from design to design.

“At 45nm, and with some designs at 65nm, companies have started to see issues with interconnects” said Charlie Janac, CEO of Arteris. “Projects cost more, they last longer, or they’re being canceled. There’s more problem solving, and the interconnect is more important. When we had single-core chips, it was a choice between a mainframe versus distributed network computing. Now we’re dealing with four to six cores, algorithmic engines, graphics, peripherals and on-chip/off-chip memory. All of this requires more communication on a chip.”

Defining NoC

So what exactly is a NoC? Definitions vary, and likely will evolve as NoCs become both more necessary and more widely deployed. And some of the standard definitions are fuzzy at best. Wikipedia, for example, defines a NoC as “an emerging paradigm for communications within large VLSI systems implemented on a single silicon chip.”

Most chip architects view NoCs as more of an evolutionary step than a radically new concept, though, with the difference being that a NoC is now a discrete part of the development process instead of including it as a piece of something else.

“I like to use the phrase ‘network on chip’ to describe what we do and have been doing for a few years,” said James Aldis, SoC architect at Texas Instruments. “My definition is based around the idea of the NoC being a separate component in the top-level assembly, with a point-to-point interface to each other top-level component. This is distinct from a traditional ‘bus’ where the bus is the top-level assembly. The alternative view is that a NoC is really something with a network-style architecture, where you send out bus requests and responses on the same wires. This alternative view means that the external interfaces of the NoC are not traditional ‘bus-style’ but rather ‘network-style.’ Transactions are captured in packets rather than being represented by separate address, data and command busses. This alternative view is not yet real in the IP industry. You can’t buy IP with this sort of interface on its boundary. It may be used internally in some companies.”

The NoC is particularly attractive at advanced process nodes because of the increasingly complexity and the ability to isolate some of that complexity in the network.

“With the advent of SOCs, a lot of complexity has moved into the interconnect. No one building such chips is really using the old “bus” paradigm anymore,” said Geert Rosseel, senior director at Pixelworks. “The interconnect now has to manage communication between IP blocks having very heterogeneous bandwidth and latency requirements and possibly living on different clock and power domains. The interconnect is now managing CPU-type requests with networking and real-time media (video and audio) traffic, usually all directed to shared resources such as memory. In my opinion, everyone building an SOC is already implementing some kind of complex on-chip communication system.”

But the NoC takes that one step further.

“What sets the concept of a NoC apart is the idea of developing an architecturally clean and unified approach to solving this problem,” Rosseel said. “You put all communication complexity in the network with the IP conforming to some simple interface standard. Once you have this ‘clean’ separation, you can develop an interconnect based on internal protocols that are optimized to meet the performance, area and power requirements.”

Looking forward and backward

The final caveat for most NoCs is that they have to embrace both new and existing technology. That includes a number of existing on-chip protocols, the Open Core Protocol (OCP), ARM’s Advanced extensible Interface (AXI) and AMBA High-Performance Bus (AHB), as well as an alphabet soup of proprietary and lesser-known acronyms.

Ian Mackintosh, chairman of OCP-IP, said the real key is to maintain openness, while embracing existing standards. “The world is heterogeneous,” Mackintosh said. “People have worked up from single bus generators to intelligent networks on chips where you need predictive performance of the NoC.”

OCP-IP has been working on a way to standardize NoC benchmarking to help sort through years of attempts to get this right. For further reading on this subject, check out the white paper entitled: “An Iniative Towards Open Network-on-Chip Benchmarks.”

The Quest For Faster Data Throughput On A Chip

Thursday, February 19th, 2009

By Ed Sperling

As with all network topologies, the general rule is the faster the better.

Jack Browne, VP of sales and marketing at Sonics, said his customers are asking for higher-speed interconnects. “Right now we’re at 300MHz,” he said. “They want to more than double that in the very near future and eventually get to 1GHz.”

Getting to that speed is no simple matter, and several approaches are under consideration.

One approach now being tested is a wireless network on a chip. Intel, STMicroelectronics and Philips are all experimenting with these techniques, sources say. And in the commercial NoC space, companies such as Arteris, Sonics, Silistix and Inventure are working on similar technologies.

Parthe Pande, assistant professor at Washington State University, said it’s too early to tell which approach will win. “This is a big research problem,” Pande said. “On-chip wireless networks are very promising. The big problem there is the on-chip antennae and how small you can make them. One approach is carbon nanotubes, but there are manufacturing problems.”

Serialized packets are another approach, but the tradeoff so far has been increased latency. At least part of that is caused by the complexity of designing systems with dedicated wires, shared busses and segmented busses, as well as algorithms that do not take advantage of all the options. Parallelization remains one of the chief conundrums for all levels of chip and software design.

Brad McCredie, an IBM’s chief architect for the Power6 chip, said to understand what’s happening on a chip becomes evident when you look outside the chip because everything is being consolidated into the chip.

“There’s been a lot of research into optical and on-chip optical, but economics never let that happen,” he said. “Whether it happens in the future we don’t know. But between chips, there is a firm direction toward a parallel bus. In cluster configurations we’re seeing packets.”

He said IBM currently is working on 3GHz packet-switch networks on chip for DARPA. But those chips are using parallelized packet switching. The bulk of the work so far has been serialized, and experts say that has created latency issues.

“The main bottleneck right now is parallelizing software,” said Pande. “This is a very hot research topic right now. Packets are another big research problem.”

One approach is to divide the packets into six parts, slimming down the data being sent and avoiding storage of the packets in cache. But Pande said there is still an enormous amount of work to be done, and so far there is no clear winner emerging from the research.

The Trouble With On-Chip Interfaces

Wednesday, December 17th, 2008

By Ed Sperling

The trouble with standards is that many of them arise out of need rather than through careful planning, and often unilaterally.

The typical scenario in chip design is that a company has an issue to solve, so it comes up with a solution. When it gets what it believes is critical mass behind the standard, the company that developed the solution opens it up to the rest of the industry, hoping that it will either attract new customers or get enough of a jump on the market to create incremental business.

This has been repeated with languages—hardware description and software programming, to name a couple—as well as intellectual property and just about every other tool used in chip design, development and verification. And when there is more than one approach, those competing and often incompatible technologies are typically integrated so that everything can work together and the industry can move on to the next challenge.

That appears to be happening now in the on-chip interface world, where ARM’s AMBA, IBM’s CoreConnect and OCP-IP are all battling for attention. Both ARM and CoreConnect are entrenched in their individual markets, but with multicore chips becoming common the separate approach presents challenges to engineers.

“All of this technology is good,” said Sudeep Pasricha, who wrote a book called “On-Chip Communication Architectures: System on Chip Interconnect,”, and assistant professor in Colorado State University’s department of electrical and computer engineering. “The bad is there are a lot of issues making it all work together. If you integrate an ARM core with a CoreConnect bus standard, there’s a mismatch of protocols. You can fix it. You can develop components that work with the different standards. But it’s expensive and it takes time.”

Multicore Multiplexing

The problem gets exponentially worse in multicore chips, where every device is basically a network on chip running under a system on chip. Cores need to communicate across that network, but frequently they are heterogeneous collections of IP. That means multiple vendors building technology on a single substrate using different protocols and interfaces. The opportunity for confusion increases with every core.

In fact, Pasricha said IBM is in the process of developing its own NoC for the Cell processor that uses packet switching for chips with 50 to 100 cores. The interface is being custom-developed by IBM, he said.

OCP-IP, meanwhile, is looking to represent the middle ground in all of this, raising up the level of abstraction by adding connections in much the same way that middleware does for disparate application software. “Our approach was to develop a socket to deal with all kinds of IP, whether it’s a graphics processor or a media processor,” said Ian Mackintosh, chairman of OCP-IP. “AMBA is very well accepted around the processor subsystem, but OCP (Open Core Protocol) will handle the broader system better. We also have worked closely with OSCI (the Open SystemC Initiative) so we are TLM 2.0 compatible. Our TL3 is compatible with TLM 2.0.”

OCP-IP currently is benchmarking NoCs to ensure there is no performance degradation when various interfaces are used. “This is becoming critical because of the diverse sets of IP that are being used,” said Mackintosh. “We’re not dealing with just one processor anymore.”

And just to make matters even more confusing, the industry isn’t dealing with a single NoC approach, either. In addition to IBM’s new NoC and Texas Instruments’ OMAP platform, there are four other commercial NoC players: Sonics, Silistix (United States), Arteris (France), and Inventure (Japan).

The bottom line: Even as we resolve some of the confusion, more is being added.