Posts Tagged ‘Architectural’
New Challenges For Hardware Engineers
Tuesday, September 16th, 2008
It used to be fun to be a chip architect. You could wake up in the morning, grab a cup of strong black coffee and run through a few power and performance tradeoff calculations before deciding on the high-level architecture. That would set the engineering direction for months, if not years. On a good day, after introducing a steady infusion of caffeine into your bloodstream, you felt like the all-powerful creator of an electronic universe.
That dream job began showing its first signs of vulnerability at the 130nm process node, especially as the SoC began emerging as the leading design platform. The job description began weakening further at 90nm, and by 65nm it has transcended into something far less satisfactory—and the trend only gets worse from here. More people are entering into the conceptual design phase of building a chip with each rev of Moore’s Law. Suddenly, there are people talking about power budgets and yield and verification engineers trying to build in ways to solve their problems earlier. Managers are screaming for first-time silicon success. And software engineers—who, incidentally, no one has ever understood very well—are now sitting at the table at initial conception, slurping Diet Coke or Mountain Dew, and speaking a language no hardware engineer can understand.
Welcome to the brave new world of hardware engineering. It’s called system-level design, and it’s become so complex that just to get the job done now requires steady and concurrent input of multiple disciplines. Engineers are struggling to keep up with multiple power domains, multiple cores that exist only because classical scaling for performance died at 90nm, and timing issues that get complicated by shared busses, shared memory, and shared resources within engineering groups.
“The technologies for low-power design are well understood for silicon,” says Nikhil Jayaram, director of CPP engineering at Cisco Systems. “The challenge is in the complexity of those technologies. You have to ask yourself, can you pull it off in a reasonable design cycle?”
The answer is always yes, of course, but the cost is not always easy to swallow. Complexity is measured in terms of additional resources. Jayaram said that number is about 20% to 50% extra per design, depending upon the complexity of the design itself. Why? “You have to buy more tools and use more people.”
There are plenty of tools, too. In order to address this complexity, vendors have been introducing a steady stream of new tools that raise abstraction levels or combine multiple tasks. Those go hand in hand with new standards such as TLM 2.0. But the learning curve on these new tools and standards is quite steep, demanding time from engineers who are hard pressed already. Even the IP that is supposed to simplify chip design and development is so complicated that it often needs additional IP just to be able to ensure it can be debugged or manufactured properly.
One verification engineer at a very large, well-known chip maker (he asked for anonymity because he didn’t get approval from his bosses before talking to System-Level Design), said overload is becoming a serious issue among engineers.
“Designers are required to become experts in three completely different languages that the industry has standardized on as mainstream,” says the engineer. “The languages are SVA (System Verilog Assertion) for the assertion-based methodology, SV (System Verilog) for the testbench methodology, and C/C++ for system-level hardware/software verification. A verification engineer cannot get by without becoming an expert in these three languages. The way to deal with this is through the right schooling so that engineers come out with the expertise in all three. Standards have definitely helped with this. The frustration of course will be for the engineers that are on the job for many years and now need to become skillful in three different areas. As things are today, I am finding it very difficult to justify all three methodologies to my customers and they are missing out on quality because of this.”
That’s only part of the problem in verification. While five years ago engineers were complaining about getting too little data back from foundries such as TSMC, UMC and Chartered Semiconductor, they’re now complaining about being flooded with data. There are volumes of it—literally—and there’s no way other than just plain luck to pinpoint a bug without running tests on broad areas of that data. TLM 2.0 purportedly will help (see related story), but it also has a fairly high learning curve to be able to use TLM 2.0 tools. How do you construct a test model, for example, using object-oriented code?
There’s a reason why verification is still 70 percent of the NRE time budget and cost for developing new chips. Despite throwing lots of money, resources, and the best minds in the world at the problem, that number hasn’t budged much.
IP, Verification IP, and insurance IP
Nowhere is this overload more evident than in the IP world. Why write a piece of code for a standard interface or a piece of memory if someone with experts on the bleeding edge of technology has already done it? That way of thinking is growing. IP is a big market, and the problems of five years ago when companies bought advanced IP only to face challenges—and potentially huge expense—getting it to work are enormous.
Buying IP isn’t like buying a pair of shoes. It’s more like setting up a deep partnership that lasts for the life of a chip’s many iterations. And getting those partnerships to work properly can be a time-consuming process. That explains why many of the smaller IP companies have evaporated even though a decade ago pundits said the barrier to entry for IP startups would create a vast array of parts that could be simply plugged into a system on chip. Things didn’t work out so well in the real world.
“When you walk in to a partnership you need to get a complete match on the methodologies and tool sets,” said an engineer, who spoke on condition that he not be named. “This is soooo difficult. Very high level managers are finding themselves bleeding trying to make this work. Your tool set may be delivered by multiple vendors in addition to internal tools. Internal tools cause even more problems that are related to support, IP, etc.”
The engineer noted that standards will help solve this—everything from standard formats, standard languages and standard methodologies, which is what the new verification IP committee is trying to tackle.
Business, As Usual?
Beyond all of this, there is the incursion of the business groups. It was bad enough to build chips that worked. Now they have to be built on time, within a financial budget, and they have to include more complex technology and tricks than ever before.
One solution for keeping chips in budget is using the lowest-cost tools. The problem with that approach, say engineers, is that not all tools share exactly the same functionality. So what happens when you run simulators such as VCS (Verilog Compiler Simulator, formerly from Chronologic but now owned by Synopsys), IUS (Cadence Incisive Unified Simulator), and (Mentor Graphics’) ModelSim? The answers to that question vary by project, and frequently for the same project.
But no matter how bad it looks, at each new process node there will be more cooks in the kitchen. You can fight it, ignore it, embrace it, but know that only the last choice is the right answer.
–Ed Sperling
Cross-Talking with TLM 2.0
Tuesday, September 16th, 2008By Ed Sperling
It’s almost like flying over the Great Plains of the United States. On the ground it’s hard to see above the corn stalks, but in an airplane you can see the entire horizon even if you can’t see those stalks anymore.
The analogy is similar to where most of the major players in chip design say the engineering for systems on chips needs to go. With millions more gates available at each new process node, compounded by multiple power domains and incredibly complex timing issues, scrutinizing detail at the RTL or pin level is becoming less important than seeing the big picture and drilling down from there. The diagram is a lot easier to plan, follow and verify at a higher altitutde, even if the details are a little blurry.
This top-down approach is the basis of the new Transaction-Level Modeling (TLM) 2.0 standard created by a working group of the Open SystemC Initiative. It’s still not perfect—in fact, some engineers say it’s a long way from that—but it’s a lot better than what was there before. And it opens the door for more concurrent design possibilities so that increasingly complex SoCs can be developed at least as quickly as previous generations of chips and pieces can be re-used much more easily.
What’s new?
The first attempt at raising the level of abstraction into what became known and overhyped as electronic system-level design, or ESL, was the TLM 1.0 standard, which was introduced in June 2005. However, TLM 1.0 didn’t allow engineers to bridge together various different tools, so verification engineers had no easy way of linking back to the design engineers or the chip architects. Some chip developers developed their own proprietary bridges between those areas, with inconsistent levels of success.
TLM 2.0 adds structure to this confusion, allowing users of tools that comply with the standards to build models that can be tested and verified across an entire system on a chip.
“The biggest challenge today is models,” said Glenn Perry, general manager of ESL/HDL Design Creation at Mentor Graphics. “If you get some from your IP vendors and some from your internal modeling group, there is no guarantee that these models work together. Historically there have been different protocols for the way these models interface and communicate with each other. TLM 2.0 provides a broad communication standard that makes it easy for these models to connect. The tools themselves—only recently a few EDA vendors have taken it into the mainstream. Design analysis, synthesis and verification.”
How chip developers define those models determines the speed and granularity for developing design components and generating test results. TLM is an abstraction of a design. It can be analyzed and simulated, and engineers can interact with it in various ways. For example, loosely timed abstractions give more general results more quickly, while more exact timing models generate more detailed results—although more slowly. These are the kinds of tradeoffs designers will need to consider in the future, as shown in the following diagram from the Open SystemC Initiative, which developed TLM 2.0.
Regardless of which approach is taken, however, all the tools have to work in the same sandbox. “There are two kinds of design flows,” said Jakob Engblom, technical marketing manager at Virtutech in Sweden. “One is from the system guys, who build a box, not an SoC, and then they make the hardware work. The second is from the software developers, who have to make the software work with the hardware. Clearly, there is more and more need for concurrent design, and it needs to be done on the system level, not the component level. An SoC is more than a hardware flow. There is only value at the hardware level if you can add a higher level of abstraction to run software, too.”
Engblom said that simulation at the pin level is no longer viable because it takes too long. There are simply too many parts to simulate—millions of gates, multiple power domains, complex memory and logic structures and shared busses. That job becomes even more complicated when you factor in multiple cores and the software that needs to be developed to take advantage of those cores, and in future iterations of chip development, stacked dies.
The tradeoff with TLM 2.0 is using a loosely-timed abstraction. At every level you gain performance and lose detail. But you can connect that to more detailed models where you need them. This is a standard, not a device library. You use it to build models and test at the level of abstraction you need. There’s still no getting around creating a system map and a lot of hard work, but the choice is simulating small with a great level of detail or simulating the big picture with a lower level of detail.
Debugging
That’s particularly useful in the verification world, which accounts for about 70 percent of the time it takes to develop a chip. While models have to be relatively accurate in the design phase, they have to be 100 percent accurate in the verification phase. There is little future for companies that develop chips that do not comply with the original design, or which fail unexpectedly.
What has been frustrating in this area, however, is that verification engineers are working with massive amounts of data and no effective way to pinpoint where bugs are. As a result, they have to pore through all the data to find the bugs. Functional verification proposes to move verification up a level of abstraction, as well, but the models still need to be integrated into the overall system. TLM 2.0 supports those kinds of models, which ultimately may reduce the time it takes to debug complex chips.
“What’s changed is that now you can build all the models in a way that’s useful for them,” said Mike Meredith, president of the Open SystemC Initiative, which created the TLM 2.0 standard as part of a working group involving all the major EDA vendors as well as companies such as STMicroelectronics, Broadcom, Texas Instruments, Infineon, and an array of ESL startups such as Virtutech and Meredith’s own Forte Design Systems. “The standard is agnostic about the processor and the busses.”
TLM also allows engineers to describe a test bench in transaction terms. That means data can be looked at functionally rather than trying to understand the individual bits in a transaction. But you can have a TLM-based test bench and still be relatively lost if you haven’t evolved the debug platform, said Perry. “One nice addition is the debug transaction interface. You can do really intelligent things with this.”
The future—more speed, more tweaks
But that intelligence may be limited to certain areas of chip design. Said one engineer, who asked not to be named: “My basic complaint regarding TLM 2.0 is the TLM working group focused too much on memory-mapped buses and the SOC or single ASIC as a focus. In my opinion they left the system out of ESL. In our systems the processor is important, but the processor and its surrounding registers and such are only 10% or less of a single ASIC, and frankly are not really a challenge in ASIC design.”
The engineer said it’s more important to model a host operating system such as Windows or Linux, running specific drivers and customer applications, talking to a virtual storage host bus adapter, and running actual firmware, which is in turn talking to a virtual storage area network with hundreds or even thousands of attached devices. “TLM 2.0 helps us with a tiny, tiny sliver of that rather large task,” he said. “We will use TLM 2.0 when picking up models from vendors where it makes sense, but we will not be rewriting any code to use TLM 2.0. Nor will new code development use TLM 2.0 directly. I think the rush to standardize TLM 2.0 is premature since it really has not yet proven itself. TLM 2.0 has some good features. Defining the possible modeling levels and defining how they interact is good. The idea of sockets to bundle ports together is good, though it can add a large coding burden for someone trying to implement a module to conform to an interface. So in a nutshell, TLM 2.0 is okay for a vendor writing a handful of modules connected to an AXI bus, but it is less suited to modeling large systems with many custom or specialized interfaces. Sure TLM 2.0 can be forced to work in these situations, but in the end it is no better than what I have now.”
Already, changes are afoot to rectify some of these problems. Sources involved in the TLM 2.0 discussions say the next steps are improving performance in such areas as direct memory access, which includes access time between the CPU and the memory. In C++, performance reportedly is almost double what it is in SystemC, the basis of TLM 2.0.
There also is room for improvement in the future with network-on-chip designs, where multiple bus architectures have to be bridged together. Some of that capability is included in TLM 2.0, but expect enhancements in future updates on the standard.
In addition, there are some tricks being used by companies that currently are not included in the standard. One is to build more optimized allocators, which can speed up performance by three to five times. The trick is learning the methodology in the standard and understanding it well enough to be able to use it more effectively. For many companies working with TLM 2.0, the standard is just a starting point. It’s also a bridge point between various disciplines that have never worked seamlessly together—people with different areas of expertise who now must work on projects concurrently instead of in series.
“What you’re going to see is synergy in design teams as they begin talking to each other and as people learn new skills,” said OSCI’s Meredith. “As development becomes a critical part of design, you’re going to see entirely new job categories emerge.”

