Published in Winter 2012 issue of Chip Design Magazine
Over the past decade, advances in both semiconductor process technologies and design-implementation solutions have enabled the development of highly integrated systems-on-a-chip (SoCs). Those SoCs, in turn, have fueled dramatic growth in portable consumer-electronic appliances worldwide. Yet a new challenge now threatens to both undermine design productivity and offset the integration benefits of SoCs. At 65nm and below, a combination of longer wire lengths and reduced spacing between wires has increased interconnect coupling capacitance to such an extent that it significantly affects design timing.
Increased coupling capacitance has given rise to a marked decrease in correlation between synthesis and layout results. This issue makes it more challenging than ever before to synthesize a design with demanding quality-of-results(QoR)for timing, area and power that carry through to physical implementation. In order to meet strict timing, area and power goals across a design’s multiple corners and operating modes, uncorrelated results necessitate numerous, time-consuming iterations between synthesis, floor planning and place and route (P&R).
To improve correlation, Integrated Device Technology (IDT)—a developer of complex, application-optimized, mixed-signal ICs for timing, serial switching and interfaces—has partially relied upon new synthesis technologies to develop its Tsi721 PCIe2-to-S-RIO2 protocol-conversion bridge. This ~16-million-gate SoC converts PCIe Gen2 at a 20-Gbaud rate to RapidIO 2 (also at 20 Gbaud) and vice versa.
Inside the device, there are eight channels for DMA and messaging—each of which can keep up with the full line rate of 20 Gbaud at 64-Byte packets or greater. Using the new synthesis technologies discussed in this article, IDT was able to reduce iterations and decrease the time and resources needed to meet its challenging design goals. Its aim was to bring the world’s first PCIe2-to-S-RIO2 SoC device to market with first-pass success.
Of course, this meant overcoming the impact that uncorrelated results have on SoC design-implementation time. Design iterations caused by uncorrelated QoR are resource intensive. They also pose a significant risk to tapeout schedules. Figure 1 compares design-implementation times for two example scenarios involving a hypothetical 65-nm SoC project. The first assumes nominal correlation of timing, area and power between synthesis and layout while the second assumes high correlation. The individual time slices represent the time spent performing synthesis and P&R. For the sake of brevity, the verification tasks aren’t shown.
|Figure 1: High correlation between synthesis and layout reduces design iterations while decreasing design-implementation time.|
In the first scenario, a substantial amount of implementation time and effort is needed to achieve design closure. After initial synthesis by the front-end design team, all timing, area and power goals apparently have been met. The netlist is therefore passed to the back-end design team. After placement is completed, timing violations reveal a divergence of QoR between synthesis and layout. The design’s timing goals haven’t been met after all, which means changes must be made to the register transfer level (RTL), physical constraints or both.
Once these changes have been made, the design is re-synthesized and placement is performed again. At this stage, timing violations again indicate that synthesis QoR hasn’t carried through to the layout. The iterations continue in this manner until timing closure is finally achieved and all of the design goals are met.
In the second scenario, the synthesis and layout results are highly correlated. As a result, only one pass is needed to achieve design closure. Reducing the number of design iterations slashes the implementation time and effort required for a project. Let’s see how this is accomplished.
For starters, facilitating timing-driven placement optimization in synthesis is critical. In Design Compiler 2010, Synopsys has increased delay modeling accuracy in its topographical technology. It now accounts for coupling capacitance and cell density in addition to wire lengths—a necessity for designs fabricated at 65nm and below. More accurate interconnect delay modeling improves timing accuracy while helping to identify design issues early in the flow.
Achieving much better placement optimization also means that these results can be used to create a better starting point for the design’s physical-implementation phase. Passing physical guidance to IC Compiler for seed placement makes it possible to achieve very strong correlation between synthesis and layout—enabling not only fewer design iterations, but faster placement runtimes.
|Figure 2: Passing physical guidance from Design Compiler to IC Compiler leads to highly correlated timing paths.|
Extensive evaluation yielded some very interesting results. Figure 2 displays scatter plots, which compare synthesis delays with post-optimized placement delays for a block that has particularly challenging timing requirements. The plot on the left reflects baseline results with a large number of delays falling outside the ±5% range without physical guidance technology. In contrast, the plot on the right reflects results obtained using physical guidance technology in synthesis. Here, most of the delays fall within the ±5% range. These results confirmed our expectations that highly correlated placement would translate into highly correlated timing results.
Moreover, the worst negative slack (WNS) and total negative slack (TNS) of the post-optimized placement timing with physical guidance improved over the baseline case by 27% and 37%, respectively. Clearly, better QoR for the post-placement results is a natural byproduct of synthesis’ passing physical guidance to placement—thereby creating a better starting point for physical implementation.
Figure 1 highlighted the synthesis-P&R iterations. But it didn’t call attention to the synthesis-floorplan iterations that often occur during each synthesis step. Synthesis frequently reveals design issues, such as timing violations and routing congestion, which require changes to the floorplan. In a traditional flow (depicted in the top scenario of Figure 3), the design is passed to the back-end design team. That team makes the needed adjustments to the floorplan. Next, the design is passed back to the front-end design team for re-synthesis based on the new floorplan constraints.
|Figure 3: Access to floorplanning from inside synthesis reduces the impact that synthesis-floorplanning iterations have on design implementation time.|
This process continues until the synthesis QoR goals are met. These iterations also can occur during the synthesis phase of subsequent synthesis-P&R iterations. For example, a routing-congestion issue could surface during P&R that requires RTL changes and re-synthesis.
All of the back-and-forth handoffs between the front-end and back-end design teams incur extra delays. Thus, another way to improve SoC design productivity is to provide instant access to floorplan creation and modification from inside the Design Compiler environment. This approach, which is shown as the bottom scenario of Figure 3, avoids the inefficiencies involved with synthesis-floorplanning handoffs. It also leads to faster convergence to an optimal floorplan.
With pushbutton access to floorplanning, RTL designers can easily explore a range of floorplan options. They also can fix any issues from within synthesis, thereby creating a better starting point for physical implementation. In addition, routing-congestion prediction and mitigation capabilities make it easy to identify potential congestion hot spots—whether caused by the floorplan or highly interconnected logic structures in the netlist. These capabilities also make it easy to perform targeted optimizations to remove the congestion before the handoff to P&R. This further reduces the number of synthesis-P&R iterations.
Aggressive SoC design goals for 65nm and below must be met in a timely manner that makes the most efficient use of IDT’s design resources. To accomplish this, it’s essential that the synthesis solution achieve excellent QoR and preserve these results downstream in the physical-implementation flow. By creating a better starting point for physical implementation, Synopsys’ topographical technology with physical guidance enables highly correlated results between synthesis and P&R. The result is better design timing and up to 2X faster placement. Deploying the new synthesis technologies at IDT has minimized design iterations, helping the company mitigate project risk and meet design goals in shorter timeframes.
Stacy is a Staff Design Engineer with Integrated Device Technology working with the Ottawa Design Center (formerly Tundra Semiconductor). Her responsibilities include synthesis & timing constraints, formal verification and Static Timing Analysis. She has a BS in Electrical Engineering from the University of Southern Maine in Gorham, Maine and began her career working for Quadic Systems, a small design center that was acquired by Tundra Semiconductor back in 2000.
Chris Allsup, marketing manager in Synopsys’ synthesis and test group, has more than 20 years combined experience in IC design, field applications, sales, and marketing. He earned a BSEE degree from UC San Diego and an MBA degree from Santa Clara University. Chris is a member of IEEE Computer Society and has authored numerous articles and papers on design and test.