Part of the  

Chip Design Magazine


About  |  Contact

Posts Tagged ‘Nvidia’

Blog Review – Monday, April 10, 2017

Monday, April 10th, 2017

This week, there are traps and lures in the IoT, as discussed by ARM and Maxim Integrated; Xilinx believes a video tutorial is a good use of time; Get cosy with SNUG for some insight; and ON Semiconductor is keeping an eye on you

Beware of delivery men bearing IoT gifts, warns, Donnie Garcia, ARM, who also looks at trap doors and NXP’s Kinetis KBOOT bootloader to foil a new attack vector and advertise a related webinar on April 25.

Nagging parents had the right idea, decides Russ Klein, Mentor Graphics, remembering entreaties to turn off lights, and whose energy saving advice he now applies to SoCs and embedded systems, with the help of the Veloce emulator.

Gabe Moretti, Chip Design, gets a bit saucy, trying to figure just what is Portable Stimulus. He gets down to the nitty gritty with how the Accellera System Initiative can help, but still believes some areas need to attended to. Let’s hope the industry pays heed.

More warnings from Kris Ardis, Maxim Integrated, and connected devices. While a Jacquard print may not be to everyone’s taste, the idea of protecting the IoT and its data has universal appeal.

The appeal of Agile design is not lost on Randy Smith, Sonics, who writes about the concept and Agile software development. He deftly dives into advances in Agile hardware design and IC methodology for Agile techniques – keeping every design engineer on their toes.

A visit to ISC West, the security expo, has made Jason Liu, ON Semiconductor, think about surveillance systems, as he throws a spotlight on one of the company’s introductions.

14 minutes does not sound like a long time to pack in all you need to know about Zynq UltraScale+ MPSoCs and Vivado Design Suite, but Steve Leibson, Xilinx points readers towards an interesting, informative video, which he describes as a fast and painless way to see the development tools used in a fully operation system.

It sounds like a self-satisfied neck-warmer, but SNUG (Synopsys User Group) events can be informative. Tom De Schutter attended the one in Silicon Valley and relates what he learned from the technical track with experts from ARM, NVIDIA, Intel and Synopsys about prototyping latch-based designs, ARM CPU and GPU increasing densities and more besides.

Striving to improve the lot of IoT designers, John Blyler, Embedded Systems, talks to Jim Bruister, SOC Solutions, about markets, licensing, open source and five elements that will drive improvement.

Compiled by Caroline Hayes, Senior Editor

Blog Review – Tuesday, November 22, 2016

Tuesday, November 22nd, 2016

New specs for PCI Express 4.0; Smart homes gateway webinar this week; sensors – kits and tools; the car’s the connected star; Intel unleashes AI

Change is good – isn’t it? Richard Solomon, Synopsys, prepares for the latest draft of PCI Express 4.0, with some hacks for navigating the 1,400 pages.

Following a triumphant introduction at ARM TechCon 016, Diya Soubra, ARM, examines the ARM Cortex-M33 from the dedicated co-processor interface to security around the IoT.

Steer clear of manipulating a layout hierarchy, advises Rishu Misri Jaggi, Cadence Design Systems. She advocates the Layout XL Generation command to put together a Virtuoso Layout XL-compliant design, with some sound reasoning – and a video – to back up her promotion.

A study to save effort is always a winner and Aditya Mittal and Bhavesh Shrivastava, Arrow, include the results of their comparisons in performing typical debug tasks. Although the aim is to save time, the authors have spent time in doing a thorough job on this study.

Are smart homes a viable reality? Benny Harven, Imagination Technologies, asks for a diary not for a webinar later this week (Nov 23) for smart home gateways – how to make them cost-effective and secure.

Changes in working practice mean sensors and security need attention and some help. Scott Jones, Maxim Integrated looks at the company’s latest reference design.

Still with sensors, Brian Derrick, Mentor Graphics Design, looks at how smartphones are opening up opportunities for sensor-based features for the IoT.

This week’s LA Auto Show, inspires Danny Shapiro, NVIDIA, to look at how the company is driving technology trends in vehicles. Amongst the name dropping (Tesla, Audi, IBM Watson) some of the pictures in the blog inspire pure auto-envy.

A guide to artificial intelligence (AI) by Douglas Fisher, Intel, has some insights into where and how it can be used and how the company is ‘upstreaming’ the technology.

Caroline Hayes, Senior Editor

Blog Review – Monday, October 10, 2016

Monday, October 10th, 2016

This week, bloggers look at the newly released ARM Cortex-R52 and its support, NVIDIA floats the idea of AI in automotives, Dassault Systèmes looks at underwater construction, Intrinsic-ID’s CEO shares about security, and there is a glimpse into the loneliness of the long distance debugger

There is a peek into the Xilinx Embedded Software Community Conference as Steve Leibson, Xilinx, shares the OKI IDS real-time, object-detection system using a Zynq SoC.

The lure of the ocean, and the glamor of Porsche and Volvo SUVs, meant that NVIDIA appealed to all-comers at its inaugural GPU Technology Conference Europe. It parked a Porsche Macan and a Volvo XC90 on top of the Ocean Diva, docked at Amsterdam. Making waves, the Xavier SoC, the Quadro demonstration and a discussion about AI in the automotive industry.

Worried about IoT security, Robert Vamosi, Synopsys, looks at the source code that targets firmware on IoT devices, and fears where else it may be used.

Following the launch of the ARM Cortex-R52 processor, which raises the bar in terms of functional safety, Jason Andrews looks at the development tools available for the new ARMv8-R architecture, alongside a review of what’s new in the processor offering.

If you are new to portable stimulus, Tom A, Cadence, has put together a comprehensive blog about the standard designed to help developers with verification reuse, test automation and coverage. Of course, he also mentions the role of the company’s Perspec System Verifier, but this is an informative blog, not a marketing pitch.

Undersea hotels sounds like the holiday of the future, and Deepak Datye, Dassault Systèmes, shows how structures for wonderful pieces of architecture can be realized with the company’s , the 3DExperience Platform.

Capturing the frustration of an engineer mid-debug, Rich Edelman, Mentor Graphics, contributes a long, blow-by-blow account of that elusive, thankless task, that he names UVM Misery, where a customer’s bug, is your bug now.

Giving Pim Tuyls, CEO of Intrinsic-ID, a grilling about security, Gabe Moretti, Chip Design magazine, teases out the difference between security and integrity and how to increase security in ways that will be adopted across the industry.

Blog Review – Monday, July 25, 2016

Monday, July 25th, 2016

This week, the blogsphere is chasing Pokemon, applying virtual reality as a medical treatment, cooking up a treat with multi-core processors, revisiting Hybrid Memory Cube, analysing convergence in the automotive market – and then there was the ARM acquisition

Have you bumped into anyone, head down as they hunt for Pickachu in Pokémon GO? Eamonn Ahearne, ON Semiconductor vents some frustration but also celebrates the milestone in virtual reality that is sweeping the nation(s).

Virtual reality is also occupying Samantha Zee, Nvidia, who relates an interesting, and moving, case study about pain management using virtual reality

The role of the car is changing, and Andrew Macleod, Mentor Graphics, identifies that convergence is a driving force, boosting mobility with an increase in the vehicle’s electronics content, presenting new challenges for system engineers.

You are always going to grab my attention with a blog that mentions food, or a place where food can be made. Taylor K, Intel, compares multi-core processors to a kitchen, albeit an industrial one. Food for thought.

Virtualization is revitalizing embedded computing, according to Alex Voica, Imagination Technologies. The blog has copious mentions of the company’s involvement in the technology, but also some design ideas for IoT, device authentication, anti-cloning and robotics.

What will the SoftBank acquisition of ARM mean for the IoT industry, ponders Gabe Moretti, Chip Design Magazine. Is SoftBank underestimating other players in the market?

Still with ARM, which recently bought UK imaging company, Apical, Freddi Jeffries, ARM, has a vision – how computing will use images, deep learning and neural networks. Exciting times, not just for the company, but for the industry.

Micron’s Hybrid Memory Cube (HMC) architecture is five years old, so Priya Balasubramanian, Cadence Design Systems, delves into the memory technology which is having a mini resurgence, and the support options available.

Caroline Hayes, Senior Editor

Blog Review – Monday, June 13, 2016

Monday, June 13th, 2016

DAC 2016 highlights; Medical technology and IoT; Autonomous car market races ahead; Remote controlled beer; Secure connectivity

Distinguishing between Big Data and Business Intelligence, ScientistBob, Intel, identifies a ‘watershed’ moment for Big Data and Intel’s steps with Intel Xeon processors to deliver the next step in data analytics.

In response to FCC regulations, the prpl Foundation addresses next-generation security for connected devices. Alexandru Voica, Imagination Techologies, has collected some useful information (demo, white paper, devices, kits and links) to show the progress made.

A fascinating medical application is detailed in Steve Leibso, Xilinx, as he describes how the Xynq-7000 SoC in an eye-tracking computer interface. The video is a little ‘salesy’ and could have benefitted from some more examples of use rather than talking heads but has some practical engineering information about how the processing moves to the SoC.

Continuing the medical theme, Thierry Marchal, ANSYS, tantalizes readers ahead of a medical IoT webinar (June 22) by Cambridge Consultants. He has some interesting statistics to put the topic into context, some graphics and an exploration of the communications protocols involved.

The 53 rd DAC saw ARM launch ARM Artisan physical IP, including POP IP, targeting mainstream mobile designs. Brian Fuller, ARM, adds some meat to the bones with comment from Will Abbey, general manager, ARM’s Physical Design Group.

Automotive design at DAC captured the interest of Christine Young, Cadence, who reports on the keynote by Lars Reger, CTO Automotive Business Unit, NXP Semiconductors. She looks at the security issues for vehicles from the family car to trucks.

Beer that comes to you takes the slog out of summer al fresco dining, doesn’t it? The Atmel team details the use of an ATmefa8 MCU for a remote controlled beer crate, with a link to the build recipe list.

Here in the UK, we are knee-deep in discussions about how to get on with our neighbours as an EU membership referendum looms. A model for happy international relations is here in the blog by Devi Keller, Semiconductor Industry Association, which records the 20 years of the World Semiconductor Council (WSC).

A trip to Detroit for Robert Bates, Mentor Graphics, for its IESF conference, was a source of great material for all things related to autonomous cars. Keynotes and networking led him to consider safety and neural network questions around the technology.

Putting it all into practise, the first Self Racing Cars track event is gleefully reported by Danny Shapiro, Nvidia. There are some great images capturing the spirit of a ground-breaking event. Last weekend a momentous event in the motorsports and automotive world took place. Of course, the company’s technology is used and there is a handy list of what was used and where.

Caroline Hayes, Senior Editor

Smart Bluetooth, Sensors and Hackers Showcased at CES 2015

Wednesday, January 14th, 2015

Internet of Things (IoT) devices ranged from Bluetooth gateways and smart sensors to intensive cloud-based data processors and hackathons – all powered by ARM.

By John Blyler, Editorial Director

Connectivity continues to be a major theme at the International Consumer Electronics Show (CES). The only difference each year is the way in which the connectivity is express in products. For example, this year’s (2015) event showcased an increase in gateway networking devices that permitted Bluetooth Low Energy-equipped gadgets to connect to a WiFi router or other interfaces with the outside world.

According to a recent IHS report, the global market for low-power, Bluetooth Smart integrated circuits (IC) will see shipments rise nearly tenfold over the next 5 years. This is good news for very low power wireless semiconductor intellectual property (IP) and device manufacturers in the wearable and connected markets. One example out of many is Atmel’s BTLC1000 chip, which the company claims will help improve battery life by over 30% of current devices. The chip architecture is based on a ARM® Cortex®-M0 processor.

Bluetooth Smart is the intelligent, low-power version of traditional Bluetooth wireless technology that works with existing smartphone and tablet applications. The technology brings smart connectivity to every day devices such as toothbrushes, heart-rate monitors, fitness devices and more. (See, Wearable Technologies Meet Bluetooth Low Energy)

For the IoT to be useful, sensor data at the edge of the connectivity node must be communicated to the cloud for high performance processing of all the IoT data. This year’s CES showcased a number of multicore 64-bit devices like NVIDIA ARM-based Tegra X1. Another example of a high-end computing system is Samsung’s Exynos 5422 processor that is based upon ARM’s big.LITTLE™ technology and contains four Cortex-A15 cores and four Cortex-A7 cores. These types of products can run Android and 4K video displays on a 28nm process node.

Team mbed

Many embedded software developers enjoy the challenge of creating something new. Today, it is fashionable to call these people hackers, in part because they exhibit the prerequisite mindset, namely, “one who programs enthusiastically…”  – from the Hacker’s Jargon File, circa 1988.

Special events called hackathons have been created for these enthusiastic programmers to practice and demonstrate their skills. For example, back in August of 2014, ARM provided a group of hackers know as Team mbed™ with hardware and software development platforms for the AT&T Hackathon at Super Mobility Week. Last week, Team mbed returned to participate in the ATT Hackathon at the CES 2015. The team consisted of Internet of Things (IoT) industry participants from Freescale, Multi-Tech, Nordic Semiconductor, STMicroelectronics, u-blox and ARM. The team was supplied with a number of cool resources including ARM mbed-enabled development boards, connectivity modules, and a variety of different actuators and sensors. These resources combined with available guidance and inspiration enabled the developers to bring their own ideas to reality.

Following the show’s IoT theme, these software developer were given a ‘smorgasbord’ of sensors and actuators to go along with a variety of hardware platforms and I/O connectivity subsystems including Bluetooth®, cellular, Ethernet, and Wi-Fi®.  Recent projects are built around this IoT platform are highlighted at (see Figure 1).

Figure 1: Krisztian Flautner, GM of IoTBU at ARM, discusses this new mbed offering that sets out to simplify and speed up the creation and deployment of Internet of Things (IoT) products

Next to connectivity, sensors are the defining component of any IoT technology. Maybe that is why sensor companies have been a growing presence on the CES show floor. This year, sensor-related vendors accounted for over 10% of total exhibitors. Many new IoT sensor technology is implemented using tiny MEMS physical structures. At CES, a relatively new company known as Invensense announced a Sensor System on Chip that combines an ARM Cortex-M0 processor with 2 motion co-processors (see Figure 2). This combination enables a 6-axis motion measurement all in a 3mm x 3mm x 1mm package. To complete the package, this device has its own RTOS that is compatible with Android Lollipop.

Figure 2: InverSense chip with sensors.

Such sensor systems on chip would make a fine addition for the resources available for Team mbed at their next hackathon.

ESL Market Potential

Wednesday, October 9th, 2013

By Gabe Moretti

In preparing for my first panel discussion on ESL (see I spoke with Gary Smith to get his views on this segment of the EDA market.  As usual Gary was very helpful, so if I misunderstood it is completely my fault.  First the bad news: there are still no standards systems architects can rely upon.  This means that during product implementation the characteristics of the architectural design are derived in a subjective manner.  Designers implement what they understand, not necessarily what is intended.

This lack of standards could also be good news, since our industry has been quite attentive to such requirements, and certainly has the proven expertise to develop effective standards.  The experience gained by architects and designers can be used by standards developers.  We now not only understand what is required but have a better idea of what might work and what does not work.

Of course Accellera has developed standards that are useful in system level development, SystemVerilog, VMM, UVM, and IP-XACT.  But, and this is Gary’s point, they address development and verification, not architectural design.  Although there have been attempts to develop an architecture description language, those initiatives are still by en large academic exercises.

On the positive note, ESL is growing, may be not as aggressively as once thought from a financial point of view, but its methods are based on solid and proven grounds and offer much room for expansion.  The revenue hockey stick phenomenon predicted a couple of years ago did not materialize.  But revenue from ESL tools is steadily growing and shows a definite positive trend.  Platform based design, according to Gary is now an established methodology and it has made possible the implementation of very complex systems while cutting development costs, some time up to 44%.  Platform based design requires both certified IP blocks and verified firmware.  Accellera is dedicating significant energy to IP issues and its work is well accepted by IP vendors.

During the discussion on ESL the panelists were very focused on the software side of the problem when it comes to verification.  Virtual prototyping and system emulation offer significant growth opportunities in EDA.  Jon MacDonald of Mentor observed that “there is a strong need for each tool to have the ability to interact with representations from other design spaces.  Engineering disciplines have been compartmentalized for a long time to address significant issues in system complexity.  Each discipline needs development tools focused on reducing the complexity of that domain.”

Gary Smith also reinforced the observation made by Frank Schrrmeister of Cadence:

“Fact is that the core customers of EDA – the semiconductor houses – have taken on over the last two decades huge amounts of additional expertise as part of their developments. Where a set of drivers and a partnership with operating system vendors may have been enough 15 years ago, today the same vendors have to provide chips with reference ports of Android, Linux, Chrome OS and Windows Mobile just to win the socket. We all have to learn and deal with the aspects of those adjacent markets as they increasingly simply become a “must Deal with” for existing EDA customers.”

Gary pointed out that the role of the semiconductors companies has changed.  The foundries have taken on an increasing role in providing IP products constituting entire subsystems that include both hardware and software components.  This change is not an overt decision on the part of foundries to change their business model.  It is, instead, the natural result of process requirements.  Processes below 45 nm require cells that are very foundry specific and have indigenous software drivers, often to address power consumption requirements.

What came out of the panel is that the electronics part of a large heterogeneous system still plays the most important role.  The electronics subsystem is fundamental to the construction of the internet of things, certainly the most promising architecture of the near and more distant future.  The strategic item in designing a heterogeneous system is the communication of information from the non-electronic subsystem to the computational part.  This is the area that offers significant growth potential for EDA companies, together with the expansion in embedded software development and verification.

During the panel discussion Bill Neifert of Carbon observed that “like it or not, software can have a significant impact on the behavior of the system. Far too often, hardware decisions are made with no real data from the software team and software teams are forced to live with decisions the hardware team made without their input.”

And Brett Cline, of Forte added: “Integrating non-electronic components could help more accurately model the system prior to construction with obvious benefits. There are plenty of tools and technologies available to help model the non-electronic portions of a system. At some point, the tool flow becomes extremely complex and modeling the entire system becomes prohibitively expensive or difficult. Should EDA companies choose to tackle this problem, the getting the tool flow and integration will be paramount to being successful.”

Virtual prototyping needs to expand its role.  From an almost exclusively software verification tool it must grow to properly support “what-if” analysis to efficiently allow architects to trade-off hardware/software combinations in order to identify the optimum architecture.  This is particularly important when dealing with power consumption issues.

Of course, without cost and time constrints, architects could experiment with a number of different architectural solutions using present tools.  But such scenario is not practical even in the academic world, let alone in the commercial one.  The creation of algorithms and computational architectures that allow modeling of complex systems in a practical manner is a challenge as well as an opportunity for EDA.

Electronics companies have been developing embedded software, called firmware, at least since the introduction of the programmable calculator in 1970.  As the IP blocks have grown to become real subsystems, they incorporate firmware.  Thus companies like ARM and NVIDIA are selling IP products that contains significant amount of firmware.  The certification of these subsystems for each process node from the various foundries is becoming a necessary part of their commercialization.

Finally we must realize that expanding the role of ESL tools within a design will be critical in lowering the cost of backend work.  Optimization of the layout of a chip can be simplified by a clean architectural design that avoids many of the problems inherent with an inefficient and crowed layout.

Rethinking SoC Architectures

Thursday, May 24th, 2012

By John Blyler and Staff
Virtualization and coherency, two concepts that can trace their origins back several decades, are suddenly gaining attention these days—but for entirely different reasons and uses.

A good way to think about virtualization is as an opportunistic use of available resources. Rather than waiting in a queue for a single processor core in a multicore SoC, for example, virtualization allows a compute task to take advantage of whatever processor is available if another is in use.

The concept is hardly a new one. Virtualization was invented by IBM in the late 1960s as a way of running batch processing while also still doing other work. But virtualization also creates a challenge for keeping caches in sync, which is why the concept of cache coherency was created. And as more cores are added into SoCs, rather than more processors within a single machine or on multiple machines, cache coherency has moved from mainframe to PC to processor and now across multiple processor cores.

What’s changing now is that these concepts are spreading well beyond just the processors. Virtualization is being applied to memory, storage, I/O and graphics processing units (GPU)s. But to make all of that work efficiently coherency will have to grow well beyond just the cache, and that may prove to be a very difficult problem, particularly in a multivendor ecosystem.

The starting point
Much of this shift has come into focus inside large data centers over the past decade as a way of reducing costs. In the 1990s, the availability of inexpensive blade servers and ever-increasing density made it possible to begin replacing expensive mainframes and minicomputers with off-the-shelf commodity machines. You could stuff them into a single cabinet and blast in chilled air to cool them sufficiently. Two things happened to change this equation. One is that the cost of electricity suddenly went up, because these cabinets were running hotter than ever before as density and current leakage increased. The second was that data centers had bought so many servers over a period of 15 years that just the cost of keeping them running was beginning to show up as seven-figure annual expenses for many large data centers.

Virtualization proved an effective way of reducing that cost because it allowed data centers to increase server utilization from an average of 5% to 15% utilization all the way up to 85% or more. That meant fewer servers overall, less electricity, less heat to remove, and far more available real estate. But with that problem now under control—at least for the moment—data center managers have shifted their attention to the exponential rise in the amount of data being stored. In the 1990s, most of the data was simply text or code. It is now a combination of text, video, data and voice, raising the same kinds of fiscal red flags about powering and cooling storage as servers prior to virtualization.

“What we’re seeing now is a move toward virtualized storage,” said Bob Pierce, flash business development group director at Cadence. “The next step is to merge storage and memory, which is why we’re seeing such strong interest in PCI Express. It’s a great transport vehicle. We’re also going to see back-end storage mixing with front-end storage data. What’s in between will be cache in the form of virtualized memory.”

This more fluid boundary between storage and cache has ramifications at all levels of design. It can affect everything from a processor to multiple processor cores on a single SoC, on multiple chips in a stacked die, and on multiple systems in a grid or mesh network.

“What’s happening is that you’re moving the back end closer to the processor,” said Pierce. “It changes the way big data and databases will be addressed in the future. If you have four CPUs, you can take them and, using PCI Express, prioritize them into a given drive sector and share them. That’s where all the VC startup money is these days. It’s the ability to configure servers for the function necessary at any given time. But you also have to virtualize storage and memory, and it has to be done dynamically.”

PCI Express has the dual advantage of adding a single protocol to keep all of this data coherent. While it’s useful to store and retrieve data quickly, it all has to be updated to reflect any changes that were made in any part of the system.

Adding other resources
Mixing storage and cache is fairly obvious, though. Less obvious is the mixing of processing between CPUs and GPUs.

“In the past, GPUs were directly assigned to a virtual machine (VM),” said Sumit Gupta, senior director of Tesla GPU Computing at Nvidia. “Every VM would get a full GPU, which meant that each server was limited by the number of GPUs it could hold.”

Nvidia’s new Kepler GPU architecture uses more cores—192 vs. 32—compared with its predecessors, and a significantly lower frequency of .175GHz compared with the old 1.35GHz. The result is faster processing with less power.

“We invented several technologies in order to virtualize the CPU, including an improved Memory Management Unit in the GPU,” he noted. “This is key because most of the data acted upon by the GPU comes from memory.”

But memory is being virtualized, as well, making this whole scheme even more complicated. Startup Memoir Systems, which touts its solution as algorithmic memory, is an intelligent virtualization scheme for almost any available memory in a system. And there are moves afoot to do the same for the multiple I/O feeds to improve the speed of downloads and uploads from a system.

Making it all work together
While virtualization all makes sense from a performance standpoint, complex systems aren’t just about performance. Coherency is a critical piece, and it’s an extremely difficult one.

“The reality is that I/O coherency has been around for a long time in the x86 world,” said Laurent Moll, CTO at Arteris (and formerly a systems architect at both Nvidia and Broadcom). “The next frontier is when you start adding in other devices, and there’s a big disruption when you’re adding full coherency between the CPU and other things. It’s easiest when you have a small team designing the cache and all the protocols. When you start plugging multiple things together it gets a lot harder. You need to be a lot clearer about the specification, the verification and the tests that need to be run.”

He said there are two key challenges in this scheme. One is simply getting it right, which is difficult for multiple companies using different teams and with different cultures. “It’s very easy to have corner cases that the guys who wrote the spec didn’t think about,” he said. The second challenge is that there is no known path to do this. Quite simply, it has never been done before.”

The upside of getting this right is a huge boost in performance. Being able to utilize more resources at any time can improve speed on almost every part of a chip or system, and virtualization plus coherency is a big win for the user.

The downside is that, assuming this can be done in the first place, it also could have an impact on power. The whole goal of most advanced SoC designs is to keep the majority of silicon dark except when it’s needed, and even then to run at maximum performance for a very short time to get everything done quickly. Having more resources to manage on an ad hoc basis solves the use model issue for performance, but it can create havoc on power management schemes.

In addition, it may require new software to even work in the first place. Cadence’s Pierce said some of this won’t even make sense on platforms such as Android until the multithreaded OS release called Ice Cream Sandwich becomes more prevalent.

Computational Powerhouse Hidden In Island Jungle

Thursday, May 27th, 2010

By John Blyler
Perhaps one of the more unusual applications of serious computing power is innocuously located within the sweltering jungles of Puerto Rico. Inside a modern, air-conditioned data center in the Aeronomy Department of the Arecibo RF Telescope Observatory, scientists and engineers study the upper regions of the Earth’s atmosphere. Their goal is to help improve the reliability of radio and satellite communications for the military, security, and global telecommunications industries.

Achieving this goal requires compute-intensive systems to handle the complex algorithms and exponential manipulations associated with high-resolution RF signal analysis. What follows is a brief history of the decade long challenge to implement the most appropriate computing technology, from early computers to today’s multicore parallel architectures.

Resolution matters
Using atmospheric radar, engineers transmit a coded RF signal into various layers of the atmosphere. They then compare it to the received signal. In the past, a digitally modulated Barker Code was used to compress the pulse width of the bursted transmitted signals, according to Arun Venkataraman, head of the Arecibo Observatory Computer Department. Compressed pulsed signals achieve wider pulses and better signal-to-noise ratios than un-pulsed signals—a necessity for high-distance resolution. The Barker pulse employs binary-phase-shift-keying (BPSK) modulation, which is why it’s also used in many spread-spectrum applications like the 1- and 2-Mbit/s IEEE 802.11x wireless-communications standards.

Continuing refinement in both RF performance and electronic technology have resulted in the adoption of higher-resolution—hence longer code length—signals like the coded-long-pulse (CLP) sequence. This pulse sequence is so long that it must be decoded in sections. This requires a compute-intensive decoding process that involves the calculation of numerous autocorrelation functions. The resulting data is used to determine the intensity or concentrations of the reflective atmospheric layers all the way up to the most upper regions (i.e., the ionosphere). All of this information is critical to understanding how radar signals interact within the different layers of the atmosphere.

Achieving the computing performance necessary to decode and analyze the CLP signals requires the use of two sets of Intel’s Xeon 5500 processors (16 cores each or 32 cores total).

How were these compute-intensive calculations handled before the advent of modern multicore processing systems? What options—ranging from array processors to field-programmable gate arrays (FPGAs) and graphics processing units (GPUs)—are being explored for the even-greater computational needs of the future?

Past computational technology
In the 1970s, some of the earliest atmospheric radar work at Arecibo used array processors from a company called Floating Point Systems (Portland, Ore.) Array processors are specially designed to process arrays (i.e., matrices of numbers). Several of the floating-point-system (FPS) array processors were connected to a 24-bit Unisys Aries minicomputer. One of the FPS processors was used for control while the other functioned as a co-processor to calculate Fast Fourier Transform (FFT) sequences.

As computational needs increased, the FPS and mini-computer were replaced in the 1990s by a SkyBolt system. It included Intel’s 8000 and 9000 series processors. According to Venkataraman, multiple SkyBolt systems were needed to achieve the necessary processing power. Because it proved extremely difficult to synchronize all of these systems together, the effort was abandoned. Interestingly, interconnect issues remain a challenge for today’s parallel-processing multicore processors.

For a while, the engineers at Arecibo struggled with general-purpose machines. They actually built platforms based on Motorola’s G-5 processors—a 64-bit PowerPC architecture that was the successor to Motorola’s PowerPC 7400 series. The engineers even created a cluster of Apple G5 processors connected with Sun’s Grid tools to create a parallel processing environment. But even these successes weren’t enough to meet the growing computational needs. Finally, the Intel Xeon platforms were used to achieve the current compute power. But will these workhorses be sufficient for the future?

Future computations
To meet the ever-increasing need for greater computing power, the computer department at Arecibo considered buying a Cray supercomputer. But careful analysis by Venkataraman suggested that the same processing performance could be had for less money by complementing the existing array-processor architecture with several graphics processing units. As a result, computer engineers are currently evaluating nVidia’s 512-core Tesla series of high-end-performance GPUs. Their goal is to use the inherent parallel architecture of the graphics chips to facilitate greater levels of processing speed and performance. Both the Tesla and the soon-to-be-released nVidia Fermi—with thousands of cores in a board—are being considered.

Using GPUs is nothing new at Arecibo. In addition to the Aeronomy department, Astronomy is another field where intensive data calculations are needed. According to the observatory’s Web site: “In 1993, the first pulsar in a binary system was discovered, leading to the important confirmation of Einstein’s theory of general relativity and a Nobel Prize in 1993 for astronomers Russell Hulse and Joseph Taylor.”

Venkataraman said shortly after that Taylor bought a special-purpose vector array processor called the Super Harvard Architecture Single-Chip Computer (SHARC)—a high-performance digital signal processor (DSP) from Analog Devices. This type of machine was needed to handle coherent dedispersion from the pulsars under study. Dispersion occurs due to the great distances between pulsars and radio receivers on the Earth. What starts out as a sharp, well-defined signal from a pulsar becomes a flattened smear of frequencies when it reaches the Earth. Coherent dedispersion is a technique that greatly improves the timing of pulsars. It requires numeric transformations that involve the multiplication of millions of complex exponential functions. This is an extremely processor-intensive set of calculations. The Arecibo Observatory works with the Green Bank Ultimate Pulsar Processing Instrument (GUPPI) at the National Radio Astronomy Observatory (NRAO) in West Virginia to perform these complex calculations.

Why go to all the trouble of processing out the smear to find the originating signal? The reason is simple: Pulsars are great cosmic clocks, which makes them ideal for timing experiments. For example, exacting timing data is needed to precisely calculate how fast the Earth is moving versus how fast the planet’s tectonic plates are shifting. Such information is essential for predicting earthquakes and tsunamis. Another application of pulsar timing is in the detection of gravitational waves—a prerequisite to furthering our understanding of Einstein’s theory of general relativity.

The ongoing advances in the semiconductor chip and electronics industries—from simple microprocessors and processing arrays in the 1970s to today’s multicore systems with complementary graphics engines—have enabled signal processing and decoding at Arecibo and other major RF radio observatories. The information gleaned from these calculations has led to advancements in such earth-bound concerns as defense, security, and telecommunications as well as cosmic realizations about deep-space environments and even gravity waves. Time will tell what new advancements lay just around the next set of data computations.

Chip Vendors Find System Coverage Helps Bottom Line

Thursday, November 19th, 2009

By John Blyler
Today’s newspapers and websites are cluttered with companies reporting increased revenues on lower sales. Simply put, this means that companies have laid off employees and cut other costs in order to show a profit. One big result of these cutbacks in workers, which this time included engineers and sometimes entire design teams, is that companies will have fewer resources to bring new products to market or to maintain existing product lines.

Such cuts may save the company money in the short term, but they may also limit what the company can do in the future. One interesting effect is to shift the task of product development further up the market supply chain. If an original equipment manufacturer (OEM) in the electronics market has cut its design team, it may mean the chip supplier to the OEM must now pick up the task of actually manufacturing their portion of the end-product.

While this is hardly a new trend, the recent downturn has accelerated it. Consider the case of Stretch, a fabless chip company that provides video surveillance and digital video recorder chips and add-in reference board designs to large end-product manufacturers. In the past, Stretch’s big focus was on developing complex video algorithm systems and associated intellectual property (IP) for the growing video surveillance market. Board-level reference designs for its chips were provided as matter of business, an essential but complementary part of its main business.

Today, however, Stretch’s OEM customers – the makers of commercial video surveillance equipment and video recording devices – are asking for a lot more. They now want Stretch to turn its board-level reference designs, which are basically prototypes, into high volume, manufacturability subsystem which the OEM will then incorporate into their larger end product.

Many start-up companies have been caught off guard by their customer’s requests to create the manufacturing ecosystem necessary for volume production of reference board designs. Bob Beachler, Stretch’s vice president of marketing, operations and systems design, explains that most start-up companies expect their customers to take the reference design, modify them as needed for product differentiation and then manufacture the modified designs themselves. Today, though, many of these same customers are asking the startup chip supplier to use the reference design as is and manufacture these designs in volume for the OEM.

What has this meant for a start-up like Stretch? “We’ve had to invest time – not really add more manpower – to find the right contract manufacturers to build the reference design board,” Beachler said. It has also meant that the chip company had to do all of the qualification (EMI) and environmental (shake-rattle-roll) testing, plus the defect screening. He confirmed that Stretch doesn’t do the actual manufacturing. That task has been outsourced to two firms in China. But the board-level manufacturing portion of the business has grown to become half of the company’s total business while the other half is surveillance chips—the place this all started.

This trend won’t work for every business. It seems to work best for well-defined products and markets, where form factors of the boards (add-in cards) and functionality are standard. Beachler said that two pioneers of this approach are Intel in the CPU and CPU motherboard market, and nVidia in the graphics card space. Both of these companies were primarily chip companies that expanded to include a system-level (printed circuit board) business.

Both companies now do substantial volumes in board-level manufacturing, even though they have an ecosystem of other manufacturers that make similar products. For example, Intel produces several slightly different motherboards for a given chipset. But companies like Asus, Gigabyte, MSI and others take Intel’s basic reference designs and customize them to produce a wide variety of mother boards. Such supply-chain companies then sell these boards in volumes that far outstrip the Intel’s versions.

The trend toward more complete system solutions within the semiconductor and board-level supply chains is nothing new. What is new is the disappearance of design teams within some of the OEMs. That is a development that is being watched closely by many within the chip design industry.