feature article
Subscribe Now

AI Sparks Hyper-Competition

Summit Points to the Future for Accelerators

Big data center operators say they are seeing a steady stream of new architectures for accelerating deep learning neural networks—and the flow is just getting started, according to comments at last week’s AI Hardware Summit. One analyst pegged the number of established and startup companies designing AI accelerators at a whopping 130. 

“The machine-learning revolution has reopened the opportunity for new architectures…let a thousand flowers bloom,” said Alphabet Chairman and former Stanford President John Hennessy in an opening keynote at the event. Such domain-specific chips don’t have to be compatible with legacy object code so the industry “can introduce new architectures faster than in general-purpose computing,” he added.

Potential users from Alibaba, Facebook, Google, and Uber said the chip vendors need to show their benchmark scores, make their software easy to use, and conform to emerging standards.

“We are sampling a few vendors’ upcoming products, and one issue is using their software correctly…it takes a long time to vet hardware and a lot of time to bring new software into our ecosystem,” said Linjie Xu [[CQ]], director of applied AI architecture at Alibaba Cloud, speaking on a panel.

“China has a huge market for AI acceleration…but we want to see something real,” he said, noting Alibaba released an open-source version of one of its preferred benchmarks.

“A lot of new accelerators are coming next year or after, each for a different workload, but we’re limited by the resources we have” to evaluate them, said Samar Dalal, a senior manager of architecture and design at Uber. The lack of “a mature software stack is a limiter for deploying” new hardware, he agreed.

Panelists gave some examples of variations in their emerging workloads as a sign of how accelerators, too, will come in many flavors over time. For example, Alibaba does significant work in optimizing traffic patterns for smart cities, natural-language processing is a big focus for Uber, and Google recently stared an initiative in digital health using AI.

Hard and Soft Standards

To give the many chips a single socket, Whitney Zhao, a hardware engineer in Facebook’s infrastructure group, led the design of the open-source accelerator module (called the OAM), as well as a motherboard that can accommodate a handful of them. The next step for her group is to design or specify “standard tooling and utilities to do management and monitoring” for the systems, Zhao said on the panel.

“For different workloads, accelerator solutions may be different, so we defined a one-size-fits-all platform, and now more than 30 companies support it with [accelerator startup] Habana Labs being one of the first,” she said.

The hyperscalers said vendors need to show performance benchmarks for their chips, starting with the metrics set by the MLPerf group, and ideally extended to some of their own company-specific workloads. The MLPerf’s new inference benchmarks open a door to discussions of possible standards for which of several popular precision formats to use. 

Xu of Alibaba encouraged chip designers to join those discussions. “New precision formats need to be something others will want. We’ve seen cases of great ideas that didn’t materialize…this is a really new area,” he said.

The so-called bfloat16 format Google helped define is gaining traction—Intel even re-spun its first Nervana training accelerator to support it. “It’s probably time to standardize that [format], so we should talk to the IEEE” group that handles floating-point specs, said Cliff Young, a Google software engineer who moderated the panel.

Among other standards efforts, Xu applauded Habana’s use of Ethernet as an interconnect for its accelerators. In the future, such an approach could enable big data centers to virtualize and use pools accelerators across their far-flung networks so they won’t need to maintain separate installations of training and inference systems, he added.

Overall, the tech landscape is wide open for the hyperscalers who are hungry to drive AI today. For example, Google focused all its jobs on x86 processors until deep learning came along and it was forced to adopt GPUs to get the performance it needed.

“That broke all our [software] models, and that enabled us to [make and adopt] TPUs [Google’s own accelerators] more easily,” said Young. “Now we ask what the customer wants and try to virtualize it. Six years ago, Google was early in this space and we had to set our own [internal] standards–that’s not the case now,” he added.

The frameworks used to describe neural network models are, for the moment, relatively stable with a lot of momentum behind Google’s TensorFlow. “It is the #1 framework used in Alibaba, but we use many others as well as our own framework built on top of others” to handle the needs of specific workloads, said Xu.

However, developers “are willing to change frameworks quickly,” said Young.  Frameworks are not like programming languages such as Fortran that developers once expected they might use for their whole career, he noted.

Market Dynamics

The new accelerators face any challenges, according to Karl Friend, an analyst with Moor Insights & Strategy who spoke at the event. They require a lot of time and money to design and make.

The new ASICs will have a relatively hard time keeping up with changing neural-network models. “Data center AI demands programmability,” Freund said.

Today Nvidia’s programmable GPUs dominate the market for chips to accelerate training neural nets in big data centers. Intel dominates the market for running inference jobs using the models also in big data centers.

Both companies will be hard to beat, Freund said. Nvidia is delivering good performance and has an extensive network of tools and partners. Intel has acquired three accelerators makers—Mobileye, Movidius and Nervana and has a new GPU called Xe in the works.

Meanwhile “a lot of other companies don’t have working silicon yet–it takes time,” Freund said. 

The winners will be companies that strike a compelling balance of delivering high performance on specific workloads while being able to run software for other jobs, said Hennessy. They also will solve specific problems like handling both dense and sparse linear algebra well, he added.

The good news is the new AI style of computing is reviving a market for processors whose old tricks—like using advanced speculation and large caches–were running out of gas.

“There is no simple way forward for getting performance. Only path left is…to do just a few tasks, but do them extremely well…it’s a whole new world,” he said.

Leave a Reply

featured blogs
May 24, 2024
Could these creepy crawly robo-critters be the first step on a slippery road to a robot uprising coupled with an insect uprising?...
May 23, 2024
We're investing in semiconductor workforce development programs in Latin America, including government and academic partnerships to foster engineering talent.The post Building the Semiconductor Workforce in Latin America appeared first on Chip Design....

featured video

Why Wiwynn Energy-Optimized Data Center IT Solutions Use Cadence Optimality Explorer

Sponsored by Cadence Design Systems

In the AI era, as the signal-data rate increases, the signal integrity challenges in server designs also increase. Wiwynn provides hyperscale data centers with innovative cloud IT infrastructure, bringing the best total cost of ownership (TCO), energy, and energy-itemized IT solutions from the cloud to the edge.

Learn more about how Wiwynn is developing a new methodology for PCB designs with Cadence’s Optimality Intelligent System Explorer and Clarity 3D Solver.

featured paper

Achieve Greater Design Flexibility and Reduce Costs with Chiplets

Sponsored by Keysight

Chiplets are a new way to build a system-on-chips (SoCs) to improve yields and reduce costs. It partitions the chip into discrete elements and connects them with a standardized interface, enabling designers to meet performance, efficiency, power, size, and cost challenges in the 5 / 6G, artificial intelligence (AI), and virtual reality (VR) era. This white paper will discuss the shift to chiplet adoption and Keysight EDA's implementation of the communication standard (UCIe) into the Keysight Advanced Design System (ADS).

Dive into the technical details – download now.

featured chalk talk

Outgassing: The Hidden Danger in Harsh Environments
In this episode of Chalk Talk, Amelia Dalton and Scott Miller from Cinch Connectivity chat about the what, where, and how of outgassing in space applications. They explore a variety of issues that can be caused by outgassing in these applications and how you can mitigate outgassing in space applications with Cinch Connectivity interconnect solutions. 
May 7, 2024
2,623 views