Startup leverages Google’s TensorFlow
Battery-powered devices will get a new option for hardware-accelerated speech interfaces next year if Kurt Busch makes his targets this year. The chief executive of Syntiant aims in 2018 to sample a novel machine-learning chip and raise a Series B to make it in volume.
The startup is designing a 20 tera-operations/watt chip using 4- to 8-bit precision to speed up AI operations initially for voice recognition. It uses an array of hundreds of thousands of NOR cells, computing TensorFlow neural-network jobs in the analog domain.
Syntiant will release a reference design pairing its sub-watt chip with an Infineon MEMS microphone. If it is successful, the two will collaborate on other designs. “We want to make it extremely easy to add voice control to any kind of device,” said Busch.
“Today, the ecosystem is only supported by devices plugged into the wall,” said Busch. “Nobody is offering an always-on, battery-powered solution … we can be a leader enabling that.”
Syntiant is using a processor-in-memory architecture defined by its CTO, Jeremy Holleman, a researcher at the University of North Carolina in Charlotte. Holleman published academic work in the area as far back as 2014 at the International Solid-State Circuits Conference.
Another startup with academic roots, Mythic, is taking a similar approach using a 40-nm Fujitsu NOR cell. But it appears to be targeting imaging applications more than speech and has Lockheed Martin as a partner for use of its chips in drones.
IBM Research is working on a similar architecture based on ReRAM. Today’s emerging MRAM, memristor, and other memories are reigniting academic work in processor-in-memory chips that dates back to the 1990s.
The architecture is gaining attention because it is ideal for executing at very low power the massively parallel multiply-accumulate operations in deep learning. Syntiant and Mythic both claim that they will process machine-learning jobs at orders-of-magnitude less power than digital chips.
An Achilles heel of the approach is that it is hard to program the chips due to both their massive parallelism and their use of analog computing. To overcome the hurdles, Syntiant’s chip basically will act as “a silicon implementation of Google’s TensorFlow framework,” said Busch.
The company’s Syntiant Simulator is essentially an add-on for TensorFlow supporting the chip’s low-precision math and other unique hardware characteristics. Users will train neural nets on Google’s AI cloud service and download weights that it produces to the chip via the simulator.
The downside of the approach is that users will not be able to work in the many other AI frameworks, including those favored by Amazon, Baidu, Microsoft, or others. However, Syntiant eventually expects to add support for other frameworks such as Caffe, which is popular in China.
All of the architectures face two other challenges: The designs are more difficult to port to new processes and they need to handle a basket of analog effects.
In addition, MRAM or other memories are expected to replace flash beyond the 28-nm node. So the startups face a potentially significant redesign early on in their roadmaps.
Syntiant aims to start with a microwatt-class device handling perhaps a few million weights and migrate to handling larger neural nets. However, it doesn’t expect that the architecture will scale to handling training jobs or use in data centers.
“Our first device is supporting one [neural-net] architecture for base functionality with an option to do two or three others,” said Busch. “The first product targets speech and can do limited imaging jobs.”
So far, the startup is staying mum on the types and sizes of neural nets that it will support as well as its foundry and NOR supplier. Busch did say that it is not using Intel’s foundry service but “a COTS flow and a merchant foundry.”
Intel Capital led the $5 million Series A round that the startup closed in May to get to first samples. Busch is now working on a Series B round to fund general availability of the chip in 2019.
“It feels a lot like where we were in networking in 1995 or so,” said Busch, who started his career as a design engineer working on Ethernet and Token Ring chips.
“There was a lot of pent-up demand. Deep learning is a powerful methodology, and the current CPUs and GPUs are not optimized for its needs on parallelism and memory access.”
— Rick Merritt, Silicon Valley Bureau Chief, EE Times