Voice control and audio interfaces are popping up in a wide spectrum of applications these days, and the possibilities extend far beyond Alexa, Google, and Siri. But adding an audio interface to your design is a complex task, and design teams face serious challenges in meeting common constraints like privacy, security, noise immunity, handling multiple speakers or sound sources, and performance. Simply extracting clear voice information from a noisy environment can be a complex task involving multiple microphones and beamforming/DSP processing.
This week, at the Consumer Electronics Show (CES) in Las Vegas, Aaware (in cooperation with Avnet) is demonstrating a development kit called the “Sound Capture Platform” for their single-chip embedded Linux audio subsystem with advanced and configurable multi-microphone processing capability, as well as hardware AI and DSP acceleration. The solution is built on a Xilinx Zynq-7000 with dual-core Arm Cortex A-9 processors, and it takes advantage of Zynq’s FPGA fabric for acceleration of DSP and AI tasks, as well as Zynq’s mixed-signal capabilities for audio input handling. The solution enables what Aaware calls “far-field, voice-enabled digital products.”
It is interesting to see what amounts to a highly optimized third-party ASSP built, bundled, and sold as IP on top of the Xilinx Zynq SoC/FPGA platform and Avnet’s MiniZed Zynq SoC development platform. Years ago, we thought the concept of a “fabless” chip company was novel. With Aaware, we are seeing an example of a further extension of that concept with what amounts to a “chipless” chip company – taking a silicon platform designed and produced by another company and adding IP to it (plus a development kit produced by yet another company, and IP from still another company) to create a powerful offering, focusing the engineering energy of the company even more narrowly on value-added technology.
The results are impressive. The Aaware platform comes with a concentric circular array of 13 microphones that may be configured in different combinations, so you can create a design with the number of mics you need depending on your noise rejection and localization needs. Aaware separates wake words and follow-on speech from the interfering background noise with remarkable performance. I participated in a demo on a loud trade show floor, and the system was able to capture and localize a very soft-spoken wake word along with the subsequent commands against an extremely challenging background with countless other human voices competing. Aaware is also capable of filtering out any built-in or external speaker noise. The setup is designed to allow us to easily integrate Aaware into our applications with third-party speech and natural language engines. It also provides a means to pass source localization data to downstream applications, such as video, which would come in handy in improving the performance of multi-sensor AI applications such as robotics and surveillance.
The processing capabilities of Aaware are compact, leaving a lot of headroom in the Zynq device, so there should be plenty of on-chip processor capability and FPGA fabric real-estate left over for your application-specific logic, interfaces, and code. That means adding the Aaware device to your BOM also brings a lot of extra general-purpose Zynq capability along, essentially for free. For many of the applications that are likely to need the capabilities Aaware provides, that means you can likely use the Aaware-enabled Zynq as the main application SoC in your design.
Aaware is architected with the idea to keep all the audio processing on-chip or in-system, which is a huge plus, given the growing concern about the privacy and security of voice-enabled devices that use cloud-based services for speech recognition. By keeping all processing local, you rid yourself of a huge number of headaches – first trying to secure the data as it moves to and from the cloud, and then trying to convince your customers that you have actually succeeded in those efforts. If all the processing is local and the data never leaves your device, the question becomes moot (at least as far as bad guys intercepting the data as it travels back and forth over the network or hangs out in the data center). Of course, Aaware does not preclude you sending data to the cloud for further processing, but in most applications the substantial processing/acceleration capability of the Zynq device will make that unnecessary.
Aaware’s key strength is the DSP algorithms that separate the signal from the noise and localize the signal source. For wake word detection, the company uses “TrulyHandsfree” wake word detection technology from Sensory, Inc., again keeping to the philosophy of spending their engineering resources as much as possible where they add the most value to the solution. By partnering with Avnet, Xilinx, and Sensory, Aaware is able to bring a sophisticated, scalable, volume-ready solution to market (with Avnet’s distribution and support as well) with minimal risk – both in their ability to deliver product and for the engineering and product teams who adopt it.
The development kit is well-supported by Avnet, and it includes reference designs that should have you up and running very quickly with a design that is at least similar to what you’re planning to develop. And, Aaware doesn’t limit you to just voice interface applications. With a little more design work on your part and the flexibility of the Zynq device, you can use it for identification and localization of virtually any type of sound, opening the door for applications such as industrial equipment diagnostics that listen for “good” or “bad” sounds emanating from operating gear. The noise rejection, localization, and accelerated processing capabilities Aaware brings to the table open the door to a vast range of applications.
Aaware says they have partnered with a number of other solutions providers to extend the solutions capabilities with things like your own dialogue and natural language interface for your embedded design. We expect a large number of design teams will opt for Aaware’s solution rather than the currently-popular cloud-based options because of the security, reliability, and in-system performance brought by the Zynq device. On a related note, we would also not be surprised to see other “chipless” chip companies emerge with a similar business model for other kinds of applications. If you have an idea for some IP/software combo that would deliver unique value on top of a ready-made platform like Zynq, it makes sense to build on top of a flexible and powerful FPGA SoC to quickly get to a marketable solution.