feature article
Subscribe Now

Aaware of Wake Words

Audio Startups Challenge Boundaries

Few tech areas are hotter right now than smart speakers and smart home assistants, with Amazon Alexa, Google Assistant, Microsoft Cortana, and Apple Siri all tripping over each other to earn our trust as personal DJ, home shopper, lighting tech, news reporter, and research assistant. Devices ranging from the $50 Echo Dot and Google Home Mini up to $350-$400 Apple Home Pod and Google Home Max are coming into our living rooms, plugging in, connecting up, and… listening. 

They may be listening to you now, in fact. Listening with arrays of finely-tuned microphones. Listening and processing and parsing and judging. Was that a wake word? Did you call? Can I help you with something? The general population seems all too eager to bring listening devices created by some of the largest corporations in the world into their homes and give them free rein to do pretty much whatever they like with the information they gather. 

It’s a tad ironic that some of the same folks who walk around in tinfoil hats just in case the government is secretly trying to monitor their brain waves from space have no compunction at all about paying the likes of Google and Amazon to put microphones in their homes or posting zettabytes of personal information on Facebook. Apparently, for-profit businesses are a lot less scary than our elected officials and sworn public servants.

What could possibly go wrong? 

Nevertheless, smart speaker sales more than tripled during 2017, and they show no sign of slowing down for the time being. For a startup, entering this market against these multi-billion dollar behemoths would appear to be an exercise in futility. But when technology and culture crash into a discontinuity this large, enormous opportunities can spin off like eddy currents from the rapids. In this case, the confluence of technologies and trends is a bit mind-boggling, and a number of startups are jumping in to surf the wave.

From a buzzword saturation point of view, personal digital assistants embody IoT, artificial intelligence, big data, DSP, cloud computing, high-speed networking, compute acceleration, low power design, digital signal processing, and a host of other “hot tech topics” rolled into one giant burrito. Serve that with a thick sauce of security and privacy concerns and you’ve got yourself a pretty volatile cocktail.

With voice control, the rubber meets the road at the microphone. Pulling in the crazy gamut of sound waves bouncing around the typical household and accurately divining the quietly uttered “Alexa” from the din is a formidable engineering feat. Last month, we got an impressive demonstration of a new technology from silicon valley startup Aaware, whose “Acoustically Aware” technology uses an array of MEMS digital microphones connected to a Xilinx Zynq SoC to implement proprietary far field voice interfaces. Aaware says their algorithms use advanced DSP techniques including adaptive beamforming to provide always-on far field voice isolation and recognition, with a very high degree of selectivity and immunity from background noise, while minimizing distortion to the original voice.

Aaware takes advantage of the extreme accelerated compute performance and power efficiency available in the Zynq devices to perform noise, echo, and reverb cancellation – the company claims up to -25db signal-to-interference ratio (SIR). It also performs source separation, discerning where sounds come from enabling downstream applications to steer things like cameras. Additionally, if multiple people are talking, source separation is critical for downstream automatic speech recognition (ASR) and natural language processing (NLP) to be successful. Aaware algorithms are flexible with respect to the number of microphones and the configuration of the microphone array, so a wide variety of applications are possible.  

Aaware has packaged their technology into a development platform (sold through Avnet) that includes an array of 13 MEMS microphones and a choice of two different Zynq boards. Currently, they are using the Xilinx Zynq® 7010, which packs a peak DSP performance of 100 GMACs, delivered by FPGA fabric connected to dual ARM Cortex-A9 cores, up to 1 GB of DDR, and WiFi/BT, Gigabit Ethernet, or USB for connectivity. These kits also come with a standard Ubuntu Linux software environment and a standard ALSA-based audio interface. You should be able to unbox the kit and be at the “Hello World” (or “Okay Google”) stage on day one.

This is one of the first examples we’ve seen in practice of a trend we expect to become common – using an FPGA or (more often, perhaps) an FPGA SoC such as Zynq to produce what is essentially a third-party ASSP. While Xilinx and Intel/Altera struggle to make these awesomely powerful devices more developer friendly, early adopters such as Aaware are developing broadly-applicable technologies based on these chips, giving their customers all the benefits of a typical ASIC or ASSP implementation, but without the enormous risk associated with custom silicon development. This new breed of company takes the “fabless” concept one step further, acting as a kind of “fabless/chipless” semiconductor supplier.  

One question that lingers over this approach, however, is the behavior of the FPGA companies themselves. Xilinx, in particular, has a longstanding reputation for trashing their own ecosystem by developing and offering IP, tools, and even end products that compete directly with their partners’ offerings. This has led to many startups keeping a wary eye on the FPGA suppliers’ level of involvement in their ventures. It will be interesting to see if this behavior changes over time as the broad applicability of this new category of FPGA SoC devices is far too vast for the FPGA companies to go it alone and truly take advantage of the available opportunities.

In a “canary in the mine” for Aaware’s technology and for this type of offering, a startup called Mycroft has integrated Aaware into a full-blown open-source, hackable smart speaker – the Mycroft Mark II, and launched the product on Kickstarter. In addition to taking advantage of Aaware’s high-performance voice isolation and recognition capability, Mycroft is banking on consumers resonating with the decidedly non-corporate approach, touting the fact that no audio data will be captured, uploaded, or used for nefarious purposes such as ad targeting by the big-data overlords.

Judging from Mycroft’s Kickstarter performance (they blew through their goal within hours and raced to 3x within days), there is a lot of interest out there in embracing the convenience of smart voice assistants while keeping a wary eye on big brother. Judging from Aaware’s starring role in the Mark II and their quick entry into Avnet’s line card, there is also a lot of interest out there in developing voice-based applications that require some pretty tricky audio processing as the cost of admission. It will be exciting to watch.

One thought on “Aaware of Wake Words”

Leave a Reply

featured blogs
Jun 22, 2018
A myriad of mechanical and electrical specifications must be considered when selecting the best connector system for your design. An incomplete, first-pass list of considerations include the type of termination, available footprint space, processing and operating temperature...
Jun 22, 2018
You can't finish the board before the schematic, but you want it done pretty much right away, before marketing changes their minds again!...
Jun 22, 2018
Last time I worked for Cadence in the early 2000s, Adriaan Ligtenberg ran methodology services and, in particular, something we called Virtual CAD. The idea of Virtual CAD was to allow companies to outsource their CAD group to Cadence. In effect, we would be the CAD group for...
Jun 7, 2018
If integrating an embedded FPGA (eFPGA) into your ASIC or SoC design strikes you as odd, it shouldn'€™t. ICs have been absorbing almost every component on a circuit board for decades, starting with transistors, resistors, and capacitors '€” then progressing to gates, ALUs...
May 24, 2018
Amazon has apparently had an Echo hiccup of the sort that would give customers bad dreams. It sent a random conversation to a random contact. A couple had installed numerous Alexa-enabled devices in the home. At some point, they had a conversation '€“ as couples are wont to...