feature article
Subscribe Now

Aaware of Wake Words

Audio Startups Challenge Boundaries

Few tech areas are hotter right now than smart speakers and smart home assistants, with Amazon Alexa, Google Assistant, Microsoft Cortana, and Apple Siri all tripping over each other to earn our trust as personal DJ, home shopper, lighting tech, news reporter, and research assistant. Devices ranging from the $50 Echo Dot and Google Home Mini up to $350-$400 Apple Home Pod and Google Home Max are coming into our living rooms, plugging in, connecting up, and… listening. 

They may be listening to you now, in fact. Listening with arrays of finely-tuned microphones. Listening and processing and parsing and judging. Was that a wake word? Did you call? Can I help you with something? The general population seems all too eager to bring listening devices created by some of the largest corporations in the world into their homes and give them free rein to do pretty much whatever they like with the information they gather. 

It’s a tad ironic that some of the same folks who walk around in tinfoil hats just in case the government is secretly trying to monitor their brain waves from space have no compunction at all about paying the likes of Google and Amazon to put microphones in their homes or posting zettabytes of personal information on Facebook. Apparently, for-profit businesses are a lot less scary than our elected officials and sworn public servants.

What could possibly go wrong? 

Nevertheless, smart speaker sales more than tripled during 2017, and they show no sign of slowing down for the time being. For a startup, entering this market against these multi-billion dollar behemoths would appear to be an exercise in futility. But when technology and culture crash into a discontinuity this large, enormous opportunities can spin off like eddy currents from the rapids. In this case, the confluence of technologies and trends is a bit mind-boggling, and a number of startups are jumping in to surf the wave.

From a buzzword saturation point of view, personal digital assistants embody IoT, artificial intelligence, big data, DSP, cloud computing, high-speed networking, compute acceleration, low power design, digital signal processing, and a host of other “hot tech topics” rolled into one giant burrito. Serve that with a thick sauce of security and privacy concerns and you’ve got yourself a pretty volatile cocktail.

With voice control, the rubber meets the road at the microphone. Pulling in the crazy gamut of sound waves bouncing around the typical household and accurately divining the quietly uttered “Alexa” from the din is a formidable engineering feat. Last month, we got an impressive demonstration of a new technology from silicon valley startup Aaware, whose “Acoustically Aware” technology uses an array of MEMS digital microphones connected to a Xilinx Zynq SoC to implement proprietary far field voice interfaces. Aaware says their algorithms use advanced DSP techniques including adaptive beamforming to provide always-on far field voice isolation and recognition, with a very high degree of selectivity and immunity from background noise, while minimizing distortion to the original voice.

Aaware takes advantage of the extreme accelerated compute performance and power efficiency available in the Zynq devices to perform noise, echo, and reverb cancellation – the company claims up to -25db signal-to-interference ratio (SIR). It also performs source separation, discerning where sounds come from enabling downstream applications to steer things like cameras. Additionally, if multiple people are talking, source separation is critical for downstream automatic speech recognition (ASR) and natural language processing (NLP) to be successful. Aaware algorithms are flexible with respect to the number of microphones and the configuration of the microphone array, so a wide variety of applications are possible.  

Aaware has packaged their technology into a development platform (sold through Avnet) that includes an array of 13 MEMS microphones and a choice of two different Zynq boards. Currently, they are using the Xilinx Zynq® 7010, which packs a peak DSP performance of 100 GMACs, delivered by FPGA fabric connected to dual ARM Cortex-A9 cores, up to 1 GB of DDR, and WiFi/BT, Gigabit Ethernet, or USB for connectivity. These kits also come with a standard Ubuntu Linux software environment and a standard ALSA-based audio interface. You should be able to unbox the kit and be at the “Hello World” (or “Okay Google”) stage on day one.

This is one of the first examples we’ve seen in practice of a trend we expect to become common – using an FPGA or (more often, perhaps) an FPGA SoC such as Zynq to produce what is essentially a third-party ASSP. While Xilinx and Intel/Altera struggle to make these awesomely powerful devices more developer friendly, early adopters such as Aaware are developing broadly-applicable technologies based on these chips, giving their customers all the benefits of a typical ASIC or ASSP implementation, but without the enormous risk associated with custom silicon development. This new breed of company takes the “fabless” concept one step further, acting as a kind of “fabless/chipless” semiconductor supplier.  

One question that lingers over this approach, however, is the behavior of the FPGA companies themselves. Xilinx, in particular, has a longstanding reputation for trashing their own ecosystem by developing and offering IP, tools, and even end products that compete directly with their partners’ offerings. This has led to many startups keeping a wary eye on the FPGA suppliers’ level of involvement in their ventures. It will be interesting to see if this behavior changes over time as the broad applicability of this new category of FPGA SoC devices is far too vast for the FPGA companies to go it alone and truly take advantage of the available opportunities.

In a “canary in the mine” for Aaware’s technology and for this type of offering, a startup called Mycroft has integrated Aaware into a full-blown open-source, hackable smart speaker – the Mycroft Mark II, and launched the product on Kickstarter. In addition to taking advantage of Aaware’s high-performance voice isolation and recognition capability, Mycroft is banking on consumers resonating with the decidedly non-corporate approach, touting the fact that no audio data will be captured, uploaded, or used for nefarious purposes such as ad targeting by the big-data overlords.

Judging from Mycroft’s Kickstarter performance (they blew through their goal within hours and raced to 3x within days), there is a lot of interest out there in embracing the convenience of smart voice assistants while keeping a wary eye on big brother. Judging from Aaware’s starring role in the Mark II and their quick entry into Avnet’s line card, there is also a lot of interest out there in developing voice-based applications that require some pretty tricky audio processing as the cost of admission. It will be exciting to watch.

One thought on “Aaware of Wake Words”

Leave a Reply

featured blogs
Jan 22, 2025
Shouldn't Matter mean I can eliminate all my other smart home apps? Almost. When it comes to smart home apps, review what device types might need an app....
Feb 5, 2025
Return of Rock ranks Telegraph Road as 5th among Dire Straits' best songs, describing it as "A fourteen-minute masterpiece worth every second of its length'...

featured chalk talk

Ultra-low Power Fuel Gauging for Rechargeable Embedded Devices
Fuel gauging is a critical component of today’s rechargeable embedded devices. In this episode of Chalk Talk, Amelia Dalton and Robin Saltnes of Nordic Semiconductor explore the variety of benefits that Nordic Semiconductor’s nPM1300 PMIC brings to rechargeable embedded devices, the details of the fuel gauge system at the heart of this solution, and the five easy steps that you can take to implement this solution into your next embedded design.
May 8, 2024
39,149 views