industry news
Subscribe Now

Sensory Boosts Performance of Embedded Wake Word and Speech Recognition by Infusing Smarter AI

Santa Clara, Calif., April 27, 2017 – Sensory, a Silicon Valley-based company focused on improving the user experience and security of consumer electronics through state-of-the-art embedded AI technologies, today announced that it has made significant updates to the embedded AI in its TrulyHandsfree™ technology to dramatically boost its performance and accuracy, while staying small and low power.

Introduced in 2009, TrulyHandsfree revolutionized voice user interfaces by offering the first commercially successful embedded small vocabulary speech recognition system to feature an always-listening wake word. Incorporating Sensory’s smartest and most efficient deep neural network technologies to date TrulyHandsfree 5.0 takes embedded voice interfaces to new heights, offering an on-device voice user interface experience that is more natural and intuitive than ever before yet a new shallow learning approach compresses the model sizes down to run in ultra-low power and with minimal memory and MIPS. Today, TrulyHandsfree can be found in leading mobile phones, sports cameras, IoT devices, and even toys!

Smarter Speech Activation for Improved Accuracy 

At the beginning, accuracy concerns were the major limiting factor that prevented mass adoption of voice wakeup technology. The risk of false fires had to be minimized to ensure that devices didn’t mistakenly activate at inappropriate times. TrulyHandsfree was the first solution capable of offering this consistent reliability, and since its introduction into products like the MotoX, and Galaxy S series smartphones, Sensory’s voice models and neural networks have continually evolved to offer better performance. Today, Sensory’s latest deep neural network models for embedded AI have allowed the company to deliver a 5X reduction in false accepts compared to version 4.01, nearly eliminating the chances of the speech recognition system activating when not actually summoned by the user. A new shallow learning approach takes the biggest speech models and compresses them down by a factor of 5-10 with no decrease in accuracy. Additionally, the latest neural network models offer greater reliability for user-defined triggers, providing the option for users to select the wake word they prefer, while still having the same accuracy and performance offered with specialized fixed triggers.

Enhanced Security Makes Sure That It’s You Speaking

One of the greatest challenges facing the IoT industry is user and data security. TrulyHandsfree 5.0 includes a layer of security in the voice interface that utilizes Sensory’s expertise in voice biometrics recognition and combines it with deep neural nets to authenticate users, limiting who can access it. TrulyHandsfree 5.0’s embedded speaker verification technology is highly flexible, allowing users to enroll their voice and their own custom trigger or passphrase, restricting unauthorized users from accessing the voice user interface. Even if an unauthorized person learns the trigger or passphrase, Sensory’s voice biometrics technology will recognize that it’s not the enrolled user speaking and not authenticate them, preventing them from accessing the device.

Advanced Signal Processing for Voice Barge-In and Far-Field Speech Recognition

TrulyHandsfree 5.0 also features a new voice barge-in feature, enabled with Sensory’s proprietary Acoustic Echo Cancellation (AEC) technology. Users can interrupt devices while playing voice prompts, music or other sounds by saying the trigger phrase to control music playback by voice, or provide any other kind of supported speech commands. This provides a more fluid voice user interface experience. Sensory’s new AEC technology is tuned specifically to maximize speech recognition system accuracy. This not only boosts the performance of the embedded TrulyHandsfree speech recognizer, but also any cloud-based speech recognition system that the speech requests are passed to.

Further, the overall performance of voice user interface systems is greatly affected by the signal-to-noise ratio of the audio signal received. Previous versions of TrulyHandsfree boasted excellent robustness to noise, however with version 5.0, Sensory incorporates new deep learning noise suppression algorithms that reduce the level of ambient noise provided to the speech recognizer to ensure that wake words and voice requests are heard clearly, further improving TrulyHandsfree’s recognition hit rate. This is especially helpful in home, automotive and mobile applications where background noise can overshadow the volume of the user’s voice.

Same Low-Power and Efficient Footprint

Today, voice has surpassed all other interface options for a growing list of device categories, however, most devices on the market today rely on cloud services for AI processing. Yet, these cloud-based solutions cannot be accessed completely hands-free without a client-side voice trigger technology. Many of today’s always-listening voice-enabled device applications, especially low-power devices that don’t have the required resources to run completely off the cloud, can benefit from a hybrid client/cloud approach that taps TrulyHandsfree technology. TrulyHandsfree is extremely resource- and power-efficient with ports available for today’s most powerful applications processors to low-power DSP platforms. For ultra-low power devices that have limited battery capacity such as wearables, Sensory offers its Low Power Sound Detector (LPSD) hardware component for DSPs and smart microphones that can reduce low-power configurations of TrulyHandsfree to operate at an average battery draw of less than a 1mA.

 “The demand for voice user interfaces continues to grow rapidly and TrulyHandsfree 5.0 will allow more manufacturers to incorporate low cost, low power voice user interfaces on device without sacrificing the cloud accuracy,” said Todd Mozer, CEO of Sensory. “TrulyHandsfree 5.0 offers the most advanced and efficient embedded AI technologies we’ve ever created. Additionally, we’ve set the bar higher than ever before for speech recognition accuracy by applying our new proprietary echo cancellation and noise reduction algorithms that we are confident will boost far-field voice performance for IoT devices of all kinds.”

TrulyHandsfree is the most widely deployed embedded speech recognition engine in the world, having enabled a hands-free voice user experience on more than 2 billion devices from leading brands worldwide. Additionally, Sensory can deliver voice triggers for all major IoT cloud services, including Amazon AVS, Apple Siri, Google Assistant and Microsoft Cortana, and provide developer support for cloud service interfaces on Linux, Android, iOS and Windows as well as support for dozens of proprietary DSPs, microcontrollers, smart microphones and other low-power embedded devices.

For more information about this announcement, Sensory or its technologies, please contact sales@sensory.com; Press inquiries:press@sensory.com.

About Sensory

Sensory Inc. creates a safer and superior UX through vision and voice technologies. Sensory’s technologies are widely deployed in consumer electronics applications including mobile phones, automotive, wearables, toys, IoT and various home electronics. Sensory’s product line includes TrulyHandsfree voice control, TrulySecure biometric authentication, and TrulyNatural large vocabulary natural language embedded speech recognition. Sensory’s technologies have shipped in over a billion units of leading consumer products. Visit Sensory at www.sensory.com

Leave a Reply

featured blogs
Apr 26, 2024
Biological-inspired developments result in LEDs that are 55% brighter, but 55% brighter than what?...

featured video

Why Wiwynn Energy-Optimized Data Center IT Solutions Use Cadence Optimality Explorer

Sponsored by Cadence Design Systems

In the AI era, as the signal-data rate increases, the signal integrity challenges in server designs also increase. Wiwynn provides hyperscale data centers with innovative cloud IT infrastructure, bringing the best total cost of ownership (TCO), energy, and energy-itemized IT solutions from the cloud to the edge.

Learn more about how Wiwynn is developing a new methodology for PCB designs with Cadence’s Optimality Intelligent System Explorer and Clarity 3D Solver.

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

Autonomous Robotics Connectivity Solutions
Sponsored by Mouser Electronics and Samtec
Connectivity solutions for autonomous robotic applications need to include a variety of orientations, stack heights, and contact systems. In this episode of Chalk Talk, Amelia Dalton and Matthew Burns from Samtec explore trends in autonomous robotic connectivity solutions and the benefits that Samtec interconnect solutions bring to these applications.
Jan 22, 2024
14,055 views