industry news
Subscribe Now

Sensory Makes Wake Words Smarter with New High-Res Voice Recognition and Authentication

Sensory, the leader and originator of wake words for personal assistants, raises the performance bar for always-listening wake word and speech recognition, with upgraded turn-key voice interface solutions for devices of all shapes, sizes and power requirements

Santa Clara, Calif., September 27, 2018 – Sensory, a Silicon Valley-based company focused on improving the user experience and security of consumer electronics through state-of-the-art embedded AI technologies, today announced that it has made significant upgrades to the embedded AI in its sixth generation ofTrulyHandsfree™, boosting the technology’s already industry-leading wake word performance and accuracy by more than 65 percent. Additionally, TrulyHandsfree boasts improved deep-neural network training that allows for even better near- and far-field speech recognition performance in all room conditions.

“In ideal conditions, an always-listening wake word recognizer should only wake the voice UI when it hears the wake word, and never false fire,” said Todd Mozer, CEO of Sensory. “This is how we designed TrulyHandsfree to operate, and with the upgraded AI technologies in version 6.0, it works exactly as it should even in far less than ideal conditions. TrulyHandsfree has long been the performance benchmark that other voice UI solutions strive to match. With our sixth generation, the bar for efficiency, security, wake word accuracy, and near- and far-field performance has been set much higher, making it the ideal wake word solution for any kind of voice-enabled product design.”

Greater Wake Word and Speech Recognition Accuracy

Version 6.0 improves performance and word recognition accuracy relative to the last two generations of TrulyHandsfree by reducing wake word false positives by more than 65%. This reduction is due to Sensory’s new high-resolution speech feature front-end that sees a higher resolution digitized representation of the speech audio, combined with the introduction of on-device wake word post-qualification. Post qualification uses intelligence about wake word events to better discriminate against false positives. In addition to enhanced wake word performance, Sensory’s new high-resolution speech feature front-end has improved upon what was already the benchmark for embedded speech recognizers and contributed a significant boost in TrulyHandsfree 6.0’s speech recognition performance and accuracy over previous generations. Sensory’s new high-resolution speech feature front-end enhancements in accuracy also enable TrulyHandsfree to support multiple wake words, like “Okay Google” “Alexa™,” “Hey Cortana,” “Hey Siri” and ““Xiaodu, Xiaodu” in a single implementation with great performance. This allows device makers to create new products with a user-friendly voice interface that works with more than one digital assistant technology.

Tolerance of Accents, Dialects and Varying Room Conditions

Sensory upgraded the machine learning within TrulyHandsfree to take advantage of high-resolution audio information for deep-neural net training. This improved training allows the algorithms to anticipate a variety of factors associated with wake word performance, including understanding how one person, or a population of people, may pronounce a wake word. It also takes into consideration acoustic challenges like various room configurations, device placement, room size, reverb and echo. This ensures superior always-listening speech recognition performance regardless of where the device is placed in a room. 

Enhanced Security Makes Sure That It’s You Speaking

Some of the most notable challenges facing the IoT industry is privacy and data security. TrulyHandsfree 6.0 does all processing on device, keeping voice data completely safe by never storing it or sending it to the cloud. Additionally, TrulyHandsfree 6.0 includes a layer of voice biometrics recognition in the voice interface for user authentication and security. TrulyHandsfree’s embedded high-resolution voice enrollment and speaker verification (SV) technology is flexible, allowing users to enroll their voice and their own custom wake word or passphrase, restricting unauthorized users from accessing the voice user interface. Even if an unauthorized person learns the custom wake word or passphrase, Sensory’s voice biometrics technology will recognize that it’s not the enrolled user speaking and not authenticate them.

Improved Barge-in and Far-Field Performance 

Specifically tuned to provide an ideal voice barge-in experience with TrulyHandsfree, Sensory offers upgraded AEC solution that supports single mic input systems with mono or stereo sound sources. This technology allows users to interrupt their devices by saying the wake word while the device is in the middle of playing voice prompts, music or other sounds, at any volume level. Sensory’s new high-resolution speech feature front-end and AEC also play major roles in making TrulyHandsfree better at hearing users in a variety of room sizes and configurations, making TrulyHandsfree 6.0’s far-field wake word performance second to none.

High-Resolution Speech Recognition, Same Efficient Footprint

In addition to being more accurate, TrulyHandsfree 6.0’s new high-resolution speech feature models are scalable and extremely efficient. Its wake word, command and control, SV and AEC all feature a compact footprint, and work with a wide range of low power DSP to high-power applications processors. TrulyHandsfree’s AEC technology is also scalable and can be tuned to optimally balance acoustic echo cancellation performance requirements and MIPS/power restrictions. For ultra-low power devices that have limited battery capacity such as wearables, Sensory offers its Low Power Sound Detector (LPSD) hardware component for DSPs and smart microphones. LPSD reduces low-power configurations of TrulyHandsfree wake words to operate at an average battery draw of under 1mA.

TrulyHandsfree is the most widely deployed embedded speech recognition engine in the world, having enabled a hands-free voice user experience on more than 2 billion devices from leading brands worldwide. TrulyHandsfree offers support for every voice UI application with several types of wake word options, such as independent fixed wake words, user enrolled fixed wake words, and user defined wake words. Sensory offers off-the-shelf wake word models for all major IoT cloud services, including Amazon AVS, Apple Siri, the Google Assistant and Microsoft Cortana, as well as wake word models for third-party devices that support cloud AI systems from Baidu, Alibaba or Tencent. Sensory can also combine multiple wake words into one solution.

Sensory’s TrulyHandsfree supports US English, UK English, Arabic, Dutch, French, German, Italian, Japanese, Korean, Mandarin, Portuguese, Russian, Spanish, Swedish and Turkish. The TrulyHandsfree SDK is available for Android, iOS, Linux, QNX and Windows. Sensory provides developer support for cloud service interfaces on Linux, Android, iOS and Windows as well as support for dozens of proprietary DSPs, microcontrollers, smart microphones and other low-power embedded devices.

Additionally, ultra-low-power deeply embedded ports of TrulyHandsfree are available for leading DSP/MCU IP cores from ARM, Cadence, CEVA, NXP, Synopsys and Verisilicon, as well as for integrated circuits from Ambiq Micro, Analog Devices, Cirrus Logic, DSP Group, Fortemedia, Intel, Knowles, Microchip (Microsemi), NXP, Qualcomm, QuickLogic, Realtek, Synaptics, STMicroelectronics, TI, Yamaha, and XMOS.

Leave a Reply

featured blogs
Apr 25, 2024
Structures in Allegro X layout editors let you create reusable building blocks for your PCBs, saving you time and ensuring consistency. What are Structures? Structures are pre-defined groups of design objects, such as vias, connecting lines (clines), and shapes. You can combi...
Apr 25, 2024
See how the UCIe protocol creates multi-die chips by connecting chiplets from different vendors and nodes, and learn about the role of IP and specifications.The post Want to Mix and Match Dies in a Single Package? UCIe Can Get You There appeared first on Chip Design....
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...

featured video

How MediaTek Optimizes SI Design with Cadence Optimality Explorer and Clarity 3D Solver

Sponsored by Cadence Design Systems

In the era of 5G/6G communication, signal integrity (SI) design considerations are important in high-speed interface design. MediaTek’s design process usually relies on human intuition, but with Cadence’s Optimality Intelligent System Explorer and Clarity 3D Solver, they’ve increased design productivity by 75X. The Optimality Explorer’s AI technology not only improves productivity, but also provides helpful insights and answers.

Learn how MediaTek uses Cadence tools in SI design

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

GaN Solutions Featuring EcoGaN™ and Nano Pulse Control
In this episode of Chalk Talk, Amelia Dalton and Kengo Ohmori from ROHM Semiconductor examine the details and benefits of ROHM Semiconductor’s new lineup of EcoGaN™ Power Stage ICs that can reduce the component count by 99% and the power loss of your next design by 55%. They also investigate ROHM’s Ultra-High-Speed Control IC Technology called Nano Pulse Control that maximizes the performance of GaN devices.
Oct 9, 2023
25,652 views