We Already Have Cars That Can See, How About Cars That Can Hear?

As I’ve mentioned before (and as I will doubtless remark on again), if we were to inquire as to how many senses humans have, most people’s knee-jerk reaction would be to say “Five” and to recite those senses we were all taught at school: Sight, Sound, Smell, Taste, and Touch. In reality, as I wrote in a blog about the possibilities of alien life, we all have as many as 20 senses, or more.

Many artificial intelligence (AI), machine learning (ML), and deep learning (DL) applications at the edge—where the internet rubber meets the real-world road—are focused on sight. This is predominantly realized using RGB cameras and the visible spectrum, although this is being extended into the infrared (IR) realm as we speak (for example, see Within Five Years, All New Cars Will Be Able to See at Night!).

Work is taking place on olfactory (smell, e-nose) and gustatory (taste, e-tongue) sensors, but these are a long way from becoming mainstream. As for touch, there are robot arms whose manipulators boast force/torque sensing capability. The example I always think of here is if I were to try to attach a nut onto a bolt (machine screw) with my eyes closed, then I could quickly detect if I’d crossed the thread, reverse the rotation a little, and try again. Similarly, in addition to using machine vision to detect, identify, and guide the picking-up of nuts and bolts, a robot equipped with force/torque sensing can be trained to recognize and address cross-thread-type problems.

Another example that just popped into my mind was when I was about 12 years old visiting my cousin Gillian, who is a year younger than me. We were playing phonograph records (for younger readers, think 1-foot diameter plastic CDs). The way this worked (again, this is for younger readers), you placed the record on a special platform called a record deck, and then you gently moved an arm with a needle on the end to place the tip of the needle onto the spinning record. This particular record deck also had the ability to move its arm automatically. Something must have messed up inside because the arm was reluctant to move. If I had been the one working the deck, I would have immediately realized something was wrong and returned the arm to its base location while I considered the problem. As it was, I watched in horror as Gillian applied extreme force to make the arm bend to her will (and I’m not using the word “bend” in a figurative sense).

As another example of touch sensing, while spending time at an aircraft manufacturing facility in the south of England circa the late 1970s, I learned a lot about the creation of the turbine blades used in jet engines. The final test for each blade at this facility was for a member of a small team to run a fingernail down the blade. These folks were blind, one result of which was to heighten their other senses. Based on the vibrations they were detecting they gave the ultimate “Pass” or “Fail” on each blade. Remembering that any failed blades had already passed every other form of inspection, they were sliced and diced until the root cause of the problem was found, which ultimately resulted in modifications to the upstream manufacturing and/or testing protocols.

Still on touch, my column BeBop RoboSkin Provides Tactile Awareness for Robots introduced a skin-like covering that can provide humanoid robots with tactile awareness that exceeds the capabilities of human beings with respect to spatial resolution and sensitivity. One example that blew me away was to see an image of a robot finger reading braille.

And so, finally, we arrive at the sense of sound, which is woefully underrated by many people. In my younger years, I spent some time in factories and industrial settings. I remember performing “let’s meander around to see if we can discover anything interesting” walks with grizzled old engineers. As we passed each machine (generator, pump, motor, etc.), it wasn’t uncommon to see them rest their hand on the mechanism for a moment. Every now and then, they would detect an unexpected vibration (too subtle for me) causing them to make note of that machine for future investigation. Similarly, these unsung heroes were constantly listening for anything untoward. Considering the volume of ambient noise, it amazed me how they could detect relatively insignificant hisses, squeaks, scrapes, and graunching noises that alerted them to a potential problem.

We can, of course, attach sensors directly to machines. And this is, of course, the way we typically do things. Having said this, it doesn’t take much for me to imagine a future in which robots are rolling around factories, placing their robot hands on machines to detect vibrations, listening for unwanted sounds with their e-ears, sniffing for unexpected smells with their e-noses, and taste-testing anomalous fluids with their e-tongues).

The reason for my sensory-based cogitations and ruminations is that I was just chatting with Anders Hardebring, CEO of Imagimob, and Steve Tateosian, SVP of IoT and Industrial MCUs at Infineon Technologies. Why was I chatting with both these guys? Well, as part of being a global semiconductor leader in power systems and the internet of things IoT, Infineon creates the sensors, microcontrollers, communication devices, security technologies, and related software that are indispensable in bringing the consumer IoT and the industrial IoT (IIoT) to life. Meanwhile, Imagimob is a leading player in the fast-growing market for Tiny Machine Learning (TinyML) and Automated Machine Learning (AutoML), providing an end-to-end development platform for machine learning on edge devices. Imagimob’s platform enables a wide range of use cases, such as audio event detection, voice control, predictive maintenance, gesture recognition, signal classification. All this explains why Infineon acquired Imagimob in May 2023.

So, what does Imagimob bring to Infineon’s party? Oh, so, so much. First, they offer IMAGIMOB Studio, which is a state-of-the-art development platform for AI/ML on edge devices. Employing the Graph UX format, users can visualize their ML modeling workflows and leverage advanced capabilities to develop edge device models better and faster. IMAGIMOB Studio’s Graph UX interface is designed to bring greater ease and clarity to the ML modeling process while offering advanced new capabilities such as built-in data collection and real-time model evaluation for Infineon hardware.

As you hopefully recall, I was waffling on about using sound to detect problems. Well, an example I’d certainly never heard before (yes, pun intended) is using sound to detect bad welds. My initial reaction was to think of hitting a pre-welded object with a hammer and listening to the ensuing ringing noise, but I was wrong. It turns out that an experienced welder with years of experience can tell the difference between a good weld and a bad weld by the sound as the weld is being made. But what if you are using a robot to do the welding for you? Well, a US company used IMAGIMOB Studio to create an AI/ML model that can use sound to accurately tell a good weld from a bad weld. This weld detection was captured in action at the 2023 TinyML Summit.

My understanding is that IMAGIMOB Studio is so easy to use that it was interns at the aforementioned US company that created this weld quality detection model in a couple of weeks. Of course, “easy” is a subjective term, and some of the interns I meet these days are scary when it comes to knowing stuff. This is why the folks at Infineon and Imagimob have just launched a library of IMAGIMOB Ready Models.

These little rascals (the Ready Models, not the guys and gals at Infineon and Imagimob) are production-ready edge AI/ML models for companies who either don’t have internal AI/ML know-how or who want to enable AI/ML features in their edge devices without the time and cost associated with building a custom model themselves. These models will run on any Arm Cortex-M4 processor, but they are tuned to run on Infineon’s PSoC 6 IoT Microcontrollers. The dual-core 32-bit Arm Cortex-M4 and Cortex-M0+ architecture in PSoC 6 devices lets designers optimize for power and performance simultaneously. Furthermore, in addition to these processors, PSoC 6 devices also boast (nay, flaunt) programmable analog and digital functionality.

There are four existing Ready Models with more to follow. These four focus on Coughing Detection, Snoring Detection, Baby Cry Detection, and Emergency Vehicle Siren Detection, all of which are audio AI/ML applications.

To be honest, it was the Emergency Vehicle Siren Detection Ready Model that caught my attention. There have been a couple of occasions when I’ve been driving along on autopilot with my mind visiting “Max’s World” (where the flowers are more colorful, the butterflies are bigger, the birds sing sweeter, and the beer flows plentiful and cold), failing to register the siren song of an emergency vehicle until it was embarrassingly close. Having my car detect such a sound in the distance and alert me ahead of time would be advantageous in its own right.

Imagine the uses of a siren detection Ready Model (Source: Imagimob)

In addition to alerting human drivers, autonomous cars that are already using a mix of camera, lidar, and radar sensors will also benefit from the ability to hear emergency vehicles in the distance. My understanding is that new regulations are coming that may make this sort of audio detection mandatory for new vehicles as soon as 2027, which makes me think (a) a lot of automotive OEMs will be scrambling for a solution and (b) Infineon and Imagimob already offer an “off-the-shelf” solution in the form of a Ready Model.

I don’t know about you, but this has given me a lot to think about. What say you? Do you have any thoughts you’d care to share on any of this with the rest of us?

We Already Have Cars That Can See, How About Cars That Can Hear?

Related

Leave a Reply Cancel reply

featured video

Larsen & Toubro Builds Data Centers with Effective Cooling Using Cadence Reality DC Design

featured chalk talk